Towards Cognitive Agents for BigData Discovery
-
Upload
jack-park -
Category
Technology
-
view
664 -
download
0
description
Transcript of Towards Cognitive Agents for BigData Discovery
![Page 1: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/1.jpg)
Towards Cognitive Agents for BigData Discovery
Finding Solutions to Complex, Urgent Problems
Jack Park
BigData Science Meetup
Freemont, CA: 19 April, 2014
Shyam Sarkar, Organizer
© 2014, TopicQuests Foundation
Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
![Page 2: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/2.jpg)
A Narrative Arc
![Page 3: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/3.jpg)
Context: Two Kinds of Discovery
• Data-based
– Harvesting nuggets from collected data
• Literature-based: Deep Question Answering
– Discovering connections between dots in the literature
![Page 4: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/4.jpg)
Target: Deep Question Answering
Breadth
Dep
th
InformationRetrieval
Semantic Representation
Goal
Diagram adapted from a talk by Percy Liang at Stanford, 20140407
![Page 5: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/5.jpg)
Our Goals
• Improve Human-Tool Capabilities
• Augment existing analytic methods
– Increase opportunities for discovery
– Improve already sophisticated methods
“Discovery consists of seeing what everybody has seen and thinking what nobody has thought.”
–Albert Szent-Györgyi
![Page 6: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/6.jpg)
Our Approach
• Explore and develop the technologies of so-called Cognitive Agents
– Current examples
• IBM’s Watson
• SIRI
• An opportunity
– Couple two platforms
• Berkeley Data Analytics Stack (BDAS)
• SolrSherlock
![Page 7: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/7.jpg)
Berkeley Data Analytics Stack Deep QA Issues*
• Low latency queries – Perform faster inferences
– Explore larger spaces
– Better decisions
• Sophisticated analysis – Better forecasts
– Better decisions
• Unification of existing data computation models – Integrate interactive queries, batch and streaming
processing
*http://strata.oreilly.com/2013/02/the-future-of-big-data-with-bdas-the-berkeley-data-analytics-stack.html
![Page 8: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/8.jpg)
An Observation
In this context, interesting literature is about the social lives of data
![Page 9: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/9.jpg)
Literature-based Discovery
• Forming bisociative links* between information in different literature sources which are not known to be related
• Swanson example (simplified)**: – Literature associated with Raynaud’s
• Raynaud’s therapy linked to blood thinners
– Literature associated with fish oils • Fish oil linked to blood thinners
– “Blood thinners” as an implicit link between fish oil and Raynaud’s Syndrome • Akin to the wormholes formed by tags on web pages or
hashtags *Arthur Koestler (1964). The Act of Creation ** Swanson, Don (1986) "Fish oil, Raynaud's syndrome, and undiscovered public knowledge." Perspectives in Biology and Medicine 30(1): 7-18.
![Page 10: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/10.jpg)
Cognitive Agents
• Examples – Proprietary
• IBM’s Watson • SIRI • SRI’s CALO
– Part of which: IRIS, was made open source as OpenIRIS
• Others…
– Open Source • Cougaar
– http://www.cougaar.org/
• Open Cog – http://opencog.org/
• Open Advancement of Question Answering Systems – Closely related to IBM’s Watson – http://oaqa.github.io/
• SolrSherlock – http://debategraph.org/SolrSherlock
• Many others…
![Page 11: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/11.jpg)
Use Cases for Big Data Harvesting
• Resource Collection – Federation
• bring together and organize without filters
• Resource Augmentation – Tagging – Annotating – Debate
• Knowledge Cartography – Connecting resources – Map maintenance – More Debate
• Research Augmentation – Crowd-sourced discovery – Harvesting – Automated inferences /reasoning – Knowledge sharing
Federated Information Resources
Harvesting Activities
Adapted from http://www.slideshare.net/jackpark/big-datasciencemeetup-final Slide 29
Harvesting Activities Harvesting Activities
![Page 12: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/12.jpg)
A Strong Conjecture
• A Knowledge Federation’s topic map provides a Rosetta Stone-like substrate
– Reasoning by analogy
– Big Data mined for clues
– Map:
• Where we have been
• Where we haven’t (Dragons be here)
Adapted from http://www.slideshare.net/jackpark/big-datasciencemeetup-final Slide 33
![Page 13: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/13.jpg)
Topic Maps for Knowledge Federation
• Maintain well-organized by topic structure
• Key issue:
– For any given information resource added to a map:
• Agents must answer this question: – Have I seen this before by any other name or description?
![Page 14: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/14.jpg)
Are We There Yet?
• We are now at the edges of discovery:
– Deeper ways of representing
– Deeper ways of knowing
• Relational Biology
![Page 15: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/15.jpg)
Relational Biology
• Paraphrasing Nicholas Rashevsky*: – We can tease open a living cell and count all its
components, but we cannot put it back together and we have no clue why
• Interpreting Robert Rosen**: – Rashevsky’s quest for a relational mathematics for
biology (complex systems) entails topological algebras (Category Theory)
• Category theory is said to facilitate modeling the social lives of members of the categories
*http://en.wikipedia.org/wiki/Nicholas_Rashevsky **http://en.wikipedia.org/wiki/Robert_Rosen_(theoretical_biologist)
![Page 16: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/16.jpg)
Relational Modeling 1
• Starts with Ontologies
– Ontologies grant uniform vocabularies to universes of discourse
• Including describing data
– Ontology-based frameworks provide ways to model social and other relational structures
• SIOC: Semantically Interlinked Online Communities*
• SWAN: Semantic Web Applications in Neuromedicine**
*http://www.sioc-project.org/ **http://www.w3.org/TR/hcls-swan/
![Page 17: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/17.jpg)
SIOC Closer Look
• A way to model components entailed by a situation (blog post in this case) – Uniform vocabulary
– Structural relations
• Creates a foundation for much deeper modeling – Including:
• Other ontologies
• Other structures
• Feedback loops
SIOC Blog Post*
*http://rdfs.org/sioc/spec/
![Page 18: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/18.jpg)
Massive Connectivity and Feedback
http://geography.oii.ox.ac.uk/?page=home
Complex Communication Processes
![Page 19: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/19.jpg)
Feedback Loops: Crucial to Learning
Image: FEDERAL HEALTH FUTURES SUMMIT LEADERSHIP LEARNING for TRANSFORMATIONAL CHANGE. September 10-11, 2012 Washington DC Metro Region Page 23
![Page 20: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/20.jpg)
Relational Biology: Context
• Context is about Relations among the components themselves
• Context is about Relations among the components and their environment
• Context is about Feedback
![Page 21: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/21.jpg)
Example from Breast Cancer 1
Extracellular Matrix (EM) as Context
Complex Communication Processes
Milk producing tissue
http://www.ted.com/talks/mina_bissell_experiments_that_point_to_a_new_understanding_of_cancer
![Page 22: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/22.jpg)
Example from Breast Cancer 2
Cells missing their EM Cells with restored EM
http://www.ted.com/talks/mina_bissell_experiments_that_point_to_a_new_understanding_of_cancer
![Page 23: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/23.jpg)
Towards Cognitive Agents
• Harvest and represent – Patterns
• Actors
• Relations
• States
– Context in which patterns exist
• Discover – Processes
– Unrecognized connections
– …
![Page 24: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/24.jpg)
Watson’s Architecture* (Simplified)
• Analysis determines answer type and topics in play
• Hypothesis formation seeks candidate answers from sources – Pattern matching
• Hypothesis scoring weighs evidence for each hypothesis
• Answer ranking uses models to select answer
Question
Analysis
Answer Sources
Evidence Sources
Hypothesis Formation
Hypothesis Scoring
Answer Ranking
Answer
*http://www.aaai.org/Magazine/Watson/watson.php
![Page 25: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/25.jpg)
SolrSherlock Architecture (Simplified)
Topic Map
Conceptual Graphs
Harvested Documents
Harvester: HyperMembrane
Information Fabrics, Agents
Literature-based Discovery:
Process documents into structures
(information fabrics) from which patterns
are harvested.
Federate Data Analysis with
Literature: Federate Data
Observations and predictions with
concepts and relations harvested from the
literature
Model Processes, Structures, and
Analogies
![Page 26: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/26.jpg)
SolrSherlock Component Diagram
Topic Map
Conceptual Graph
Information Fabrics
TSC
TM Provider
CG Provider Machine Reader
TSC Provider
Open DeepQA Harvester
Pers
iste
nce
P
rovi
der
s A
gen
ts
![Page 27: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/27.jpg)
Looking Forward
• Coupling Literature-based research with BigData analysis
– Common ontologies
– Hypothesis formation
– Evidence gathering
– Relation discovery
![Page 28: Towards Cognitive Agents for BigData Discovery](https://reader034.fdocuments.us/reader034/viewer/2022051609/547bd2905806b5f93f8b469f/html5/thumbnails/28.jpg)
Completed Representation
antioxidants kill
free radicals
Contraindicates
macrophages use free radicals to
kill bacteria
Bacterial Infection Antioxidants
Because
Appropriate For
Compromised Host
Let us co-create Cognitive Agents for Discovery [email protected]
Thanks to Martin Radley , Patrick Durusau Sherry Jones, and Mark Szpakowski for valuable comments
SolrSherlock at: http://debategraph.org/SolrSherlock and https://github.com/SolrSherlock