“As we may think” (Vannevar Bush, 1945) Source: w3.org/Proposal.html
-
Upload
virgil-ryan -
Category
Documents
-
view
220 -
download
0
description
Transcript of “As we may think” (Vannevar Bush, 1945) Source: w3.org/Proposal.html
From Digital Library to Digital LibrarianA case study on Microsoft Academic ServicesKuansan WangDirector and Principal ResearcherInternet Service Research CenterMicrosoft Research, Redmond WA, USA
“As we may think” (Vannevar Bush, 1945)
Source: http://w3.org/Proposal.html
www.webat25.org
Lance Ulanoff, Mashable.com, December 4 2015
Hiro: Who worshipped Asherah?Librarian: Everyone who lived between India and
Spain, from the second millennium B.C. up into the Christian era. With the exception of the Hebrews, who only worshipped her until the religious reforms....
Hiro: I thought the Hebrews were monotheists….Librarian: Monolatrists. They did not deny the
existence of other gods. Asherah was venerated as the consort of Yahweh.
Hiro: I don't remember anything about God having a wife in the Bible.
Librarian: The Bible didn't exist at that point. Judaism was just a loose collection of Yahwistic cults, each with different shrines and practices.
Hiro and the Librarian, Chapter 30, Snow Crash (1992) , Neal Stephenson
In Ugarit
In EgyptIn Israel and
Judah
In Arabia
Semantic Web(Tim Berners-Lee, 2000)
“The intelligent agent that people have touted
for ages will finally materialize.”
http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html
Knowledge WebSemantic Web• Human readable vs machine
readable contents
• Human defines standard for data formats and models
• Explicit and precise specification of knowledge representation that everyone has to agree upon
• Machine reads human readable contents
• Machine learns to conflate different formats of the same thing
• Latent and fuzzy representation of knowledge learned by mining big data
TRADITIONALWEB SEARCH
Paradigm Shift in Web Search (the “Librarian”)
KNOWLEDGE WEB SEARCH
Index Keywords in Documents
Digest World’s Knowledge
Match Keywords in Queries
Match User Intent
Relevance of “10 blue links”
Dialog Experience
1. “Bing Dialog Model: Knowledge, Intent and Dialog”, MSR Faculty Summit, July 20102. “Introducing the Knowledge Graph: things, not strings”, Official Google Blog, May 20123. “Chinese Search Engine – Baidu’s Practice”, SIRIP, SIGIR 2014, July 2014
“Dialog Acts” in Bing/Cortana• Answer• Confirmation• Disambiguation• Suggestion• Progressive: Refinement• Digressive: Recommendation (reactive + proactive)
• Key difference from human-to-human dialog• Not limited to anthropomorphic natural language dialogs• Each dialog turn can present multiple acts• Can overload back channel communications
ConfirmationAnswer
Progressive Suggestion
Digressive Suggestion
Answer
Confirmation
Closed-loop Dynamic Bayesian Inference
+
Behavior Model
+-
Previous Inference
s(K, It - 1)
InferredIntent (It)
InferredIntent (It )
InferredAction (At)
User Behavior (Ut)
Intent Model
Interaction Model
Knowledge+ History
Bayesian Minimum Risk It = arg max P(I | Ut, K, It-
1) At = arg min E[Cost(A, It )]
Expected Behavior (U’t)
Expected Behavior (U’t)
Digital Librarian for Researchers: How far are we?A Case Study on Microsoft Academic Search
Predictive Completion and Disambiguation
Progressive Suggestion: Topic recognition, not just
keyword extraction
Progressive Suggestion
Knowledge Driven Suggestions
Research Challenges• How to complete never
foreseen academic queries?
• How to rank completion suggestions?
• How to avoid making completions leading no search results?
More on Intent Inference• Generative model approach:
• Dynamic ranking, , score depending on user behavior (e.g., query)
• Static ranking, , score determined by knowledge and dialog history
Special Case 1: Static Rank at Onset• Given knowledge graph, find for all entity types• Journal, article, conference, author, institution
• Journal impact factor: E. Garfield, Science, 1972
• Page Rank: A paper is important if cited by important papers• G. Pinski and F. Narin, Information Processing and Management, 1976• N. Geller, Information Processing and Management, 1978• Rediscovery of Perron-Frobenius theorem (1904)
• How to make better use of heterogeneity of the graph?
Static Rank of a Paper• Inaugural WSDM Cup, Autumn 2015• Industry organizer: MSR and Elsevier
• http://www.wsdm-conference.org/2016/wsdm-cup.html
Institution (20K)
Paper (> 100M) Venue (> 23K) Event (> 46K)
Field of Study (> 50K)
Author (> 40M)
Microsoft Academic Graph
Citations (billions)
Microsoft Academic Graph (MAG)• Data Releases on Azure
• Free Azure accesses for research
• http://research.microsoft.com/MAG
• Web Service API coming!
• Community properties
Special Case 2: “Zero-query” Suggestion• Digital librarian to notify me new materials I should read
• Find whenever the knowledge graph grows
• Best if • Tailored to user based on interests inferred from aggregated behaviors• Following user wherever, whenever and whatever
• Cortana: intelligent personal assistant• Windows, Android, IOS
Summary• Intelligent agent at web scale (“digital librarian”): • From keyword matching to intent/knowledge understaning• One year old for academic services!
• Conduct interactive dialog or forage on behalf of users behind the scene• Albeit w/o anthropomorphic façade
• Microsoft Academic Services:• Search (reactive)• Cortana notification (proactive)• Data and Intelligent API • We want to build a community