1
Scaling Textual Inference to the Web
Stefan Schoenmackers, Oren Etzioni, and Daniel S. Weld
Presented by Kristine MonteithCS 652 - 5/8/09
2
The ProblemLots of information on the web, but answers
to questions aren’t always stated explicitly
Query: “What vegetables help prevent osteoporosis?”
Not going to find “Kale prevents osteoporosis”
Need to infer this from:kale is a vegetablekale contains calciumcalcium helps prevent osteoporosis
3
OverviewHOLMES Architecture (performs textual
inference)Scaling Inference to the WebExperimental ResultsRelated Work
4
The HOLMES Architecture
Information from Knowledge Basese.g. IsHighIn(kale, calcium), Prevents(calcium, osteoporosis)
Inference Rulese.g. Prevents(X,Z) :- IsHighIn(X,Z) ^ Prevents(Y,Z)
Queriese.g. query(X) :- IS-A(X,vegetable) ^ Prevents(X,osteoporosis)
5
Partial proof tree (DAG) for the query “What vegetables help prevent osteoporosis?”
6
Incremental ExpansionExact probabilistic inference is NP-complete
To deal with this, HOLMESUses approximate methods (loopy belief propagation)Focused queries help keep probabilistic inference
manageable
Creates networks incrementally (searches for additional proof trees and updates the network if there is more time)
Exploits standard Datalog optimization (e.g. only expands proofs of recently added nodes)
7
Markov Logic Inference Rules1. Observed relations are likely to be true:
R(X,Y) :- ObservedInCorpus(X, R, Y)
2. Synonym substitution preserves meaning:RTR(X’,Y) :- RTR(X,Y) ^ Synonym(X, X’)RTR(X,Y’) :- RTR(X,Y) ^ Synonym(Y, Y’)
3. Generalizations preserve meaning:RTR(X’,Y) :- RTR(X,Y) ^ IS-A(X, X’)RTR(X,Y’) :- RTR(X,Y) ^ IS-A(Y, Y’)
4. Transitivity of Part Meronyms:RTR(X,Y’) :- RTR(X,Y) ^ Part-Of(Y, Y’) where RTR
matches ‘* in’ (e.g., ‘born in’).
8
Scaling Inference to the WebIn order to scale Textual Inference to the web, it has
to scale linearly
Assumptions:Number of ground assertions |A| grows linearly with size
of corpus (True for assertions extracted by TextRunner)Size of every proof tree is bounded by some constant m
(Seems to be true in practice, could be enforced by terminating search for proof trees at a certain depth)
Need to show that constructing proof trees takes O(|A|) time
9
Constructing proof trees in O(|A|) timeUsing function free horn clauses means that
logical inference can be done in polynomial timeStill not good enough to scale to the Web
Need to ensure two more things:Number of different types of proofs doesn’t grow
too quickly (e.g. Fixed number of rules results in a constant number of first-order search trees)
Number of tuples participating in each relation doesn’t grow too quickly
10
Approximately Pseudo-Functional
11
Experimental ResultsUses two knowledge bases:
TextRunner (183 million ground assertions from 117 million web pages)
WordNet (159 thousand manually created IS-A, Part-Of, and Synonym assertions)
Twenty queries in three domainsGeographyBusinessNutrition
12
Geography Queries“Who was born in one of the following
countries?”Q(X) :- BornIn(X,{country}) Possible countries: France, Germany, China,
Thailand, Kenya, Morocco, Peru, Columbia, Guatemala
Example:Ground assertion: BornIn(Alberto Fujimori, Lima)Background knowledge: LocatedIn(Lima, Peru)New conclusion: BornIn(Alberto Fujimori, Peru)
13
Business QueriesWhich companies are acquiring software companies?Q(X) :- Acquired(X, Y)^ Develops(Y, ‘software’)
This query tests HOLMES’s ability to scalably join a large number of assertions from multiple pages.
Which companies are headquartered in the USA? Q(X) :- HeadquarteredIn(X, ‘USA’) ^ IS-A(X,
‘company’)Join on HeadquarteredIn and IS-ATransitive inference:
Seattle is PartOf Washington which is PartOf the USA Microsoft IS-A software company which IS-A company
14
Nutrition Queries“What foods prevent disease?” Q(X, {disease}) :- Prevents(X, {disease}) ^
IS-A(X, {food})Possible foods: fruit, vegetable, grainPossible diseases: anemia, scurvy, or
osteoporosis.
15
Effect of Inference on RecallBaseline: Number of query answers derived
from information explicitly stated in the Knowledge Bases (TextRunner and WordNet)
Inference increases the number of query answers by 102% for the Geography domain, and considerable more for the other two domains
16
17
Prevalence of APF RelationsExamined 500 binary relations selected
randomly from TextRunners assertionsLargest two relations had over 1.25 million
unique instances52% of the relations had more than 10,000
instancesFound most of the smallest value Kmin such that
the relation was APF with degree Kmin80% of relations were APF with degree less than
496
18
19
Related WorkVan Durme and Schubert (2008)
Use highly expressive representations (e.g. negation, temporal information)
HOLMES is less expressive but more scalableOpen-domain Question-Answering Systems
Attempt to find individual documents or sentences containing the answer
HOLMES can infer from multiple texts, but is not well suited to answering more abstract or open-ended questions
Statistical Relational LearningTechniques for combining logical and probabilistic inferenceHOLMES uses more restrictive inference rules, but again is
more scalable
20
Conclusions1. We introduce and evaluate the HOLMES system, which
leverages KBMC methods in order to scale a class of TI methods to the Web.
2. We define the notion of Approximately Pseudo-Functional (APF) relations and prove that, for a APF relations, HOLMES’s inference time increases linearly with the size of the input corpus. We show empirically that APF relations appear to be prevalent in our Web corpus and that HOLMES’s runtime does scale linearly with the size of its input taking only a few CPU minutes when run over 183 million distinct ground assertions.
3. We present experiments demonstrating that, for a set of queries in the domains of geography, business, and nutrition, HOLMES substantially improves the quality of answers (measured by AuC) relative to a “no inference” baseline.
21
Questions???
Top Related