Structured Probabilistic Inference in an Embodied Construction Grammar and Jerome Feldman...

Structured Probabilistic Inference in an Embodied Construction Grammar

andJerome FeldmanInternational Computer Science Institute

U. California at BerkeleyBerkeley, CA

[email protected]

Srini NarayananInternational Computer Science Institute

Berkeley, [email protected]

Language, Learning and Neural Modeling

• Scientific Goal Understand how people learn and use language

• Practical Goal Build systems that analyze and produce language

• Approach Embodied linguistic theories with advanced

biologically-based computational methods

State of the Art

• Limited Commercial Speech Applications

transcription, simple response systems

• Statistical NLP for Restricted Tasks

tagging, parsing, information retrieval

• Template-based Understanding programs

expensive, brittle, inflexible, unnatural

• Essentially no NLU in QA Systems

Hypothesis• NLU is essential to large, open domain QA.• One line dismissals (lack of world knowledge) are out of

date (by at least 5 yrs.)• Substantial Progress in Enabling Technologies

– Knowledge Representation/Inference Techniques• Active Knowledge• Dealing With Uncertainty.• Simulation Semantics

– Scaling Up• CYC, Wordnet, Term-bases• FrameNet, Semantic Web, MetaNet

– Extraction of Semantic Relations• Empirical• Linguistic

• The goal of NLU can be realized

Hypothesis• NLU is essential to large, open domain QA.• One line dismissals (lack of world knowledge) are out of

date (by at least 5 yrs.)• Substantial Progress in Enabling Technologies

– Knowledge Representation/Inference Techniques• Active Knowledge• Dealing With Uncertainty.• Simulation Semantics

– Scaling Up• CYC, Wordnet, Term-bases• FrameNet, Semantic Web, MetaNet

– Extraction of Semantic Relations• Empirical• Linguistic

• The goal of NLU can be realized, perhaps! Anyway, it is time to try again.

Complex questions and semantic information (SH, UTD)

• Complex questions are not characterized only by a question class (e.g. manner questions)– Example: How can a biological weapons program be detected ?– Associated with the pattern “How can X be detected?”– And the topic X = “biological weapons program”

• Processing complex questions is also based on access to the semantics of the question topic– The topic is modeled by a set of discriminating relations, e.g.

Develop(program); Produce(biological weapons); Acquire(biological weapons) or stockpile(biological weapons)

– Such relations are extracted from topic-relevant texts

The need for Semantic Inference in QA

• Some questions are complex!• Example (from UTD CNS QA database):

– What is the evidence that IRAQ has WMD?– Answer: In recent months, Milton Leitenberg, an expert on

biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory. He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago. A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries. The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel. US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors.

ANSWER: Evidence-Combined: Pointer to Text Source:

A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking at this murkiest and most dangerous corner of Saddam Hussein's armory.

A2: He says a series of reports add up to indications that Iraq may be trying to develop a new viral agent, possibly in underground laboratories at a military complex near Baghdad where Iraqis first chased away inspectors six years ago.

A3: A new assessment by the United Nations suggests Iraq still has chemical and biological weapons - as well as the rockets to deliver them to targets in other countries.

A4:The UN document says Iraq may have hidden a number of Scud missiles, as well as launchers and stocks of fuel.

A5: US intelligence believes Iraq still has stockpiles of chemical and biological weapons and guided missiles, which it hid from the UN inspectors

Content: Biological Weapons Program:

develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated; Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden

possess(Iraq, Chemical and Biological Weapons) Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Status: Hidden from Inspectors Likelihood: Medium

possess(Iraq, delivery systems(type : rockets; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium

Answer

Structure








possess(Iraq, fuel stock(purpose: power launchers)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium

possess(Iraq, delivery systems(type : scud missiles; launchers; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium

Answer Structure

(continued)

hide(Iraq, Seeker: UN Inspectors; Hidden: CBW stockpiles & guided missiles) Justification: DETECTION Schema Inspection status: Past; Likelihood: Medium

Embodiment

Of all of these fields, the learning of languages would be the most impressive, since it is the most human of these activities. This field, however, seems to depend rather too much on the sense organs and locomotion to be feasible.

Alan Turing (Intelligent Machines,1948)

NTL Manifesto

• Basic concepts and words derive their meaning from embodied experience.

• Abstract and Theoretical concepts derive their meaning from metaphorical maps to more basic embodied concepts.

• Structured Connectionist Models can capture both of these processes nicely.

General and Domain Knowledge

• Conceptual Knowledge and Inference– Embodied– Language and Domain Independent– Powerful General Inferences– Ubiquitous in Language

• Domain Specific Frames and Ontologies– Framenet

• Metaphor links domain specific to general– E.g., France slipped into recession.

What are Image schemas?

– Regularities in our perceptual, motor and cognitive systems

– Structure our experiences and interactions with the world.

– May be grounded in a specific cognitive system, but are not situation-specific in their application (can apply to many domains of experience)

Basis of Image schemas

• Perceptual systems• Motor routines• Social Cognition• Image Schema properties depend

on– Neural circuits– Interactions with the world

semantic schema Containerroles:

interiorexteriorportalboundary

Representing image schemas

Interior

Exterior

Boundary

PortalSource

Path

GoalTrajector

These are abstractions over sensorimotor experiences.

semantic schema Source-Path-Goalroles:

sourcepathgoaltrajector

Schema FormalismSCHEMA <name>

SUBCASE OF <schema>

EVOKES <schema> AS <local name>

ROLES < self role name>: <role restriction>

< self role name> <-> <role name>

CONSTRAINTS <role name> <- <value>

<role name> <-> <role name>

<setting name> :: <role name> <-> <role name>

<setting name> :: <predicate> | <predicate>

A Simple Example

SCHEMA hypotenuse

SUBCASE OF line-segment

EVOKES right-triangle AS rt

ROLES Comment inherited from line-segment

CONSTRAINTS

SELF <-> rt.long-side

Source-Path-Goal

SCHEMA: spg

ROLES:

source: Place

path: Directed Curve

goal: Place

trajector: Entity

Translational Motion

SCHEMA translational motion

SUBCASE OF motion

EVOKES spg AS s

ROLES

mover <-> s.trajector

source <-> s.source

goal <-> s.goal

CONSTRAINTS

before:: mover.location <-> source

after:: mover.location <-> goal

Extending Inferential Capabilities• Given the formalization of the conceptual

schemas– How to use them for inferencing?

• Earlier pilot systems– Used metaphor and Bayesian belief networks– Successfully construed certain inferences– But don’t scale to large open domains

• New approach– Probabilistic relational models– Support an open ontology

Frames

• Frames are conceptual structures that may be culture specific

• Words evoke frames– The word “talk” evokes the Communication frame– The word buy (sell, pay) evoke the Commercial Transaction (CT)

frame.– The words journey, set out, schedule, reach etc. evoke the

Journey frame.

• Frames have roles and constraints like schemas. – CT has roles vendor, goods, money, customer.

• Words bind to frames by specifying binding patterns – Buyer binds to Customer, Vendor binds to Seller.

The FrameNet ProjectC Fillmore PI (ICSI)Co-PI’s:

S Narayanan (ICSI, SRI)D Jurafsky (U Colorado) J M Gawron (San Diego State U)

Staff: C Baker Project Manager B Cronin Programmer C Wooters Database Designer

FrameNet in the Larger Context

• The long-term goal is to reason about the world in a way that humans understand and agree with.

• Such a system requires a knowledge representation that includes the level of frames.

• FrameNet can provide such knowledge for a number of domains.

• FrameNet representations complement ontologies and lexicons.

The core work of FrameNet1. characterize frames

2. find words that fit the frames

3. develop descriptive terminology

4. extract sample sentences

5. annotate selected examples

6. derive "valence" descriptions

The Core DataThe basic data on which FrameNet descriptions are based take the form of a collection of annotated sentences, each coded for the combinatorial properties of one word in it. The annotation is done manually, but several steps are computer-assisted.

Types of Words / Frameso events

o artifacts, built objects

o natural kinds, parts and aggregates

o terrain features

o institutions, belief systems, practices

o space, time, location, motion

o etc.

FrameNet Product

• For every target word,

• describe the frames or conceptual structures which underlie them,

• and annotate example sentences that cover the ways in which information from the associated frames are expressed in these sentences.

Applying Frame Structures to QA

• Parsing Questions

• Parsing Answers

• Result: exact answer= “approximately 7 kg of HEU”

Q: What kind of materials were stolen from the Russian navy?

FS(Q): What [GOODS: kind of nuclear materials] were [Target-Predicate:stolen] [VICTIM: from the Russian Navy]?

A(Q): Russia’s Pacific Fleet has also fallen prey to nuclear theft; in 1/96, approximately 7 kg of HEU was reportedly stolen from a naval base in Sovetskaya Gavan.

FS(A(Q)): [VICTIM(P1): Russia’s Pacific Fleet] has also fallen prey to [Goods(P1): nuclear ][Target-Predicate(P1): theft]; in 1/96, [GOODS(P2): approximately 7 kg of HEU]was reportedly [Target-Predicate (P2): stolen] [VICTIM (P2): from a naval base] [SOURCE(P2): in Sovetskawa Gavan]

Language understanding via simulation

• Hypothesis: Linguistic input is converted into a mental simulation based on bodily grounded structures– Semantic schemas, including image schemas (Johnson

1987) and executing schemas (Bailey, Narayanan 1997) are abstractions over neurally grounded perceptual and motor representations

– Linguistic units make reference to these structures in their semantic pole

– Analysis produces a simulation specification linking these structures and providing parameters for a simulation engine

• Model requires:– schema representations (image, motor, frame, social, etc.)– lexical and phrasal construction representations that invoke

those schemas and other cultural frames (FrameNet+ frames)

Simulation-based language understanding

“Harry walked to the cafe.”

Schema Trajector Goalwalk Harry cafe

Analysis Process

Simulation Specification

Utterance

SimulationCafe

Constructions

General Knowledge

Belief State

Simulation specification

The analysis process produces a simulation specification that

•includes image-schematic, motor control and conceptual structures

•provides parameters for a mental simulation

Simulation Semantics• BASIC ASSUMPTION: SAME REPRESENTATION FOR

PLANNING AND SIMULATIVE INFERENCE– Evidence for common mechanisms for recognition and

action (mirror neurons) in the F5 area (Rizzolatti et al (1996), Gallese 96, Buccino 2002, Tettamanti 2004) and from motor imagery (Jeannerod 1996)

• IMPLEMENTATION: – x-schemas affect each other by enabling, disabling or

modifying execution trajectories. Whenever the CONTROLLER schema makes a transition it may set, get, or modify state leading to triggering or modification of other x-schemas. State is completely distributed (a graph marking) over the network.

• RESULT: INTERPRETATION IS IMAGINATIVE SIMULATION!











Answer

Structure











Answer

Structure Temporal Reference/Grounding








possess(Iraq, fuel stock(purpose: power launchers)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium

possess(Iraq, delivery systems(type : scud missiles; launchers; target: other countries)) Justification: POSSESSION Schema Previous (Intent and Ability): Hidden from Inspectors; Status: Ongoing Likelihood: Medium

Answer Structure

(continued)

hide(Iraq, Seeker: UN Inspectors; Hidden: CBW stockpiles & guided missiles) Justification: DETECTION Schema Inspection status: Past; Likelihood: Medium

Present Progressive Perfect

Present Progressive Continuing











Answer

Structure Uncertainty and Belief











Answer

Structure Uncertainty and Belief

Multiple partly reliable sources


A1: In recent months, Milton Leitenberg, an expert on biological weapons, has been looking

at this murkiest and most dangerous corner of Saddam Hussein's armory.





Content: Biological Weapons Program:develop(Iraq, Viral_Agent(instance_of:new))Justification: POSSESSION Schema Previous (Intent and Ability): Prevent(ability, Inspection); Inspection terminated;

Status: Attempt ongoing Likelihood: Medium Confirmability: difficult, obtuse, hidden



Answer

Structure

Event Structure Metaphor

Temporal relations in QA

• Results of the workshop are accessible from http://www.cs.brandeis.edu/~jamesp/arda/time/documentation/TimeML-use-in-qa-v1.0.pdf

• A set of questions that require the extraction of temporal relations was created (TimeML question corpus)– E.g.:

• “When did the war between Iran and Iraq end?”• “Who was Secretary of Defense during the Golf War?”

• A number of features of these questions were identified and annotated– E.g.:

• Number of TEMPEX relations in the question• Volatility of the question (how often does the answer change)• Reference to repetitive events• Number of events mentioned in the question

http://www.cs.brandeis.edu/~jamesp/arda/time/documentation/TimeML-use-in-qa-v1.0.pdf

http://www.cs.brandeis.edu/~jamesp/arda/time/documentation/TimeML-use-in-qa-v1.0.pdf

Event Structure for semantically based QA

• Reasoning about dynamics– Complex event structure

• Multiple stages, interruptions, resources, framing– Evolving events

• Conditional events, presuppositions.– Nested temporal and aspectual references

• Past, future event references– Metaphoric references

• Use of motion domain to describe complex events.• Reasoning with Uncertainty

– Combining evidence from multiple, unreliable sources– Non-monotonic inference

• Retracting previous assertions• Conditioning on partial evidence

– Linguistic Ambiguity– Figurative inference

Relevant Previous WorkEvent Structure

Aspect (VDT, TimeML), Situation Calculus (Steedman), Frame Semantics (Fillmore), Cognitive Linguistics (Langacker, Talmy, Lakoff, Sweetser), Metaphor and Aspect (Narayanan)

Reasoning about Uncertainty Bayes Nets (Pearl), Probabilistic Relational Models

(Pfeffer, Koller), Graphical Models (Jordan), First Order Probabilistic Inference (Poole, Braz et al)

Reasoning about DynamicsDynamic Bayes Nets (Murphy, Friedman), Distributed Systems (Alur, Meseguer), Control Theory (Ramadge and Wonham), Causality (Pearl)

Active representations• Many inferences about actions derive from what we

know about executing them• Representation based on stochastic Petri nets

captures dynamic, parameterized nature of actions

Walking:

bound to a specific walker with a direction or goal

consumes resources (e.g., energy)may have termination condition

(e.g., walker at goal) ongoing, iterative action

walker=Harry

goal=home

energy

walker at goal

X-Schema Extensions to Petri Nets

• Parameterization– x-schemas take parameter values (speed, force)

• Walk(speed = slow, dest = store1)

• Dynamic Binding– X-schemas allow run-time binding to different

objects/entities• Grasp(cup1), push(cart1)

• Hierarchical control and durative transitions– Walk is composed of steps which are composed of

stance and swing phases

• Stochasticity and Inhibition– Uncertainties in world evolution and in action

selection

Event Structure in Language

• Fine-grained • Rich Notion of Contingency Relationships.

– Phenomena: Aspect, Tense, Force-dynamics, Modals, Counterfactuals

• Event Structure Metaphor:– Phenomena: Abstract Actions are

conceptualized in Motion and Manipulation terms.

– Schematic Inferences are preserved.

Aspect• Aspect is the name given to the ways

languages describe the structure of events using a variety of lexical and grammatical devices.– Viewpoints

• is walking, walk– Phases of events

• Starting to walk, walking, finish walking– Inherent Aspect

• run vs cough vs. rub– Composition with

• Temporal modifiers, tense..• Noun Phrases (count vs. mass) etc..

Phases, Viewpoints, and Aspects

• John is walking to the store.• John is about to walk to the store.• John walked to the store.• John started walking to the store.• John is starting to walk to the store.• John has walked to the store.• John has started to walk to the store.• John is about to start walking to the store.• John resumed walking to the store.• John has been walking to the store.• John has finished walking to the store.• John almost walked to the store.

Phasal Aspect Maps to the Controller

Ready DoneStart Process Finish

SuspendCancel

interrupt resume

IterateInceptive (start, begin) Iterative (repeat)

Completive (finish, end)Resumptive(resume)

Embedding: The end of the beginning

Ready DoneStart Process Finish

Suspendinterruptresume

R DS P F

SC i

r

X-Schema for X with bindings

Embedding: The beginning of the end

Ongoing Finish Done

R DP F

SC i

r

X-Schema for X with bindings

S

Inherent Aspect• Much richer than traditional Linguistic

Characterizations (VDT (durative/atomic, telic/atelic))

• Action patterns– one-shot, repeated, periodic, punctual– decomposition: concurrent, alternatives, sequential

• Goal based schema enabling/disabling• Generic control features;

– interruption, suspension, resumption

• Resource usage

Other Transitions in the Controller may be coded

• Lexical items may code interrupts– Stumble is an interrupt to an ongoing walk

• A combination of grammatical and aktionsart may code of the controller phases– Ready to walk : Prospective– Resuming his run: Resumptive– Has been running: Embedded progressive– About to Finish the painting: Embedded Completive.– Canceling the meeting vs. Aborting the meeting.

Simulation Semantics• BASIC ASSUMPTION: SAME REPRESENTATION FOR

PLANNING AND SIMULATIVE INFERENCE– Evidence for common mechanisms for recognition and

action (mirror neurons) in the F5 area (Rizzolatti et al (1996), Gallese 96, Buccino 2002, Tettamanti 2004) and from motor imagery (Jeannerod 1996)

• IMPLEMENTATION: – x-schemas affect each other by enabling, disabling or

modifying execution trajectories. Whenever the CONTROLLER schema makes a transition it may set, get, or modify state leading to triggering or modification of other x-schemas. State is completely distributed (a graph marking) over the network.

• RESULT: INTERPRETATION IS IMAGINATIVE SIMULATION!

A Precise Notion of Contingency Relations

Activation:Executing one schema causes the enabling, start or continued execution of another schema. Concurrent and sequential activation.

Inhibition:Inhibitory links prevent execution of the inhibited x-schema by activating an inhibitory arc. The model distinguishes between concurrent and sequential inhibition, mutual inhibition and aperiodicity.

Modification:The modifying x-schema results in control transition of the modified xschema. The execution of the modifying x-schema could result in theinterruption, termination, resumption of the modified x-schema.

Summary of Aspect Results• Controller mediates between linguistic markings and individual event/verbal

x-schemas (Cogsci99, Coling2002)• Captures regular event structure; inspired by biological control theory• Flexible: specific events may require only a subset of controller; interaction of

underlying x-schemas, linguistic markers and hierarchical abstraction/ decomposition of controller accounts for wide range of aspectual phenomena.

• Important aspectual distinctions, both traditional and novel, can be precisely specified in terms of the interaction of x-schemas with the controller (CogSci97, CogSci 98, AAAI99, IJCAI99, CogSci04):

• stative/dynamic, durative/punctual: natural in x-schemas• telic processes: depletion of resources• continuous processes: consumption of resources• temporary/effortful states; habituals• dynamic interactions with tense, nominals, temporal modifiers• incorporation of world knowledge, pragmatics

Event Structure in Language

• Fine-grained • Rich Notion of Contingency Relationships.

– Phenomena: Aspect, Tense, Force-dynamics, Modals, Counterfactuals

• Event Structure Metaphor:– Phenomena: Abstract Actions are

conceptualized in Motion and Manipulation terms.

– Schematic Inferences are preserved.

We use metaphors everyday

• The council attacked every weak point of his proposal.• I don't know how to put my thoughts into words.• My summer plans are still up in the air.• France slipped into a recession.• Something smells fishy, but I can't quite put my finger on

it.• The economy hasn’t turned the corner, it has just made

a U-turn and reversed course (Kerry, 8/10/04)

What is the basis for metaphors?

• metaphor is understanding one thing in terms of another

• specifically, we reason about abstract concepts through our sensory-motor experience.

• that means we have:– correlation– inference

Event Structure MetaphorMaps motion and manipulation concepts to abstract actions.

• States are Locations

• Changes are Movements

• Causes are Forces– Force Dynamic patterns of causation

• Actions are Self-propelled Movements– Speed, step size are parameters that are mapped.

• Means are Paths– crossroads

• Difficulties are Impediments to Motion

• Long-term, Purposeful Activities are Journeys– Set out, back on track…

Metaphorical Inference

There is a conceptual metaphor, Understanding Is Grasping, according to which one can grasp ideas.

One can begin to grasp an idea, but not quite get a hold of it.If you fail to grasp an idea, it can go right by you — or over your head!If you grasp it, you can turn it over in your mind.You can’t hold onto an idea before having grasped it.

In short, reasoning patterns about physical grasping can be mapped by conceptual metaphor onto abstract reasoning patterns. Metaphors project the products of simulations to the target (abstract domain)!

Uncertainty and domain representation

• Factorized Representation of Domain uses Dynamic Belief Nets (DBN’s)– Probabilistic Semantics– Structured Representation

Qualitative part:

Directed acyclic graph (DAG)

• Nodes - random vars. • Edges - direct influence

Quantitative part: Set of conditional probability distributions

0.9 0.1

e

b

e

0.2 0.8

0.01 0.99

0.9 0.1

be

b

b

e

BE P(A | E,B)Family of Alarm

Earthquake

Radio

Burglary

Alarm

Call

Compact representation of joint probability distributions via conditional independence

Together:Define a unique distribution in a factored form

)|()|(),|()()(),,,,( ACPERPEBAPEPBPRCAEBP

What is a Bayes (belief) net?

Figure from N. Friedman

What is a Bayes net?Earthquake

Radio

Burglary

Alarm

Call

C || R,B,E | A

A node is conditionally independent of itsancestors given its parents, e.g.

Hence

From 25 – 1 = 31 parameters to 1+1+2+4+2=10

Why are Bayes nets useful?

- Graph structure supports- Modular representation of knowledge- Local, distributed algorithms for inference and

learning- Intuitive (possibly causal) interpretation

- Factored representation may have exponentially fewer parameters than full joint P(X1,…,Xn) =>

- lower sample complexity (less data for learning)

- lower time complexity (less time for inference)

What can Bayes nets be used for?

• Posterior probabilities– Probability of any event given any evidence

• Most likely explanation– Scenario that explains evidence

• Rational decision making– Maximize expected utility– Value of Information

• Effect of intervention– Causal analysis

Earthquake

Radio

Burglary

Alarm

Call

Radio

Call

Figure from N. Friedman

Explaining away effect

Dynamic Bayes Nets• Dynamic Bayesian Networks (D(T)BNs) are an

extension of Bayesian networks for modeling dynamic systems. – state at time t is represented by a set of random

variables.– The state at time t is dependent on the states at

previous time steps. – first-order Markovian, and thus we need to represent the

transition distribution P(Zt+1 | Zt). • This can be done using a two-time-slice Bayesian

network fragment (2-TBN) Bt+1, – variables from Zt+1 whose parents are variables from Zt

and/or Zt+1, and variables from Zt without any parents. – Typically, we also assume that the process is stationary,

Economic State [recession,nogrowth,lowgrowth,higrowth]

Goal

Policy

Outcome

Difficulty

A Simple DBN for the Economics Domain

[Liberalization, Protectionism]

[Free Trade, Protection ]

[Success, failure]

[present, absent]

T0 T1

Probabilistic inference

– Filtering• P(X_t | o_1…t,X_1…t)• Update the state based on the observation sequence and state

set– MAP Estimation

• Argmaxh1…hnP(X_t | o_1…t, X_1…t)• Return the best assignment of values to the hypothesis

variables given the observation and states– Smoothing

• P(X_t-k | o_1…t, X_1…t)• modify assumptions about previous states, given observation

sequence and state set– Projection/Prediction/Reachability

• P(X_t+k | o_1..t, X_1..t)

Answer Type to Inference Method

ANSWER TYPE INFERENCE DESCRIPTIONJustify (Proposition) MAP Proposition is part of

the MAP

Ability (Agent, Act) Filtering; Smoothing

Past/Current Action enabled given current state

Prediction (State) P;R’ MAP Propogate current information and

estimate best new state

Hypothetical (Condition) S, R_I Smooth intervene and compute state

Logical Action Theories

• Connection to ARD (or other Action Languages):– The representation can be used to encode a causal model for a

domain description D (in the Syntax of ARD) in that it satisfies all the causal laws in D. Furthermore, a value proposition of the form C after A is entailed by D iff all the terms in C are in Si; the state that results after running the projection algorithm on the action set A. (IJCAI 99)

• Executing representation, – frame axioms are encoded in the topology of the network and

transition firing rules respect them.

• Planning as backward reachability or computing downward closure (IJCAI 99, WWW2002)

• Links to linear logic. Perhaps a model of stochastic linear logic? (joint work with Jose Meseguer).

Task: Interpret simple discourse fragments/ blurbs

France fell into recession. Pulled out by Germany

US Economy on the verge of falling back into recession after moving forward on an anemic recovery.

Indian Government stumbling in implementing Liberalization plan.

Moving forward on all fronts, we are going to be ongoing and relentless as we tighten the net of justice.

The Government is taking bold new steps. We are loosening the stranglehold on business, slashing tariffs and removing obstacles to international trade.

Basic Result• An implemented computational model that is

– Neurally plausible (Feldman and Narayanan 2004)

– Establishes that motion, manipulation and spatial concepts are used to convey important and subtle information about abstract domains such as International Economics.

• In 1991, India set out on a path of Liberalization. After making rapid strides in the first few years, the Government policy hit a first of a series of roadblocks in 1995. By 1998, the new BJP Government had reoriented the Government’s policy ..

I/O as Feature Structures

• Indian Government stumbling in implementing liberalization plan

Basic Primitives

• An fine-grained executing model of action and events (X-schemas).

• A factorized representation of probabilistic domain knowledge (DBN’s)

• A controller X-schema that models inter-event relations and forms the basis of inference through simulation.

• A model of metaphor maps that project bindings from embodied to application domains.

KARMA DEMO

• SOURCE DOMAINS: MOTION, HEALTH

• TARGET DOMAINS: INTERNATONAL ECONOMICS

• METAPHOR MAPS: EVENT STRUCTURE METAPHOR

C:\Documents and Settings\Srini Narayanan\Desktop\RUN.bat

Results• Model was implemented and tested on discourse fragments

from a database of 50 newspaper stories in international economics from standard sources such as WSJ, NYT, and the Economist.

• Results show that motion terms are often the most effective method to provide the following types of information about abstract plans and actions.– Information about uncertain events and dynamic changes in goals and

resources. (sluggish, fall, off-track, no steam)– Information about evaluations of policies and economic actors and

communicative intent (strangle-hold, bleed).– Communicating complex, context-sensitive and dynamic economic

scenarios (stumble, slide, slippery slope).– Commincating complex event structure and aspectual information (on

the verge of, sidestep, giant leap, small steps, ready, set out, back on track).

• ALL THESE BINDINGS RESULT FROM REFLEX, AUTOMATIC INFERENCES PROVIDED BY X-SCHEMA BASED INFERENCES.

Scaling Up

– Scaling Up Language• Embodied Construction Grammar• FrameNet for Scenarios

– Scaling Up Inference• Coordinated Probabilistic Relational Model (CPRM)• An Application to QA inference

– Scaling Up Ontological Knowledge• Semantic Web

– OWL and OWL-S– An Event Ontology in OWL

Embodied Construction Grammar• Embodied representations

– active perceptual and motor schemas(image schemas, x-schemas, frames, etc.)

– situational and discourse context

• Construction Grammar– Linguistic units relate form and

meaning/function.– Both constituency and (lexical) dependencies

allowed.

• Constraint-based– based on feature unification (as in LFG, HPSG)– Diverse factors can flexibly interact.

Embodied Construction Grammar providesformal tools for linguistic description and analysis motivated largely by cognitive/functional concerns.

• A shared theory and formalism for different cognitive mechanisms–Constructions, metaphor, mental spaces, etc.

• Precise specifications of structures/processes involved in language understanding

• Bridge to detailed simulative inference using embodied representations

“Harry walked into the cafe.”

Phonology

Semantics

Pragmatics

Morphology

Syntax

Phonetics

“Harry walked into the cafe.”

Phonology

Semantics

Pragmatics

Morphology

Syntax

Phonetics

UTTERANCE

ECG Structures

• Schemas– image schemas, force-dynamic schemas, executing

schemas, frames…

• Constructions– lexical, grammatical, morphological, gestural…

• Maps– metaphor, metonymy, mental space maps…

• Spaces– discourse, hypothetical, counterfactual…

schema Containerroles

interiorexteriorportalboundary

Embodied schemas

Interior

Exterior

Boundary

PortalSource

Path

GoalTrajector

These are abstractions over sensorimotor experiences.

schema Source-Path-Goalroles

sourcepathgoaltrajector

schema name

role name

Embodied constructions

construction HARRYform : /hEriy/meaning : Harry

construction CAFEform : /khaefej/meaning : Cafe

Harry

CAFEcafe

ECG NotationForm Meaning

Constructions have form and meaning poles that are subject to type constraints.

The meaning pole may evoke schemas (e.g., image schemas) with a local alias. The meaning pole may include constraints on the schemas (e.g., identification constraints ).

construction TO

form selff.phon /thuw/

meaning evokes

Trajector-Landmark as tl Source-Path-Goal as spg constraints:

tl.trajectorspg.trajectortl.landmarkspg.goal

construction TO

form selff.phon /thuw/

meaning evokes

Trajector-Landmark as tl Source-Path-Goal as spg constraints:

tl.trajectorspg.trajectortl.landmarkspg.goal

Representing constructions: TO

local alias

identification constraint

TO vs. INTO:INTO adds a Container schema and appropriate bindings.

The INTO construction construction INTO

form selff.phon /Inthuw/

meaning evokes

Trajector-Landmark as tl Source-Path-Goal as spg

Container as cont constraints:

tl.trajectorspg.trajectortl.landmarkcontcont.interiorspg.goalcont.exteriorspg.source

construction SPATIAL-PHRASEconstructional

constituentssp : Trajector-Landmarklm : Thing

formspf before lmf

meaningspm.landmark lmm

Constructions with constituents:The SPATIAL-PHRASE construction

Constructions may also specify constructional constituents and impose form and meaning constraints on them:

–order constraints–identification constraints

order constraint

local alias

identification constraint

An argument structure construction

construction DIRECTED-MOTIONsubcase of Pred-Exprconstructional

constituentsa : Ref-Expm: Pred-Expp : Spatial-Phrase

form af before mf

mf before pf

meaningevokes Directed-Motion as dmselfm.scene dmdm.agent am

dm.motion mm

dm.path pm

schema Directed-Motionroles

agent : Entitymotion : Motionpath : SPG

Simulation-based language understanding

Analysis Process

SemanticSpecification

“Harry walked into the cafe.” Utterance

CAFE Simulation

Belief State

General Knowledge

Constructions

construction WALKEDform

selff.phon [wakt]meaning : Walk-Action constraints

selfm.time before Context.speech-time selfm..aspect encapsulated

The ECG Analyzer

• Goes bottom up, generating a semantic interpretation as well as syntactic constituency

• Uses a chart to store the constituent matches

• Instead of using FSMs, each construction is turned into a construction recognizer

Using a Chart

• A chart is a data structure that tracks the constituents found during an analysis.– Note that the chart is storing semantic

information as well as grammatical.

• After analysis, the chart is filled with all the semantic chunks (constituents) that were found.

Construction Recognizer

• Each construction is turned into a parameterized procedure.

• The procedure checks form and semantic constraints simultaneously.

• Constructions = active knowledge

ECG - Clausal Example Construction Active-Caused-Motion constituents subj : RefExp[Agent] verb : Verb[Force-Application] DO : RefExp path : PP[SPG] form subj before verb verb before DO verb before path meaning:caused-motion-scene selfm.agent <-> subjm selfm.action <-> verbm

selfm.patient <-> DOm

selfm.path <-> pathm

Chunking

0 1 2 3 4 5 6 7 8 9the woman in the lab coat thought you were sleeping

L0 D N P D N N V-tns Pron Aux V-ing

L1 ____NP P_______NP VP NP ______VP

L2 ____NP _________PP VP NP ______VP

L3 ________________________S_____________S

After Abney, 1996.

Construction Recognizers

You want to put a cloth on your hand ?

NP NP NP NP NP

Form Meaning“you”<->[Addressee]

Form MeaningD,N <-> [Cloth num:sg]

Form MeaningPP$,N <-> [Hand num:sg

poss:addr]

Like Abney: Unlike Abney:

One recognizer per rule

Bottom up and level-based

Check form and semantics

More powerful/slower than FSMs

Chunk Chart

• Interface between chunking and structure merging

• Each edge is linked to its corresponding semantics.

You want to put a cloth on your hand ?

Combining Partial Parses

• Prefer an analysis that spans the input utterance with the minimum number of chunks.

• When no spanning analysis exists, however, we still have a chart full of semantic chunks.

• The system tries to build a coherent analysis out of these semantics chunks.

• This is where structure merging comes in.

Structure Merging• Closely related to abductive inferential

mechanisms like Hobbs’• Unify compatible structures (find fillers for

frame roles)• Intuition: Unify structures that would have

been co-indexed had the missing construction been defined.

• There are many possible ways to merge structures.

• In fact, there are an exponential number of ways to merge structures (NP Hard). But using heuristics cuts down the search space.

Caused-Motion-Actionagent:

patient: (1)

path:

Structure Merging Example

Utterance:It is too heavy for you to pour.

[Addressee < Anim]

It < Entitynum:sg…

Caused-Motion-Actionagent: [Animate]patient: (1) [Entity]

path:

Before Merging: After Merging:

It num:sg…

TrajectorLandmark trajector : (1) landmark : [Entity]

TrajectorLandmark trajector : (1) landmark :

[Addressee ]

Semantic Density

• Semantic density is a simple heuristic to choose between competing analyses.

• Density of an analysis = (filled roles) / (total roles)

• The system prefers higher density analyses because a higher density suggests that more frame roles are filled in than in competing analyses.

• Extremely simple / useful? but it certainly can be improved upon.

Measuring Semantic Density

Commercial Transaction

buyer : John

seller :

goods : Lexus

price : $1000

Commercial Transaction

buyer : John

seller : Bill

goods : Lexus

price : $1000

The frame on the left has a density of .75 while the frame on the right has a density of 1.

Semantic Density At Work

[Addressee]

Caused-Motion-Actionagent : [Animate]patient : [Entity]path:[TrajLM]

TrajectorLandmark trajector : landmark : here

Motion-Action mover : path:[TrajLM]

You

move

over here

or

Assume that “You move over here” isn’t

recognized and that “move” has two senses.

Semantic Density At Work 2

Caused-Motion-Actionagent: (1) [Addressee]patient: /**unfilled**/

path:TrajectorLandmark trajector : (1) landmark : here

Motion-Actionagent: (1) [Addressee]

path: TrajectorLandmark trajector : (1) landmark : here

There are 2 resulting merged analyses. The Motion-Action frame has a density of 3/3 = 1 and the Caused-Motion-Action frame has a density of 3/4 = .75 because of the missing patient.

Language Understanding Process

Simulation specification

A simulation specification consists of:- schemas evoked by constructions- bindings between schemas

Summary: ECG

• Linguistic constructions are tied to a model of simulated action and perception

• Embedded in a theory of language processing– Constrains theory to be usable– Frees structures to be just structures, used in processing

• Precise, computationally usable formalism– Practical computational applications, like MT and NLU– Testing of functionality, e.g. language learning

• A shared theory and formalism for different cognitive mechanisms– Constructions, metaphor, mental spaces, etc.

ECG applications

• Grammar– Spatial relations/events (Bergen & Chang 1999;

Bretones et al. In press)

– Verbal morphology (Gurevich 2003, Bergen ms.)

– Reference: measure phrases (Dodge and Wright 2002), construal resolution (Porzel & Bryant 2003), reflexive pronouns (Sanders 2003)

• Semantic representations / inference– Aspectual inference (Narayanan 1997; Chang, Gildea &

Narayanan 1998)

– Perspective / frames (Chang, Narayanan & Petruck 2002)

– Metaphorical inference (Narayanan 1997, 1999)

– Simulation semantics (Narayanan 1997, 1999)

• Language acquisition– Lexical acquisition (Regier 1996, Bailey 1997)

– Multi-word constructions (Chang 2004; Chang & Maia 2001)

FrameNet for AQUAINT Scenarios

• FrameNet is annotating the AQUAINT CNS data with frame information– Design and build FN database of Frames and

Schemas relavant to the AQUAINT data– Select a set of representative documents and

annotate them with FrameNet frames.• Gold Standard annotations for the program.

• All the FrameNet data (DB and annotations) will be released in RDF/OWL in addition to the XML format.

Example Document: Country Profile- Libya

Frame Type Description Number

Action Verbs of action often hostile action

50

Resources Resources, assets, capabilities

5

Attitude Different attitude frames

15

Weaponry Different types and descriptions

75

Treaty Different treaties named

12

Organization Organizations named

19

FrameNet annotation of CNS data

C:\Documents and Settings\Srini Narayanan\Desktop\AQUAINT-ontologies\libya1.htm

Scaling Up





Structured Probabilistic Inference

Probabilistic inference

– Filtering• P(X_t | o_1…t,X_1…t)• Update the state based on the observation sequence and state set

– MAP Estimation• Argmaxh1…hnP(X_t | o_1…t, X_1…t)• Return the best assignment of values to the hypothesis variables

given the observation and states

– Smoothing• P(X_t-k | o_1…t, X_1…t)• modify assumptions about previous states, given observation

sequence and state set

– Projection/Prediction/Reachability• P(X_t+k | o_1..t, X_1..t)

Relational Models• Relational models make some ontological

commitments: the world consists of objects, and relations over them

• PRMs are based on a particular “relational logic” borrowed from databases:

Databases Relational Logic

Table Class

Tuple Object

Standard Field Descriptive Attribute

Foreign Key Field Reference Slot

PRM Introduction

• PRMs allow objects to be augmented with a description of relations between instances.

• The object relational structure can be a relational database or logic program.

• The PRM model augments the database with probabilities.

• The model is useful in cases where the configuration of a system (instances) change while the relational schema remains constant.

Probabilistic Relational Models

)).(|.()(

Oobj objAA

AobjPaAobjP

A PRM for a Relational Schema S is defined as: For each Class C and propositional Attribute A(C) in C we have A set of parents of A (simple, complex or aggregate) A CPD P(C.A | Pa(C.A))

Inference With PRMsSVE inference for a PRM P with q query variables and N attributes is

O(Nkbk(m+2)bq) (Pfeffer 2000) k is the maximum number of interface

variables q is the number of query variables m is the maximum tree width for any object in

P (related to the markov blanket).

Controlling PRM inference

• The number of interface variables, k, is related to the number of relations that a variable participates in as well as the number of slot chains that the variable participates in– Careful selection of relations (only part-of) can make

inference tractable.

• The tree width m depends on the markov blanket of an attribute. – Control of network topology can reduce this.

Adding time to PRM D(T)PRM

A two-time-slice PRM (2TPRM) for a relational schema S is defined as follows. For each class C and each propositional attribute A 2 A(C), we have:– A set of parents Pa(C:A) = f{Pa1; Pa2; .. Pal}, where– each Pai has the form C:B or f(C.rho.B), where rho is

a slot chain containing the attribute previous at most once, and

• f() is an aggregation function.

– A conditional probability model for P(C.A | Pa(C.A))

Adding Time to PRM’s

• Since time is another relation, doesn’t increase expressive power.– Significant impact of inference tractability since both k and m

may become quite large.• New Algorithm: Exploit the structure of time using the

interface and frontier algorithm (Murphy 2002).– Variables at slice t with links to variables at t+1 form the interface– Interface variables d-separate the past (< t) from the future

slices (> t).– Allows for on-line inference algorithms similar to inside-outside

algorithm for SCFG’s.– Use SVE to rollover slices.

• Approximation using Rao-Blackwelized particle filtering (Sanghai, Domingos, and Weld 2003)

Structured Probabilistic Inference

CPRM inference• Combines insights from

– the SVE algorithm for PRMs (Pfeffer 2000)– the frontier algorithms for temporal models (Murphy

2002) and the BN SCFG algorithm (Narayanan 99)– Inference algorithms for complex, coordinated events

(Narayanan 2002)

• Expressive Probabilistic Modeling paradigm with relations and branching dynamics.

• Offers principled methods to bound inferential complexity.

Temporal Projection in CPRM



the MAP






Event Simulation

Predicate ExtractionRetrieved

Documents

FrameNetFrames

OWL/OWL-STopic

Ontologies

Model Parameterization

CONTEXT

PRM

< PRM Update>

<Pred(args), Topic Model, Answer Type>

<Simulation Triggering >

AnswerBank

AnswerBank is a collection of over a 1200 QA annotations from the AQUAINT CNS corpus.

Questions and answers cover the different domains of the CNS data.

Questions and answers are POS tagged, and syntactically parsed.

Question and Answer predicates are annotated with PropBank arguments and FrameNet (when available) tags. FrameNet is annotating CNS data with frame information for use

by the AQUAINT QA community. We are planning to add more semantic information

including temporal, aspectual information (TIMEML+) and information about event relations and figurative uses.

Answer Types for complex questions in AnswerBank

ANSWER TYPE EXAMPLE NUMBERJustify (Proposition) What is the evidence that

IRAQ has WMD?89

Ability (Agent, Act) How can a Biological Weapons Program be detected?

71

Prediction (State) What were the possible ramifications of India’s launch of the Prithvi missile?

63

Hypothetical (Condition) If Musharraf is removed from power, will Pakistan become a militant Islamic State?

62

Event Structure Inferences

• For the annotations we classify complex event structure inferences as– Aspectual

• Stages of events, viewpoints, temporal relations (such as start(ev1, ev2), interrupt(ev1, ev2))

– Action-Based• Resources (produce,consume,lock), preconditions,

maintenance conditions, effects.

– Metaphoric• Event Structure Metaphor (ESM)

Events and predications (motion => Action), objects (Motion.Mover => Action.Actor), Parameters(Motion.speed =>Action.rateOfProgress)

Scaling Up





Semantic Web

• The World Wide Web (WWW) contains a large and expanding information base.

• HTML is accessible to humans but does not formally describe data in a machine interpretable form.

• XML remedies this by allowing for the use of tags to describe data (ex. disambiguating crawl)

• Ontologies are useful to describe objects and their inter-relationships.

• DAML+OIL (http://www.daml.org) is an markup language based on XML and RDF that is grounded in description logic and is designed to allow for ontology development, transfer, and inference on the web.

http://www.daml.org/

Programmatic Access to the web

Web-accessible programs and devices

Knowledge Rep’n for the “Semantic Web”

XML Schema RDF (Resource Description Framework)

RDFS (RDF Schema)

OWL/DAML-L (Logic)

OWL (Ontology)

XML (Extensible Markup Language)

Knowledge Rep’n for “Semantic Web Services”

XML Schema RDF (Resource Description Framework)

RDFS (RDF Schema)

DAML-L (Logic)

DAML+OIL (Ontology)

XML (Extensible Markup Language)

DAML-S (Services)

The OWL Language

OWL REF

DAML-S: Semantic Markup for Web Services

DAML-S: A DARPA Agent Markup Language for Services • DAML+OIL ontology for Web services:

• well-defined semantics• ontologies support reuse, mapping, succinct markup, ...

• Developed by a coalition of researchers from Stanford, SRI, CMU, BBN, and Nokia, Yale, under the auspices of DARPA.

• DAML-S version 0.6 posted October,2001 http://www.daml.org/services/daml-s[DAML-S Coalition, 2001, 2002]

[Narayanan & McIlraith 2003]

DAML-S/OWL-S Compositional Primitives

process

atomicprocess

compositeprocess

inputs (conditional) outputs preconditions (conditional) effects

controlconstructs

composedBy

whilesequence

If-then-else

fork

...

Implementation

DAML-S translation to the modeling environment KarmaSIM [Narayanan, 97] (http://www.icsi.berkeley.edu/~snarayan)

Basic Program:

Input: DAML-S description of Events

Output: Network Description of Events in KarmaSIM

Procedure:• Recursively construct a sub-network for each control construct.

Bottom out at atomic event.• Construct a net for each atomic event• Return network

Example of A WMD Ontology in OWL

<rdfs:Class rdf:ID="DevelopingWeaponOfMassDestruction"> <rdfs:subClassOf rdf:resource= SUMO.owl#Making"/><rdfs:comment>

Making instances of WeaponOfMassDestruction.</rdfs:comment>

</rdfs:Class>

http://reliant.teknowledge.com/DAML/SUMO.owl

http://www.daml.org/cgi-bin/hyperdaml?http://www.w3.org/2000/01/rdf-schema#Class

http://www.daml.org/cgi-bin/hyperdaml?http://www.w3.org/2000/01/rdf-schema#subClassOf

http://www.daml.org/cgi-bin/hyperdaml?http://reliant.teknowledge.com/DAML/SUMO.owl#Making

http://www.daml.org/cgi-bin/hyperdaml?http://www.w3.org/2000/01/rdf-schema#comment

http://www.daml.org/cgi-bin/hyperdaml?http://www.w3.org/2000/01/rdf-schema#comment

http://www.daml.org/cgi-bin/hyperdaml?http://www.w3.org/2000/01/rdf-schema#Class

http://reliant.teknowledge.com/DAML/SUMO.owl

FrameNet in OWL

• Program FNtoOWL– Implemented in Java– Uses the JENA API– Given

• an XML FrameNet database– File in the FN XML format (schema and dtd)

• The Location (file, URL) of the resulting OWL markup• A URL for the FrameNet ontology file

– Output• An OWL file for the FrameNet database at the specified

location.– URL:

http://www.icsi.berkeley.edu/~snarayan/FNtoOWL.zip.gz

FrameNet in OWL

C:\Program Files\Protege_2.1\Protege.exe

Conclusion

• Answering complex questions requires semantic representations at multiple levels.– NE and Extraction-based– Predicate Argument Structures– Frame, Topic and Domain Models

• All these representations should be capable of supporting inference about relational structures, uncertain information, and dynamic context.

• Both Semantic Extraction techniques and Structured Probabilistic KR and Inference methods have matured to the point that we understand the various algorithms and their properties.

• Flexible architectures that – embody these KR and inference techniques and – make use of the expanding linguistic and ontological resources (such

as on the Semantic Web)• Point the way to the future of semantically based QA systems!

References (URL)• Semantic Resources

– FrameNet: http://www.icsi.berkeley.edu/framenet (Papers on FrameNet and Computational Modeling efforts using FrameNet can be found here).

– PropBank: http://www.cis.upenn.edu/~ace/ – Gildea’s Verb Index; http://www.cs.rochester.edu/~gildea/Verbs/ (links FrameNet,

PropBank, and VerbNet• Probabilistic KR (PRM)

– http://robotics.stanford.edu/~koller/papers/lprm.ps (Learning PRM)– http://www.eecs.harvard.edu/~avi/Papers/thesis.ps.gz (Avi Pfeffer’s PRM Stanford thesis)– David Poole IJCAI 2003– Dan Roth NIPS 2004 (submitted)

• Dynamic Bayes Nets– http://www.ai.mit.edu/~murphyk/Thesis/thesis.pdf (Kevin Murphy’s Berkeley DBN thesis)– Weld et all 2003, DPRM

• Event Structure in Language– http://www.icsi.berkeley.edu/~snarayan/thesis.pdf (Narayanan’s Berkeley PhD thesis on

models of metaphor and aspect)– ftp://ftp.cis.upenn.edu/pub/steedman/temporality/temporality.ps.gz (Steedman’s article on

Temporality with links to previous work on aspect)– http://www.icsi.berkeley.edu/NTL (publications on Cognitive Linguistics and computational

models of cognitive linguistic phenomena can be found here)

http://www.icsi.berkeley.edu/framenet

http://www.cis.upenn.edu/~ace/

http://www.cs.rochester.edu/~gildea/Verbs/

http://robotics.stanford.edu/~koller/papers/lprm.ps

http://www.eecs.harvard.edu/~avi/Papers/thesis.ps.gz

http://www.ai.mit.edu/~murphyk/Thesis/thesis.pdf

http://www.icsi.berkeley.edu/~snarayan/thesis.pdf

ftp://ftp.cis.upenn.edu/pub/steedman/temporality/temporality.ps.gz

http://www.icsi.berkeley.edu/NTL

References (URL)

• Semantic Web– Scientific American Article– http://www.semanticWeb.org

• OWL– http://www.owl.org

• OWL-S – http://www.daml.org/services

http://www.semanticweb.org/

http://www.owl.org/

http://www.daml.org/services

Outline• Part V. From Ontologies to Inference

– From OWL to CPRM– FrameNet in OWL– FrameNet to CPRM mapping

• Part VI. A Pilot QA System Implementation– AnswerBank examples– Current results for Inference Type– Current results for Answer Structure

AnswerBank

AnswerBank is a collection of over a 1200 QA annotations from the AQUAINT CNS corpus.

Questions and answers cover the different domains of the CNS data.

Questions and answers are POS tagged, and syntactically parsed.

Question and Answer predicates are annotated with PropBank arguments and FrameNet (when available) tags. FrameNet is annotating CNS data with frame information for use

by the AQUAINT QA community. We are planning to add more semantic information

including temporal, aspectual information (TIMEML+) and information about event relations and figurative uses.

Event Simulation

Predicate ExtractionRetrieved

Documents

FrameNetFrames

OWL/OWL-STopic

Ontologies

Model Parameterization

CONTEXT

PRM

< PRM Update>

<Pred(args), Topic Model, Answer Type>

<Simulation Triggering >

Answer Types for complex questions in AnswerBank

ANSWER TYPE EXAMPLE NUMBERJustify (Proposition) What is the evidence that

IRAQ has WMD?89

Ability (Agent, Act) How can a Biological Weapons Program be detected?

71

Prediction (State) What were the possible ramifications of India’s launch of the Prithvi missile?

63

Hypothetical (Condition) If Musharraf is removed from power, will Pakistan become a militant Islamic State?

62



the MAP






Conclusion

• Answering complex questions requires semantic representations at multiple levels.– NE and Extraction-based– Predicate Argument Structures– Frame, Topic and Domain Models

• All these representations should be capable of supporting inference about relational structures, uncertain information, and dynamic context.

• Both Semantic Extraction techniques and Structured Probabilistic KR and Inference methods have matured to the point that we understand the various algorithms and their properties.

• Flexible architectures that – embody these KR and inference techniques and – make use of the expanding linguistic and ontological resources (such

as on the Semantic Web)• Point the way to the future of semantically based QA systems!

References (URL)• Semantic Resources

– FrameNet: http://www.icsi.berkeley.edu/framenet (Papers on FrameNet and Computational Modeling efforts using FrameNet can be found here).

– PropBank: http://www.cis.upenn.edu/~ace/ – Gildea’s Verb Index; http://www.cs.rochester.edu/~gildea/Verbs/ (links FrameNet,

PropBank, and VerbNet• Probabilistic KR (PRM)

– http://robotics.stanford.edu/~koller/papers/lprm.ps (Learning PRM)– http://www.eecs.harvard.edu/~avi/Papers/thesis.ps.gz (Avi Pfeffer’s PRM Stanford thesis)– David Poole IJCAI 2003– Dan Roth NIPS 2004 (submitted)

• Dynamic Bayes Nets– http://www.ai.mit.edu/~murphyk/Thesis/thesis.pdf (Kevin Murphy’s Berkeley DBN thesis)– Weld et all 2003, DPRM

• Event Structure in Language– http://www.icsi.berkeley.edu/~snarayan/thesis.pdf (Narayanan’s Berkeley PhD thesis on

models of metaphor and aspect)– ftp://ftp.cis.upenn.edu/pub/steedman/temporality/temporality.ps.gz (Steedman’s article on

Temporality with links to previous work on aspect)– http://www.icsi.berkeley.edu/NTL (publications on Cognitive Linguistics and computational

models of cognitive linguistic phenomena can be found here)

http://www.icsi.berkeley.edu/framenet

http://www.cis.upenn.edu/~ace/

http://www.cs.rochester.edu/~gildea/Verbs/

http://robotics.stanford.edu/~koller/papers/lprm.ps

http://www.eecs.harvard.edu/~avi/Papers/thesis.ps.gz

http://www.ai.mit.edu/~murphyk/Thesis/thesis.pdf

http://www.icsi.berkeley.edu/~snarayan/thesis.pdf

ftp://ftp.cis.upenn.edu/pub/steedman/temporality/temporality.ps.gz

http://www.icsi.berkeley.edu/NTL

References (URL)

• Semantic Web– Scientific American Article– http://www.semanticWeb.org

• OWL– http://www.owl.org

• OWL-S – http://www.daml.org/services

http://www.semanticweb.org/

http://www.owl.org/

http://www.daml.org/services

Structured Probabilistic Inference in an Embodied Construction Grammar and Jerome Feldman...

Documents

Transcript of Structured Probabilistic Inference in an Embodied Construction Grammar and Jerome Feldman...