From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

12
From Text to Reasoning Marko Grobelnik Jozef Stefan Institute / Cycorp Europe, Slovenia SWANK Workshop, Stanford, Apr 16 th 2014 Thanks to Michael Witbrock, Janez Starc, Luka Bradesko, Blaz Fortuna

description

text understanding topic - ...from plain text to logic

Transcript of From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Page 1: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

From Text to ReasoningMarko Grobelnik

Jozef Stefan Institute / Cycorp Europe, Slovenia

SWANK Workshop, Stanford, Apr 16th 2014Thanks to Michael Witbrock, Janez Starc, Luka Bradesko, Blaz Fortuna

Page 2: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Reflection on what should be the goal of NLP

• The (mostly) forgotten long term aim of NLP is to understand the text• …and not so much ‘processing’ itself (as NLP suggests)

• The curse of shallow solutions working well enough for too many problems, made people (and researchers) happy for too long

• …as much as information retrieval and text mining are useful, they delayed development of “text understanding”

Page 3: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Language vs. World

• …if we agree with the above statement, then at this point in time, we have ‘language’, but the ‘world’ is more or less missing

• So – so what a ‘world’ or ‘world model’ could be?

Page 4: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

CYC KNOWLEDGE BASE

Thing

Universe

isa

isa

Celestial Body

isa

located in

Planet

subclass

Earth

isa

Animal

isa

Human

subclass

Physics

Money

Mathematics

Chemistry

Time

LearningFoodVehicles

EventEducation

School

LanguageLoveEmotions Going for a

walk

Death

Cat

Euro

Working

Words

DrivingRainStabbing someone

Nature

Tree

HatredFear

Physics

Time

LearningVehicles

EventEducation

School

EmotionsGoing for a

walk

Death

Cat

EuroWords

DrivingRain

Stabbing someone

Nature

Tree

HatredFear

Planet

Earth

isaHuman

Physics

Money

Mathematics

Chemistry

Time

LearningFoodVehicles

Event

EducationLanguag

e LoveEmotions Going for a walk

Cat

Euro

Working

Words

Driving Rain

Tree

HatredFear

LearningVehicles

Event

EducationSchool

Emotions

Euro

Driving

Stabbing someone

Hatred

Fear

Creating a World Model (top-down approach -Cyc)

Page 5: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Model of the world…• …beyond surface knowledge• …to interconnect contextualized fragments

Why?• To make reasoning capable of connecting

isolated fragments of knowledge• To derive new knowledge beyond

materialized factual knowledge

World model

Top-down KA

Bottom-up KA

Multimodal data

Why we need a World model?

Page 6: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Disambiguation with a world model (CycKB)World model used as a set of common-sense semantic

constraints to disambiguate text

Page 7: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

One of the challenges for the future: Micro-reading

• It is “easier” to understand millions of documents than one document• …reading and understanding a single document is micro-reading

• The following experiment is on how much knowledge we can extract from individual documents

• …extraction is in a form of first order inferentially productive Cyc logic

• …allowing us full reasoning to identify new facts

• …minimizing human involvement, optimizing precision and recall

Document Assertions Reasoning Dialogue

Page 8: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Example of text and extracted Cyc assertions (1/2)

Automatically Extracted Assertions:• (isa ?V1 ProsecutingEvent)• (agent ?V1 RudyGiuliani)• (genls Entity Agent)• (isa RudyGiuliani Agent)• (isa RudyGiuliani Entity)• (isa ?V3 OrganizingEvent)• (patient ?V3 (IntersectionFn

OrganizedCrime WallStreet))

• (isa (IntersectionFn OrganizedCrimeWallStreet) Patient)

• (genls Entity Patient)• (isa OrganizedCrime Patient)• (isa OrganizedCrime Entity)• (isa WallStreet Patient)• (isa WallStreet Entity)

Sentence: He prosecuted a number of high-profile cases, including ones against organized crime and Wall_Street financiers.

Page 9: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Example of text and extracted Cyc assertions (2/2)

Automatically Extracted Assertions:

• (isa ?V1 SubstitutingEvent)

• (temporal ?V1 Lincoln)

• (genls Entity Agent)

• (isa Lincoln Agent)

• (genls Person Entity)

• (isa Lincoln Entity)

• (isa Lincoln Person)

• (isa ?V3 SucceedingEvent)

• (temporal ?V3 Grant)

• (isa Grant Agent)

• (isa Grant Entity)

• (isa Grant Person)

Sentence: Each time a general failed, Lincoln substituted another until finally Grant succeeded in 1865.

Page 10: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Reasoning on extracted assertions (Cyc)

Query:

(and

(isa ?Per Person)

(birthDate ?Per ?BD)

(occursBefore ?BD WorldWarII)

(thereExistsAtLeast 2 ?Role

(lifeRole ?Per ?Role)

(roleInIndustry ?Role FilmIndustry)

)

)

Answers:

Sir Derek_George_Jacobi

Sir Alexander_Korda

Victor Lonzo_Fleming

John_Francis_Junkin

Cornel_Wilde

George_Stevens

Bertrand_Blier

NL Query: People born before World War II who had at least two roles in the film industry KB?

Page 11: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Knowledge Capture Knowledge UseRule:

(implies (and

(isa ?VENUE FoodTruck-Organization)

(lastVenue ?USER ?VENUE)

(suggestionsForCuriousCatQuestionType FoodTruckSecondaryTypeOfPlace-

CuriousCatQuestion ?SUGGESTIONLIST))

(curiousCatWantsToAskUser ?USER

(secondaryTypeOfPlace ?VENUE FoodTruck-Organization ?TYPE) ?SUGGESTIONLIST))

Witbrock, M., Bradeško, L., 2013,Conversational Computation in Michelucci, Pietro (Ed.)Handbook of Human Computation, 531-543.

Intelligent SIRI:http://curiouscat.cc/

Page 12: From Text To Reasoning - Marko Grobelnik - SWANK Workshop Stanford - 16 Apr 2014

Some of the AI challenges for next years

• Background knowledge in a form of a World Model• …to have knowledge contextualized

• Representing and scalable reasoning knowledge with operational soft logic

• …to decrease brittleness of logic and increase scale

• Economically viable structured knowledge acquisition with high precision and recall

• …to increase the reach of what we can acquire

• Emphasizing understanding vs. applying black box models