CS626-449: NLP, Speech and Web-Topics-in-AI
description
Transcript of CS626-449: NLP, Speech and Web-Topics-in-AI
![Page 1: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/1.jpg)
CS626-449: NLP, Speech and Web-Topics-in-AI
Pushpak BhattacharyyaCSE Dept., IIT Bombay
Lecture 37: Semantic Role Extraction (obtaining Dependency Parse)
![Page 2: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/2.jpg)
Vaquious Triangle
2
Analy
sis
Generation
Transfer Based(do deep semantic processBefore entering the target language)
Direct(enter the target Language immediatelyThrough a dictionary)
Interlingua based (do deep semantic processBefore entering the target language)
Vaquious: an eminentFrench Machine Translation Researcher-Originally a Physicist
![Page 3: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/3.jpg)
3
Universal Networking Language Universal Words (UWs) Relations Attributes Knowledge Base
![Page 4: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/4.jpg)
4
UNL Graph
objagt
@ entry @ past
minister(icl>person)
forward(icl>send)
mail(icl>collection)
He(icl>person)
@def
@def
gol
He forwarded the mail to the minister.
![Page 5: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/5.jpg)
5
AGT / AOJ / OBJ AGT (Agent)
Definition: Agt defines a thing which initiates an action
AOJ (Thing with attribute)Definition: Aoj defines a thing which is in a state or has an attribute
OBJ (Affected thing)Definition: Obj defines a thing in focus which is directly affected by an event or state
![Page 6: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/6.jpg)
6
Examples John broke the window.
agt ( break.@entry.@past, John)
This flower is beautiful.aoj ( beautiful.@entry, flower)
He blamed John for the accident.obj ( blame.@entry.@past, John)
![Page 7: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/7.jpg)
7
BEN BEN (Beneficiary)
Definition: Ben defines a not directly related beneficiary or victim of an event or state
Can I do anything for you?ben ( do.@entry.@interrogation.@politeness, you )obj ( do.@entry.@interrogation.@politeness,
anything )agt (do.@entry.@interrogation.@politeness, I )
![Page 8: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/8.jpg)
8
PUR PUR (Purpose or objective)
Definition: Pur defines the purpose or objectives of the agent of an event or the purpose of a thing exist
This budget is for food.pur ( food.@entry, budget )mod ( budget, this )
![Page 9: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/9.jpg)
9
RSN
RSN (Reason)Definition: Rsn defines a reason why an event or a state happens
They selected him for his honesty.agt(select(icl>choose).@entry, they)obj(select(icl>choose) .@entry, he)rsn (select(icl>choose).@entry, honesty)
![Page 10: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/10.jpg)
10
TIM TIM (Time)
Definition: Tim defines the time an event occurs or a state is true
I wake up at noon.agt ( wake up.@entry, I )tim ( wake up.@entry, noon(icl>time))
![Page 11: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/11.jpg)
11
TMF TMF (Initial time)
Definition: Tmf defines a time an event starts
The meeting started from morning.obj ( start.@entry.@past, meeting.@def )tmf ( start.@entry.@past, morning(icl>time) )
![Page 12: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/12.jpg)
12
TMT TMT (Final time)
Definition: Tmt defines a time an event ends
The meeting continued till evening.obj ( continue.@entry.@past, meeting.@def )tmt ( continue.@entry.@past,evening(icl>time) )
![Page 13: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/13.jpg)
13
PLC PLC (Place)
Definition: Plc defines the place an event occurs or a state is true or a thing exists
He is very famous in India.aoj ( famous.@entry, he )man ( famous.@entry, very)plc ( famous.@entry, India)
![Page 14: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/14.jpg)
14
PLF PLF (Initial place)
Definition: Plf defines the place an event begins or a state becomes true
Participants come from the whole world.
agt ( come.@entry, participant.@pl )plf ( come.@entry, world )mod ( world, whole)
![Page 15: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/15.jpg)
15
PLT PLT (Final place)
Definition: Plt defines the place an event ends or a state becomes false
We will go to Delhi.agt ( go.@entry.@future, we )plt ( go.@entry.@future, Delhi)
![Page 16: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/16.jpg)
16
INS INS (Instrument)
Definition: Ins defines the instrument to carry out an event
I solved it with computeragt ( solve.@entry.@past, I )ins ( solve.@entry.@past, computer )obj ( solve.@entry.@past, it )
![Page 17: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/17.jpg)
17
Attributes Constitute syntax of UNL Play the role of bridging the conceptual world
and the real world in the UNL expressions Show how and when the speaker views what is
said and with what intention, feeling, and so on Seven types:
Time with respect to the speaker Aspects Speaker’s view of reference Speaker’s emphasis, focus, topic, etc. Convention Speaker’s attitudes Speaker’s feelings and viewpoints
![Page 18: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/18.jpg)
18
Tense: @past
The past tense is normally expressed by @past
{unl}agt(go.@entry.@past, he)…{/unl}
He went there yesterday
![Page 19: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/19.jpg)
19
Aspects: @progress
{unl}man
( rain.@entry.@present.@progress, hard )
{/unl}
It’s raining hard.
![Page 20: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/20.jpg)
20
Speaker’s view of reference
@def (Specific concept (already referred))
The house on the corner is for sale. @indef (Non-specific class)
There is a book on the desk @not is always attached to the UW
which is negated.He didn’t come.
agt ( come.@entry.@past.@not, he )
![Page 21: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/21.jpg)
21
Speaker’s emphasis @emphasis
John his name is.mod ( name, he )aoj ( John.@emphasis.@entry, name )
@entry denotes the entry point or main UW of an UNL expression
![Page 22: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/22.jpg)
22
Subcategorization Frames Specify the categorial class of the lexical
item. Specify the environment. Examples:
kick: [V; _ NP]cry: [V; _ ] rely: [V; _PP] put: [V; _ NP PP]think: : [V; _ S` ]
![Page 23: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/23.jpg)
23
Subcategorization Rules
V y /_NP]_ ]_PP]_NP PP]_S`]
Subcategorization Rule:
![Page 24: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/24.jpg)
24
Subcategorization Rules1. S NP VP
2. VP V (NP) (PP) (S`)…3. NP Det N4. V rely / _PP]5. P on / _NP]6. Det the7. N boy, friend
The boy relied on the friend.
![Page 25: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/25.jpg)
25
Semantically Odd Constructions Can we exclude these two ill-
formed structures ? *The boy frightened sincerity. *Sincerity kicked the boy.
Selectional Restrictions
![Page 26: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/26.jpg)
26
Selectional Restrictions Inherent Properties of Nouns:
[+/- ABSTRACT], [+/- ANIMATE]
E.g., Sincerity [+ ABSTRACT]Boy [+ANIMATE]
![Page 27: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/27.jpg)
27
Selectional Rules A selectional rule specifies certain selectional
restrictions associated with a verb.
V y /[+/-ABSTARCT][+/-
ANIMATE]
V frighten
/ [+/-ABSTARCT][+ANIMATE
]
____
____
![Page 28: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/28.jpg)
28
Subcategorization FrameforwardV__ NP PP
invitationN__ PP
accessibleA__ PP
e.g., An invitation to the party
e.g., A program making science is more accessible to young people
e.g., We will be forwarding our new catalogue to you
![Page 29: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/29.jpg)
29
Thematic RolesThe man forwarded the mail to the minister.
forward
V__ NP PP
Event FORWARD [Thing THE MAN], [Thing THE MAIL],
[Path TO THE MINISTER]
()
![Page 30: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/30.jpg)
30
How to define the UWs in UNL Knowledge-Base?
Nominal concept Abstract Concrete
Verbal concept Do Occur Be
Adjective concept Adverbial concept
![Page 31: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/31.jpg)
31
Nominal Concept: Abstract thing
abstract thing{(icl>thing)}culture(icl>abstract thing)civilization(icl>culture{>abstract thing})direction(icl>abstract thing)east(icl>direction{>abstract thing})duty(icl>abstract thing)mission(icl>duty{>abstract thing})responsibility(icl>duty{>abstract thing})accountability{(icl>responsibility>duty)}event(icl>abstract thing{,icl>time>abstract thing}) meeting(icl>event{>abstract thing,icl>group>abstract thing})conference(icl>meeting{>event}) TV conference{(icl>conference>meeting)}
![Page 32: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/32.jpg)
32
Nominal Concept: Concrete thing
concrete thing{(icl>thing,icl>place>thing)}building(icl>concrete thing)factory(icl>building{>concrete thing})house(icl>building{>concrete thing})substance(icl>concrete thing)cloth(icl>substance{>concrete thing})cotton(icl>cloth{>substance})fiber(icl>substance{>concrete thing})synthetic fiber{(icl>fiber>substance)}
textile fiber{(icl>fiber>substance)}liquid(icl>substance{>concrete thing})
beverage(icl>food,icl>liquid>substance}) coffee(icl>beverage{>food}) liquor(icl>beverage{>food})
beer(icl>liquor{>beverage})
![Page 33: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/33.jpg)
33
Verbal concept: do
do({icl>do,}agt>thing,gol>thing,obj>thing)express({icl>do(}agt>thing,gol>thing,obj>thing{)})
state(icl>express(agt>thing,gol>thing,obj>thing))explain(icl>state(agt>thing,gol>thing,obj>thing))
add({icl>do(}agt>thing,gol>thing,obj>thing{)})change({icl>do(}agt>thing,gol>thing,obj>thing{)})convert(icl>change(agt>thing,gol>thing,obj>thing)classify({icl>do(}agt>thing,gol>thing,obj>thing{)})divide(icl>classify(agt>thing,gol>thing,obj>thing))
![Page 34: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/34.jpg)
34
Verbal concept: occur and be occur({icl>occur,}gol>thing,obj>thing)
melt({icl>occur(}gol>thing,obj>thing{)})divide({icl>occur(}gol>thing,obj>thing{)})arrive({icl>occur(}obj>thing{)})
be({icl>be,}aoj>thing{,^obj>thing}) exist({icl>be(}aoj>thing{)})born({icl>be(}aoj>thing{)})
![Page 35: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/35.jpg)
35
How to define the UWs in UNL Knowledge Base?
In order to distinguish among the verb classes headed by 'do', 'occur' and 'be', the following features are used:
UW[ need an agent ]
[ need an object ]
English
'do' + + "to kill"'occur' - + "to fall"
'be' - - "to know"
![Page 36: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/36.jpg)
36
The verbal UWs (do, occur, be) also take some pre-defined semantic cases, as follows:
How to define the UWs in UNL Knowledge-Base?
UW PRE-DEFINED CASES
English
'do' takes necessarily agt>thing
"to kill"
'occur' takes necessarily obj>thing
"to fall"
'be' takes necessarily aoj>thing
"to know"
![Page 37: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/37.jpg)
37
Complex sentenceI want to watch this movie.
movie(icl>)
want (icl>)@entry.@pa
stob
j
@def
:01
I (iof>person)
watch (icl>do)@entry.@inf
objag
tag
tI (iof>person)
![Page 38: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/38.jpg)
38
Approach to UNL Generation
![Page 39: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/39.jpg)
Problem Definition Generate UNL expressions for English
sentences in a robust and scalable manner, using syntactic analysis and lexical
resources extensively. This needs
detecting semantically relatable entities and solving attachment problems
![Page 40: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/40.jpg)
Semantically Relatable Sequences (SRS)Definition: A semantically relatable
Sequence (SRS) of a sentence is a group of words in the sentence (not necessarily consecutive) that appear in the semantic graph of the sentence as linked nodes or nodes with speech act labels
(This is motivated by UNL representation)
![Page 41: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/41.jpg)
SRS as an intermediary to and intermediary
SourceLanguageSentence
TargetLanguageSentenceSRS UNL
![Page 42: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/42.jpg)
Example to illustrate SRS
“The man bought a
new car in June” in: modifiera: indefinite
the: definite
man
past tense
agent
bought
objecttime
car
new
June
modifier
![Page 43: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/43.jpg)
Sequences from “the man bought a new car in June”
a. {man, bought}b. {bought, car}c. {bought, in, June}d. {new, car}e. {the, man}f. {a, car}
![Page 44: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/44.jpg)
Basic questions
Which words can form semantic constituents, which we call Semantically Relatable Sequences (SRS)?
What after all are the SRSs of the given sentence?
What semantic relations can link the words in an SRS and the SRSs themselves?
![Page 45: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/45.jpg)
Postulate
A sentence needs to be broken into Sequences of at most three forms {CW, CW} {CW, FW, CW} {FW, CW}
where CW refers to content word or a clause and FW to function word
![Page 46: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/46.jpg)
SRS and Language Phenomena
![Page 47: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/47.jpg)
Movement: Preposition Stranding John, we laughed at.
(we , laughed.@entry)---------(CW, CW)
(laughed.@entry,at, John)---(CW, FW, CW)
![Page 48: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/48.jpg)
Movement: Topicalization The problem, we solved.
(we , solved.@entry)------------(CW, CW)
(solved.@entry , problem)-----(CW,CW)
(the, problem)--------------------(CW,CW)
![Page 49: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/49.jpg)
Movement: Relative Clauses John told a joke which we had already heard.
(John, told.@entry) -------------------(CW, CW) (told.@entry, :01) ---------------------(CW,CW) SCOPE01(we,had,heard.@entry)-------(CW,
FW,CW) SCOPE01(already,heard.@entry)-------(CW,CW) SCOPE01(heard@entry,which,joke)----
(CW,FW,CW) SCOPE01(a, joke)--------------------------(FW,CW)
![Page 50: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/50.jpg)
Movement: Interrogatives Who did you refer her to?
(did , refer.@entry.@interrogative)-------(FW,CW)
(you, refer.@entry.@interrogative)--------(CW,CW)
(refer.@entry.@interrogative , her)--------(CW,CW)
(refer.@entry.@interrogative , to,who)----
(CW,FW,CW)
![Page 51: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/51.jpg)
Empty Pronominals: to-infinitivals Bill was wise to sell the piano.
(wise.@entry , SCOPE01)---------------(CW,CW) SCOPE01(sell.@entry , piano)---------(CW,CW) (Bill, was, wise.@entry) -----------------(CW,
FW,CW) SCOPE01(Bill, to, sell.@entry)---------(CW,
FW,CW) SCOPE01(the, piano) --------------------(FW,CW)
![Page 52: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/52.jpg)
Empty pronominal: Gerundial The cat leapt down spotting a thrush on the lawn. (The, cat) -------------------------------(FW, CW) (cat, leapt.@entry) --------------------(CW, CW) (leapt.@entry , down) ----------------(CW, CW) (leapt.@entry , SCOPE01) -----------------(CW, CW) SCOPE01(spotting.@entry,thrush)--------(CW,CW) SCOPE01(spotting.@entry,on,lawn)---(CW,FW,CW)
![Page 53: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/53.jpg)
PP Attachment John cracked the glass with a stone.
(John, cracked.@entry)--------------(CW,CW) (cracked.@entry, glass)-------------(CW,CW) (cracked.@entry, with, stone)----(CW,FW,CW) (a, stone)------------------------------(FW,CW) (the,glass)-------------------------(FW,CW)
![Page 54: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/54.jpg)
SRS and PP attachment (Mohanty, Almeida, Bhattacharyya, 04)
Conditions Sub-conditions Attachment Point
[PP] is subcategorized by the verb [V]
[NP2] is licensed by a preposition [P]
[NP2] is attached to the verb [V] (e.g., He forwarded the mail to the minister)
[PP] is subcategorized by the noun in [NP1]
[NP2] is licensed by a preposition [P]
[NP2] is attached to the noun in [NP1](e.g., John published six articles on machine translation )
[PP] is neither subcategorized by the verb [V] nor by the noun in [NP1]
[NP2] refers to [PLACE] / [TIME] feature
[NP2] is attached to the verb [V](e.g., I saw Mary in her office; The girls met the teacher on different days)
![Page 55: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/55.jpg)
Linguistic Study to Computation
![Page 56: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/56.jpg)
Syntactic constituents to Semantic constituents
A probabilistic parser (Charniak, 04) is used.
Other resources: Wordnet and Oxford Advanced Learner’s Dictionary
In a parse tree, tags give indications of CW and FW: NP, VP, ADJP and ADVP CW PP (prepositional phrase), IN
(preposition) and DT (determiner) FW
![Page 57: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/57.jpg)
Observation: Headwords of sibling nodes form SRSs
“John has bought
a car.”
SRS:{has, bought}, {a, car}, {bought, car} a
(C) VP bought
(F) AUX has (C) VP bought
(C) VBD bought (C) NP car
(F) DT a (C) NN car
bought
car
has
![Page 58: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/58.jpg)
Need: Resilience to wrong PP attachment
“John has published an
article on linguistics” Use PP attachment heuristics Get
{article, on, linguistics}
on linguistics
(C)VP published
(F) PP on(C)VBD published (C)NP article
published
(F)DT an
an
(C)NNarticle
(F)IN on
article
(C)NNS linguistics
(C)NPlinguistics
![Page 59: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/59.jpg)
to-infinitival“I forced him to watch this movie” Clause boundary is the VP node, labeled with SCOPE
Tag is modified to TO, a FW tag, indicating that it heads a to-infinitival clause,
The duplication and insertion of the NP node with head him (depicted by shaded nodes) as a sibling of the VBD node with head forced is done to bring
out the existence of a semantic relation between force and
him.
(C)VP watch
(C)VBD forced (C)NP him(C) S SCOPE
(F)TO toto
(C)VP forced
to
forced
(C)VP
(C)PRP him
him
(C)NP him
him
(C)PRP him
![Page 60: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/60.jpg)
Linking of clauses: “John said that he was reading a novel” Head of S node marked as Scope SRS: {said, that, SCOPE}.
Adverbial clauses have similar parse tree structures except that the subordinating conjunctions are different from that.
(C)VBD said (F) SBAR that
(C) VP said
(F) IN that(C) S SCOPE
said that
![Page 61: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/61.jpg)
Implementation Block Diagram of the system
Parse Tree
Charniak Parser
Scope Handler
Attachment Resolver
WordNet 2.0
Sub-categorization Database
Input Sentence
Parse Tree modification and augmentation with head and scope
information
AugmentedParse Tree
Semantically Related Sequences
Noun classification
Semantically Relatable Sequences Generator
THAT clause as Subcat property
Preposition as Subcat property
Time and Place features
![Page 62: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/62.jpg)
Head determination Uses a bottom-up strategy to determine the
headword for every node in the parse tree. Crucial in obtaining the SRSs, since wrong
head information may end up getting propagated all the way up the tree
Processes the children of every node starting from the rightmost child and checks the head information already specified against the node’s tag to determine the head of the node
Some special cases are: SBAR node A VP node with PRO insertion, copula, Phrasal verbs
etc. NP nodes with of-PP cases and conjunctions under
them, which lead to scope creation.
![Page 63: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/63.jpg)
Scope handler Performs modification on the parse
trees by insertion of nodes in to-infinitival cases
Adjusts of the tag and head information in case of SBAR nodes
![Page 64: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/64.jpg)
Attachment resolver
Takes a (CW1, FW, CW2) as input and checks the time and place features of CW2, the noun class of CW1 and the subcategorization information for the CW1 and
FW pair to decide the attachment. If none of these yield any deterministic
results, take the attachment indicated by the parser
![Page 65: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/65.jpg)
SRS generator Performs a breadth-first search on the
parse tree and performs detailed processing at every node N1 of the tree.
S nodes which dominate entire clauses (main or embedded) are treated as CWs.
SBAR and TO nodes are treated as FWs.
![Page 66: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/66.jpg)
AlgorithmAlgorithmIf the node N1 is a CW (new/JJ,
published/VBD, fact/NN, boy/NN, John/NNP) perform the following checks:
If the sibling N2 of N 1 is a CW (car/NN, article/NN, SCOPE/S)
Then create {CW,CW} ({new, car}, {published, article}, {boy, SCOPE})
If the sibling N2 is a FW (in/PP, that/SBAR, and/CC)
Then, check if N2 has a child FW, N3 (in/IN, that/IN) and a child CW, N4 (June/NN, SCOPE/S)
If yes,Then use attachment resolver to decide
the CW to which N3 and N4 attach.Create{CW,FW,CW} ({published, in,
June}, {fact, that, SCOPE})If no,
Then check if next sibling N5 of N 1 is a CW (Mary/NN)
If yes,Create {CW,FW,CW} ({John, and, Mary})If the node N1 is a FW (the/DT, is/AUX,
to/TO), perform the following checks: If the parent node is a CW (boy/NP,
famous/VP)Check if sibling is an adjective.i. If yes, (famous/JJ)Then, create {CW,FW,CW} ({She, is,
famous})ii. If no, (boy/NN)Then, create {FW,CW} ({the, boy}, {has,
bought})If the parent node N6 is a FW (to/TO) and
the sibling node N7 is a CW (learn/VB)Use attachment resolver to decide on the
preceding CW to which N6 and N7 can attach.
Create {CW,FW,CW} ({exciting, to, learn})
![Page 67: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/67.jpg)
Evaluation FrameNet corpus [Baker et. al., 1998], a
semantically annotated corpus, as the testdata.
92310 sentences (call this the gold standard) Created automatically from the FrameNet
corpus taking verbs, nouns and adjectives as the targets Verbs as the target- 37,984 (i.e., semantic frames
of verbs) Nouns as the target-37,240 Adjectives as the target-17,086
![Page 68: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/68.jpg)
Score for high frequency verbsVerb Frequency ScoreSwim 280 0.709Depend 215 0.804Look 187 0.835Roll 173 0.7Rush 172 0.775Phone 162 0.695Reproduce 159 0.797Step 159 0.795Urge 157 0.765Avoid 152 0.789
![Page 69: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/69.jpg)
Scores of 10 verb groups of high frequency in the Gold Standard
![Page 70: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/70.jpg)
Scores of 10 noun groups of high frequency in the Gold Standard
![Page 71: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/71.jpg)
An actual sentence A. Sentence : A form of asbestos
once used to make Kent cigarette filters has caused a high percentage of cancer deaths among a group of workers exposed to it more than 30 years ago, researchers reported.
![Page 72: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/72.jpg)
Relative performance on SRS constructs
0 20 40 60 80 100
Total SRSs
(FW,CW)
(CW,FW,CW)
(CW,CW)
Para
met
ers
mat
ched
Recall/Precision
Recall
Precision
![Page 73: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/73.jpg)
Results on sentence constructs
0 20 40 60 80 100
To-infinitival clause resolution
Complement-clause resolution
Clause linkings
PP Resolution
Para
met
er
Recall/Precision
Recall
Precision
Rajat Mohanty, Anupama Dutta and Pushpak Bhattacharyya, Semantically Relatable Sets: Building Blocks for Repesenting Semantics, 10th Machine Translation Summit ( MT Summit 05), Phuket, September, 2005.
![Page 74: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/74.jpg)
Statistical Approach
![Page 75: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/75.jpg)
Use SRL marked corpora Daniel Gildea and Daniel Jurafsky. 2002. Automatic labeling of
semantic roles. Computational Linguistics, 28(3):245–288. PropBank corpus
Role annotated WSJ part of Penn Treebank [10] PropBank role-set [2,4]
Core roles: ARG0 (Proto-agent), ARG1 (Proto-patient) to ARG5 Adjunctive roles:
ARGM-LOC (for locatives), ARGM-TMP (for temporals), etc.
![Page 76: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/76.jpg)
SRL marked corpora contd… PropBank roles: an example
[ARG0 It] operates] [ARG1 stores] [ARGM−LOC mostly in Iowa and Nebraska]
Preprocessing systems [2] Part of speech tagger Base Chunker Full syntactic parser Named entities recognizer
Fig.4: Parse tree output, Source: [5]
![Page 77: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/77.jpg)
Probabilistic estimation [1] Empirical probability estimation over candidate roles for each
constituent based upon extracted features
here,t is the target wordr is a candidate role,h , pt, gov, voice are features
Linear interpolation, with condition
• Geometric mean, with condition
),,,,,(#),,,,,,(#),,,,,|(
tvoicepositiongovpthtvoicepositiongovpthrtvoicepositiongovpthrP
),,|()|(),,|(),|()|()|( 54321 tpthrPhrPtgovptrPtptrPtrPtconstituenrP
)},,|()|(),,|(),|()|(exp{1)|( 54321 tpthrPhrPtgovptrPtptrPtrPz
tconstituenrP
1)|( r tconstituenrP
1i i
![Page 78: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/78.jpg)
A state-of-art SRL system: ASSERT [4]
Main points [3,4] Use of Support Vector Machine [13] as classifier Similar to FrameNet “domains”, “Predicate Clusters” are introduced Named Entities [14] is used as a new feature
Experiment I (Parser dependency testing) Use of PropBank bracketed corpus Use of Charniak parser trained on Penn Treebank corpusParse Task Precision (%) Recall (%) F-score (%) Accuracy (%)
TreebankId. 97.5 96.1 96.8 -
Class. - - - 93.0
Id. + Class. 91.8 90.5 91.2 -
CharniakId. 87.8 84.1 85.9 -
Class. - - - 92.0
Id. + Class. 81.7 78.4 80.0 -
Table 1: Performance of ASSERT for Treebank and Charniak parser outputs.Id. Stands for identification task and Class. stands for classification task. Data source: [4]
![Page 79: CS626-449: NLP, Speech and Web-Topics-in-AI](https://reader035.fdocuments.us/reader035/viewer/2022062310/56814b8b550346895db8702b/html5/thumbnails/79.jpg)
Experiments and Results Experiment II (Cross genre testing)
1. Training on PropBanked WSJ data and testing on Brown Corpus2. Charniak parser trained on first PropBank then Brown
Table 2: Performance of ASSERT for various experimental combinations Date source: [4]