A Joint Model For Semantic Role Labeling Aria Haghighi, Kristina Toutanova, Christopher D. Manning...
-
Upload
malcolm-harrington -
Category
Documents
-
view
213 -
download
0
Transcript of A Joint Model For Semantic Role Labeling Aria Haghighi, Kristina Toutanova, Christopher D. Manning...
A Joint Model For Semantic Role Labeling
Aria Haghighi, Kristina Toutanova, Christopher D. Manning
Computer Science DepartmentStanford University
Most Previous Work: Local Models
• Extract features for each node and the predicate
• Classify nodes independently
the children
The ogrecooked
NP
S
VPNP(n
)
Phrase Type: NPPath: NP-up-VP-down VHead Word: childrenPredicate: cookPassive: falsePosition: after
A Drawback of Local Models
S
VP
The ogre
cooked
NP
NP
the children
NP
a meal
NPPATIENT
NPAGENT
NPBENIFICIARY NPPATIENT
• Core argument frame constraints• Hard Constraints: No overlapping arguments• Soft Constraints: AGENT occurs before other core arguments in a active sentence• Highly non-local constraints
• Model a predicate’s argument preferences• No core arguments are bad and so are 10• Verb specific rules: Obligatory arguments
• We’d like to do this statistically without hand-coding constraints or conditions
What we’d like to capture
Previous Joint Approaches
• Argument Language model and Viterbi Decoding
(Gildea and Jurafsky, 02)
• Linear Programming over Local Scores (Punyakanok et al, 04 and 05)
• Our approach: Capture joint information between features and labels discriminatively
Joint Discriminative Reranking
• Use a reranking approach (Collins 00)• Start with local model with strong
independences
• Find top N non-overlapping assignments for local model using a simple dynamic program (Toutanova, 05)
• Use joint model to select best assignment among top N using a joint log-linear model
Reranking Upperbounds
72
76
80
84
88
92
96
100
0 5 10 15 20top N
Performance
core args f-measure core args whole frame accall args f-measure all args whole frame acc
• Reranking not a serious bottleneck
• Core arguments top 20: f-measure 99.2, whole frame acc 97.4
• All arguments top 20: f-measure 98.8, whole frame acc 95.3
Global Reranking Features[AGENT The company] offered [PATIENT a 20%
stake]
[BENEFICIARY to the public]
• Core Argument Sequence with predicate and voice• [NPAGENT active:pred NPPATIENT PP-toBENEFICIARY]• Lexicalized version: active:pred to active:offer• [AGENT active:pred PATIENT BENEFICIARY] • [NP active:pred NP PP-to]
• Frame Feature• [NP active:pred NPPATIENT PP-to]• Compare to less likely [NP active:pred NPPATIENT NP]
Joint Results and Improvements
• Improvement doesn’t match gold parses (Toutanova,05)
• Argument Identification Bottleneck
Flat Model Joint Model Error Reduction
Dev Set F-Measure
74.52 76.71 8.6 %
Dev Set Whole Frame Accuracy
51.02 % 54.92 % 7.1 %
Using Multiple Trees
• Argument identification sensitive to parser errors
• PP attachment, Coordination, etc..
• Path feature becomes very noisy
• Use Top K trees (Charniak Parser ‘05)
• For top local assignments and
trees choose assignment and tree
to maximize:
• Only a small boost in performance….
Dealing with Dislocations
• Argument dislocation via control, subject raising etc.• IsMissingSubject and Path• For local with overlap: 73.80 to 74.52• AGENT improvement: 81.02 to 83.08
S
VPNPi
isS
VP
expectedVPNPi
-NONE-
The trade gap
to widen
Final Results
F-Measure Whole Frame
Test WSJ 78.45 56.52 %
Test Brown 67.71 37.06 %
Combined 77.04 44.83 %
• Genaralizing to other domains
Thanks !Thanks !
Why hasn’t it been done?
• Exponential Blowup!• A normal-sized tree in the Wall Street Journal will
have about 40 internal nodes to be classified• About 1 trillion possible assignments (binary ARG/NONE)
Thanks !Thanks !
What we’d like to capture …..
• Model predicate’s argument preferences • Bad: no core arguments, 10 core arguments • Verb specific rules: Require A0 or A1 args
• Model dependencies between labels and features of argument sequence • Discourage repeated arguments • Model syntactic alternations: [NPA0,gave,NPA2,NPA1] [NPA0,gave,NPA1,PP_toA2]
• Principled Parameter Estimation
Previous Work: Local Classifiers
• Extract features and classify each node independently
S
Phrase Type: NPPath: NP-up-VP-down VHead Word: Dursleys
NPVP
NPNPV
PP
a lesson
NPHarry Potter
gave
the Dursleys
in magic
NP
last week
(n)
(n)(n)(n)
• Core argument frame strongly interdependent• Hard Constraints: No overlapping arguments• Soft Constraints: A0 occurs before A1, A2, etc…
• Doesn’t capture statistical tendencies in core argument sequences and their syntactic realization
Problems with Local Classifiers…
NPA0
SVP
NPA0
NPA1V
PP
a lesson
NPHarry Potter
gave
the Dursleys
in magic
NPTMP
last week