Corpus-based Induction of an LFG Syntax-Semantics Interface for Frame Semantic Processing
description
Transcript of Corpus-based Induction of an LFG Syntax-Semantics Interface for Frame Semantic Processing
Corpus-based Induction
of an LFG Syntax-Semantics Interface
for Frame Semantic Processing
Anette Frank, Jiří Semecký
LFG 2004, Christchurch 2July 10, 2004
Overview State of the art
Frame Semantics and FrameNet project Salsa frame annotation project LFG syntax-semantics interface for Frame Semantics
Our work Porting SALSA frame annotations to LFG Special phenomena Extraction of frame assignment rules
Conclusion Current data and results Summary Next steps [and Application]
OverviewState of the artOur workConclusion
LFG 2004, Christchurch 3July 10, 2004
Frame Semantics Frame Semantics (Fillmore 1976, 1977, ..)
Frame: a conceptual structure or prototypical situation,e.g. SPD requests that coalition talk about reform.
Evokes a frame REQUEST,with frame elements (frame semantic roles) that identify participants
SPEAKER, SPD ADDRESSEE, Coalition MESSAGE, talk about reform
Frame evoking elements: verbs, nouns, adjectives, ... introduce frames FrameNet
Berkeley FrameNet II Project Database of frames for a lexicon of English
Definition of frames and frame semantic roles Inheritance relations among frames Selected and manually annotated example sentence
OverviewState of the artOur workConclusion
Frame SemanticsSalsaFrom SALSA to LFG
LFG 2004, Christchurch 4July 10, 2004
SALSA Saarbrücken Lexical SemanticsAnnotation and Analysis Project
German FrameNet “light” Creating a large semantically annotated corpus of German Building on FrameNet DB definitions of frames and roles Strongly corpus-based oriented
Methods and Aims Manual annotation on top of syntactically annotated TIGER corpus (Semi-)automatic semantic annotation of larger corpora Automatic acquisition of a lexical semantic resource Semantics-based information access in NLP applications
Focus of our work Induction of an LFG syntax-semantics interface for frame semantics
from manually annotated corpus
OverviewState of the artOur workConclusion
Frame SemanticsSalsaFrom SALSA to LFG
LFG 2004, Christchurch 5July 10, 2004
SALSAExample
SPD fordert Koalition zu Gespräch über Reform auf.SPD requests that coalition talk about reform.
TIGER Newspaper corpus 1.5 Million words
TIGER annotation scheme Syntactic constituents Functional role labels (SB, HD, ..) Crossing edges (word order)
OverviewState of the artOur workConclusion
Frame SemanticsSalsaFrom SALSA to LFG
LFG 2004, Christchurch 6July 10, 2004
SALSAExample
TIGER Newspaper corpus 1.5 Million words
TIGER annotation scheme Syntactic constituents Functional role labels (SB, HD, ..) Crossing edges (word order)
SALSA frame annotation Frame evoking element, FEE,
(fordert auf) projects frame
OverviewState of the artOur workConclusion
Frame SemanticsSalsaFrom SALSA to LFG
SPD fordert Koalition zu Gespräch über Reform auf.SPD requests that coalition talk about reform.
LFG 2004, Christchurch 7July 10, 2004
SALSAExample
TIGER Newspaper corpus 1.5 Million words
TIGER annotation scheme Syntactic constituents Functional role labels (SB, HD, ..) Crossing edges (word order)
SALSA frame annotation Frame evoking element, FEE,
(fordert auf) projects frame Frame elements (FEs) of the
frame are connectedto syntactic constituents
OverviewState of the artOur workConclusion
Frame SemanticsSalsaFrom SALSA to LFG
SPD fordert Koalition zu Gespräch über Reform auf.SPD requests that coalition talk about reform.
LFG 2004, Christchurch 8July 10, 2004
SALSAExample
TIGER Newspaper corpus 1.5 Million words
TIGER annotation scheme Syntactic constituents Functional role labels (SB, HD, ..) Crossing edges (word order)
SALSA frame annotation Frame evoking element, FEE,
(fordert auf) projects frame Frame elements (FEs) of the
frame are connectedto syntactic constituents
OverviewState of the artOur workConclusion
Frame SemanticsSalsaFrom SALSA to LFG
SPD fordert Koalition zu Gespräch über Reform auf.SPD requests that coalition talk about reform.
LFG 2004, Christchurch 9July 10, 2004
From SALSA to LFG Automatic semantic frame assignment
Broad-coverage grammar High accuracy Portability of manual SALSA/TIGER frame annotations
German LFG grammar (IMS, Univ. Stuttgart) Used for TIGER annotation: 50% coverage, 70% precision Further extension of coverage OT-based and statistical disambiguation
A general syntax-semantics interface LFG f-structures provide a good level of abstraction PARGRAM: Common f-structure design principles for different
languages allow study of generalizations across languages
OverviewState of the artOur workConclusion
Frame SemanticsSalsaFrom SALSA to LFG
LFG 2004, Christchurch 10July 10, 2004
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
An LFG Frame Semantics Projection Projection from f-structure
SPD fordert Koalition zu Gespräch über Reform auf.SPD requests that coalition talk about reform.
LFG 2004, Christchurch 11July 10, 2004
An LFG Frame Semantics Projection Projection from f-structure
SPD fordert Koalition zu Gespräch über Reform auf.SPD requests that coalition talk about reform.
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
LFG 2004, Christchurch 12July 10, 2004
An LFG Frame Semantics Projection Projection from f-structure
SPD fordert Koalition zu Gespräch über Reform auf.SPD requests that coalition talk about reform.
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
LFG 2004, Christchurch 13July 10, 2004
An LFG Frame Semantics Projection
Description by Analysis:
transfer rule for frame projection
Co-description:
lexicon entry for frame projection
auffordern V,(PRED) = ‘AUFFORDERN <(SUBJ) (OBJ) (OBL OBJ)>’...( () FRAME) = REQUEST( () FEE) = ( PRED FN)( () SPEAKER) = ( SUBJ)( () ADDRESSEE) = ( OBJ)( () MESSAGE) = ( OBL OBJ)
pred (X, auffordern),subj (X, A), obj (X, B), obl (X, C), obj (C, D)==>+ (X, SemX), +frame (SemX, request),
+fee (SemX, auffordern),+ (A, SemA), +speaker (SemX, SemA),+ (B, SemB), +addressee (SemX, SemB),+ (D, SemD), +message (SemX, SemD),
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
LFG 2004, Christchurch 14July 10, 2004
Corpus-based inductionof frame assignment rules
Step 1: Porting SALSA annotations to LFG Using “parallel” LFG corpus of TIGER To obtain an LFG-frame corpus
Step 2: Induction of general frame assignment rules from the LFG-frame corpus
Can be applied to f-structure output of LFG parsing of new sentences
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
LFG 2004, Christchurch 15July 10, 2004
Porting SALSA Annotations to LFG
FRAME: Request
FEE ID: {2, 8}
SPEAKER:
ADDRESSEE:
MESSAGE:
1
3
501
Frame evoking elements (FEE) and frame elements (FE) connected to syntactic constituents identified by IDs
Extracting frame constituting information from SALSA/TIGER annotations FRAME, TIGER constituent IDentifiers of FEE and FEs
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
1 2 3 8
501
LFG 2004, Christchurch 16July 10, 2004
Porting SALSA Annotations to LFG
„Parallel“ TIGER corpus consisting of automatically derived LFG f-structures (Forst 2003)
Using treebank conversion methods Preserves TIGER constituent information (ID)
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
1 2 3 8
501
LFG 2004, Christchurch 17July 10, 2004
Porting SALSA Annotations to LFG An LFG Corpus with frame Semantic Projection
Identify f-structure nodes of FEE and FEs, using IDs as anchor Define semantic projection for frame and all the frame elements Using rewrite rules of XLE transfer system
OverviewState of the artOur workConsequences
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
LFG 2004, Christchurch 18July 10, 2004
Special Phenomena
Multiple constituents Asymmetric embedding
Coordination
Multiword expressions
Underspecification
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
LFG 2004, Christchurch 19July 10, 2004
Special PhenomenaMultiword Expressions
Idiomatic expression evokes frame for non-literal meaning „über die Ladentheke gehen“ -- „sell“
Project individual components to set-valued FEE-MWE
Vier Artikel gingen über die Ladentheke.Four items went over the counter“Four items were sold.”
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
LFG 2004, Christchurch 21July 10, 2004
Corpus-based induction of frame rules
Step 1: Porting SALSA annotations to LFG Using “parallel” LFG corpus of TIGER To obtain an LFG-frame corpus Rules anchored to node IDs
Step 2: Induction of general frame assignment rules from the LFG-frame corpus
Can be applied to f-structure output of LFG parsing of new sentences Rules anchored to functional descriptions
FE assignment (auffordern)(SUBJ) – SPEAKER(OBJ) – ADDRESSEE(OBL OBJ) – MESSAGE
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
LFG 2004, Christchurch 22July 10, 2004
Extraction of Functional Paths
FE assignment paths Paths relative to FEE
Local and non-local Non-local = with inside
out relative path Prefer local to non-local
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
SPD verspricht Wählern, Beschüsse mitzuteilen.SPD promises voters to report decisions. Relative f-path
FEE
MESSAGE OBJ local
SPEAKER(XCOMP ) SUBJ non-local
SUBJ local
LFG 2004, Christchurch 23July 10, 2004
Extraction of Functional Paths
Prefer local to non-local SPEAKER => choose SUBJ
In ambiguous non-local paths choose „shortest non-local sub-path“ Prefer (XCOMP ) SUBJ to (XCOMP XCOMP ) SUBJ
Non-local paths of equal length considered equally good Choose both (XCOMP ) OBJ and (ADJ ) OBJ
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
Relative f-path
FEE
MESSAGE OBJ local
SPEAKER(XCOMP ) SUBJ non-local
SUBJ local
LFG 2004, Christchurch 24July 10, 2004
Applying rules to new sentences mitteilen: COMMUNICATION; SUBJ SPEAKER, OBJ MESSAGE
Complete frames with all frame elements As instantiated in the corpus
OverviewState of the artOur workConclusion
An LFG Frame Semantic ProjectionPorting SALSA Annotations to LFGSpecial phenomenaExtraction of Frame Assignment Rules
Problem: unseen configurations (sparse data problem)
Partial annotation Individual rules for the FEE Individual rules for each FE of the frame (conditioned on FEE)
pred (X, mitteilen),‘SUBJ‘ (X, A),‘OBJ‘ (X, B)
+ (X, SemX), +frame (SemX, communication), +fee (SemX, mitteilen),+ (A, SemA), +speaker (SemX, SemA),+ (B, SemB), +message (SemX, SemB).
pred (X, mitteilen) + (X, SemX), +frame (SemX, communicaition),+fee (SemX, mitteilen).
pred (X, mitteilen), (X, SemX), frame (SemX, communicaition), ‘SUBJ‘(X, A)
+(A, SemA), +speaker (SemX, SemA).
LFG 2004, Christchurch 25July 10, 2004
Current Data and Results Data used:
12127 frame assignment rules 10009 sentences
Successfully ported frames: 11612 Compiled transfer rules after path extraction: 9334 Local vs. non-local FE assignments:
87.18% vs. 12.82%
Ambiguity rate: Average 8.83 rules per FEE Average 41.27 rules per frame
OverviewState of the artOur workConclusion
Current Data and ResultsSummaryNext steps and Application
LFG 2004, Christchurch 26July 10, 2004
OverviewState of the artOur workConclusion
Current Data and ResultsSummaryNext steps and Application
Re-applying syntax-semantics mapping rules to TIGER-LFG corpus
Current Data and Results
Recall Precision Ambiguity rate / sentence
Full frame 93.98 % 25.94 % 8.46
Partial frame 94.98 % 45.52 % 7.83
Applying syntax-semantics mapping rules to free LFG parsing (without statistical disambiguation)
Recall Precision Ambiguity rate / sentence
Full frame 52.21 % 6.93 % 13.35
Partial frame 76.41 % 18.32 % 15.79
LFG 2004, Christchurch 27July 10, 2004
Summary
Modeling frame semantics in LFG framework Porting frame annotations from TIGER/SALSA to
an LFG corpus Extracting general frame assignment rules for LFG
parsing Applying frame assignment rules in an LFG parsing
architecture
OverviewState of the artOur workConclusion
Current Data and ResultsSummaryNext steps and Application
LFG 2004, Christchurch 28July 10, 2004
Next steps
Semantically driven syntactic disambiguation Reduce ambiguity of syntactic parses Prefer parses with corresponding semantic annotation
Stochastic modeling for semantic role assignment Training stochastic models on the basis of corpus annotations For disambiguation of disjunctive frame assignments XLE: statistical ME package for training and online disambiguation
OverviewState of the artOur workConclusion
Current Data and ResultsSummary Next steps and Application