Arabic Syntactic Trees Zdeněk Žabokrtský Otakar Smrž Center for Computational Linguistics...
-
Upload
rosamond-paul -
Category
Documents
-
view
215 -
download
0
description
Transcript of Arabic Syntactic Trees Zdeněk Žabokrtský Otakar Smrž Center for Computational Linguistics...
Arabic Syntactic Trees
Zdeněk ŽabokrtskýOtakar Smrž
Center for Computational LinguisticsFaculty of Mathematics and PhysicsCharles University in Prague
from Constituency to Dependency
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 2
Motivation & Background Linguistic Data Consortium Arabic Treebank
Constituent-syntax bracketing ~100k words published Modification from English to Arabic
Prague Arabic Dependency Treebank Dependency approach to syntax ~50k words in
progress Pre-step to tectogrammatical description
Motivation: co-operation and resource exchange Our goal: transform the data from one annotation
scheme to the other
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 3
Constituency X Dependency Non-terminal nodes
+ Text tokens Constituent labeling
on non-terminals Slots and traces
Linguistic Data Consortium, University of Pennsylvania
Sentence root node + Text tokens
Analytical function for every tree node
Government and roles
CCL & IFAL & ICL, Charles University in Prague
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 4
Model Arabic Phrase I Trace of the antecedent
subject Compound function of
the head of the clause – outer and inner perspectives
Free word-order compliant
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 5
Outline of the Transformation
1. Build temporary dependency tree Contraction of the input phrase-structure tree Uniquely determined by head selection function Implementation: simple recursive procedure
2. Create analytical tree topology Post-processing (corrections) of the temporary dep.
tree, e.g., substituting traces with trace coindexed fillers
Re-arrangement of special complex constructs
3. Assign analytical functions
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 6
Head Selection Function For each constituent, select the head
constituent among its children Based on (ordered) handcrafted rules Examples:
If there is a node with tag=PREP among the children, then it is the head
If there is a node with phrase_label=VP among the children, then it is the head
... etc ... If nothing was selected by the rules, then the
rightmost child is selected
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 7
Analytical Function Assignment
Based on (ordered) handcrafted rules and lexical lists
Completes the process, does not override previous assignments
Examples: phrase_label=NP-SBJ afun=Sb lemma=wa- afun=Coord pos_tag=CONJ afun=AuxC ... etc ...
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 8
Model Arabic Phrase II Sister-like co-ordination Conjunction of co-ordination
Status constructus
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 9
Model Arabic Phrase III Non-expressed subject (?) Complex modality
constructs Principal discrepancies
between descriptions – both in topology and labeling
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 10
Model Arabic Sentence Wa lam yakun mina ’s-sahli `alay hi
muwāğahatu kāmīrāti ’t-tilfizyūni wa `adasāti ’l-muşawwirīna wa huwa yaş`adu ’l-bāşa.
It was not easy for him to face the television cameras and the lenses of photographers as he was getting on the bus.
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 11
Constituency Annotation
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 12
Dependency Annotation
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 13
Evaluation & Conclusion Implementation still in progress, fine-tuning
needed
10,000 words manually annotated in both styles ~60% of correctly aimed dependencies
2nd Prague Penn Arabic Treebanking Workshop, May 2003 in Prague
Transfer from dependency to constituency?
April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 14
Related Work New tool for assignment of analytical functions
Based on machine learning (C5-trained decision trees) Error rate 17% (supposing the topology of the tree is
correct)
First experiments with Arabic dependency parser
Incorporated into the process of annotation of Prague Arabic Dependency Treebank