The Exploration of Deterministic and Efficient Dependency Parsing
-
Upload
elton-fulton -
Category
Documents
-
view
22 -
download
0
description
Transcript of The Exploration of Deterministic and Efficient Dependency Parsing
Yu-Chieh Wu Yue-Shi Lee Jie-Chi YangNational Central University, Taiwan
Ming Chuan University, Taiwan
Date: 2006/6/8
Reporter: Yu-Chieh Wu
The Exploration of Deterministic and Efficient Dependency Parsing
Context
• Nivre’s method is a LINEAR-TIME parsing algorithm
• But it presumed the projective grammar relation for text– One solution is to applying the psuedo projectivizati
on (Nivre and Nilson, 2005)
• In addition, non-projective words or roots were still kept in stack– Un-parsed words
• In multilingual scenario, some languages annotated labels for roots
In this paper
• Extend the time efficiency of the Nivre’s method
• DO NOT scan the word sequence multiple times– Perform the Niver’s algorithm– Only focused on the “UN-PARSED” words
• Efficiently label the roots
System Overview
Nivre’sParser
Learner 1
RootParser
Learner 2
Post-Processor
Learner 3
Un-ParsedText
Un-ParsedText Parsed
Text
ParsedText
Un-ParsedWords
Un-ParsedWords
Our solution is…
• To reduce the un-parsed rate– We performed both
• Forward parsing• Backward parsing directions (usually better)
• To classify the remaining words in stacks– A root parser to identify the word is…
• Root (including root label) or not root
• To re-connect the non-projective words– A post-processor is used to re-construct the arcs
• Exhaustive from the sentence start– Regardless its children
Statistics of un-parsed rate (percentage)
Un-Parsed Rate
Forward
Backward
Arabic 10.3 1.4
Chinese 4.01 2.3
Czech 16.1 5.6
Danish 12.8 2.5
Dutch 18.4 9.8
German 12.7 9.2
ForwardBackward
Japanese 1.1 4.4
Portugese
24.3 3.17
Slovene 14.9 5.5
Spanish 20 0.5
Swedish 19.1 2.8
Turkish 2.5 4
Bulgarian
15.7 1.2
AVG 13.22 4.02
Root Parser
For each un-parsed words
Wordi-2
Bigrami-2
BiPOSi-2
Wordi-1
Bigrami-1
BiPOSi-1
Wordi
Bigrami
BiPOSi
Wordi+1
Bigrami+1
BiPOSi+1
Wordi+2
Bigrami+2
BiPOSi+2
ChildR
Bigram
BiPOS
Child0
Bigram
BiPOS
Experimental Results
A(New result)
B(Old result)
C(Maltparser)
Statistic test
A vs B B vs C A vs C
Arabic 63.75 63.81 54.11 No Yes Yes
Chinese 81.25 74.81 73.92 Yes No Yes
Czech 71.24 59.36 59.36 Yes No Yes
Danish 79.52 78.38 77.31 No No No
Dutch 68.45 68.45 63.61 No Yes Yes
German 79.57 76.52 76.52 Yes No Yes
Japanese 91.43 90.11 89.07 Yes No Yes
Portugese 81.33 81.47 75.38 No Yes Yes
Slovene 68.41 67.83 55.04 No Yes Yes
Spanish 74.65 72.99 72.81 Yes No Yes
Swedish 79.53 71.72 76.28 Yes Yes Yes
Turkish 55.33 55.09 52.18 No Yes Yes
Bulgarian 81.23 79.73 79.73 No No No
AVG 74.53 72.32 69.64
Parsing performance of different grained POS tags and forward/backward parsing directions
Parsingdirection
LA-ScorePOSgrained
LA-Score
Ja Forward 91.43Forward
Fine 91.43
Backward 85.75 Coarse 91.25
Ar Forward 60.62Backward
Fine 63.55
Backward 63.55 Coarse 63.63
Tu Forward 55.47Forward
Fine 55.47
Backward 55.59 Coarse 55.59
Conclusion
• In this paper, we investigate the how effect does the “fast parser” achieve
• The employed features were quite simple– Only C/F-POS tag and word form
• We extend the Nivre’s method– Root parser– Exhaustive post-processing
Questions ?
System Spec
Ⅰ. Parsing Algorithm:1. Nivre's Algorithm (Nivre, 2003)2. Root Parser3. Exhaustive-based Post-processing
Ⅱ.Parser
Characteristics:
1. Top-down + Bottom-up2. Deterministic + Exhaustive3. Labeling integrated4. Projective
Ⅲ. Learner: SVMLight (Joachims, 1998)(1
)One-versus-One
(2)Linear Kernel
Ⅳ. Feature Set: 1. Lexical (Unigram/Bigram)2. Fine-grained POS and Coarse grained BiCPOS
Ⅴ. Post-Processing:Another learner is used to re-recognize heads in
stacks
Ⅵ.Additional/
External Resources:
Non-Used