Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation...
-
Upload
harold-little -
Category
Documents
-
view
218 -
download
4
Transcript of Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation...
Reordering Model UsingSyntactic Information of a Source Tree
for Statistical Machine Translation
Kei Hashimoto , Hirohumi Yamamoto ,
Hideo Okuma , Eiichiro Sumita ,
and Keiichi Tokuda
Nagoya Institute of Technology
National Institute of Information and Communications Technology
Kinki University
ATR Spoken Language Communication Research Labs.
1
2
3
4
1 2,3
2,4 2,4
1,2
Background (1/2) Phrase-based statistical machine translation
Can model local word reordering Short idioms Insertions and deletions of words
Errors in global word reordering
Word reordering constraint technique Linguistically syntax based approach
Source tree, target tree, both tree structures Formal constraints on word permutations
IBM distortion, lexical reordering model, ITG2
Background (2/2) Imposing a source tree on ITG (IST-ITG)
Extension of ITG constraints Introduce a source sentence tree structure Cannot evaluate the accuracy of the target word
orders Reordering model using syntactic information
Extension of IST-ITG constraints Rotation of source-side parse-tree Can be briefly introduce to the phrase-based
translation system
3
Outline Background ITG & IST-ITG constraints Proposed reordering model
Training of the proposed model Decoding using the proposed model
Experiments Conclusions and future work
4
Inversion transduction grammar ITG constraints
All possible binary tree structures are generated from the source word sequence
The target sentence is obtained by rotating any node of the generated binary trees
Can reduce the number of target word orders Not consider the tree structure instance
5
Imposing source tree on ITG Directly introduce a source sentence tree
structure to ITG
6
Source sentence tree structureThis is a pen
Source sentence
This is a pen
The target sentence is obtainedby rotating any node ofsource sentence tree structure
The number of word orders is reduced to
Non-binary tree The parsing results sometimes produce non-
binary trees
7
A B C D E
cde dce ecd
ced dec edc# of orders in non-binary subtree is
Any reordering of child nodes in non-binary subtree is allowed
Problem of IST-ITG Cannot evaluate the accuracy of the target
word reordering ⇒ Assign an equal probability to all rotations
8
1f 2f 3f 4f
Propose reordering model using syntactic information
Equal probability
],,[ 21 Nfff : source sentence
Outline Background ITG & IST-ITG constraints Proposed reordering model
Training of the proposed model Decoding using the proposed model
Experiments Conclusions and future work
9
Rotation of each subtree type is modeled
Abstract of proposed method
10
This is a penSource sentence
Reordering probability
k
kr stPP )|(
}s,m{t : monotone or swap
1s = S+NP+VP
2s = VP+AUX+NP
= NP+DT+NN3s
Subtree typeSource-side parse-tree
NP
S
VP
AUX NP
DT NN
1s
2s
3s
This is a pen
Reordering model using syntactic information
Statistical syntax-directed translation with extended domain of locality [Liang Huang et al. 2006]
Extract rules for tree-to-string translation Consider syntactic information Consider multi-level trees on the source-side
Related work 1
11
NP VP
NPVB
S
1x 2x 3x
S( :NP, VP( :VB, :NP)) →1x 2x 3x 1x2x 3x
2x 1x 3x
Proposed reordering model Used in phrase-based translation Estimation of proposed model is independently
conducted from phrase extraction Child node reordering in one-level subtree Cannot represent complex reordering Reordering using syntactic information can be
briefly introduced to phrase-based translation
Related work 2
12
Training algorithm (1/3) Reordering model training
1. Word alignment
2. Parsing source sentence
13
1.
NP
S
VP
AUX NP
DT NN
2.source
target
1f 2f 3f 4f
1e 3e 4e 2e
1f 2f 4f3f
1s
2s
3s
Training algorithm (2/3)
3. Word alignments and source-side parse-trees are combined
4. Rotation position is checked (monotone or swap)
14
3.
1f 2f 3f 4f
NP
S
VP
AUX NP
DT NN
1,2,3,4
2,3,41
2,34
2 3
1e 3e 4e 2e
1s
2s
3s
1s = S+NP+VP ⇒ monotone
2s = VP+AUX+NP ⇒ swap
= NP+DT+NN ⇒ monotone3s
4.
5. Reordering probability of the subtree is estimated by counting each rotation position
Non-binary subtree Any orderings for child nodes are allowed Rotation positions are categorized into only two
type
⇒ Monotone or other (swap)
Training algorithm (3/3)
15
t t
t
sc
scstP
)(
)()|( is the count of rotation position t
included all training samples for the subtree type s
)(sct
Target word orders which are not derived from rotating nodes of source-side parse-tree
Linguistic reasonsDifference of sentence structures
Non-linguistic reasonsErrors of word alignments and syntactic analysis
Remove subtree samples
16
1f 2f 3f 4f
1e 3e 2e 4e
1s
2s 3s
Subtree and are used as training samples
Subtree is removed from training samples
Clustering of subtree type Number of possible subtree types is large
Unseen subtree type Subtree type observed a few times
⇒ Cannot model exactly Clustering of subtree type
The number of training samples is less than a heuristic threshold
Estimate clustered model from the counts of clustered subtree types
17
Decode using proposed model Phrase-based decoder Constrained by IST-ITG constraints
Target sentence is generated by rotating any node of the source-side parse-tree
Target word ordering that destroys a source phrase is not allowed
Check the rotation positions of subtrees Calculate the reordering probabilities
18
Calculate reordering probability
Decode using proposed model
19
A B C D
1s
2s 3s
E
b a
SubtreeRotation position
monotone
swap
monotone2s
3s
1s
c d e
k
kr stPP )|(}s,m{t : monotone or swap
Source sentence
Target sentence
Calculate reordering probability
Decode using proposed model
20
A B C D
1s
2s 3s
E
c d
SubtreeRotation position
swap
monotone
monotone2s
3s
1s
e a b
k
kr stPP )|(}s,m{t : monotone or swap
Source sentence
Target sentence
Rotation position included in a phrase
Cannot determine the rotation position Word alignments included a phrase are not clear
⇒ Assign the higher probability, monotone or swap
21
A B C D
1s
2s 3s
E
SubtreeRotation position
swap
higher
higher2s
3s
1s
a bc d e
Phrase Phrase
Outline Background ITG & IST-ITG constraints Proposed reordering model
Training of the proposed model Decoding using the proposed model
Experiments Conclusions and future work
22
Experimental conditions Compared methods
Baseline : IBM distortion, lexical reordering models IST-ITG : Baseline + IST-ITG constraint Proposed : Baseline + proposed reordering model
Training GIZA++ toolkit SRI language model toolkit Minimum error rate training (BLEU-4) Charniak parser
23
Experimental conditions (E-J) English-to-Japanese translation experiment
JST Japanese-English paper abstract corpus
24
English Japanese
Training data Sentences 1.0M
Words 24.6M 28.8M
Development data Sentences 2.0K
Words 50.1K 58.7K
Test data Sentences 2.0K
Words 49.5K 58.0K
Dev. and test data: single reference
Experimental results (E-J) Proposed reordering model
Results of test set
25
Baseline IST-ITG Proposed
BLEU-4 27.87 29.31 29.80
Subtree sample 13M
Remove sample 3M (25.38%)
Subtree type 54K
Threshold 10
Number of models 6K + clustered
Coverage 99.29%
Improved 0.49 points from IST-ITG
Experimental conditions (E-C) English-to-Chinese translation experiment
NIST MT08 English-to-Chinese translation track
26
English Chinese
Training data Sentences 4.6M
Words 79.6M 73.4M
Development data Sentences 1.6K
Words 46.4K 39.0K
Test data Sentences 1.9K
Words 45.7K 47.0K (Ave.)
Test data: 4 referencesDev. data: single references
Experimental results (E-C) Proposed reordering model
Results of test set
27
Baseline IST-ITG Proposed
BLEU-4 17.54 18.60 18.93
Subtree sample 50M
Remove sample 10M (20.36%)
Subtree type 2M
Threshold 10
Number of models 19K + clustered
Coverage 99.45%
Improved 0.33 points from IST-ITG
Conclusions and future work Conclusions
Extension of the IST-ITG constraints Reordering using syntactic information can be
briefly introduced to the phrase-based translation Improve 0.49 points in BLEU from IST-ITG
Future work Simultaneous training of translation and
reordering models Deal with the complex reordering which is due to
difference of sentence tree structures 28
29
Thank you very much!