Imposing Constraints from the Source Tree on ITG Constraints for SMT Hirofumi Yamamoto, Hideo Okuma,...

Post on 18-Jan-2016

217 views 0 download

Tags:

Transcript of Imposing Constraints from the Source Tree on ITG Constraints for SMT Hirofumi Yamamoto, Hideo Okuma,...

Imposing Constraints from the Source Tree on ITG Constraints for SMT

Hirofumi Yamamoto, Hideo Okuma, Eiichiro Sumita

National Institute of Information and Communications Technology

ATR Spoken Language Communication Research Labs.

Kindai University School of Science and Engineering Department of Information

Background

In current SMT, erroneous word reordering is one of the most serious problems, especially for dis- similar language pair such as English-Chinese or English-Japanese.

1) To introduce linguistic syntax directly.

Not robust to parsing error

Tree-to-stringString-to-tree

Tree-to-tree

Background

In current SMT, erroneous word reordering is one of the most serious problems, especially for not similar language pair such as English-Chinese or English-Japanese.

2) To assign probabilistic constraints for word reordering

Weaker constraints than the first type

To introduce syntax information to second type

IBM distortion, Lexical reordering, ITG

ITG Constraints

Translation source sentences are represented by binary tree. Translation target sentences can be generated by rotating branches of nodes of source tree.

BA DC

db ca

BA DC

ac bd

Above target word order cannot be generated from any source binary tree.Source binary tree instance is not considered.

Basic Idea of IST-ITG

To use ITG constraints under the given source tree

BA DC

BA DC

abcd, abdc, bacd, badc,cdad, cdba, dcab, dcba

abcd, bacd, cabd, cbad,dabc, dbac, dcab, dcba

In original ITG constraints, 22 combinations are allowed.

The Number of Word Order Combinations

For binary source tree, word order combinations are allowed without constraints. Under the IST-ITG constraints, this number is reduced to .12 N

6N720!N

322 1 N

!N

If

Without constraintsITG constraintsIST-ITG

394

10N

800,628,3!N

5122 1 N

If

Without constraintsITG constraintsIST-ITG

098,206

Extension to Non-binary Tree

Parsing results sometimes are not binary tree.

For the nodes which have more than two branches, any word reorderings are allowed.

BA DC

abcd, abdc, acbd, acdb,adbc, adcb, bcda, bdca, cbda, cdba,dbca, dcba

Extension to Non-binary Tree

Parsing results sometimes are not binary tree.

For the node which have more than two branches, any word reorderings are allowed.

For non-binary tree, the number of combinations of IST-ITG can represented by . ( represents number of branches in -th node)

n

i iB1)!(

iB i

IST-ITG in Phrase-based SMT (1)

×   The unit of parsing tree is “word”, but the unit of phrase-based SMT is “phrase”. Units are different.

Additional rules for phrase-based SMT

1) Word reordering that breaks a phrase is not allowed.

2) Phrase internal word reordering is not checked.

○   Word-to-word alignments are sometimes not one-to-one. But phrase-to-phrase alignments are always one-to-one

IST-ITG in Phrase-based SMT (2)

E F G

2 3

APh

B C D

14

5

1:NG 2:NG 3:OK 4:NG 5:OK(unacceptable)

IST-ITG in Phrase-based SMT (2)

E F G

2 3

APh

B C D

14

5

1:NG 2:NG 3:OK 4:NG 5:OK

Ph

IST-ITG in Phrase-based SMT (2)

E F G

2 3

APh

B C D

14

5

1:NG 2:NG 3:OK 4:NG 5:OK

Ph

IST-ITG in Phrase-based SMT (2)

E FG

2 3

APh

B C D

14

5

1:NG 2:NG 3:OK 4:NG 5:OK

IST-ITG in Phrase-based SMT (2)

EFG

2 3

APh

B C D

14

5

1:NG 2:NG 3:OK 4:NG 5:OK

Ph

IST-ITG in Phrase-based SMT (2)

EFG

2 3

APh

BCD

14

5

1:NG 2:NG 3:OK 4:NG 5:OK

Decoding Algorithm with IST-ITG

E F GA B C D1 0 00 0 0 1

12

0

2

0

0:Untranslated   1 : Translated   2 : Translating

d e

H I0 0

0

Decoding Algorithm with IST-ITG

E F GA B C D1 0 01 1 0 1

12

2

NG

0

H I0 0

0

If phrases A and B are translated,Sub-tree that includes more than two “2”    NG

d e a b

Decoding Algorithm with IST-ITG

E F GA B C D1 0 00 0 0 1

12

0

2

0

H I0 0

0

Consider minimum Translating sub-tree(sub-tree that includes both “0” and “1”.)

d e

Decoding Algorithm with IST-ITG

E F GA B C D1 1 10 0 0 1

11

0

2

1

H I1 0

2

All of minimum Translating sub-tree are translated.     OK

d e f g h

Decoding Algorithm with IST-ITG

E F GA B C D1 0 10 0 0 1

12

0

2

2

H I0 0

0

Translate sub-part of minimum Translating sub-tree.     OK

d e g

English and Japanese Patent Corpus Experiments

# of sent. Total Words # of entry

E/J Train

E/J Dev

E/J Eval

Experimental corpus size

1.8M

916

899

60M/64M

30K/32K

29K/32K

188K/118K

4,072/3,646

3,967/3,682

Single reference

Other Experimental Conditions

LM training: SRI Language model toolkit (5-grams)Word alignment for TM training: GIZA++Decoder: Moses compatible in-house decoder named CleopATRa

Evaluation measures

BLEU,NIST,WER,PER

njj eeXeeee ,...,,,,...,,, 2321

jnj eeeeXee ,...,,,,,,..., 3212

English and Japanese Patent TranslationExperimental Results

IBM+Lex

IBM+Lex+IST

BLEU NIST WER PER

31.17

32.20

7.50

7.61

76.30

71.18

38.61

38.15

English-to-Japanese

IST-ITG 30.26 7.41 74.90 38.93

Monotone 24.91 6.95 79.97 40.02

No Constraint 26.83 7.19 81.10 39.52

IBM 28.34 7.29 78.35 39.25

English and Japanese Patent TranslationExperimental Results

IBM+Lex

IBM+Lex+IST

BLEU NIST WER PER

31.17

32.20

7.50

7.61

76.30

71.18

38.61

38.15

English-to-Japanese

IST-ITG 30.26 7.41 74.90 38.93

Monotone 24.91 6.95 79.97 40.02

No Constraint 26.83 7.19 81.10 39.52

IBM 28.34 7.29 78.35 39.25

English and Japanese Patent TranslationExperimental Results

IBM+Lex

+IST-ITG

BLEU NIST WER PER

29.93

29.77

7.54

7.50

77.27

72.80

39.12

39.73

Japanese-to-English

English and Japanese Patent TranslationExperimental Results

IBM+Lex

+IST-ITG

BLEU NIST WER PER

29.93

29.77

7.54

7.50

77.27

72.80

39.12

39.73

Japanese-to-English

Chinese-to-English Translation Experiments

NIST MT08   English-to-Chinese track

IBM+Lex

+IST-ITG

W-Bleu C-Bleu WER CER

21.0

23.2

35.2

37.0

75.069.7

74.167.9

Experimental Results

Training data for TMTraining data for LMDevelopment dataEvaluation data

6.2M20.1M1,6641,859

1 reference4 reference

Conclusion

We proposed new word reordering constrains IST-ITG using source tree structure. It is extension of ITG constraints.

We conducted three experiments of proposed method: E-J and J-E patent translation and NIST MT08 E-C track. In all experiments, improvements of BLEU and WER are confirmed.

Especially, improvement for WER is very large, and effectiveness for global word reordering is confirmed.

Thank you!