Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao...

37
Linguistically-motivated Tree- based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

description

Background  Many of state-of-the-art SMT systems are based on “word-based” alignment results  Phrase-based SMT [Koehn et al., 2003]  Hierarchical Phrase-based SMT [Chiang, 2005]  and so on  Some of them incorporate syntactic information “after” word-based alignment  [Quirk et al., 2005], [Galley et al., 2006] and so on  Is it enough?  Is it able to achieve “practical” translation quality? 31/18/2016

Transcript of Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao...

Page 1: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Linguistically-motivated Tree-based Probabilistic Phrase Alignment

Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Page 2: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Outline Background Tree-based Probabilistic Phrase Alignment

Model Model Training Symmetrization Algorithm Experiments Conclusions

2 05/03/23

Page 3: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Background Many of state-of-the-art SMT systems are

based on “word-based” alignment results Phrase-based SMT [Koehn et al., 2003] Hierarchical Phrase-based SMT [Chiang, 2005] and so on

Some of them incorporate syntactic information “after” word-based alignment [Quirk et al., 2005], [Galley et al., 2006] and so on

Is it enough? Is it able to achieve “practical” translation

quality?

3 05/03/23

Page 4: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Background (cont.) Word-based alignment model works well for

structurally similar language pairs It is not effective for language pairs with great

difference in linguistic structure such as Japanese and English SOV versus SVO

For such language pair, syntactic information is necessary even during alignment process

4 05/03/23

Page 5: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Related Work Syntactic tree-based model

[Yamada and Knight, 2001], [Gildea, 2003], ITG by Wu Incorporating some operations which control sub-

trees (re-order, insert, delete, clone) to reproduce the opposite tree structure

Our model does not require any operations Our model utilizes dependency trees

Dependency tree-based model [Cherry and Lin, 2003] Word-to-word, and one-to-one alignment Our model makes phrase-to-phrase alignment, and

can make many-to-many links

5 05/03/23

Page 6: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Features of Proposed Tree-based Probabilistic Phrase Alignment Model Generation model similar to IBM models Using phrase dependency structures

“phrase” means a linguistic phrase (cf. phrase-based SMT)

Phrase to phrase alignment model Each phrase (node) consists of basically 1 content

word and 0 or more function words Source side content words can be aligned to content

words of target side only (same for function words) Generation starts from the root node and end

up with one of leaf nodes (cf. IBM model is from first word to last word)

6 05/03/23

Page 7: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Outline Background Tree-based Probabilistic Phrase Alignment

Model Model Training Symmetrization Algorithm Experiments Conclusions

7 05/03/23

Page 8: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Dependency Analysis of Sentencesプロピレングリコールは血中グルコースインスリンを上昇させ、血中NEFA 濃度を減少させる

Propylene glycol increases in blood glucose and insulin and decreases in NEFA concentration in the blood

Source Target

Word order

Head node Head node

Root node

Root node

8 05/03/23

Page 9: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

IBM Model v.s Tree-based Model IBM Model [Brown et al., 93]

Tree-based Model

)|(),|(maxargˆ eapaefpaa

)|(),|(maxargˆ eefa

TapaTTpa

f : source sentencee : target sentence

S

s asss eapaefp

1

)|(),|(maxargˆ

S

s asesesf TapaTTp

1,,, )|(),|(maxargˆ

a : alignment

: parameters

fT : source tree

eT : target tree

9 05/03/23

Page 10: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Model Decomposition:Lexicon Probability Suppose consists of nodes and consists

of nodes

is calculated as a product of two probabilities

Ex) 濃度 を - in concentration

上昇 さ せ - increase

),|( aTTp ef

fT J eTI

J

jajef jefpaTTp

1

)|(),|(

)|(jaj efp

)|()|()|( .. jjj ajfuncajcontaj efpefpefp

)in|を()ionconcentrat |濃度( .. funccont pp

Phrase translatio

n probabilit

y

)EMPTY|せ さ()increase|上昇( .. funccont pp

)|(),|(maxargˆ eefa

TapaTTpa

10 05/03/23

Page 11: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Model Decomposition:Alignment Probability Define the parent node of as is decomposed as a product of target

side dependency relation probability conditioned on source side relation

If the parent node has been aligned to NULL, indicates the grandparent of , and this continues until has been aligned to other than NULL

models a tree-based reordering

jf

)|( eTap

J

jjjaae ffreleerelpTap

jj1

)),(|),(()|(

jf )|( eTap

jf jf

jf

Dependency relation

probability

)|( eTap

)|(),|(maxargˆ eefa

TapaTTpa

11 05/03/23

Page 12: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Outline Background Tree-based Probabilistic Phrase Alignment

Model Model Training Symmetrization Algorithm Experiments Conclusions

12 05/03/23

Page 13: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Model Training The proposed model is trained by EM

algorithm First, phrase translation probability is learned

(Model 1) Model 1 can be efficiently learned without

approximation (cf. IBM model 1 and 2) Next, dependency relation probability is

learned (Model 2) with probabilities learned in Model 1 as initial parameters Model 2 needs some approximation (cf. IBM model

3 or greater), we use beam-search algorithm

13 05/03/23

Page 14: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Model 1 Each phrase in source side can

correspond to an arbitrary phrase in target side a or NULL phrase

A probability of one possible alignment is:

Then, tree translation probability is:

Efficiently calculated as:

)1( Jjf j

)1( Iiei )( 0e

)|()|()|,( .1

. jj ajfunc

J

jajcontef efpefpTTap

a

efef TTapTTp )|,()|(

J

j

I

iaj

a

J

jaj jj

efpefp1 01

)|()|(

14 05/03/23

Page 15: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Model 2 (imaginary ROOT node) Root node of a sentence is supposed to

depend on the imaginary ROOT node, which works as a Start-Of-Sentence (SOS) in word-based model

The ROOT node in source tree always corresponds to that of target tree

事例 を 通して援助 の

視点 に必要な

ポイント を確認 した

ROOT

necessarythe point

through the casein the viewpoint

of the assistwas confirmed

ROOT

15 05/03/23

Page 16: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Model 2 (beam-search algorithm) It is impossible to enumerate all the possible

alignment Consider only a subset of “good-looking”

alignments using beam-search algorithm Ex) beam-width = 4

事例 を 通して援助 の

視点 に必要な

ポイント を確認 した

necessarythe point

through the casein the viewpoint

of the assistwas confirmedNULL

16 05/03/23

Page 17: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Model 2 (beam-search algorithm)事例 を 通して

援助 の視点 に

必要な

ポイント を確認 した

necessary

the pointthrough the case

in the viewpoint

of the assistwas confirmed

NULL

事例 を 通して

援助 の視点 に

必要な

ポイント を確認 した

necessary

the pointthrough the case

in the viewpoint

of the assistwas confirmed

NULL

事例 を 通して

援助 の視点 に

必要な

ポイント を確認 した

necessary

the pointthrough the case

in the viewpoint

of the assistwas confirmed

NULL

事例 を 通して

援助 の視点 に

必要な

ポイント を確認 した

necessary

the pointthrough the case

in the viewpoint

of the assistwas confirmed

NULL

17 05/03/23

Page 18: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Model 2 (parameter notations) Dependency relation between two

phrases and is defined as a path from to using the following notations: “c-” if is a pre-child of “c+” if is a post-child of “p-” if is a post-child of “p+” if is a pre-child of “INCL” if and are same phrase “ROOT” if is an imaginary ROOT node “NULL” if is aligned to NULL

),( 21 PPrel1P 2P 2P 1P

1P 2P

1P 2P

1P2P1P2P

1P 2P

18 05/03/23

1P

1P

1P

1P

2Pc-

c+p-

p+

2P

2P

2P

2P

1P ROOTROOT2P

1P

Page 19: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Model 2 (parameter notations, cont.) In a case where and are two or more

nodes distant from each other, the relation is described by combining the notations

Ex)

05/03/2319

1P 2P

1P2P

c-c+

c-;c+

1P

2P

c-c+

p-

p-;c+;c-

Page 20: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Dependency Relation Probability Examples事例 を 通して

援助 の視点 に

必要な

ポイント を確認 した

necessary

the pointthrough the case

in the viewpoint

of the assistwas confirmed

NULL

事例 を 通して

援助 の視点 に

必要な

ポイント を確認 した

necessary

the pointthrough the case

in the viewpoint

of the assistwas confirmed

NULL

事例 を 通して

援助 の視点 に

必要な

ポイント を確認 した

necessary

the pointthrough the case

in the viewpoint

of the assistwas confirmed

NULL

事例 を 通して

援助 の視点 に

必要な

ポイント を確認 した

necessary

the pointthrough the case

in the viewpoint

of the assistwas confirmed

NULL

20 05/03/23

c-)|(c-ROOT)|(ROOT pp c-)|(p-ROOT)|c-(ROOT; pp

c-)|c(c-;ROOT)|(ROOT pp c-)ROOT;|(ROOTROOT)|(NULL pp

Page 21: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Example事例 を 通して

援助 の視点 に

必要なポイント を

確認 したROOT

necessarythe point

through the casein the viewpoint

of the assistwas confirmed

ROOT

)|,( ef TTapROOT)|(ROOTEMPTY)|(confirmed) was|確認( ppp した

c-)|c(c-;through)| 通 を(case)|事例( ppp してc-)|(c-EMPTY)|(point)|( ppp をポイント

c-)|(c-EMPTY)|EMPTY(necessary)|( ppp 必要なc-)|(cin)|(viewpoint)|( ppp に視点

c-)|(cof)|(assist)|( ppp の援助21 05/03/23

Page 22: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Outline Background Tree-based Probabilistic Phrase Alignment

Model Model Training Symmetrization Algorithm Experiments Conclusions

22 05/03/23

Page 23: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Symmetrization Algorithm Since our model is directed, we run the model

bi-directionally and symmetrize two alignment results heuristically

Symmetrization algorithm is similar to [Koehn et al. 2003], which uses 1-best GIZA++ word alignment result of each direction

Our algorithm exploits n-best alignment results of each direction

Three steps: Superimposition Growing Handling isolations

23 05/03/23

Page 24: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Symmetrization Algorithm1. SuperimpositionSource to Target

5-bestTarget to Source

5-best5 210

1010

5 35

1 57 3

105 9

77 1

・・・ ・・・

24 05/03/23

Page 25: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Symmetrization Algorithm1. Superimposition (cont.)

5 210

1010

5 35

1 57 3

105 9

77 1

Definitive alignment points are adopted The points which don’t

have same or higher scored point in their same row or column

Conflicting points are discarded The points which is in

the same row or column of the adopted point and is not contiguous to the adopted point on tree

5 210

1010

5 35

1 57 3

105 9

77 1

5 210

1010

3

57

105 9

77

25 05/03/23

Page 26: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Symmetrization Algorithm2. Growing Adopt contiguous

points to adopted points in both source and target tree In descending order of

the score From top to bottom From left to right

Discard conflicting points The points which have

adopted point both in the same row and column

5 210

1010

3

57

105 9

77

510

1010

3

57

109

77

26 05/03/23

Page 27: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Symmetrization Algorithm3. Handling Isolation Adopt points which

are not aligned to any phrase in both source and target language

510

1010

3

57

109

77

510

1010

3

57

109

77

27 05/03/23

Page 28: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Alignment Experiment Training corpus

Japanese-English paper abstract corpus provided by JST which consists of about 1M parallel sentences

Gold-standard alignment Manually annotated 100 sentence pairs among the

training corpus Sure (S) alignment only [Och and Ney, 2003]

Evaluation unit Morpheme-based for Japanese Word-based for English

Iterations 5 iterations for Model 1, and 5 iterations for Model 2

28 05/03/23

Page 29: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Alignment Experiment (cont.) Comparative experiment (word-base

alignment) GIZA++ and various symmetrization heuristics

[Koehn et al., 2007] Default settings for GIZA++

Use original forms of words for both Japanese and English

29 05/03/23

Page 30: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Results

Precision Recall F-measure

proposed

1-best-intersection 90.92 41.69 57.171-best-grow 83.30 54.33 65.763-best-grow 81.21 56.52 66.655-best-grow 80.59 57.33 67.00

GIZA++

intersection 88.14 40.18 55.20grow 83.50 49.65 62.27grow-final 67.19 56.91 61.63grow-final-and 78.00 52.93 63.06grow-diag 77.34 53.18 63.03grow-diag-final 67.24 56.63 61.48grow-diag-final-and 74.95 54.26 62.95

30 05/03/23

Page 31: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Example of Alignment ImprovementProposed model Word-base alignment

31 05/03/23

Page 32: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Translation Experiments Training corpus

Same to alignment experiments Test corpus

500 paper abstract sentences Decoder

Moses [Koehn et al., 2007] Use default options except for phrase table limit (20 -

> 10) and distortion limit (6 -> -1) No minimum error rate training

Evaluation BLEU No punctuations and case-insensitive

33 05/03/23

Page 33: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

ResultsPre Rec F BLEU

proposed 1-best-intersection 90.92 41.69 57.17 12.735-best-grow 80.59 57.33 67.00 15.40

GIZA++intersection 88.14 40.18 55.20 16.35grow-diag 77.34 53.18 63.03 17.89grow-diag-final-and 74.95 54.26 62.95 17.76

34 05/03/23

Definition of function words is improper Articles? Auxiliary verbs? …

Tree-based decoder is necessary BLEU is essentially insensitive to syntactic

structure Translation quality potentially improved

Page 34: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Potentially Improved Example Input:

これ は LB 膜 の 厚み が アビジン を 吸着 する こと で 増加 した こと に よる 。

Proposed (30.13):this is due to the increase in the thickness of the lb film avidin adsorb

GIZA++ (33.78):the thickness of the lb film avidin to adsorption increased by it

Reference:this was due to increased thickness of the lb film by adsorbing avidin

05/03/2335

Page 35: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Conclusion Tree-based probabilistic phrase alignment

model using dependency tree structures Phrase translation probability Dependency relation probability

N-best symmetrization algorithm Achieve high alignment accuracy compared to

word-based models Syntactic information is useful during alignment

process BUT: Unable to improve the BLEU scores of

translation

36 05/03/23

Page 36: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Future Work More flexible model

Content words sometimes correspond to function words and vice versa

Integrate parsing probabilities into the model Parsing errors easily lead to alignment errors By integrating parsing probabilities, parsing

results and alignment can be revised complementary

More syntactical information Use POS or phrase category into the model

37 05/03/23

Page 37: Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

05/03/2338

Thank You!