Translation Divergence LING 580MT Fei Xia 1/10/06.

Post on 21-Dec-2015

219 views 0 download

Transcript of Translation Divergence LING 580MT Fei Xia 1/10/06.

Translation Divergence

LING 580MT

Fei Xia

1/10/06

Papers

• Bonnie Dorr (1994): Machine Translation Divergences: a Formal Description and Proposed Solution

Outline

• Formal definition of translation divergence

• Seven types of divergence

• Discussion

• Remaining questions

Formal definition of translation divergence

Distinction between the source and target languages

Two categories (Bernett et. al., 1991):

• Translation divergence: same information, different structures

• Translation mismatches: different information important, but outside of the scope of the paper

How to define translation divergence formally?

Define the language-to-language divergence via language-to-interlingua divergence:

Interlingua: lexical conceptual structure (LCS)

Language-to-interlingua: mapping from syntactic form to LCS

Lexical conceptual structure (LCS)

])]]...[[]]...[[],'(['[ '

)(

'1)(

'

)(

'1)()'()'( ''

1''

1mQTQTnZTZTWTXT QQZZWX

mn

X’T(X’)

W’T(W’)

Z1’T(Z1’)

Zn’T(Zn’)

Qm’T(Qm’)

Q1’T(Q1’)

X’: logical head W’: logical subjectZ1’…Zn’: logical argument Q1’…Qm’: logical modifiersT(Φ’) is the logical type (Event, Path, ….) of the primitive Φ’ (CAUSE, LET, GO, …)

… …

Root LCS (RLCS)

• A RLCS is an un-instantiated LCS that is associated with a word definition in the lexicon (i.e., a LCS with unfilled variable position)

• LCSs are recursively defined.

RLCS representation for go

XThing

TOLoc

Path

ATLoc

Position

XThing

ZLocation

GOLoc

Event

It is different from dependency structure

Composed LCS (CLCS)

• A CLCS is an instantiated LCS that is the result of combining two or more RLCSs by means of unification (roughly).

• This is the interlingua form that serves as the pivot between the source and target languages.

CLCS representation for “John went happily to school”

JohnThing

TOLoc

Path

ATLoc

Position

JohnThing

SchoolLocation

GOLoc

Event

HappilyManner

The operations of combining are not defined in this paper.

Syntactic phrase

X: syntactic head W: external argumentZ-MAX i: internal arguments Q-MAXi: syntactic adjuncts

Similar to X-bar theory, GB theory, etc.

An example

Mapping between LCS and syntactic form

• Generalized linking routine (GLR):– X’ X (logical head syntactic head)– W’ W (logical subject external argument)– Z’ Z (logical argument internal argument)– Q’ Q (logical modifiers syntactic adjunct)

• Canonical syntactic realization (CSR)– Relate T(Φ’) to CAT(Φ): (logical type syntactic

category)

Ex: THING N, EVENT V

Divergence problem

• Translation divergences occur when there is an exception either to the GLR or to the CSR (or to both) in one language, but not in the other.

Outline

• Formal definition of translation divergence

• Seven types of divergence

• Discussion

• Remaining questions

T1: Thematic divergence

• The repositioning of arguments w.r.t. a head.

• GLR: W’ Z and Z’W

• Example: I like Mary Maria me gusta

:INT and :EXT

General Solution

T2: Promotional Divergence

• Promoting a logical modifier into a main verb position (or vice versa)

• GLR: X’Z and Q’X• Ex: John usually goes home Juan suele ir a casa

:PROMOTE

General Solution

T3: Demotional Divergence

• Demoting a logical head into an internal argument (adjunct?) position (or vice versa).

• GLR: X’Q and Z’X

• Ex: I like to eat Ich gern esse

:DEMOTE

General Solution

T4: Structural divergence

• It does not alter the positions used in GLR mapping• But it changes the nature of the relation between

different positions (i.e., the “” correspondence)• Ex: John entered the house Juan entro en la casa

* marker

Marker forces logical constituents to be realized compositionally at different levels

General solution

T5: Conflational Divergence

• The suppression of a CLCS constituent (or the inverse of the process)

• GLR: correspondence of step (3) or (4) is changed.

ExampleI stabbed John Yo le di punaladas a Juan

:CONFLATED

General solution

T6: Categorical divergence

• CAT(Φ) is different from CSR(T(Φ’)).• Ex: I am hungry Ich hunger habe

:CAT

General solution

T7: Lexical divergence• As a side effect of other divergences.• Ex: John broke into the room Juan forzo la entrada al cuarto

Summary of seven types

• Repositioning (GLR mappings): thematic, promotional, demotional divergences

• Changing correspondence: structural, conflational divergences

• Category: categorical divergence

• ??: Lexical divergence

Discussion

Discussion

• Limits on Repositioning Divergences

• Promotional vs. Demotional Divergences

• Lexical Selection: Full Coverage Constraint

• Interacting Divergence Types

Limits on Repositioning divergences

• Three types to cover all repositioning divergences:– Thematic: W’Z, Z’W– Promotional: X’Z, Q’X – Demotional: X’Q, Z’X

• (X, W, Z, Q) (X’, W’, Z’, Q’)– W has a special status: 44=256 33=27– a CLCS must contain exactly one head:

33=2712

Limits on Repositioning Divergences (cont)

• Z can never be associated with Q’, and Q can never be associated with Z’: 12 5

• Modifying relation cannot be reversed: 54 (Q’X, X’Q, Z’Z)

• Argument relation cannot be reversed: 4 3 (Z’X, X’Z, Q’Q)

• Canonical positions: 3 2

Promotional vs. Demotional Divergences

• Promotion is triggered by a main verb (e.g., soler in soler-usually)

• Demotion is triggered by an adverb (e.g., gern in like-gern)

Interacting Divergence Types

• Promotional and thematic divergence:

S: Leer libros le suele gustar a Juan

‘reading books (him) tends to please (to) John’

E: John usually likes reading books

Remaining questions

Remaining questions: Interlingua

• How to build RLCS?– What are logical head, subject, arguments and

modifiers? Ex: like likingly – How to represent a verb: stab CAUSE GOPoss

KNIFE-WOUND

• How are RLCSs combined to form CLCSs? – Unification = substitution?

• Are CLCSs really sufficient to handle all the languages?

Remaining issues: divergences

• Are the seven types really sufficient to cover all the convergences?– Is the “proof” for limits on repositioning

divergences convincing?– “Translation divergences occur when there is

an exception to GLR/CSR in one language, but not the other”: what if there are exceptions in both languages?

– Can a dependent of X become a dependent of Y?

Remaining issues: MT

• How to build a real MT system with this approach?