Applying Machine Translation Metrics to Student-Written Translations Lisa N. Michaud Computer...

Post on 14-Dec-2015

215 views 2 download

Transcript of Applying Machine Translation Metrics to Student-Written Translations Lisa N. Michaud Computer...

Applying Machine Translation Metrics to Student-Written

Translations

Lisa N. MichaudComputer Science Department

Merrimack CollegeNorth Andover, Massachusetts, USA

Patricia Ann McCoyLanguage Department

Universidad de las Americas PueblaPuebla, Mexico

Michaud and McCoy3

Criteria for judging translations

fluency (is it well-formed?)

fidelity (does it convey original meaning?)

(Hovy et al., 2002)

Michaud and McCoy4

Multiplicity of translations

In each one of these jobs the professor could have agreed to work 6 hours a day and therefore would not be surpassing the working day hour limit.

In each one of these jobs the teacher could have agreed to work 6 hours per day and therefore he wouldn't be bound by the limits of the working day.

In each of these examples the teaching could have been arranged so that he/she works six hours a day and would not be affected by any workday limitations.

In both of these jobs the professor could have agreed to work six hours daily and therefore he wouldn't be affecting his work shift limit.

Michaud and McCoy5

Multiplicity of translations

In each one of these jobs the professor could have agreed to work 6 hours a day and therefore would not be surpassing the working day hour limit.

In each one of these jobs the teacher could have agreed to work 6 hours per day and therefore he wouldn't be bound by the limits of the working day.

In each of these examples the teaching could have been arranged so that he/she works six hours a day and would not be affected by any workday limitations.

In both of these jobs the professor could have agreed to work six hours daily and therefore he wouldn't be affecting his work shift limit.

Michaud and McCoy6

BLEU

Hypothesis

Multiple References

Michaud and McCoy7

TERp

Hypothesis

Single Reference

PHRASALEQUIVALENCE

orSYNONYM SHIFTSUBSTITUTION

INSERTION

SAMESTEM

Michaud and McCoy8

TERp alignment and tags

Michaud and McCoy9

Student translation corpus

Number of Subjects 13Native English Speakers 3Native Spanish Speakers 10Number of Articles Translated 11Avg Number of Sentences per Article

28

Total Translated Sentences 2,982

Michaud and McCoy11

Does TERp agree with an expert?

Instructor Scores vs Inverted TERp650 sentences (22%)

Pearson Correlationr = 0.232236

Michaud and McCoy12

Score distribution

0 10 20 30 40 50 60 70 80 90 1000

50

100

150

200

250

300

TERp-AInstructor

Assigned Grade Decile

Nu

mb

er

of

Sen

ten

ces R

eceiv

ing

G

rad

e

Michaud and McCoy13

Instructor rubric (original)

Conveys Original Meaning 55%Written in Natural Language 20%Uses Appropriate Vocabulary 10%Written in Accurate Language 15%

10 Excellent9 Good8Satisfactory0-7 Deficient

Michaud and McCoy15

Evaluating TERp tags (pilot)

Precision

Recall

Phrase equivalence 83% 68%Stemming 100% 75%Synonymy 89% 65%Shifts 92% 89%

Michaud and McCoy16

Future work

Michaud and McCoy17

Instructor rubric (revised)

Uses Grammatical Language 50%Conveys Original Meaning 50%

100 Excellent90 Good80Satisfactory0-70 Deficient

Michaud and McCoy18

Modifying the TERp Score

Hypothesis

Single Reference

PHRASALEQUIVALENCE

orSYNONYM SHIFTSUBSTITUTION

INSERTION

SAMESTEM

Michaud and McCoy19

Recognizing false cognates

Hypothesis

Single Reference

SUBSTITUTION

cynical

brazen

Sourcecínico

Michaud and McCoy20

Extracting mistranslation pairs

SPANISHDICTIONARY

ENGLISHDICTIONARY

cynical

cínico

cynicalbrazen

zona zone