Automated Translation of Multi-word Expressions Application in English-Latvian SMT - poster booster

3
Automated Translation of Multi-word Expressions: Application in English-Latvian SMT Prof. Inguna Skadiņa 1 and Matīss Rikters 2 1,2 University of Latvia, 19 Raina Blvd., Riga, Latvia 1 Institute of Mathematics and Computer Science, 29 Raina Blvd., Riga, Latvia 2nd PARSEME Training School La Rochelle, France June 27, 2016

Transcript of Automated Translation of Multi-word Expressions Application in English-Latvian SMT - poster booster

Page 1: Automated Translation of Multi-word Expressions Application in English-Latvian SMT - poster booster

Automated Translation of Multi-word Expressions:

Application in English-Latvian SMT

Prof. Inguna Skadiņa1 and Matīss Rikters2

1,2University of Latvia, 19 Raina Blvd., Riga, Latvia1Institute of Mathematics and Computer Science, 29 Raina Blvd., Riga, Latvia

2nd PARSEME Training SchoolLa Rochelle, France

June 27, 2016

Page 2: Automated Translation of Multi-word Expressions Application in English-Latvian SMT - poster booster

General schema of experiments

Page 3: Automated Translation of Multi-word Expressions Application in English-Latvian SMT - poster booster

Data and Tools• JRC Acquis corpus(v. 3.0):• 1 472 367 parallel sentences as training data• 1134 random sentences as development data• 1599 random sentences as test data• 64 290 multiword expressions

• Tools:• Moses toolkit (Keohn et al., 2007) for training the MT system• MPAligner (Pinnis, 2014) for alignment of multiword expressions• SRILM (Stolcke et al., 2011) for training 5-gram language model• MERT (Och, 2003) for tunning the MT system