The Web as a Parallel Corpus Philip Resnik, Noah A. Smith, Computational Linguistics, 29, 3, pp....

download The Web as a Parallel Corpus Philip Resnik, Noah A. Smith, Computational Linguistics, 29, 3, pp. 349…

If you can't read please download the document

description

Introduction Parallel corpora, bitexts; –for automatic lexical acquisition (Gale and Church 1991; Melamed 1997) –provide indispensable training data for statistical translation models (Brown et al. 1990; Melamed 2000; Och and Ney 2002) –provide the connection between vocabularies in cross-language information retrieval (Davis and Dunning 1995; Landauer and Littman 1990; see also Oard 1997)