CS 479, section 1: Natural Language Processing

CS 479, section 1:Natural Language Processing

Lecture #35: Word Alignment Models (cont.)

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.

Content by Eric Ringger, partially based on earlier slides from Dan Klein of U.C. Berkeley.

Announcements Project #4

Your insights into treebank grammars? Project #5

Model 2 discussed today! Propose-your-own

Reminder: No presentation, unless you really want to give one!

Check the schedule Plan enough time to succeed! Don’t get or stay blocked. Get your questions answered early. Get the help you need to keep moving forward. No late work accepted after the last day of instruction.

Announcements (2) Project Report:

Early: Wednesday Due: Friday

Homework 0.4 Due: today

Reading Report #14 Phrase-based MT paper Due: next Monday (online again)

EM Revisited

1. What are the four steps of the Expectation Maximization (EM) algorithm? Think of document clustering and/or training IBM

Model 1!

2. What are the two primary purposes of EM?

Objectives

Observe problems with IBM Model 1

Model ordering issues as IBM Model 2!

“Monotonic Translation”

Le Japon secoué par deux nouveaux séismes

Japan shaken by two new quakes

How would you implement a monotone decoder?(to translate the French)

MT System You could now build a simple MT system using:

English language model English to French alignment model (IBM Model 1)

Canadian Hansard data Monotone Decoder

Greedy Or Viterbi

IBM Model 1

1... Ja a a1 2a 2 3a 3 4a 4 5a 5 6a 6 6a 7 6a

( |ˆ ( , | )1)1 jj

t fa e eP f

ˆ ˆ( | ) ( , | )a

P f e P f a e

Target:

Source:

One-to-Many Alignments

But there are other problems to think about as the following examples will show:

Problem: Many-to-One Alignments

Problem: Many-to-Many Alignments

Problem: Local Order Change

Le Japon est au confluent de quatre plaques tectoniques

Japan is at the junction of four tectonic plates

“Distortions”

Problem: More Distortions

Le tremblement de terre a fait 39 morts et 3,183 blessés.

The earthquake killed 39 and wounded 3,183.

Insights

How to include “distortion” in the model?

How to prefer nearby distortions over long-distance distortions?

IBM Model 2 Reminder: Model 1

Could model distortions without any strong assumptions about where they occur as a distribution over target language positions:

Could build a model as a distribution over distortion distances:

Matrix View of an Alignment

Preference for the Diagonal But alignments for some language pairs tend to the

diagonal in general: Can use a normal distribution for the distortion model

EM for Model 2 Model 2 Parameters:

Translation probabilities: Distortion parameters:

Initialize with Model 1 Initialize as uniform E-step: For each pair of sentences :

For each French position 1. Calculate posterior over English positions :

2. Increment count of word with word by these amounts:

3. Similarly, for each English position , update:

( | , , )( | , , )

( | )( | )

t f et

a i j I Ja f

P i f e jj I J ei

( | , ,, )( )j i P i jf fC ee

| , , ; ,C i j J I f e

EM for Model 2 (cont.) M-step:

Re-estimate by normalizing these counts one conditional distribution for each context

Re-estimate by normalizing the earlier counts one conditional distribution per word e

Iterate until convergence of or a handful of times

See the directions for Project #5 on the course wiki for a more detailed version of this EM algorithm, including implementation tips.

Even better alignment models

Evaluating alignment models

Evaluating translation end-to-end!

CS 479, section 1: Natural Language Processing

Documents

Transcript of CS 479, section 1: Natural Language Processing

EECE/CS 4353 Image Processing - Internet Archive

ECE 468 / CS 519 Digital Image Processing Introduction

CS 600.416 Transaction Processing Lecture 18 Parallelism.

Parallel Processing Steve Terpe CS 147. Overview What is Parallel Processing What is Parallel Processing Parallel Processing in Nature Parallel Processing.

CS 213: Parallel Processing Architectures

Digital Image Processing (CS/ECE 545) Introduction to ...web.cs.wpi.edu/~emmanuel/courses/cs545/S14/slides/lecture01.pdf · Digital Image Processing (CS/ECE 545) ... Image Sampling

CS 224S / LINGUIST 285 Spoken Language Processing

CS 6120/CS4120: Natural Language Processing

CS 6825: Binary Image Processing – binary blob metrics.

CS 4120: Natural Language Processing

CS 317 - Data Management and Information Processing

ECE 468 / CS 519: Digital Image Processing Spatial Sharpeningweb.engr.oregonstate.edu/~sinisa/courses/OSU/ECE468/lectures/ECE... · ECE 468 / CS 519: Digital Image Processing Spatial

CS 317 - Data Management and Information Processing.

Superscalar Processing CS 740 September 24-26, 2001

CS 533: Natural Language Processing - Language …karlstratos.com/teaching/cs533spring20/slides/language...Karl Stratos CS 533: Natural Language Processing 2/40 Motivation How likely

CS 147 – Parallel Processing

Image Processing CS 6640 : An Introduction to MATLAB ...

CS 388: Natural Language Processing: Information Extraction

DS e-STUDIO389CS 479 CS - Liscic...EFFICIENCY IS KEY e-STUDIO389CS/479CS Take advantage of all that e-STUDIO389 cs and e-STUDIO479 cs have to offer, especially their print speeds of

CS 479, section 1: Natural Language Processing