CS460/626 : Natural Language Processing/Speech NLP and...

CS460/626 : Natural Language Processing/Speech NLP and the WebProcessing/Speech, NLP and the Web

(Lecture 18– Alignment in SMT and Tutorial on Giza++ and Moses)on Giza++ and Moses)

Pushpak BhattacharyyaPushpak BhattacharyyaCSE Dept., IIT Bombay

15th F b 201115th Feb, 2011

Going forward from word alignmentalignment

Word alignmentWord alignment

Phrase Alignment Decoding(going to bigger units (best possibleOf correspondence) translation)

Abstract ProblemAbstract Problem

Given: e e e e e e (Entities)Given: eoe1e2e3….enen+1 (Entities)

Goal: l l1l2l3 l l 1 (Labels)Goal: lol1l2l3….lnln+1 (Labels)

The Goal is to find the best possible label sequence

))|((maxarg* ELPLL

Generative Model

)|().(maxarg)|(maxarg LEPLPELPL

SimplificationSimplification

Using Markov Assumption the LanguageUsing Markov Assumption, the Language Model can be represented using bigrams

Simila l t anslation model can also be

)|()( 10

iLLPLP +

Similarly translation model can also be represented in the following way:

iii lePLEP

)|()|(

Statistical Machine Translation

Finding the best possible English sentence given the foreign sentencesentence given the foreign sentence

)|().(maxarg)|(maxarg* EFPEPFEPeE

P(E)= Language ModelP(F|E) Translation ModelP(F|E) = Translation ModelE: English, F: Foreign Language

Problems in the frameworkProblems in the frameworkLabels are words of the target languageLabels are words of the target language

Very large in number Who do you want to_go with ? Preposition

With whom do you want to go ?आप िकस के_साथ जाना चाहते_हो (Aap kis ke sath jaana chahate ho)

Stranding

(Aap kis ke_sath jaana chahate_ho)who whodo do and so on

you youwant wantto_go to_gowith with

Column of words of target language on the

l dsource language words

^ Aap kis ke_sath jaana chahate_ho .who whodo do and so on you youy y

^ want want … .to_go to_gowith withwith with

Find the best possible path from ‘^’ to ‘.’ using transition andObservation probabilities.

Viterbi can be usedViterbi can be used

TUTORIAL ON Giza++ and Moses tools(delivered by Kushal Ladha)

Word-based alignmentWord based alignment

For each word in source language alignFor each word in source language, align words from target language that this word possibly producespossibly producesBased on IBM models 1-5M d l 1 i l tModel 1 – simplestAs we go from models 1 to 5, models get more complex but more realisticThis is all that Giza++ does

Ali tAlignment

A function from target position to source position:

The alignment sequence is: 2,3,4,5,6,6,6Ali f i A A(1) 2 A(2) 3 Alignment function A: A(1) = 2, A(2) = 3 ..A different alignment function will give the sequence:1,2,1,2,3,4,3,4 for A(1), A(2)..

To allow spurious insertion, allow alignment with word 0 (NULL)No. of possible alignments: (I+1)J

IBM Model 1: Generative ProcessProcess

Training Alignment ModelsTraining Alignment Models

Given a parallel corpora, for each (F,E) learn the best alignment A and thelearn the best alignment A and the component probabilities:

How to compute these probabilities if all h i ll l

you have is a parallel corpora

Intuition : Interdependence of ProbabilitiesProbabilities

If you knew which words are probable translation of each other then you cantranslation of each other then you can guess which alignment is probable and which one is improbablepIf you were given alignments with probabilities then you can compute p y ptranslation probabilitiesLooks like a chicken and egg problem

gg pEM algorithm comes to the rescue

Limitation: Only 1->Many Alignments ll dallowed

Phrase-based alignmentPhrase based alignment

More natural

Many-to-one mappings allowed

Giza++ and Moses PackageGiza++ and Moses Package

http://cl naist jp/~eric-n/ubuntu-nlp/http://cl.naist.jp/~eric-n/ubuntu-nlp/Select your Ubuntu versionBrowse the nlp folderDownload debian package of giza++, p g g ,moses, mkcls, srilmResolve all the dependencies and they getResolve all the dependencies and they get installedFor alternate installation refer toFor alternate installation, refer to http://www.statmt.org/moses_steps.html

StepsSteps

Input - sentence aligned parallel corpusO t t t t id t d d tOutput- target side tagged data

TrainingTuningGenerate output on test corpusGenerate output on test corpus (decoding)

TrainingTraining Create a folder named corpus containing test, train and tuning fileGiza++ is used to generate alignmentg gPhrase table is generated after trainingBefore training language model needs toBefore training language model needs to be build on target sidemkdir lm ; /usr/bin/ngram-count -order 3 -interpolate -kndiscount -text d ; /us /b / g a cou t o de 3 te po ate d scou t te t$PWD/corpus/train_surface.hi -lm lm/train.lm;/usr/share/moses/scripts/training/train-factored-phrase-model.perl -scripts-root-dir /usr/share/moses/scripts -root-dir . -corpus train.clean -e hi -f en -l $ /l / llm 0:3:$PWD/lm/train.lm:0;

ExampleExample

train en train prtrain.enh e l l oh l l

train.prhh eh l owhh h l h e l l o

w o r l dc o m p o u n d w o r d

hh ah l oww er l dk d dc o m p o u n d w o r d

h y p h e n a t e do n e

k aa m p aw n d w er dhh ay f ah n ey t ih dow eh n iyo n e

b o o mk w e e z l e b o t t e r

ow eh n iyb uw mk w iy z l ah b aa t ah rk w e e z l e b o t t e r k w iy z l ah b aa t ah r

Sample from Phrase-tableSample from Phrase table

b ||| b ||| (0) (1) ||| (0) (1) ||| 1 0 666667 1 0 181818b o ||| b aa ||| (0) (1) ||| (0) (1) ||| 1 0.666667 1 0.181818 2.718

b ||| b ||| (0) ||| (0) ||| 1 1 1 1 2.718c o m p o ||| aa m p ||| (2) (0,1) (1) (0) (1) ||| (1,3) (1,2,4) (0)

||| 1 0.0486111 1 0.154959 2.718c ||| p ||| (0) ||| (0) ||| 1 1 1 1 2.718d w ||| d w ||| (0) (1) ||| (0) (1) ||| 1 0.75 1 1 2.718

l l o ||| l ow ||| (0) (0) (1) ||| (0,1) (2) ||| 0.5 1 1 0.227273 2.718l l ||| l ||| (0) (0) ||| (0,1) ||| 0.25 1 1 0.833333 2.718l o ||| l ow ||| (0) (1) ||| (0) (1) ||| 0.5 1 1 0.227273 2.718l ||| l ||| (0) ||| (0) ||| 0 75 1 1 0 833333 2 718d ||| d ||| (0) ||| (0) ||| 1 1 1 1 2.718

e b ||| ah b ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718e l l ||| ah l ||| (0) (1) (1) ||| (0) (1,2) ||| 1 1 0.5 0.5 2.718e l l ||| eh l ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.111111 0.5

0.111111 2.718e l ||| eh ||| (0) (0) ||| (0,1) ||| 1 0.111111 1 0.133333 2.718e ||| ah ||| (0) ||| (0) ||| 1 1 0 666667 0 6 2 718

l ||| l ||| (0) ||| (0) ||| 0.75 1 1 0.833333 2.718m ||| m ||| (0) ||| (0) ||| 1 0.5 1 1 2.718n d ||| n d ||| (0) (1) ||| (0) (1) ||| 1 1 1 1 2.718n e ||| eh n iy ||| (1) (2) ||| () (0) (1) ||| 1 1 0.5 0.3 2.718n e ||| n iy ||| (0) (1) ||| (0) (1) ||| 1 1 0.5 0.3 2.718n ||| eh n ||| (1) ||| () (0) ||| 1 1 0.25 1 2.718e ||| ah ||| (0) ||| (0) ||| 1 1 0.666667 0.6 2.718

h e ||| hh ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718h ||| hh ||| (0) ||| (0) ||| 1 1 1 1 2.718l e b ||| l ah b ||| (0) (1) (2) ||| (0) (1) (2) ||| 1 1 1 0.5 2.718l e ||| l ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.5 2.718

n ||| eh n ||| (1) ||| () (0) ||| 1 1 0.25 1 2.718o o m ||| uw m ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.5 1 0.181818 2.718o o ||| uw ||| (0) (0) ||| (0,1) ||| 1 1 1 0.181818 2.718o ||| aa ||| (0) ||| (0) ||| 1 0.666667 0.2 0.181818 2.718o ||| ow eh ||| (0) ||| (0) () ||| 1 1 0.2 0.272727 2.718o ||| ow ||| (0) ||| (0) ||| 1 1 0.6 0.272727 2.718w o r ||| w er ||| (0) (1) (1) ||| (0) (1,2) ||| 1 0.1875 1 0.424242 2.718w ||| w ||| (0) ||| (0) ||| 1 0.75 1 1 2.718

TuningTuning

Not a compulsory step but will improve the decoding by a small percentagethe decoding by a small percentagemkdir tuning; cp $WDIR/corpus/tun.en tuning/input; cp $WDIR/corpus/tun.hi tuning/reference; /usr/share/moses/scripts/training/mert moses pl $PWD/tuning/input/usr/share/moses/scripts/training/mert-moses.pl $PWD/tuning/input $PWD/tuning/reference /usr/bin/moses $PWD/model/moses.ini --working-dir $PWD/tuning --rootdir /usr/share/moses/scripts

It will take around 1 hour on a server with 32GBIt will take around 1 hour on a server with 32GB RAM

TestingTesting

mkdir evaluation; /usr/bin/moses -config $WDIR/tuning/moses.ini -input-file $WDIR/corpus/test.en >evaluation/test.output;

The output will be inThe output will be in evaluation/test.output fileSample OutputSample Output

h o t hh aa th |UNK hh h ip h o n e p|UNK hh ow eh n iy

b o o k b uw k

CS460/626 : Natural Language Processing/Speech NLP and...

Documents

Transcript of CS460/626 : Natural Language Processing/Speech NLP and...

CS626: NLP, Speech and the Web - Indian Institute of ...pb/cs626-2014/cs626... · Use of Lexical Knowledge Networks Proliferation of statistical ML based methods Still the need for

CS460/626 : Natural Language Processing/Speech, …pb/cs626-460-2011/cs626-460...2011/03/31 · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 33– Phonetics

CS460/626 : Natural Language Processing/Speech, …cs626-460-2012/cs626-460...2011/04/07 · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 35– Phonetics

CS460/626 : Natural Language Processing/Speech, NLP and ... · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 35– X-bar theory) Pushpak Bhattacharyya CSE

CS460/626 : Natural: Natural Language Processing/Speech ...cs626-460-2012/lecture... · Wh t it iWhat it is POS Tagging is a process that attaches each word in a sentence with a suitableeach

CS626: NLP, Speech and the Webpb/cs626-2014/cs626-lect30to... · 2014. 11. 9. · “The basic concept of word” 13 Oct, 2014 Pushpak Bhattacharyya: recurrent NN 3 The Basic Con

CS626-449: NLP, Speech and Web-Topics-in-AI

CS460/626 : Natural Language Processing/Speech, NLP and ...cs626/cs626-sem1-2012/...CS460/626 : Natural Language Processing/Speech, NLP and the Web Lecture 28, 29: Phonetics, Phonology

CS460/626 : Natural Language Processing/Speech, NLP and ...pb/cs626-460-2011/cs626-460-lect35-syllabification-2011-04...“Syllable is a unit of spoken language ... Chest Pulse Theory

CS460/626 : Natural Language Processing/Speech, NLP and the …pb/cs626-460-2011/cs626-460... · 2011-04-14 · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

Speech, NLP and the Web - CSE, IIT Bombaypb/cs626-2014/cs626-lect1to4-intro-pos... · Speech, NLP and the Web Pushpak Bhattacharyya ... Word formation rules from root words ... Preposition

CS460/626 : Natural Language Processing/Speech, NLP …cs626-460-2012/lecture_slides/cs626... · CS460/626 : Natural Language Processing/Speech, ... is a technique to translate ...

CS460/626 : Natural Language Processing/Speech, …pb/cs626-sem1-2012/cs626-lect27-wn... · The first wordnet in the world was for English ... Wordnets for Hindi and Marathi being

CS460/626 : Natural Language Processing/Speech, NLP …cs626-460-2012/lecture_slides/cs626... · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 17, 18, 19–

CS460/626 : Natural Language Processing/Speech NLP and the ...

Speech, NLP and the Web - CSE, IIT Bombaypb/cs626-2014/cs626-lect13to15... · Speech, NLP and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 13, 14, 15: Morphology: English

CS460/626 : Natural Language Processing/Speech, NLP …pb/cs626-460-2011/cs626-460-lect37... · CS460/626 : Natural Language Processing/Speech, ... •Computational semantics has

CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CS626-449: Speech, Natural Language Processing and the Web/Topics in Artificial Intelligence

CS460/626 : Natural Language Processing/Speech, NLP and ...