CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural...

82
CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 Introduction) (Lecture 1 Introduction) Pushpak Bhattacharyya Pushpak Bhattacharyya CSE Dept., IIT Bombay 4 th J 2011 4 th Jan, 2011

Transcript of CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural...

Page 1: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web

(Lecture 1 Introduction)(Lecture 1 – Introduction)

Pushpak BhattacharyyaPushpak BhattacharyyaCSE Dept., IIT Bombay4th J 20114th Jan, 2011

Page 2: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Persons involved

Faculty instructors: Dr. Pushpak Bhattacharyya (www.cse.iitb.ac.in/~pb)Bhattacharyya (www.cse.iitb.ac.in/ pb)TAs: Joydip Datta, DebarghyaMajumdar {joydip deb}@cseMajumdar {joydip,deb}@cseCourse home page (to be created)

iitb i / 626 460 2011www.cse.iitb.ac.in/~cs626-460-2011

Page 3: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Perpectivising NLP: Areas of AI and p gtheir inter-dependencies

SearchKnowledge RepresentationLogic

Machine PlanningMachine Learning

VisionExpert S tRoboticsNLP Vision SystemsRoboticsNLP

Page 4: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Books etcBooks etc.Main Text(s):

Natural Language Understanding: James AllanSpeech and NLP: Jurafsky and MartinFoundations of Statistical NLP: Manning and Schutze

Other References:NLP a Paninian Perspective: Bharati, Cahitanya and SangalStatistical NLP: CharniakStatistical NLP: Charniak

JournalsComputational Linguistics, Natural Language Engineering, AI, AI Magazine IEEE SMCMagazine, IEEE SMC

Conferences ACL, EACL, COLING, MT Summit, EMNLP, IJCNLP, HLT, ICON SIGIR WWW ICML ECMLICON, SIGIR, WWW, ICML, ECML

Page 5: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Allied DisciplinespPhilosophy Semantics, Meaning of “meaning”, Logic

(syllogism)Linguistics Study of Syntax, Lexicon, Lexical Semantics etc.

Probability and Statistics Corpus Linguistics, Testing of Hypotheses, System EvaluationSystem Evaluation

Cognitive Science Computational Models of Language Processing, Language Acquisition

Psychology Behavioristic insights into Language Processing, Psychological Models

Brain Science Language Processing Areas in Brain

Physics Information Theory, Entropy, Random Fields

C t S & E S t f NLPComputer Sc. & Engg. Systems for NLP

Page 6: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Topics proposed to be coveredTopics proposed to be covered

Shallow Processingf S h i d Ch ki i C dPart of Speech Tagging and Chunking using HMM, MEMM, CRF, and

Rule Based SystemsEM Algorithm

Language Modelingg g gN-gramsProbabilistic CFGs

Basic Speech ProcessingPh l d Ph tiPhonology and PhoneticsStatistical ApproachAutomatic Speech Recognition and Speech Synthesis

Deep Parsingp gClassical Approaches: Top-Down, Bottom-UP and Hybrid MethodsChart Parsing, Earley ParsingStatistical Approach: Probabilistic Parsing, Tree Bank Corpora

Page 7: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Topics proposed to be covered (contd.)

Knowledge Representation and NLPPredicate Calculus, Semantic Net, Frames, Conceptual Dependency, p p yUniversal Networking Language (UNL)

Lexical SemanticsLexicons, Lexical Networks and OntologyW d S Di bi tiWord Sense Disambiguation

ApplicationsMachine TranslationIRIRSummarizationQuestion Answering

Page 8: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Grading

Based onMidsemMidsemEndsemAssignmentsAssignmentsPaper-reading/Seminar

Except the first two everything else in groupsExcept the first two everything else in groups of 4. Weightages will be revealed soon.

Page 9: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Definitions etc.Definitions etc.

Page 10: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

What is NLP

Branch of AI2 Goals2 Goals

Science Goal: Understand the way language operateslanguage operatesEngineering Goal: Build systems that analyse and generate language; reduce theanalyse and generate language; reduce the man machine gap

Page 11: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

The famous Turing Test: Language Based InteractionInteraction

Test conductor

Machine

HHuman

Can the test conductor find out which is the machine and which the human

Page 12: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Inspired Eliza

http://www.manifestation.com/neurotoys/eliza.php3ys/eliza.php3

Page 13: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Inspired Eliza (another sample interaction)

A Sample of Interaction:

Page 14: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

“What is it” question: NLP is concerned with Grounding

Ground the language intoGround the language into perceptual, motor and

iti iticognitive capacities.

Page 15: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Grounding

Chair

Comp teComputer

Page 16: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

T Vi f NLP d thTwo Views of NLP and the Associated ChallengesAssociated Challenges

1. Classical View2. Statistical/Machine Learning2. Statistical/Machine Learning

View

Page 17: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Stages of processing

Phonetics and phonologyMorphologyMorphologyLexical Analysis

lSyntactic AnalysisSemantic AnalysisPragmaticsDiscourseDiscourse

Page 18: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Phonetics

Processing of speechChallenges

Homophones: bank (finance) vs. bank (riverbank)

Near Homophones: maatraa vs. maatra (hin)Word Boundary

aajaayenge (aa jaayenge (will come) or aaj aayenge (will come today)I got [ua]plate

Phrase boundarymtech1 students are especially exhorted to attend as such seminars are integral to one's post-graduate education

Disfluency: ah, um, ahem etc.

Page 19: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

MorphologyWord formation rules from root wordsNouns: Plural (boy-boys); Gender marking (czar-czarina)Nouns: Plural (boy boys); Gender marking (czar czarina)Verbs: Tense (stretch-stretched); Aspect (e.g. perfective sit-had sat); Modality (e.g. request khaanaa khaaiie)First crucial first step in NLPFirst crucial first step in NLPLanguages rich in morphology: e.g., Dravidian, Hungarian, TurkishLanguages poor in morphology: Chinese EnglishLanguages poor in morphology: Chinese, EnglishLanguages with rich morphology have the advantage of easier processing at higher stages of processingA t k f i t t t t i Fi it St t M hi fA task of interest to computer science: Finite State Machines for Word Morphology

Page 20: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Lexical AnalysisEssentially refers to dictionary access and obtaining the properties of the word

e g doge.g. dognoun (lexical property)take-’s’-in-plural (morph property)take s in plural (morph property)animate (semantic property)4-legged (-do-)carnivore (-do)

Challenge: Lexical or word sense disambiguationdisambiguation

Page 21: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Lexical DisambiguationFirst step: part of Speech Disambiguation

Dog as a noun (animal)Dog as a verb (to pursue)Dog as a verb (to pursue)

Sense DisambiguationDog (as animal)g ( )Dog (as a very detestable person)

Needs word relationships in a contextThe chair emphasised the need for adult educationThe chair emphasised the need for adult education

Very common in day to day communicationsSatellite Channel Ad: Watch what you want, when you

want (two senses of watch)want (two senses of watch)e.g., Ground breaking ceremony/research

Page 22: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Technological developments bring in new terms additional meanings/nuances forterms, additional meanings/nuances for existing terms

Justify as in justify the right margin (word processing context)Xeroxed: a new verbDigital Trace: a new expressionC if ki di lkCommunifaking: pretending to talk on mobile when you are actually notDiscomgooglation: anxiety/discomfort atDiscomgooglation: anxiety/discomfort at not being able to access internetHelicopter Parenting: over parentinge copte a e t g o e pa e t g

Page 23: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Syntax Processing StageSyntax Processing StageStructure DetectionStructure Detection

SS

NPNPVPVP

NPNP

VV NPNPNPNP

IIlikelike mangoesmangoes

Page 24: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Parsing Strategy

Driven by grammarS-> NP VPNP-> N | PRONVP-> V NP | V PPN-> MangoesPRON-> I

l kV-> like

Page 25: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Challenges in Syntactic g yProcessing: Structural Ambiguity

Scope1.The old men and women were taken to safe locations(old men and women) vs. ((old men) and women)(old men and women) vs. ((old men) and women)2. No smoking areas will allow Hookas inside

Preposition Phrase AttachmentI saw the boy with a telescopeI saw the boy with a telescope(who has the telescope?)

I saw the mountain with a telescope(world knowledge: mountain cannot be an instrument of(world knowledge: mountain cannot be an instrument of seeing)I saw the boy with the pony-tail(world knowledge: pony-tail cannot be an instrument of seeing)seeing)

Very ubiquitous: newspaper headline “20 years later, BMC pays father 20 lakhs for causing son’s death”

Page 26: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Structural Ambiguity…Overheard

I did not know my PDA had a phone for 3 monthsAn actual sentence in the newspaperAn actual sentence in the newspaper

The camera man shot the man with the gun when he was near Tendulkar

(P G Wodehouse Ring in Jeeves) Jill had rubbed(P.G. Wodehouse, Ring in Jeeves) Jill had rubbed ointment on Mike the Irish Terrier, taken a look at the goldfish belonging to the cook, which had caused

i t i th kit h b f i it t’anxiety in the kitchen by refusing its ant’s eggs… (Times of India, 26/2/08) Aid for kins of cops killed in terrorist attacks

Page 27: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Headache for Parsing: Garden Path sentences

Garden PathingThe horse raced past the garden fellThe horse raced past the garden fell.The old man the boat.Twin Bomb Strike in Baghdad kill 25Twin Bomb Strike in Baghdad kill 25(Times of India 05/09/07)

Page 28: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Semantic AnalysisSemantic AnalysisRepresentation in terms of

Predicate calculus/SemanticPredicate calculus/Semantic Nets/Frames/Conceptual Dependencies and Scripts

J h b k t MJohn gave a book to MaryGive action: Agent: John, Object: Book, Recipient: Maryp y

Challenge: ambiguity in semantic role labeling(Eng) Visiting aunts can be a nuisance(Hi ) k jh ith i khil ii d ii(Hin) aapko mujhe mithaai khilaanii padegii (ambiguous in Marathi and Bengali too; not in Dravidian languages)

Page 29: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

PragmaticsVery hard problemModel user intention

Tourist (in a hurry, checking out of the hotel, motioning to the service boy): Boy, go upstairs and see if my sandals are under the divan Do notand see if my sandals are under the divan. Do not be late. I just have 15 minutes to catch the train.Boy (running upstairs and coming back panting): y ( g p g p g)yes sir, they are there.

World knowledgeWHY INDIA NEEDS A SECOND OCTOBER (ToI, 2/10/07)

Page 30: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

DiscourseProcessing of sequence of sentences Mother to John:

John go to school. It is open today. Should you bunk? Father will be very angry.

Ambiguity of openg y pbunk what?Why will the father be angry?

Complex chain of reasoning and application ofComplex chain of reasoning and application of world knowledge Ambiguity of father

father as parentfather as parent or

father as headmaster

Page 31: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Complexity of Connected Text

John was returning from school dejected – today was the math testj y

He couldn’t control the class

Teacher shouldn’t have made him responsibleresponsible

After all he is just a janitorAfter all he is just a janitor

Page 32: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

A look at Textual Humour1. Teacher (angrily): did you miss the class yesterday?

Student: not much2. A man coming back to his parked car sees the

sticker "Parking fine". He goes and thanks the policeman for appreciating his parking skillpoliceman for appreciating his parking skill.

3. Son: mother, I broke the neighbour's lamp shade.Mother: then we have to give them a new one.Son: no need, aunty said the lamp shade is irreplaceable.

4 Ram: I got a Jaguar car for my unemployed4. Ram: I got a Jaguar car for my unemployed youngest son.Shyam: That's a great exchange!

5. Shane Warne should bowl maiden overs, instead of bowling maidens over

Page 33: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Giving a flavour of what is done gin NLP: Structure Disambiguation

Scope, Clause and Preposition/Postpositon p / p

Page 34: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Structure Disambiguation is as critical asStructure Disambiguation is as critical as Sense Disambiguation

Scope (portion of text in the scope of a modifier)Old men and women will be taken to safe locationsNo smoking areas allow hookas insideNo smoking areas allow hookas inside

ClauseI told the child that I liked that he came to the game on timegame on time

PrepositionI saw the boy with a telescope

Page 35: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Structure Disambiguation is as critical asStructure Disambiguation is as critical as Sense Disambiguation (contd.)

Semantic roleVisiting aunts can be a nuisanceMujhe aapko mithaai khilaani padegii (“I have to give you sweets”

“Y h t i t ”)or “You have to give me sweets”)Postposition

unhone teji se bhaaagte hue chor ko pakad liyaa (“he caught the thief that was running fast” or “he ran fast and caught the thief”)thief that was running fast or he ran fast and caught the thief )

All these ambiguities lead to the construction of multiple parse trees for each sentence andmultiple parse trees for each sentence and need semantic, pragmatic and discourse cues for disambiguation

Page 36: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Higher level knowledge needed g gfor disambiguation

SemanticsI saw the boy with a pony tail (pony tail cannot be an instrument of seeing)an instrument of seeing)

Pragmatics((old men) and women) as opposed to (old men and women) in “Old men and women were takenand women) in Old men and women were taken to safe location”, since women- both and young and old- were very likely taken to safe locations

Discourse:Discourse: No smoking areas allow hookas inside, except the one in Hotel Grand.No smoking areas allow hookas inside but notNo smoking areas allow hookas inside, but not cigars.

Page 37: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Preposition Attachment Disambiguation

Page 38: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

P oblem definitionProblem definition

4-tuples of the form V N1 P N2saw (V) boys (N1) with (P) telescopes (N2)saw (V) boys (N1) with (P) telescopes (N2)

Attachment choice is between the matrix verb V and the object noun Nmatrix verb V and the object noun N1

Page 39: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Lexical Association Table (Hindle andLexical Association Table (Hindle and Rooth, 1991 and 1993)

From a large corpus of parsed textfirst find all noun phrase headsfirst find all noun phrase headsthen record the verb (if any) that precedes the headthe headand the preposition (if any) that follows itas well as some other syntactic informationas well as some other syntactic information about the sentence.

Extract attachment information from thisExtract attachment information from this table of co-occurrences

Page 40: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Example: lexical associationpA table entry is considered a definite instance of the prepositional phrase attaching to the verb if:prepositional phrase attaching to the verb if:

the verb definitely licenses the prepositional phraseE.g. from Propbank,E.g. from Propbank,

absolveframesabsolve.XX: NP-ARG0 NP-ARG2-of obj-ARG1 1 absolve.XX NP-ARG0 NP-ARG2-of obj-ARG1 On Friday , the firms filed a suit *ICH*-1 against West Virginia in New York state court asking for [ARG0 a declaratory judgment] [rel absolving] [ARG1[ARG0 a declaratory judgment] [rel absolving] [ARG1them] of [ARG2-of liability] .

Page 41: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Core stepsCore steps

Seven different procedures for deciding whether a table entry is an instance of no attachment, sure noun attach, sure verb attach, or ambiguous attachbl f i f iable to extract frequency information,

counting the number of times a particular verb or noun attaches with a particularverb or noun attaches with a particular preposition

Page 42: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Core steps (contd )Core steps (contd.)

These frequencies serve as the training data for the statistical model used to predict correct attachmentpredict correct attachmentTo disambiguate a sentence, compute the likelihood of the particular prepositionlikelihood of the particular preposition given the particular verb and contrast with the likelihood of the preposition given the particular noungiven the particular noun

i.e., compare P(with|saw) with P(with|telescope)as in I saw the boy with a telescope

Page 43: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Critique

Limited by the number of relationships in the training corporain the training corpora Too large a parameter spaceModel acquired during training isModel acquired during training is represented in a huge table of probabilities precluding anyprobabilities, precluding any straightforward analysis of its workings

Page 44: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Approach based on Transformation Based Error Driven LearningDriven Learning, Brill and Resnick, COLING 1994

Page 45: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Example Transformations

Initial attach‐ments by defaultyare to N1 pre‐dominantly. 

Page 46: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Transformation rules with word classes

Wordnet synsetsWordnet synsetsandSemantic classes used

Page 47: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Accuracy values of the transformation based approach: 12000 training and 500 testapproach: 12000 training and 500 test examples

Method  Accuracy #of transformation rulesrules

Hindle and Rooth

(b li )

70.4 to 75.8% NA

(baseline)

Transformations 79.2 418

Transformations 81.8 266

(word classes)

Page 48: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Maximum Entropy Based Approach: py pp(Ratnaparki, Reyner, Roukos, 1994)

Use more features than (V N1) bigram and (N1 P) bigramand (N1 P) bigramApply Maximum Entropy Principle

Page 49: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

C f l iCore formulationWe denote

the partially parsed verb phrase i e thethe partially parsed verb phrase, i.e., the verb phrase without the attachment decision, as a history h, and the conditional probability of an attachment as P(d|h), where d and corresponds to a noun or verb attachment- 0 or 1- respectively.

Page 50: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Maximize the training d l lik lih ddata log likelihood

--(1)(1)

--(2)

Page 51: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Equating the model expected parameters and training data p gparameters

(3)--(3)

--(4)

Page 52: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Features

Two types of binary-valued questions:Questions about the presence of any n-Questions about the presence of any n-gram of the four head words, e.g., a bigram maybe V == ‘‘is’’ P == ‘‘of’’bigram maybe V == is , P == ofFeatures comprised solely of questions on words are denoted as “word”on words are denoted as “word” features

Page 53: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Features (contd.)Questions that involve the class membership of a head wordmembership of a head wordBinary hierarchy of classes derived by mutual informationmutual information

Page 54: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Features (contd.)Given a binary class hierarchy,

we can associate a bit string with every wordwe can associate a bit string with every word in the vocabularyThen, by querying the value of certain bit , y q y gpositions we can constructbinary questions.

Features comprised solely of questions about class bits are denoted as “class” f t d f t t i ifeatures, and features containing questions about both class bits and words are denoted as “mixed” featureswords are denoted as mixed features.

Page 55: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Word classes (Brown et. al. 1992)

Page 56: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Experimental data size

Page 57: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Performance of ME Model on Test Events

Page 58: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Examples of Features Chosen for Wall St. Journal Data

Page 59: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Average Performance of Human & ME Model on300 f S300 Events of WSJ Data

Page 60: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Human and ME model performance onHuman and ME model performance on consensus set for WSJ

Page 61: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Average Performance of Human & ME Model on200 Events of Computer Manuals Data200 Events of Computer Manuals Data

Page 62: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Back-off model based approach pp(Collins and Brooks, 1995)

NP-attach: (joined ((the board) (as a non executive director)))

VP-attach:VP-attach: ((joined (the board)) (as a non executive director))

Correspondingly,NP tt hNP-attach:

1 joined board as directorVP-attach:

0 joined board as directorQuintuple of (attachment: A: 0/1, V, N1, P, N2)5 random variables5 random variables

Page 63: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Probabilistic formulation

Or briefly,

If

Then the attachment is to the noun, else to the verbThen the attachment is to the noun, else to the verb

Page 64: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Maximum Likelihood estimate

Page 65: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

The Back-off estimateo Inspired by speech recognitionp y p go Prediction of the Nth word from previous (N-1) words

Data sparsity problemData sparsity problemf(w1, w2, w3,…wn) will frequently be 0 for large values on n

Page 66: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Back-off estimate contd.

The cut off frequencies (c1, c2 ....) are thresholds determining whether to back-off or not at each level-counts lower than ci at stage i are deemed to be too low to give an accurate estimate, so in this case backing-off continues.

Page 67: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Back off for PPT attachmentBack off for PPT attachment

Note: the back off tuples always retain the prepositionNote: the back off tuples always retain the preposition

Page 68: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

The backoff algorithm

Page 69: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Lower and upper b d fbounds on performance

Lower bound(most frequent) Upper bound(most frequent)

(human expertsLooking at 4 wordonly)

Page 70: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Results

Page 71: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Comparison with other systems

Maxent, Ratnaparkhi et. al.

TransformationLearning,Brill et. al.

Page 72: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Flexible Unsupervised PP Attachment using WSD and S i d i ( di i S i i d h kData Sparsity Reduction: (Medimi Srinivas and Pushpak

Bhattacharyya, IJCAI 2007)

Unsupervised approach (some way similar to Ratnaparkhi 1998): The training data is extracted from raw textThe unambiguous training data of the form V-P-N and N1-P-N2 TEACH the system how to resolve PP-attachment in ambiguous test data V-N1-P-N2Refinement of extracted training data. And use of N2 in PP-attachment resolution process.

Page 73: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Flexible Unsupervised PP Attachment using WSD and S i d i ( di i S i i d h kData Sparsity Reduction: (Medimi Srinivas and Pushpak

Bhattacharyya, IJCAI 2007)

PP-attachment is determined by the semantic property of lexical items in the context of preposition using WordNet gAn Iterative Graph based unsupervised approach is used for Word Sense disambiguation (Similar to Mihalcea 2005)Use of a Data sparseness Reduction (DSR) Process which uses lemmatization, Synset replacement and a form of inferencing. DSRP uses WordNet.Flexible use of WSD and DSR processes for PP-Attachment

Page 74: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Graph based disambiguation: page rank based algorithm,Mihalcea 2005

Page 75: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Experimental setup

Training Data: Brown corpus (raw text) Corpus size is 6 MB consists ofBrown corpus (raw text). Corpus size is 6 MB, consists of 51763 sentences, nearly 1 million 27 thousand words.Most frequent Prepositions in the syntactic context N1-P-N2: of, in, for, to, with, on, at, from, by, , , , , , , , yMost frequent Prepositions in the syntactic context V-P-N: in, to, by, with, on, for, from, at, ofThe Extracted unambiguous N1-P-N2: 54030 and V-P-N: 22362

Test Data:Penn Treebank Wall Street Journal (WSJ) data extracted by ( ) yRatnaparkhiIt consists of V-N1-P-N2 tuples: 20801(training), 4039(development) and 3097(Test)

Page 76: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Experimental setup contd.BaseLine:

The unsupervised approach by Ratnaparkhi 1998The unsupervised approach by Ratnaparkhi, 1998 (Base-RP).

Preprocessing:Preprocessing: Upper case to lower case Any four digit number less than 2100 as a yearA h b % i dAny other number or % signs are converted to numExperiments are performed using DSRP: with diff t t f DSRPdifferent stages of DSRP Experiments are performed using GuWSD and DSRP: with different senses

Page 77: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

The process of extracting training data Data Spa sit Red ctiondata: Data Sparsity Reduction

Tools/process Output

Raw Text The professional conduct of the doctors is guided by Indian Medical Association.

POS Tagger The_DT professional_JJ conduct_NN of_IN the_DT doctors_NNS is_VBZ guided_VBN by_ IN Indian_NNP Medical_NNP Association_NNP._.

Chunker [The_DT professional_JJ conduct_NN ] of_IN [the_DT doctors_NNS ] (is_VBZ guided_VBN)  by_IN [Indian_NNP Medical_NNP Association_NNP].

After replacing each chunk by its head word it results in:

conduct_NN of_IN  doctors_NNS guided_VBN  by_IN Association_NNP

Extraction   N1PN2: conduct of doctors  and VPN: guided by Association  Heuristics

1 2 g y

Morphing N1PN2: conduct of doctor  and VPN: guide by association

DSRP (Synset  N1PN2: {conduct, behavior} of {doctor, physician} can result in 4 Replacement) combination with the same sense  and similarly  for VPN: {guide, direct} by 

{association} can result in 2 combinations with the same sense. 

Page 78: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Data Sparsity Reduction: p yInferencing

If V1-P-N1 and V2-P-N1 exist as also do V1-P- N2V1 P N1 and V2 P N1 exist as also do V1 P N2

and V2-P-N2, then ifif

V3-P-Ni exist (i=1,2), thenwe can infer the existence of V3-P-NJwe can infer the existence of V3 P NJ

(i ≠ j) with a frequency count of V3-P-Ni that can be added to the corpus.p

Page 79: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Example of DSR by inferencing

V1-P-N1: play in garden and V2-P-N1: sit in gardenin gardenV1-P-N2: play in house and V2-P-N2: sit in housein houseV3-P-N2: jump in house exists

f h f jInfer the existence of V3-P-N1: jump in garden

Page 80: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Results

Page 81: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Effect of various processes onEffect of various processes on FlexPPAttach algorithm

Page 82: CS460/626 : Natural LanguageCS460/626 : Natural …cs626-460-2012/cs626-460...CS460/626 : Natural LanguageCS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1(Lecture

Precision vs. various processes86

n

84

Pre

cisi

on

84

chm

ent

Pr

WS-DSRDSRP

82

PP

-att

ach

Morph Infer WnSyn Syn-Inf80

PP