Introduction to Computational Linguistics
description
Transcript of Introduction to Computational Linguistics
LING 2000 - 2006 NLP2
Natural Language Processing
• Machine Translation• Predicate argument structures• Syntactic parses• Producing semantic representations• Ambiguities in sentence interpretation
LING 2000 - 2006 NLP3
Machine Translation
• One of the first applications for computers– bilingual dictionary > word-word translation
• Good translation requires understanding!– War and Peace, The Sound and The Fury?
• What can we do? Sublanguages.– technical domains, static vocabulary– Meteo in Canada, Caterpillar Tractor Manuals,
Botanical descriptions, Military Messages
LING 2000 - 2006 NLP5
Translation Issues: Korean to English
- Word order- Dropped arguments- Lexical ambiguities- Structure vs morphology
LING 2000 - 2006 NLP6
Common Thread
• Predicate-argument structure– Basic constituents of the sentence and how
they are related to each other• Constituents
– John, Mary, the dog, pleasure, the store.• Relations
– Loves, feeds, go, to, bring
LING 2000 - 2006 NLP9
Machine Translation Lexical Choice- Word Sense Disambiguation
Iraq lost the battle. Ilakuka centwey ciessta. [Iraq ] [battle] [lost].
John lost his computer. John-i computer-lul ilepelyessta. [John] [computer] [misplaced].
LING 2000 - 2006 NLP10
Natural Language Processing
• Syntax– Grammars, parsers, parse trees,
dependency structures• Semantics
– Subcategorization frames, semantic classes, ontologies, formal semantics
• Pragmatics– Pronouns, reference resolution, discourse
models
LING 2000 - 2006 NLP11
Syntactic Categories
• Nouns, pronouns, Proper nouns• Verbs, intransitive verbs, transitive verbs,
ditransitive verbs (subcategorization frames)
• Modifiers, Adjectives, Adverbs• Prepositions• Conjunctions
LING 2000 - 2006 NLP12
Syntactic Parsing
• The cat sat on the mat. Det Noun Verb Prep Det Noun
• Time flies like an arrow. Noun Verb Prep Det Noun
• Fruit flies like a banana. Noun Noun Verb Det Noun
Context Free Grammar
• S -> NP VP• NP -> det (adj) N• NP -> Proper N• NP -> N• VP -> V, VP -> V PP• VP -> V NP• VP -> V NP PP, PP -> Prep NP• VP -> V NP NP
LING 2000 - 2006 NLP13
LING 2000 - 2006 NLP14
Parses
V PP
VP
S
NP
the
the mat
satcat
onNPPrep
The cat sat on the mat
DetN
Det N
LING 2000 - 2006 NLP15
Parses
VPP
VP
S
NP
time
an arrow
flies
likeNPPrep
Time flies like an arrow.
N
Det N
LING 2000 - 2006 NLP16
Parses
V NP
VP
S
NP
flies like
anNDet
Time flies like an arrow.
Ntime
arrow
N
LING 2000 - 2006 NLP17
Features• C for Case, Subjective/Objective
– She visited her. • P for Person agreement, (1st, 2nd, 3rd)
– I like him, You like him, He likes him, • N for Number agreement, Subject/Verb
– He likes him, They like him.• G for Gender agreement, Subject/Verb
– English, reflexive pronouns He washed himself.– Romance languages, det/noun
• T for Tense, – auxiliaries, sentential complements, etc. – * will finished is bad
LING 2000 - 2006 NLP18
Probabilistic Context Free Grammars
• Adding probabilities• Lexicalizing the probabilities
LING 2000 - 2006 NLP19
Simple Context Free Grammar in BNFS → NP VPNP → Pronoun
| Noun | Det Adj Noun |NP PP
PP → Prep NPV → Verb
| Aux VerbVP → V
| V NP | V NP NP | V NP PP | VP PP
LING 2000 - 2006 NLP20
Simple Probabilistic CFGS → NP VPNP → Pronoun [0.10]
| Noun [0.20]| Det Adj Noun [0.50]|NP PP [0.20]
PP → Prep NP [1.00]V → Verb [0.33]
| Aux Verb [0.67]VP → V [0.10]
| V NP [0.40]| V NP NP [0.10]| V NP PP [0.20]| VP PP [0.20]
LING 2000 - 2006 NLP21
Simple Probabilistic Lexicalized CFGS → NP VPNP → Pronoun [0.10]
| Noun [0.20]| Det Adj Noun [0.50]|NP PP [0.20]
PP → Prep NP [1.00]V → Verb [0.33]
| Aux Verb [0.67]VP → V [0.87] {sleep, cry, laugh}
| V NP [0.03]| V NP NP [0.00]| V NP PP [0.00]| VP PP [0.10]
LING 2000 - 2006 NLP22
Simple Probabilistic Lexicalized CFGVP → V [0.30]
| V NP [0.60] {break,split,crack..}
| V NP NP [0.00]| V NP PP [0.00]| VP PP [0.10]
VP → V [0.10] what about | V NP [0.40] leave?| V NP NP [0.10] leave1,
leave2?| V NP PP [0.20]| VP PP [0.20]
LING 2000 - 2006 NLP23
Language to Logic
• John went to the book store. John store1, go(John, store1)
• John bought a book. buy(John,book1)
• John gave the book to Mary. give(John,book1,Mary)
• Mary put the book on the table. put(Mary,book1,table1)
LING 2000 - 2006 NLP24
SemanticsSame event - different sentences
John broke the window with a hammer.
John broke the window with the crack.
The hammer broke the window.
The window broke.
LING 2000 - 2006 NLP25
Same event - different syntactic frames
John broke the window with a hammer. SUBJ VERB OBJ MODIFIER
John broke the window with the crack. SUBJ VERB OBJ MODIFIER
The hammer broke the window. SUBJ VERB OBJ
The window broke. SUBJ VERB
LING 2000 - 2006 NLP26
Semantics -predicate arguments
break(AGENT, INSTRUMENT, PATIENT)
AGENT PATIENT INSTRUMENT John broke the window with a hammer.
INSTRUMENT PATIENT The hammer broke the window.
PATIENT The window broke.
Fillmore 68 - The case for case
LING 2000 - 2006 NLP27
AGENT PATIENT INSTRUMENT John broke the window with a hammer. SUBJ OBJ MODIFIER
INSTRUMENT PATIENT The hammer broke the window. SUBJ OBJ
PATIENT The window broke. SUBJ
LING 2000 - 2006 NLP28
Canonical Representation
break (Agent: animate, Instrument: tool, Patient: physical-object)
Agent <=> subj Instrument <=> subj, with-pp Patient <=> obj, subj
LING 2000 - 2006 NLP29
Syntax/semantics interaction
• Parsers will produce syntactically valid parses for semantically anomalous sentences
• Lexical semantics can be used to rule them out