Chapter 23(continued) Natural Language for...
Transcript of Chapter 23(continued) Natural Language for...
![Page 1: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/1.jpg)
Chapter 23(continued)
Natural Language for Communication
![Page 2: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/2.jpg)
2
Phrase Structure Grammars• Probabilistic context-free grammar (PCFG):
– Context free: the left-hand side of the grammar consists of a single nonterminal symbol
– Probabilistic: the grammar assigns a probability to every string– Lexicon: list of allowable words– Grammar: a collection of rules that defines a language as a set of
allowable string of words– Example: Fish people fish tanks Backus–Naur Form (BNF)
![Page 3: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/3.jpg)
3
Phrase Structure Grammars (continued)• Probabilistic context-free grammar (PCFG):
– Context free: the left-hand side of the grammar consists of a single nonterminal symbol
– Probabilistic: the grammar assigns a probability to every string– Lexicon: list of allowable words– Grammar: a collection of rules that defines a language as a set of
allowable string of words– Example: Fish people fish tanks PCFG
![Page 4: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/4.jpg)
4
Phrase Structure Grammars (continued)• Example: Fish people fish tanks
GrammarLexicon
0.2 0.5 0.6 0.2
0.1 0.7
0.5
0.9
Probability = 0.2 x 0.5 x 0.6 x 0.2 x 0.1 x 0.7 x 0.5 x 0.9
![Page 5: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/5.jpg)
5
Parsing• Objective: analyzing a string of words to uncover its
phrase structure, given the lexicon and grammar.– The result of parsing is a parse tree
• Top-down parse and bottom-up parse– Naïve solutions: left-to-right or right-to-left parse– Example: The wumpus is dead
![Page 6: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/6.jpg)
6
Parsing (continued)• Objective: analyzing a string of words to uncover its
phrase structure, given the lexicon and grammar.– The result of parsing is a parse tree
• Naïve solutions:– Top-down parse and bottom-up parse– Example: The wumpus is dead
– Efficient? – Example:
Have the students in section 2 of Computer Science 101 take the exam.Have the students in section 2 of Computer Science 101 taken the exam?
![Page 7: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/7.jpg)
7
Parsing (continued)• Efficient solutions: chart parsers
– Using dynamic programming
• CYK algorithm– A bottom-up chart parser:
(Named after its inventors, John Cocke, Daniel Younger, and Tadeo Kasami)
– Input: lexicon, grammar and query strings.– Output: a parse tree– Three major steps:
• Assign lexicons• Compute probability of adjacent phrases• Solve grammar conflict by selecting the most probable phrases
![Page 8: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/8.jpg)
8
Parsing (continued)• CYK algorithm
– Three major steps:• Assign lexicons• Compute probability of adjacent phrases• Solve grammar conflict by selecting the most probable phrases
Assign lexicons
Compute probability of adjacent phrases
Solve grammar conflict
![Page 9: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/9.jpg)
9
Parsing (continued)• Example: Fish people fish tanks
GrammarLexicon
![Page 10: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/10.jpg)
10
Parsing (continued)• Example: by Dr. Christopher Manning from Stanford
![Page 11: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/11.jpg)
11
Augmented Parsing Methods• Lexicalized PCFGs
– BNF notation for grammars too restrictive– Augmented grammar
• adding logical inference • to construct sentence semantics
![Page 12: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/12.jpg)
12
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora – Indexicality– Vagueness – Discourse structure – Metonymy – Metaphor – Noncompositionality
![Page 13: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/13.jpg)
13
Real language• Real human languages provide many problems for NLP
– Ambiguity: can be lexical (polysemy), syntactic, semantic, referential
I ate spaghetti with meatballs
![Page 14: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/14.jpg)
14
Real language• Real human languages provide many problems for NLP
– Ambiguity: can be lexical (polysemy), syntactic, semantic, referential
I ate spaghetti with meatballssalad
![Page 15: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/15.jpg)
15
Real language• Real human languages provide many problems for NLP
– Ambiguity: can be lexical (polysemy), syntactic, semantic, referential
I ate spaghetti with meatballssaladabandon
![Page 16: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/16.jpg)
16
Real language• Real human languages provide many problems for NLP
– Ambiguity: can be lexical (polysemy), syntactic, semantic, referential
I ate spaghetti with meatballssaladabandona fork
![Page 17: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/17.jpg)
17
Real language• Real human languages provide many problems for NLP
– Ambiguity: can be lexical (polysemy), syntactic, semantic, referential
I ate spaghetti with meatballssaladabandona forka friend
![Page 18: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/18.jpg)
18
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora: using pronouns to refer back to entities already
introduced in the text
After Mary proposed to John, they found a preacher and got married.
![Page 19: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/19.jpg)
19
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora: using pronouns to refer back to entities already
introduced in the text
After Mary proposed to John, they found a preacher and got married.For the honeymoon, they went to Hawaii
![Page 20: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/20.jpg)
20
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora: using pronouns to refer back to entities already
introduced in the text
After Mary proposed to John, they found a preacher and got married.For the honeymoon, they went to HawaiiMary saw a ring through the window and asked John for it
![Page 21: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/21.jpg)
21
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora: using pronouns to refer back to entities already
introduced in the text
After Mary proposed to John, they found a preacher and got married.For the honeymoon, they went to HawaiiMary saw a ring through the window and asked John for itMary threw a rock at the window and broke it
![Page 22: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/22.jpg)
22
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora – Indexicality: indexical sentences refer to utterance situation (place,
time, S/H, etc.)
I am over hereWhy did you do that?
![Page 23: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/23.jpg)
23
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora – Indexicality– Vagueness – Discourse structure – Metonymy: using one noun phrase to stand for another
I've read ShakespeareChrysler announced record profitsThe ham sandwich on Table 4 wants another beer
![Page 24: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/24.jpg)
24
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora – Indexicality– Vagueness – Discourse structure – Metonymy – Metaphor: “Non-literal” usage of words and phrases
I've tried killing the process but it won't die. Its parent keeps it alive
![Page 25: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/25.jpg)
25
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora – Indexicality– Vagueness – Discourse structure – Metonymy – Metaphor – Noncompositionality
basketball shoes red bookbaby shoes red penalligator shoes red hairdesigner shoes red herringbrake shoes
![Page 26: Chapter 23(continued) Natural Language for Communicationweb.eecs.utk.edu/~leparker/Courses/CS594-fall13/Lectures/20-Slide… · 0.2 0.5 0.6 0.2 0.1 0.7 0.5 0.9 Probability = 0.2 x](https://reader033.fdocuments.us/reader033/viewer/2022060603/6057cd3952a0bb26941ee36d/html5/thumbnails/26.jpg)
26
Real language• Real human languages provide many problems for NLP
– Ambiguity – Anaphora – Indexicality– Vagueness – Discourse structure – Metonymy – Metaphor – Noncompositionality
• Interpreting natural language using computer agents is challenging and still an open problem (but we are doing better)