Natural Language Processing
description
Transcript of Natural Language Processing
![Page 1: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/1.jpg)
Natural Language Processing
Finite State Morphology
Slides adapted from Jurafsky & Martin, Speech and Language Processing
![Page 2: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/2.jpg)
2
Outline
• Finite State Automata• (English) Morphology• Finite State Transducers
![Page 3: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/3.jpg)
Finite State Automata
• Let’s start with the sheep language from Chapter 2 /baa+!/
![Page 4: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/4.jpg)
Sheep FSA
• We can say the following things about this machine It has 5 states b, a, and ! are in its alphabet q0 is the start state
q4 is an accept state It has 5 transitions
![Page 5: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/5.jpg)
More Formally
• You can specify an FSA by enumerating the following things. The set of states: Q A finite alphabet: Σ A start state A set of accept/final states A transition relation that maps QxΣ to Q
![Page 6: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/6.jpg)
Yet Another View
• The guts of FSAs can ultimately be represented as tables
b a ! e
0 1
1 2
2 3
3 3 4
4
If you’re in state 1 and you’re looking at an a, go to state 2
![Page 7: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/7.jpg)
Recognition
• Recognition is the process of determining if a string should be accepted by a machine
• Or… it’s the process of determining if a string is in the language we’re defining with the machine
• Or… it’s the process of determining if a regular expression matches a string
• Those all amount the same thing in the end
![Page 8: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/8.jpg)
Recognition
• Traditionally, (Turing’s notion) this process is depicted with a tape.
![Page 9: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/9.jpg)
Recognition
• Simply a process of starting in the start state
• Examining the current input• Consulting the table• Going to a new state and updating
the tape pointer• Until you run out of tape
![Page 10: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/10.jpg)
D-Recognize
![Page 11: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/11.jpg)
Key Points
• Deterministic means that at each point in processing there is always one unique thing to do (no choices).
• D-recognize is a simple table-driven interpreter
• The algorithm is universal for all unambiguous regular languages. To change the machine, you simply
change the table.
![Page 12: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/12.jpg)
Recognition as Search
• You can view this algorithm as a trivial kind of state-space search.
• States are pairings of tape positions and state numbers.
• Operators are compiled into the table• Goal state is a pairing with the end of
tape position and a final accept state• It is trivial because?
![Page 13: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/13.jpg)
Non-Determinism
![Page 14: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/14.jpg)
Non-Determinism
• Yet another technique Epsilon transitions Key point: these transitions do not
examine or advance the tape during recognition
![Page 15: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/15.jpg)
Equivalence
• Non-deterministic machines can be converted to deterministic ones with a fairly simple construction
• That means that they have the same power; non-deterministic machines are not more powerful than deterministic ones in terms of the languages they can accept
![Page 16: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/16.jpg)
ND Recognition
• Two basic approaches (used in all major implementations of regular expressions, see Friedl 2006)
1. Either take a ND machine and convert it to a D machine and then do recognition with that.
2. Or explicitly manage the process of recognition as a state-space search (leaving the machine as is).
![Page 17: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/17.jpg)
Non-Deterministic Recognition: Search
• In a ND FSA there exists at least one path through the machine for a string that is in the language defined by the machine.
• But not all paths directed through the machine for an accept string lead to an accept state.
• No paths through the machine lead to an accept state for a string not in the language.
![Page 18: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/18.jpg)
Non-Deterministic Recognition
• So success in non-deterministic recognition occurs when a path is found through the machine that ends in an accept.
• Failure occurs when all of the possible paths for a given string lead to failure.
![Page 19: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/19.jpg)
Example
![Page 20: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/20.jpg)
Example
![Page 21: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/21.jpg)
Example
![Page 22: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/22.jpg)
Example
![Page 23: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/23.jpg)
Example
![Page 24: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/24.jpg)
Example
![Page 25: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/25.jpg)
Example
![Page 26: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/26.jpg)
Example
![Page 27: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/27.jpg)
Key Points
• States in the search space are pairings of tape positions and states in the machine.
• By keeping track of as yet unexplored states, a recognizer can systematically explore all the paths through the machine given an input.
![Page 28: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/28.jpg)
Why Bother?
• Non-determinism doesn’t get us more formal power and it causes headaches so why bother? More natural (understandable) solutions Not always equivalent
![Page 29: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/29.jpg)
Compositional Machines
• Formal languages are just sets of strings
• Therefore, we can talk about various set operations (intersection, union, concatenation)
• This turns out to be a useful exercise
![Page 30: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/30.jpg)
Union
![Page 31: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/31.jpg)
Concatenation
![Page 32: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/32.jpg)
32
Words
• Finite-state methods are particularly useful in dealing with a lexicon
• Many devices, most with limited memory, need access to large lists of words
• And they need to perform fairly sophisticated tasks with those lists
• So we’ll first talk about some facts about words and then come back to computational methods
![Page 33: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/33.jpg)
33
English Morphology
• Morphology is the study of the ways that words are built up from smaller meaningful units called morphemes
• We can usefully divide morphemes into two classes Stems: The core meaning-bearing units Affixes: Bits and pieces that adhere to
stems to change their meanings and grammatical functions
![Page 34: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/34.jpg)
34
English Morphology
• We can further divide morphology up into two broad classes Inflectional Derivational
![Page 35: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/35.jpg)
35
Inflectional Morphology
• Inflectional morphology concerns the combination of stems and affixes where the resulting word: Has the same word class as the original Serves a grammatical/semantic purpose
that is Different from the original But is nevertheless transparently related to
the original
![Page 36: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/36.jpg)
36
Nouns and Verbs in English
• Nouns are simple Markers for plural and possessive
• Verbs are only slightly more complex Markers appropriate to the tense of the
verb
![Page 37: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/37.jpg)
37
Regulars and Irregulars
• It is a little complicated by the fact that some words misbehave (refuse to follow the rules) Mouse/mice, goose/geese, ox/oxen Go/went, fly/flew
• The terms regular and irregular are used to refer to words that follow the rules and those that don’t
![Page 38: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/38.jpg)
38
Regular and Irregular Verbs
• Regulars… Walk, walks, walking, walked, walked
• Irregulars Eat, eats, eating, ate, eaten Catch, catches, catching, caught, caught Cut, cuts, cutting, cut, cut
![Page 39: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/39.jpg)
39
Derivational Morphology
• Derivational morphology is the messy stuff that no one ever taught you. Quasi-systematicity Irregular meaning change Changes of word class
![Page 40: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/40.jpg)
40
Derivational Examples
• Verbs and Adjectives to Nouns
-ation computerize computerization
-ee appoint appointee
-er kill killer
-ness fuzzy fuzziness
![Page 41: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/41.jpg)
41
Derivational Examples
• Nouns and Verbs to Adjectives
-al computation computational
-able embrace embraceable
-less clue clueless
![Page 42: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/42.jpg)
42
Morphology and FSAs
• We’d like to use the machinery provided by FSAs to capture these facts about morphology Accept strings that are in the language Reject strings that are not And do so in a way that doesn’t require
us to in effect list all the words in the language
![Page 43: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/43.jpg)
43
Start Simple
• Regular singular nouns are ok• Regular plural nouns have an -s on
the end• Irregulars are ok as is
![Page 44: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/44.jpg)
44
Simple Rules
![Page 45: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/45.jpg)
45
Now Plug in the Words
![Page 46: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/46.jpg)
46
Derivational Rules
If everything is an accept state how do things ever get rejected?
![Page 47: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/47.jpg)
47
Parsing/Generation vs. Recognition
• We can now run strings through these machines to recognize strings in the language
• But recognition is usually not quite what we need Often if we find some string in the language we
might like to assign a structure to it (parsing) Or we might have some structure and we want to
produce a surface form for it (production/generation)• Example
From “cats” to “cat +N +PL”
![Page 48: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/48.jpg)
48
Finite State Transducers
• The simple story Add another tape Add extra symbols to the transitions
On one tape we read “cats”, on the other we write “cat +N +PL”
![Page 49: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/49.jpg)
49
FSTs
![Page 50: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/50.jpg)
50
Applications
• The kind of parsing we’re talking about is normally called morphological analysis
• It can either be • An important stand-alone component of
many applications (spelling correction, information retrieval)
• Or simply a link in a chain of further linguistic analysis
![Page 51: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/51.jpg)
51
Transitions
• c:c means read a c on one tape and write a c on the other• +N:ε means read a +N symbol on one tape and write
nothing on the other• +PL:s means read +PL and write an s
c:c a:a t:t +N: ε +PL:s
![Page 52: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/52.jpg)
52
Ambiguity
• Recall that in non-deterministic recognition multiple paths through a machine may lead to an accept state.• Didn’t matter which path was actually
traversed• In FSTs the path to an accept state
does matter since different paths represent different parses and different outputs will result
![Page 53: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/53.jpg)
53
Ambiguity
• What’s the right parse (segmentation) for• Unionizable• Union-ize-able• Un-ion-ize-able
• Each represents a valid path through the derivational morphology machine.
![Page 54: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/54.jpg)
54
Ambiguity
• There are a number of ways to deal with this problem• Simply take the first output found• Find all the possible outputs (all paths)
and return them all (without choosing)• Bias the search so that only one or a few
likely paths are explored
![Page 55: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/55.jpg)
55
The Gory Details
• Of course, its not as easy as • “cat +N +PL” <-> “cats”
• As we saw earlier there are geese, mice and oxen
• But there are also a whole host of spelling/pronunciation changes that go along with inflectional changes• Cats vs Dogs• Fox and Foxes
![Page 56: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/56.jpg)
56
Multi-Tape Machines
• To deal with these complications, we will add more tapes and use the output of one tape machine as the input to the next
• So to handle irregular spelling changes we’ll add intermediate tapes with intermediate symbols
![Page 57: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/57.jpg)
57
Multi-Level Tape Machines
• We use one machine to transduce between the lexical and the intermediate level, and another to handle the spelling changes to the surface tape
![Page 58: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/58.jpg)
58
Overall Scheme
![Page 59: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/59.jpg)
59
Cascades
• This is an architecture that we’ll see again and again• Overall processing is divided up into
distinct rewrite steps• The output of one layer serves as the
input to the next• The intermediate tapes may or may not
wind up being useful in their own right
![Page 60: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/60.jpg)
60
Overall Plan
![Page 61: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/61.jpg)
61
Final Scheme
![Page 62: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/62.jpg)
Conclusion
• Finite state machines provide flexible and efficient models of words
• Finite state transducers are the method of choice for morphological analysis If there is a solved problem in NLP, this
is it!
• Why not use finite state techniques for all problems in NLP?
62
![Page 63: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/63.jpg)
63
Lexical to Intermediate Level
![Page 64: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/64.jpg)
64
Intermediate to Surface
• The add an “e” rule as in fox^s# <-> foxes#
![Page 65: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/65.jpg)
65
Foxes
![Page 66: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/66.jpg)
66
Composition
1. Create a set of new states that correspond to each pair of states from the original machines (New states are called (x,y), where x is a state from M1, and y is a state from M2)
2. Create a new FST transition table for the new machine according to the following intuition …
![Page 67: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/67.jpg)
67
Composition
• There should be a transition between two states in the new machine if it’s the case that the output for a transition from a state from M1, is the same as the input to a transition from M2 or …
![Page 68: Natural Language Processing](https://reader031.fdocuments.us/reader031/viewer/2022012922/56814a17550346895db73e41/html5/thumbnails/68.jpg)
68
Composition
• δ3((xa,ya), i:o) = (xb,yb) iffThere exists c such thatδ1(xa, i:c) = xb AND
δ2(ya, c:o) = yb