Logic Form Representations Reading: Chap 14, Jurafsky & Martin
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin,...
-
Upload
reynard-banks -
Category
Documents
-
view
225 -
download
1
Transcript of Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin,...
Context FreeGrammars
Reading: Chap 9, Jurafsky & Martin
This slide set was adapted from J. Martin, U. Colorado
Instructor: Rada Mihalcea
Slide 1
Syntax
Syntax = rules describing how words can connect to each other
* that and after year last I saw you yesterday colorless green ideas sleep furiously
• the kind of implicit knowledge of your native language that you had mastered by the time you were 3 or 4 years old without explicit instruction
• not necessarily the type of rules you were later taught in school.
Slide 1
Syntax
Why should you care?Grammar checkersQuestion answering Information extractionMachine translation
Slide 1
Context-Free Grammars
Capture constituency and orderingOrdering is easy
What are the rules that govern the ordering of words and bigger units in the language
What’s constituency?How do words group into units and what we say about how the various kinds of units behave
Slide 1
CFG Examples
S -> NP VPNP -> Det NOMINALNOMINAL -> NounVP -> VerbDet -> aNoun -> flightVerb -> left
these rules are defined independent of the context where they might occur -> CFG
Slide 1
CFGs
S -> NP VPThis says that there are units called S, NP, and VP in this
languageThat an S consists of an NP followed immediately by a VPDoesn’t say that that’s the only kind of SNor does it say that this is the only place that NPs and VPs occur
GenerativityAs with FSAs you can view these rules as either analysis or
synthesis machinesGenerate strings in the languageReject strings not in the languageImpose structures (trees) on strings in the language
Slide 1
Derivations
A derivation is a sequence of rules applied to a string that accounts for that stringCovers all the elements in the stringCovers only the elements in the string
Slide 1
Parsing
Parsing is the process of taking a string and a grammar and returning a (many?) parse tree(s) for that string
It is completely analogous to running a finite-state transducer It's just more powerful
Slide 1
Other Options
Regular languages (expressions)Too weak
Context-sensitive or Turing equivToo powerfulExample?
Slide 1
Context?
The notion of context in CFGs is not the same as the ordinary meaning of the word context in language.
All it really means is that the non-terminal on the left-hand side of a rule is out there all by itselfA -> B CMeans that I can rewrite an A as a B followed by a C regardless
of the context in which A is found
Slide 1
Sentence-Types
Declaratives: A plane leftS -> NP VP
Imperatives: Leave!S -> VP
Yes-No Questions: Did the plane leave?S -> Aux NP VP
WH Questions: When did the plane leave?S -> WH Aux NP VP
Slide 1
Recursion
We’ll have to deal with rules such as the following where the non-terminal on the left also appears somewhere on the right (directly).NP -> NP PP [[The flight] [to Boston]]VP -> VP PP [[departed Miami] [at noon]]
Slide 1
Recursion
Of course, this is what makes syntax interestingflights from DenverFlights from Denver to MiamiFlights from Denver to Miami in FebruaryFlights from Denver to Miami in February on a FridayFlights from Denver to Miami in February on a Friday under
$300Flights from Denver to Miami in February on a Friday under
$300 with lunch
Slide 1
The Point
If you have a rule likeVP -> V NP
It only cares that the thing after the verb is an NP. It doesn’t have to know about the internal affairs of that NP
Slide 1
The Point
VP -> V NPI hate
flights from DenverFlights from Denver to MiamiFlights from Denver to Miami in FebruaryFlights from Denver to Miami in February on a FridayFlights from Denver to Miami in February on a Friday under
$300Flights from Denver to Miami in February on a Friday under
$300 with lunch
Slide 1
Conjunctive Constructions
S -> S and SJohn went to NY and Mary followed him
NP -> NP and NPVP -> VP and VP…In fact the right rule for English is
X -> X and X
Slide 1
Agreement
This dogThose dogs
This dog eatsThose dogs eat
*This dogs*Those dog
*This dog eat*Those dogs eats
Slide 1
Subcategorization
Sneeze: John sneezedFind: Please find [a flight to NY]NPGive: Give [me]NP[a cheaper fare]NPHelp: Can you help [me]NP[with a flight]PPPrefer: I prefer [to leave earlier]TO-VPTold: I was told [United has a flight]S
…
*John sneezed the book*I prefer United has a flight*Give with a flight
Subcat expresses the constraints that a predicate (verb for now) places on the number and type of the argument it wants to take
Slide 1
So?
So the various rules for VPs overgenerate.They permit the presence of strings containing verbs and
arguments that don’t go togetherFor exampleVP -> V NP therefore
Sneezed the book is a VP since “sneeze” is a verb and “the book” is a valid NP
Subcategorization frames can fix this problem (“slow down” overgeneration)
Slide 1
Movement
• Core example– [[My travel agent]NP [booked [the flight]NP]VP]S
• I.e. “book” is a straightforward transitive verb. It expects a single NP arg within the VP as an argument, and a single NP arg as the subject.
Slide 1
Movement
• What about?– Which flight do you want me to have the travel agent
book?
• The direct object argument to “book” isn’t appearing in the right place. It is in fact a long way from where its supposed to appear.
• And note that its separated from its verb by 2 other verbs.
Slide 1
Formally…
To put all previous discussions/examples in a formal definition for CFG:
A context free grammar has four parameters:
1. A set of non-terminal symbols N2. A set of terminal symbols T3. A set of production rules P, each of the form A a, where
A is a non-terminal, and a is a string of symbols from the infinite set of strings (T N)*
4. A designated start symbol S
Slide 1
Grammar equivalence and normal form
Strong equivalence:– two grammars are strongly equivalent if:
• they generate the same set of strings• they assign the same phrase structure to each sentence
– two grammars are weakly equivalent if:• they generate the same set of strings• they do not assign the same phrase structure to each sentence
Normal form – Restrict the form of productions– Chomsky Normal Form (CNF)– Right hand side of the productions has either two terminals, or
one non-terminal– e.g. A -> BC A -> a– Any grammar can be translated into a weakly equivalent CNF– A -> B C D <=> A-> B X X -> C D