Basic Parsing with Context-Free Grammars

26
CS 4705 1 Basic Parsing with Context- Free Grammars

description

Basic Parsing with Context-Free Grammars. Analyzing Linguistic Units. Morphological parsing: analyze words into morphemes and affixes rule-based, FSAs, FSTs Ngrams for Language Modeling POS Tagging Syntactic parsing: identify constituents and their relationships - PowerPoint PPT Presentation

Transcript of Basic Parsing with Context-Free Grammars

Page 1: Basic Parsing with Context-Free Grammars

CS 4705 1

Basic Parsing with Context-Free Grammars

Page 2: Basic Parsing with Context-Free Grammars

2

Analyzing Linguistic Units

• Morphological parsing: – analyze words into morphemes and affixes– rule-based, FSAs, FSTs

• Ngrams for Language Modeling• POS Tagging• Syntactic parsing:

– identify constituents and their relationships– to see if a sentence is grammatical– to assign an abstract representation of meaning

Page 3: Basic Parsing with Context-Free Grammars

3

Syntactic Parsing

• Declarative formalisms like CFGs, FSAs define the legal strings of a language -- but only tell you ‘this is a legal string of the language X’

• Parsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

• Parsing useful for grammar checking, semantic analysis, MT, QA, information extraction, speech recognition…almost every task in NLP…but…

Page 4: Basic Parsing with Context-Free Grammars

4

Parsing as a Form of Search

• Searching FSAs– Finding the right path through the automaton– Search space defined by structure of FSA

• Searching CFGs– Finding the right parse tree among all possible parse

trees– Search space defined by the grammar

• Constraints provided by the input sentence and the automaton or grammar

Page 5: Basic Parsing with Context-Free Grammars

5

CFG for Fragment of English

S NP VP VP VS Aux NP VP PP -> Prep NPS VP N book | flight | meal | moneyNP Det Nom V book | include | preferNP PropN Aux doesNom N Nom Prep from | to | onNom N PropN Houston | TWANom Nom PP Det that | this | aVP V NP

TopD BotUp E.g. LC’s

Page 6: Basic Parsing with Context-Free Grammars

6

S

VP

NP

Nom

V Det N

Book that flight

Parse Tree for ‘Book that flight’ for Prior CFG

Page 7: Basic Parsing with Context-Free Grammars

7

Rule Expansion

VP V NP (2)Nom Nom PP

PropN Houston | TWANom N (4)Prep from | to | onNom N NomAux doesNP PropNV book | include | preferNP Det Nom (3)N book | flight | meal | moneyS VP (1)PP -> Prep NPS Aux NP VPVP VS NP VP

TopD BotUp E.g. LC’s

Det that | this | a

Page 8: Basic Parsing with Context-Free Grammars

8

Top-Down Parser

• Builds from the root S node to the leaves• Assuming we build all trees in parallel:

– Find all trees with root S (or all rules w/lhs S)– Next expand all constituents in these trees/rules– Continue until leaves are pos– Candidate trees failing to match pos of input string are

rejected (e.g. Book that flight matches only one subtree)

Page 9: Basic Parsing with Context-Free Grammars

9

Top-Down Search Space for CFG (expanding only leftmost leaves)

S S S

NP VP Aux NP VP VP

S S S S S S

NP VP NP VP Aux NP VP Aux NP VP VP VP

Det Nom PropN Det Nom PropN V NP V

Det Nom

N

Page 10: Basic Parsing with Context-Free Grammars

10

Bottom-Up Parsing

• Parser begins with words of input and builds up trees, applying grammar rules whose rhs match– Book that flight

N Det N V Det NBook that flight Book that flight

– ‘Book’ ambiguous (2 pos appear in grammar)– Parse continues until an S root node reached or no

further node expansion possible

Page 11: Basic Parsing with Context-Free Grammars

11

Two Candidates: One Successful Parse

SVP

VP NP NPNom Nom

V Det N V Det NBook that flight Book that flight

S ~ VP NP

Page 12: Basic Parsing with Context-Free Grammars

12

What’s right/wrong with….

• Top-Down parsers – they never explore illegal parses (e.g. which can’t form an S) -- but waste time on trees that can never match the input

• Bottom-Up parsers – they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)

• For both: find a control strategy -- how explore search space efficiently?– Pursuing all parses in parallel or backtrack or …?– Which rule to apply next?– Which node to expand next?

Page 13: Basic Parsing with Context-Free Grammars

13

A Possible Top-Down Parsing Strategy

• Depth-first search: – Agenda of search states: expand search space

incrementally, exploring most recently generated state (tree) each time

– When you reach a state (tree) inconsistent with input, backtrack to most recent unexplored state (tree)

• Which node to expand?– Leftmost or rightmost

• Which grammar rule to use?– Order in the grammar? How?

Page 14: Basic Parsing with Context-Free Grammars

14

Top-Down, Depth-First, Left-Right Strategy• Initialize agenda with ‘S’ tree and ptr to first word

(cur)• Loop: Until successful parse or empty agenda

– Apply next applicable grammar rule to leftmost unexpanded node (n) of current tree (t) on agenda and push resulting tree (t’) onto agenda

• If n is a POS category and matches the POS of cur, push new tree (t’’) onto agenda and increment cur

• Else pop t’ from agenda– Final agenda contains history of successful parse

• Does this flight include a meal?

Page 15: Basic Parsing with Context-Free Grammars

15

Fig 10.7

CFG

Page 16: Basic Parsing with Context-Free Grammars

16

Left Corners: Top-Down Parsing with Bottom-Up Filtering

• We saw: Top-Down, depth-first, L2R parsing – Expands non-terminals along the tree’s left edge down to

leftmost leaf of tree– Moves on to expand down to next leftmost leaf…– Note: In successful parse, current input word will be first

word in derivation of node the parser currently processing– So….look ahead to left-corner of the tree

• B is a left-corner of A if A =*=> Bα• Build table with left-corners of all non-terminals in

grammar and consult before applying rule

Page 17: Basic Parsing with Context-Free Grammars

17

Left Corners

Page 18: Basic Parsing with Context-Free Grammars

18

Left-Corner Table for CFG

Category Left Corners (POS)

S Det, PropN, Aux, V

NP Det, PropN

Nom N

VP V

Page 19: Basic Parsing with Context-Free Grammars

19

Left Recursion vs. Right Recursion

• Depth-first search will never terminate if grammar is left recursive (e.g. NP --> NP PP)

),( **

Page 20: Basic Parsing with Context-Free Grammars

20

• Solutions:– Rewrite the grammar (automatically?) to a weakly

equivalent one which is not left-recursivee.g. The man {on the hill with the telescope…}NP NP PP (wanted: Nom plus a sequence of PPs)NP Nom PPNP NomNom Det N…becomes…NP Nom NP’Nom Det NNP’ PP NP’ (wanted: a sequence of PPs)NP’ e• Not so obvious what these rules mean…

Page 21: Basic Parsing with Context-Free Grammars

21

– Harder to detect and eliminate non-immediate left recursion

– NP --> Nom PP– Nom --> NP

– Fix depth of search explicitly– Rule ordering: non-recursive rules first

• NP --> Det Nom• NP --> NP PP

Page 22: Basic Parsing with Context-Free Grammars

22

An Exercise: The city hall parking lot in town

• NP NP NP PP• NP Det Nom• NP Adj Nom• NP Nom Nom• Nom NP Nom• Nom N• PP Prep NP• N city | hall | lot | town• Adj parking• Prep to | for | in

Page 23: Basic Parsing with Context-Free Grammars

23

Another Problem: Structural ambiguity

• Multiple legal structures– Attachment (e.g. I saw a man on a hill with a telescope)– Coordination (e.g. younger cats and dogs)– NP bracketing (e.g. Spanish language teachers)

Page 24: Basic Parsing with Context-Free Grammars

24

NP vs. VP Attachment

Page 25: Basic Parsing with Context-Free Grammars

25

• Solution? – Return all possible parses and disambiguate using

“other methods”

Page 26: Basic Parsing with Context-Free Grammars

26

Summing Up

• Parsing is a search problem which may be implemented with many control strategies– Top-Down or Bottom-Up approaches each have

problems• Combining the two solves some but not all issues

– Left recursion– Syntactic ambiguity

• Next time: Making use of statistical information about syntactic constituents– Read Ch 12