Basic Parsing with Context- Free Grammarskathy/NLP/ClassSlides/Class7... · 2009-09-29 ·...

Post on 11-Aug-2020

0 views 0 download

Transcript of Basic Parsing with Context- Free Grammarskathy/NLP/ClassSlides/Class7... · 2009-09-29 ·...

Basic Parsing with Context-Free Grammars

1

Some slides adapted from Julia Hirschberg and Dan Jurafsky

To view past videos httpglobecvncolumbiaedu8080oncampusph

pc=133ae14752e27fde909fdbd64c06b337

Usually available only for 1 week Right now available for all previous lectures

2

3

4

5

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language XrsquoParsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

6

Many possible CFGs for English here is an example (fragment) S rarr NP VP VP rarr V NP NP rarr Det N | Adj NP N rarr boy | girl V rarr sees | likes Adj rarr big | small DetP rarr a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

S NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

To view past videos httpglobecvncolumbiaedu8080oncampusph

pc=133ae14752e27fde909fdbd64c06b337

Usually available only for 1 week Right now available for all previous lectures

2

3

4

5

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language XrsquoParsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

6

Many possible CFGs for English here is an example (fragment) S rarr NP VP VP rarr V NP NP rarr Det N | Adj NP N rarr boy | girl V rarr sees | likes Adj rarr big | small DetP rarr a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

S NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

3

4

5

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language XrsquoParsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

6

Many possible CFGs for English here is an example (fragment) S rarr NP VP VP rarr V NP NP rarr Det N | Adj NP N rarr boy | girl V rarr sees | likes Adj rarr big | small DetP rarr a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

S NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

4

5

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language XrsquoParsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

6

Many possible CFGs for English here is an example (fragment) S rarr NP VP VP rarr V NP NP rarr Det N | Adj NP N rarr boy | girl V rarr sees | likes Adj rarr big | small DetP rarr a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

S NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

5

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language XrsquoParsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

6

Many possible CFGs for English here is an example (fragment) S rarr NP VP VP rarr V NP NP rarr Det N | Adj NP N rarr boy | girl V rarr sees | likes Adj rarr big | small DetP rarr a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

S NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Declarative formalisms like CFGs FSAs define the legal strings of a language -- but only tell you lsquothis is a legal string of the language XrsquoParsing algorithms specify how to recognize the strings of a language and assign each string one (or more) syntactic analyses

6

Many possible CFGs for English here is an example (fragment) S rarr NP VP VP rarr V NP NP rarr Det N | Adj NP N rarr boy | girl V rarr sees | likes Adj rarr big | small DetP rarr a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

S NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Many possible CFGs for English here is an example (fragment) S rarr NP VP VP rarr V NP NP rarr Det N | Adj NP N rarr boy | girl V rarr sees | likes Adj rarr big | small DetP rarr a | the

big the small girl sees a boy John likes a girl I like a girl I sleep The old dog the footsteps of the young

the small boy likes a girl

S NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

S NP VP VP VS Aux NP VP VP -gt V PPS -gt VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young | flight

NP PropN V dog | include | prefer | book

NP -gt PronounNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Parse Tree for lsquoThe old dog the footsteps of the youngrsquo for Prior CFG

S

NP VP

NPV

DETNOM

N PP

DET NOM

N

The old dog the

footstepsof the young

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Searching FSAs Finding the right path through the automaton Search space defined by structure of FSASearching CFGs Finding the right parse tree among all possible

parse trees Search space defined by the grammarConstraints provided by the input sentenceand the automaton or grammar

10

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Builds from the root S node to the leavesExpectation-basedCommon search strategy Top-down left-to-right backtracking Try first rule with LHS = S Next expand all constituents in these treesrules Continue until leaves are POS Backtrack when candidate POS does not match input string

11

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

ldquoThe old dog the footsteps of the youngrdquoWhere does backtracking happen

What are the computational disadvantages

What are the advantages

12

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Parser begins with words of input and builds up trees applying grammar rules whose RHS matches

Det N V Det N Prep Det NThe old dog the footsteps of the young

Det Adj N Det N Prep Det NThe old dog the footsteps of the young

Parse continues until an S root node reached or no further node expansion possible

13

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Det N V Det N Prep Det NThe old dog the footsteps of the youngDet Adj N Det N Prep Det N

14

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

When does disambiguation occur

What are the computational advantages and disadvantages

15

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Top-Down parsers ndash they never explore illegal parses (eg which canrsquot form an S) -- but waste time on trees that can never match the inputBottom-Up parsers ndash they never explore trees inconsistent with input -- but waste time exploring illegal parses (with no S root)For both find a control strategy -- how explore search space efficiently Pursuing all parses in parallel or backtrack or hellip Which rule to apply next Which node to expand next

16

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Dynamic Programming Approaches ndash Use a chart to represent partial results

CKY Parsing Algorithm Bottom-up Grammar must be in Normal Form The parse tree might not be consistent with linguistic

theoryEarly Parsing Algorithm Top-down Expectations about constituents are confirmed by input A POS tag for a word that is not predicted is never addedChart Parser

17

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Allows arbitrary CFGsFills a table in a single sweep over the input words Table is length N+1 N is number of words Table entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

18

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

19

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

20

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

21

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

As with most dynamic programming approaches the answer is found by looking in the table in the right placeIn this case there should be an S state in the final column that spans from 0 to n+1 and is completeIf thatrsquos the case yoursquore done S ndashgt α [0n+1]

22

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

March through chart left-to-rightAt each step apply 1 of 3 operators Predictor

Create new states representing top-down expectations Scanner

Match word predictions (rule with word after dot) to words

CompleterWhen a state is complete see what rules were looking for that completed constituent

23

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Given a state With a non-terminal to right of dot (not a part-

of-speech category) Create a new state for each expansion of the

non-terminal Place these new states into same chart entry as

generated state beginning and ending where generating state ends So predictor looking at

S -gt VP [00] results in

VP -gt Verb [00]VP -gt Verb NP [00]

24

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Given a state With a non-terminal to right of dot that is a part-of-

speech category If the next word in the input matches this POS Create a new state with dot moved over the non-

terminal So scanner looking at VP -gt Verb NP [00] If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01] Add this state to chart entry following current one Note Earley algorithm uses top-down input to

disambiguate POS Only POS predicted by some state can get added to chart

25

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this category copy state move dot insert in current chart entryGiven NP -gt Det Nominal [13] VP -gt Verb NP [01]Add VP -gt Verb NP [03]

26

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Find an S state in the final column that spans from 0 to n+1 and is complete

If thatrsquos the case yoursquore done S ndashgt α [0n+1]

27

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

More specificallyhellip

1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches2 Add new predictions3 Go to 2

3 Look at N+1 to see if you have a winner

28

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

29

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

S NP VP VP VS Aux NP VP PP -gt Prep NPNP Det Nom N old | dog | footsteps |

young

NP PropN V dog | include | preferNom -gt Adj Nom Aux doesNom N Prep from | to | on | ofNom N Nom PropN Bush | McCain |

ObamaNom Nom PP Det that | this | a| theVP V NP Adj -gt old | green | red

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

31

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

32

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

33

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

What kind of algorithms did we just describe Not parsers ndash recognizers

The presence of an S state with the right attributes in the right place indicates a successful recognitionBut no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

34

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

With the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

35

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

S8S9

S10

S11

S13S12

S8

S9S8

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

All the possible parses for an input are in the table

We just need to read off all the backpointers from every complete S in the last column of the table

Find all the S -gt X [0N+1]

Follow the structural traces from the Completer

Of course this wonrsquot be polynomial time since there could be an exponential number of trees

We can at least represent ambiguity efficiently

37

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

38

)( εαα ⎯rarr⎯ΑΒ⎯rarr⎯Α

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Solutions Rewrite the grammar (automatically) to a weakly

equivalent one which is not left-recursiveeg The man on the hill with the telescopehellipNP NP PP (wanted Nom plus a sequence of PPs)NP Nom PPNP NomNom Det NhellipbecomeshellipNP Nom NPrsquoNom Det NNPrsquo PP NPrsquo (wanted a sequence of PPs)NPrsquo e

Not so obvious what these rules meanhellip

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Harder to detect and eliminate non-immediate left recursion

NP --gt Nom PPNom --gt NP

Fix depth of search explicitly

Rule ordering non-recursive rules firstNP --gt Det NomNP --gt NP PP

40

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Multiple legal structures Attachment (eg I saw a man on a hill with a

telescope) Coordination (eg younger cats and dogs) NP bracketing (eg Spanish language teachers)

41

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

42

NP vs VP Attachment

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Solution Return all possible parses and disambiguate

using ldquoother methodsrdquo

43

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up

Parsing is a search problem which may be implemented with many control strategies Top-Down or Bottom-Up approaches each have

problemsCombining the two solves some but not all issues

Left recursion Syntactic ambiguityNext time Making use of statistical information about syntactic constituents Read Ch 14

44

  • Slide Number 1
  • Announcements
  • Homework Questions
  • Evaluation
  • Syntactic Parsing
  • Syntactic Parsing
  • CFG Example
  • Modified CFG
  • Slide Number 9
  • Parsing as a Form of Search
  • Top-Down Parser
  • Rule Expansion
  • Bottom-Up Parsing
  • Slide Number 14
  • Bottom-up parsing
  • Whatrsquos rightwrong withhellip
  • Some Solutions
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • How do we know we are done
  • Earley
  • Example
  • CFG for Fragment of English
  • Example
  • Example
  • Example
  • Details
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Left Recursion vs Right Recursion
  • Slide Number 39
  • Slide Number 40
  • Another Problem Structural ambiguity
  • Slide Number 42
  • Slide Number 43
  • Summing Up