Hector Miguel Chavez Western Michigan University.

44
Hector Miguel Chavez Western Michigan University

Transcript of Hector Miguel Chavez Western Michigan University.

Page 1: Hector Miguel Chavez Western Michigan University.

Hector Miguel ChavezWestern Michigan University

Page 2: Hector Miguel Chavez Western Michigan University.

Presentation Outline

2May 27, 2009

• Introduction• Chomsky normal form• Preliminary simplifications• Final steps

• Greibach Normal Form• Algorithm (Example)

• Summary

Page 3: Hector Miguel Chavez Western Michigan University.

Introduction

3May 27, 2009

Grammar: G = (V, T, P, S)

T = { a, b }T = { a, b }Terminals

V = A, B, CV = A, B, CVariables

SSStart Symbol

P = S → AP = S → AProduction

Page 4: Hector Miguel Chavez Western Michigan University.

Introduction

4May 27, 2009

Grammar example

S → aBScS → abcBa → aBBb → bb

L = { anbncn | n ≥ 1 }L = { anbncn | n ≥ 1 }

S aBSc aBabcc aaBbcc aabbcc

Page 5: Hector Miguel Chavez Western Michigan University.

Introduction

5May 27, 2009

Context free grammar

The head of any production contains only one non-terminal symbol

S → PP → aPbP → ε

L = { anbn | n ≥ 0 }L = { anbn | n ≥ 0 }

Page 6: Hector Miguel Chavez Western Michigan University.

Presentation Outline

6May 27, 2009

• Introduction• Chomsky normal form• Preliminary simplifications• Final simplification

• Greibach Normal Form• Algorithm (Example)

• Summary

Page 7: Hector Miguel Chavez Western Michigan University.

Chomsky Normal Form

7May 27, 2009

A → BCA → α

A context free grammar is said to be in Chomsky Normal Form if all productions are in the following form:

• A, B and C are non terminal symbols• α is a terminal symbol

Page 8: Hector Miguel Chavez Western Michigan University.

Presentation Outline

8May 27, 2009

• Introduction• Chomsky normal form• Preliminary simplifications• Final steps

• Greibach Normal Form• Algorithm (Example)

• Summary

Page 9: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

9May 27, 2009

Eliminate Useless SymbolsEliminate Useless Symbols1

Eliminate ε productions Eliminate ε productions 2

Eliminate unit productionsEliminate unit productions3

There are three preliminary simplifications

Page 10: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

10May 27, 2009

Eliminate Useless Symbols

We need to determine if the symbol is useful by identifying if a symbol is generating and is reachable

• X is generating if X ω for some terminal string ω.• X is reachable if there is a derivation X αXβ for some α and β

**

Page 11: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

11May 27, 2009

Example: Removing non-generating symbols

S → AB | aA → bS → AB | aA → b Initial CFL grammar

S → AB | aA → bS → AB | aA → b Identify generating symbols

S → a A → b S → a A → b Remove non-generating

Page 12: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

12May 27, 2009

Example: Removing non-reachable symbols

S → a S → a Eliminate non-reachable

S → a A → b S → a A → b Identify reachable symbols

Page 13: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

13May 27, 2009

The order is important.

S → AB | aA → bS → AB | aA → b

Looking first for non-reachable symbols and then for non-generating symbols can still leave some useless symbols.

S → a A → b S → a A → b

Page 14: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

14May 27, 2009

Finding generating symbols

If there is a production A → α, and every symbol of α is already known to be generating. Then A is generating

S → AB | aA → bS → AB | aA → b

We cannot use S → AB because B has not been established to be generating

Page 15: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

15May 27, 2009

Finding reachable symbols

S is surely reachable. All symbols in the body of a production with S in the head are reachable.

S → AB | aA → bS → AB | aA → b

In this example the symbols {S, A, B, a, b} are reachable.

Page 16: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

16May 27, 2009

Eliminate Useless SymbolsEliminate Useless Symbols1

Eliminate ε productions Eliminate ε productions 2

Eliminate unit productionsEliminate unit productions3

There are three preliminary simplifications

Page 17: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

17May 27, 2009

Eliminate ε Productions

• In a grammar ε productions are convenient but not essential

• If L has a CFG, then L – {ε} has a CFG

Nullable variable

A εA ε*

Page 18: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

18May 27, 2009

If A is a nullable variable

• Whenever A appears on the body of a production A might or might not derive ε

S → ASA | aBA → B | SB → b | ε

Nullable: {A, B}

Page 19: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

19May 27, 2009

• Create two version of the production, one with the nullable variable and one without it

• Eliminate productions with ε bodies

S → ASA | aBA → B | SB → b | ε

S → ASA | aB | AS | SA | S | aA → B | SB → b

Eliminate ε Productions

Page 20: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

20May 27, 2009

• Create two version of the production, one with the nullable variable and one without it

• Eliminate productions with ε bodies

S → ASA | aBA → B | SB → b | ε

S → ASA | aB | AS | SA | S | aA → B | SB → b

Eliminate ε Productions

Page 21: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

21May 27, 2009

• Create two version of the production, one with the nullable variable and one without it

• Eliminate productions with ε bodies

S → ASA | aBA → B | SB → b | ε

S → ASA | aB | AS | SA | S | aA → B | SB → b

Eliminate ε Productions

Page 22: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

22May 27, 2009

Eliminate Useless SymbolsEliminate Useless Symbols1

Eliminate ε productions Eliminate ε productions 2

Eliminate unit productionsEliminate unit productions3

There are three preliminary simplifications

Page 23: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

23May 27, 2009

Eliminate unit productions

A unit production is one of the form A → B where both A and B are variables

A BA B*

A → B, B → ω, then A → ω

Identify unit pairs

Page 24: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

24May 27, 2009

Example:

I → a | b | Ia | Ib | I0 | I1F → I | (E)T → F | T * FE → T | E + T

Pairs Productions

( E, E ) E → E + T

( E, T ) E → T * F

( E, F ) E → (E)

( E, I ) E → a | b | Ia | Ib | I0 | I1

( T, T ) T → T * F

( T, F ) T → (E)

( T, I ) T → a | b | Ia |Ib | I0 | I1

( F, F ) F → (E)

( F, I ) F → a | b | Ia | Ib | I0 | I1

( I, I ) I → a | b | Ia | Ib | I0 | I1

Basis: (A, A) is a unit pair of any variable A, if A A by 0 steps.*

T = {*, +, (, ), a, b, 0, 1}

Page 25: Hector Miguel Chavez Western Michigan University.

Preliminary Simplifications

25May 27, 2009

Example:

Pairs Productions

… …

( T, T ) T → T * F

( T, F ) T → (E)

( T, I ) T → a | b | Ia |Ib | I0 | I1

… …

I → a | b | Ia | Ib | I0 | I1E → E + T | T * F | (E ) | a | b | la | lb | l0 | l1T → T * F | (E) | a | b | Ia | Ib | I0 | I1F → (E) | a | b | Ia | Ib | I0 | I1

Page 26: Hector Miguel Chavez Western Michigan University.

Presentation Outline

26May 27, 2009

• Introduction• Chomsky normal form• Preliminary simplifications• Final steps

• Greibach Normal Form• Algorithm (Example)

• Summary

Page 27: Hector Miguel Chavez Western Michigan University.

Final Simplification

27May 27, 2009

Chomsky Normal Form (CNF)

1. Arrange that all bodies of length 2 or more to consists only of variables.

2. Break bodies of length 3 or more into a cascade of productions, each with a body consisting of two variables.

Starting with a CFL grammar with the preliminary simplifications performed

Page 28: Hector Miguel Chavez Western Michigan University.

Final Simplification

28May 27, 2009

Step 1: For every terminal α that appears in a body of length 2 or more create a new variable that has only one production.

E → E + T | T * F | (E ) | a | b | la | lb | l0 | l1T → T * F | (E) | a | b | Ia | Ib | I0 | I1F → (E) | a | b | Ia | Ib | I0 | I1I → a | b | Ia | Ib | I0 | I1

E → EPT | TMF | LER | a | b | lA | lB | lZ | lOT → TMF | LER | a | b | IA | IB | IZ | IOF → LER | a | b | IA | IB | IZ | IOI → a | b | IA | IB | IZ | IOA → a B → b Z → 0 O → 1 P → + M → * L → ( R → )

Page 29: Hector Miguel Chavez Western Michigan University.

Final Simplification

29May 27, 2009

Step 2: Break bodies of length 3 or more adding more variables

E → EPT | TMF | LER | a | b | lA | lB | lZ | lOT → TMF | LER | a | b | IA | IB | IZ | IOF → LER | a | b | IA | IB | IZ | IOI → a | b | IA | IB | IZ | IOA → a B → b Z → 0 O → 1 P → + M → * L → ( R → )

C1 → PTC2 → MFC3 → ER

Page 30: Hector Miguel Chavez Western Michigan University.

Presentation Outline

30May 27, 2009

• Introduction• Chomsky normal form• Preliminary simplifications• Final steps

• Greibach Normal Form• Algorithm (Example)

• Summary

Page 31: Hector Miguel Chavez Western Michigan University.

Greibach Normal Form

31May 27, 2009

A → αX

A context free grammar is said to be in Greibach Normal Form if all productions are in the following form:

• A is a non terminal symbols• α is a terminal symbol• X is a sequence of non terminal symbols. It may be empty.

Page 32: Hector Miguel Chavez Western Michigan University.

Presentation Outline

32May 27, 2009

• Introduction• Chomsky normal form• Preliminary simplifications• Final steps

• Greibach Normal Form• Algorithm (Example)

• Summary

Page 33: Hector Miguel Chavez Western Michigan University.

Greibach Normal Form

33May 27, 2009

Example:

S → XA | BBB → b | SBX → bA → a

CNFCNF

S = A1

X = A2

A = A3

B = A4

New LabelsNew Labels

A1 → A2A3 | A4A4

A4 → b | A1A4

A2 → bA3 → a

Updated CNFUpdated CNF

Page 34: Hector Miguel Chavez Western Michigan University.

Greibach Normal Form

34May 27, 2009

Example:

A1 → A2A3 | A4A4

A4 → b | A1A4

A2 → bA3 → a

First Step

Xk is a string of zero

or more variables

Ai → AjXk j > i Ai → AjXk j > i

A4 → A1A4

Page 35: Hector Miguel Chavez Western Michigan University.

Greibach Normal Form

35May 27, 2009

Example:

A1 → A2A3 | A4A4

A4 → b | A1A4

A2 → bA3 → a

A4 → A1A4

A4 → A2A3A4 | A4A4A4 | b

A4 → bA3A4 | A4A4A4 | b

First Step Ai → AjXk j > i Ai → AjXk j > i

Page 36: Hector Miguel Chavez Western Michigan University.

Greibach Normal Form

36May 27, 2009

Example:

Second Step

Eliminate Left Recursions

Eliminate Left Recursions

A1 → A2A3 | A4A4

A4 → bA3A4 | A4A4A4 | b A2 → bA3 → a

A4 → A4A4A4

Page 37: Hector Miguel Chavez Western Michigan University.

Greibach Normal Form

37May 27, 2009

Example:Second Step

Eliminate Left Recursions

Eliminate Left Recursions

A1 → A2A3 | A4A4

A4 → bA3A4 | A4A4A4 | b A2 → bA3 → a

A4 → bA3A4 | b | bA3A4Z | bZ

Z → A4A4 | A4A4Z

Page 38: Hector Miguel Chavez Western Michigan University.

Greibach Normal Form

38May 27, 2009

Example:

A1 → A2A3 | A4A4

A4 → bA3A4 | b | bA3A4Z | bZZ → A4A4 | A4A4 ZA2 → bA3 → a

A → αXA → αX

GNF

Page 39: Hector Miguel Chavez Western Michigan University.

Greibach Normal Form

39May 27, 2009

Example:A1 → A2A3 | A4A4

A4 → bA3A4 | b | bA3A4Z | bZZ → A4A4 | A4A4 ZA2 → bA3 → a

Z → bA3A4A4 | bA4 | bA3A4ZA4 | bZA4 | bA3A4A4 | bA4 | bA3A4ZA4 | bZA4

A1 → bA3 | bA3A4A4 | bA4 | bA3A4ZA4 | bZA4

Page 40: Hector Miguel Chavez Western Michigan University.

Greibach Normal Form

40May 27, 2009

Example:

A1 → bA3 | bA3A4A4 | bA4 | bA3A4ZA4 | bZA4

A4 → bA3A4 | b | bA3A4Z | bZZ → bA3A4A4 | bA4 | bA3A4ZA4 | bZA4 | bA3A4A4 | bA4 | bA3A4ZA4 | bZA4 A2 → bA3 → a

Grammar in Greibach Normal Form

Page 41: Hector Miguel Chavez Western Michigan University.

Presentation Outline

41May 27, 2009

Summary (Some properties)

• Every CFG that doesn’t generate the empty string can be simplified to the Chomsky Normal Form and Greibach Normal Form

• The derivation tree in a grammar in CNF is a binary tree

• In the GNF, a string of length n has a derivation of exactly n steps

• Grammars in normal form can facilitate proofs• CNF is used as starting point in the algorithm CYK

Page 42: Hector Miguel Chavez Western Michigan University.

Presentation Outline

42May 27, 2009

References

[1] Introduction to Automata Theory, Languages and Computation, John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman, 2nd edition, Addison Wesley 2001 (ISBN: 0-201-44124-1)

[2] CS-311 HANDOUT, Greibach Normal Form (GNF), Humaira Kamal, http://suraj.lums.edu.pk/~cs311w02/GNF-handout.pdf

[3] Conversion of a Chomsky Normal Form Grammar to Greibach Normal Form, Arup Guha, http://www.cs.ucf.edu/courses/cot4210/spring05/lectures/Lec14Greibach.ppt

Page 43: Hector Miguel Chavez Western Michigan University.

Test Questions

43May 27, 2009

1. Convert the following grammar to the Chomsky Normal Form.

2. Is the following grammar context-free?

3. Prove that the language L = { anbncn | n ≥ 1 } is not context-free.4. Convert the following grammar to the Greibach Normal Form.

S → PP → aPb | ε

S → aBSc | abcBa → aBBb → bb

S -> a | CD | CSA -> a | b | SSC -> aD -> AS

Page 44: Hector Miguel Chavez Western Michigan University.

Presentation Outline

44May 27, 2009