19 Data Structure Design

download 19 Data Structure Design

of 22

Transcript of 19 Data Structure Design

  • 8/8/2019 19 Data Structure Design

    1/22

    20-Nov-10

    BNF Grammar

    Example design of a data structure

  • 8/8/2019 19 Data Structure Design

    2/22

    Form of BNF rules

    ::= "::="

    ::= ""

    ::= | | ' " ' ' " '

    ::= |

    ::= | | ::= | "|"

    ::= | |

    ::=

    ::= |

    Not defined here (but you know what they are) : , ,

    , (any printable nonalphabetic character

    except double quote)

  • 8/8/2019 19 Data Structure Design

    3/22

  • 8/8/2019 19 Data Structure Design

    4/22

    Uses of a grammar

    A BNF grammar can be used in two ways:

    To generate strings belonging to the grammar

    To do this, start with a string containing a nonterminal;

    while there are still nonterminals in the string {replace a nonterminal with one of its definitions

    }

    To recognize strings belonging to the grammar

    This is the way programs are compiled--a program is astring belonging to the grammar that defines the language

    Recognition is much harder than generation

  • 8/8/2019 19 Data Structure Design

    5/22

    Generating sentences

    I want to write a program that reads in a grammar,

    stores it in some appropriate data structure, then

    generates random sentences belonging to that grammar

    I need to decide:

    How to store the grammar

    What operations to provide on the grammar

    These decisions are intertwined! How I store the grammar determines what operations are easy

    and efficient (and even possible!)

  • 8/8/2019 19 Data Structure Design

    6/22

    Development approaches

    Bad approach:

    Design a general representation for grammars and a complete set

    of operations on them

    Actually, this is a good approach if you are writing a general-purpose

    package for general use--for example, for inclusion in the Java API Otherwise, it just makes your program much more complex

    Good approach:

    Decide the operations you need for this program, and design a

    representation for grammars that supports these operations

    Its a nice bonus if the design can later be extended for other purposes

    Remember the Extreme Programming slogan YAGNI: You aint gonna

    need it.

  • 8/8/2019 19 Data Structure Design

    7/22

    Requirements and constraints

    We need to read the grammar in

    But we dont need to modify it later

    Any tools for building the grammar structure can be private

    We need to look up the definitions of nonterminals

    We need this because we will need to replace eachnonterminal with one of its definitions

    We need to know the top level element of the grammar

    But we can just assume that we know what it is

    For example, we can insist that the top-level element be

  • 8/8/2019 19 Data Structure Design

    8/22

    First cut

    public class Grammar implements Iterable

    List rule; // a single alternative for a nonterminal

    List definition; // all the rules for one nonterminal

    Map grammar; // rules for all the nonterminals

    public Grammar() { grammar = new TreeMap(); }

    public void addRule(String rule) throws IllegalArgumentException

    public List getDefinition(String nonterminal)

    public List getOneRule(String nonterminal) // random choice

    public Iterator iterator()

    public void print()

  • 8/8/2019 19 Data Structure Design

    9/22

    First cut: Evaluation

    Advantages

    Small, easily learned interface

    Just one class

    Can be made to work

    Disadvantages As designed, ::= bar | baz is two rules, requiring two calls to

    addRule; hence requires caller to do some of the parsing, to separate out

    the left-hand side

    Requires some fairly complicated use of generics

    ArrayList implements List (hence is a List), but consider:

    List definition = makeList();

    This statement is legal ifmakeList() returns an ArrayList

    It is not legal ifmakeList() returns an ArrayList

  • 8/8/2019 19 Data Structure Design

    10/22

    Second cut: Overview

    We can eliminate the compound generics by using morethan one class

    public class Grammar implements Iterable

    Map grammar; // all the rules

    public class DefinitionList definition;

    public class RuleString lhs; // the definiendumList rhs; // the definiens

  • 8/8/2019 19 Data Structure Design

    11/22

    Second cut: More detail

    public class Grammar implements Iterable

    Map grammar; // rules for all the nonterminals

    public Grammar() { grammar = new TreeMap(); } // constructor

    public void addRule(String rule) throws IllegalArgumentException

    public Definition getDefinition(String nonterminal)

    public Iterator iterator()

    public void print() public class Definition

    List definition; // all definitions for some unspecified nonterminal

    Definition() // constructor

    void addRule(Rule rule)

    Rule getOneRule()

    public String toString()

    public class RuleString lhs; // the definiendum

    List rhs; // the definiens

    Rule(String text) // constructor

    public String getLeftHandSide()

    public List getRightHandSide()

    public String toString()

  • 8/8/2019 19 Data Structure Design

    12/22

    Second cut: Evaluation

    Advantages:

    Simplifies use of generics

    Disadvantages:

    Many more methods Definitions are unattached from nonterminal being defined

    This makes it easier to parse definitions

    Seems a bit unnatural

    Need to pass the tokenizer around as an additional argument

    Doesnt help with the problem that the caller still has to

    separate out the definiendum from the definiens

  • 8/8/2019 19 Data Structure Design

    13/22

  • 8/8/2019 19 Data Structure Design

    14/22

    Fourth cut, not quite as brief

    The class AbstractListprovides a skeletal implementation of the List interface...the

    programmer needs only to extend this class and provide

    implementations for the get(int index) and size() methods.

    I tried this, but...

    If I dont know how AbstractList is implemented, how can I

    write these methods?

    No book or API class that I looked at provided any clues

    I may be missing something, but it looks like the only thing

    to do is to look at the source code for some of Javas classes

    (like ArrayList) to see how they do it

    Doable, but too much work!

  • 8/8/2019 19 Data Structure Design

    15/22

    Letting go of a constraint

    It is good practice to use a more general class orinterface if you dont need the services of a morespecific class

    In this problem, I want to use lists, but I dont care

    whether they are ArrayLists, orLinkedLists, orsomething else

    Hence, I generally prefer declarations likeList list = new ArrayList();

    In this case, however, trying to do this just seems to bethe cause of many of the problems

    What happens if I just make all lists ArrayLists?

  • 8/8/2019 19 Data Structure Design

    16/22

    Fifth (and final) cut

    public class Grammar

    Map grammar; // rules for all the nonterminals

    public Grammar() { grammar = new TreeMap(); }

    public void addRule(String rule) throws IllegalArgumentException

    public Definitions getDefinitions(String nonterminal)

    public void print()

    private void addToGrammar(String lhs, SingleDefinition definition)

    private static boolean isNonterminal(String s) { return s.startsWith("

  • 8/8/2019 19 Data Structure Design

    17/22

    Explanation I of final BNF API

    Example: ::= |

    The above is a rule

    is the definiendum (the thing being defined) is a single definition of

    is another single definition of So,

    There is a SingleDefinition consisting of the ArrayList [ "" ]

    AnotherSingleDefinition consists of the ArrayList[ "", "" ]

    A Definitions object is a list of single definitions, in this case:

    [ [ "" ], [ "", "" ] ] A Grammar maps nonterminals onto their definitions; thus, a grammar containing

    the above rule would include the mapping:"" [ [ "" ], [ "", "" ] ]

  • 8/8/2019 19 Data Structure Design

    18/22

    Explanation II of final BNF API

    A Grammar is a set of mappings from definienda (nonterminals)

    to definitions, along with some operations on that set of

    definitions

    You can addRule(String rule) to a Grammar

    The rule isparsed, and an entry made in the map

    Definitions for a nonterminal may be together, as in the above example, or

    separate:

    ::=

    ::=

    You can get a list of all the Definitions for a given nonterminal

    You can print the complete Grammar

  • 8/8/2019 19 Data Structure Design

    19/22

    Final version: Evaluation

    Advantages:

    Grammar has one constructor and three public methods

    Definitions and SingleDefinition are just ArrayLists, so there are no new

    methods to learn

    All rule parsing is consolidated into a single public method,addRule(String rule)

    I was able to come up with more meaningful names for classes

    Disadvantages:

    User has to do a bit more list manipulation; in particular, choosing a

    random element from a list This doesnt seem like an appropriate thing to have in a grammar, anyway

  • 8/8/2019 19 Data Structure Design

    20/22

    Morals

    Weeks of programming can save you hours of planning.

    The mistake most programmers make is to use the first designthat comes to mind This usually can be made to work, but its seldom optimal

    Much as we would like to pretend otherwise, programming is aniterative process--we design, then try to implement, then changethe design, then try to implement....

    TDD (Test-Driven Development) is a lightweight (low cost)way to try out a design

    For example, in my first design, I discovered how difficult it was to writetests that used the complex generics

    Consequently, I never even tried to implement this first design

    Morals to take home: Be flexible; try out more than one design

    Do TDD

  • 8/8/2019 19 Data Structure Design

    21/22

    Aside: Tokenizing the input grammar

    I wrote a BnfTokenizer class that returns every tokenas a String Nonterminals keep their angle brackets, and may be multi-

    word

    Double-quoted strings are returned as a single token (minusthe double quotes)

    ::= and | are returned as single tokens

    BnfTokenizer uses StreamTokenizer

    It provides two constructors,BnfTokenizer() and BnfTokenizer(String text)

    And two methods,void tokenize(text) and String nextToken()

  • 8/8/2019 19 Data Structure Design

    22/22

    The End