Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this...

60

Transcript of Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this...

Page 1: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar
Page 2: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Recursive descent parsingDan S. Wallach and Mack Joyner, Rice University

Copyright © 2016 Dan Wallach, All Rights Reserved

Page 3: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Mac users: don’t upgrade!IntelliJ crashes. Don’t upgrade! Watch Piazza for instructions once they fix it and we verify it works.

Page 4: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Last time: recursive grammars

Vocabulary reminder: context-free grammars LHS is always a single non-terminal

• Example: matched #’s of a’s and b’s • S→A • A→a A b • A→∅ (empty)

We’ve seen these before!

Page 5: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Data definition are also grammars!

A list is: a head value and a tail list; or an empty list List→value List List→∅

A tree is: a value, a left-tree, and a right-tree; or an empty-tree Tree→value Tree Tree Tree→∅

Page 6: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

What’s a parser?

Given a sentence (list of tokens) and a language (grammar): Construct a derivation or parse tree if the sentence is in the language. If the sentence is not in the language, returns some sort of error.

Last week’s project: constructing the list of tokens. Regular expressions to match specific tokens.

This week’s project: constructing the parse tree. Recursive code to process the tokens.

Page 7: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

What’s a parserParsers: list of tokens → parse tree { "name": "Dan Wallach", "email": "[email protected]", "classes": [ "Comp215", "Comp427" ] }

JObject

JKeyValue

JString name

JString Dan Wallach

JKeyValue

JString classes

JArray

JKeyValue

JString email

JString Dan Wallach

JString Comp215

JString Comp427

Page 8: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

What’s a parserParsers: list of tokens → parse tree { "name": "Dan Wallach", "email": "[email protected]", "classes": [ "Comp215", "Comp427" ] }

JObject

JKeyValue

JString name

JString Dan Wallach

JKeyValue

JString classes

JArray

JKeyValue

JString email

JString Dan Wallach

JString Comp215

JString Comp427

Page 9: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

What’s a parserParsers: list of tokens → parse tree { "name": "Dan Wallach", "email": "[email protected]", "classes": [ "Comp215", "Comp427" ] }

JObject

JKeyValue

JString name

JString Dan Wallach

JKeyValue

JString classes

JArray

JKeyValue

JString email

JString Dan Wallach

JString Comp215

JString Comp427

Page 10: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

What’s a parserParsers: list of tokens → parse tree { "name": "Dan Wallach", "email": "[email protected]", "classes": [ "Comp215", "Comp427" ] }

JObject

JKeyValue

JString name

JString Dan Wallach

JKeyValue

JString classes

JArray

JKeyValue

JString email

JString Dan Wallach

JString Comp215

JString Comp427

Page 11: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

How do you write a parser?

Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand.

Option 2: “table-driven parser” Tools take a BNF grammar and write your parser for you automatically. You’ll see this more in Comp412 and elsewhere.

Page 12: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Theory stuffThe set of languages you can construct with recursive descent (top-down) and one-token lookahead is called LL(1).

The set of languages you can construct with a table-driven (bottom-up) and one-token lookahead is called LR(1).

They’re not equivalent, but for Comp215, we don’t really care. It’s at least useful that you’ve heard the terms.

Page 13: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Recursive-descent parsingLet’s use a simple example: s-expressions Every LISP-family program is an s-expression. Example: factorial (define (factorial n) (if (= n 0) 1 (* n (factorial (- n 1)))))

We’ve got only three kinds of tokens: • open parenthesis • close parenthesis • “words” (terminals)

Page 14: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

S-expression grammarS→word

S→( L )

L→∅

L→S L

Page 15: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

S-expression grammarS→word an s-expression can be any word (i.e., any terminal)

S→( L ) an s-expression can be parentheses around a list

L→∅ a list can be nothing

L→S L or an s-expression followed by another list

Page 16: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

S-expression grammarS→word an s-expression can be any word (i.e., any terminal)

S→( L ) an s-expression can be parentheses around a list

L→∅ a list can be nothing

L→S L or an s-expression followed by another list

Hey, look, it’s the data definition of a list!

Page 17: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

The grammar cannot be left-recursiveWhy? Because we want our parser to “eat” one token each time. Guarantees recursion will terminate. (By inductive proof!)

S→word

S→( L )

L→∅

L→S L

Page 18: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

The grammar cannot be left-recursiveWhy? Because we want our parser to “eat” one token each time. Guarantees recursion will terminate. (By inductive proof!)

S→word

S→( L )

L→∅

L→S L

RHS is a terminal. Definitely not left-recursive.

Page 19: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

The grammar cannot be left-recursiveWhy? Because we want our parser to “eat” one token each time. Guarantees recursion will terminate. (By inductive proof!)

S→word

S→( L )

L→∅

L→S L

We eat a left-paren, so we’re always making progress.

Page 20: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

The grammar cannot be left-recursiveWhy? Because we want our parser to “eat” one token each time. Guarantees recursion will terminate. (By inductive proof!)

S→word

S→( L )

L→∅

L→S L

Seems scary, but not a big deal, because…

Page 21: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

The grammar cannot be left-recursiveWhy? Because we want our parser to “eat” one token each time. Guarantees recursion will terminate. (By inductive proof!)

S→word

S→( L )

L→∅

L→S L A non-empty list starts off with an s-expression, so a word or a left-paren

Page 22: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

The grammar cannot be ambiguousWhy? Because we want exactly one valid parse tree. Uniqueness is essential. Otherwise, what does the data “mean”?

S→word

S→( L )

L→∅

L→S L

Page 23: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

The grammar cannot be ambiguousWhy? Because we want exactly one valid parse tree. Uniqueness is essential. Otherwise, what does the data “mean”?

S→word

S→( L )

L→∅

L→S L

The first token (word or open-paren) tells us unambiguously which production we’re supposed to use.

Page 24: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

The grammar cannot be ambiguousWhy? Because we want exactly one valid parse tree. Uniqueness is essential. Otherwise, what does the data “mean”?

S→word

S→( L )

L→∅

L→S L If we find a word or open-paren, then we’re dealing with a non-empty list (so the bottom production). If we find a close-paren,

then we’re dealing with an empty-list. No ambiguity.

Page 25: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Writing a recursive-descent parser

Example code in this week’s code dump: edu.rice.sexpr

Three main .java files: Scanner: uses our NamedMatcher to tokenize an s-expression Value: functional data definition for our s-expressions Parser: mutually recursive functions that eat tokens, return Values

Page 26: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Data definitionpublic interface Value { enum ValueType { SEXPR, WORD }

class Sexpr implements Value { private final IList<Value> valueList; ... }

class Word implements Value { private final String word; ... }

Page 27: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Data definitionpublic interface Value { enum ValueType { SEXPR, WORD }

class Sexpr implements Value { private final IList<Value> valueList; ... }

class Word implements Value { private final String word; ... }

We’ve got two kinds of values we care about: words and s-expressions.

Page 28: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Data definitionpublic interface Value { enum ValueType { SEXPR, WORD }

class Sexpr implements Value { private final IList<Value> valueList; ... }

class Word implements Value { private final String word; ... }

An Sexpr is a Value that has a list of Values inside.

Page 29: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

A Word is a Value with a string inside.

Data definitionpublic interface Value { enum ValueType { SEXPR, WORD }

class Sexpr implements Value { private final IList<Value> valueList; ... }

class Word implements Value { private final String word; ... }

Page 30: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Parser engineering

externally visible builder / factory method public static Option<Value> parseSexpr(String input) { ... }

internal helper methods static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { ... }

static Option<Result<Value>> makeWord(IList<Token<SexprPatterns>> tokenList) { ... }

Page 31: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Parser engineering

externally visible builder / factory method public static Option<Value> parseSexpr(String input) { ... }

internal helper methods static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { ... }

static Option<Result<Value>> makeWord(IList<Token<SexprPatterns>> tokenList) { ... }

If the parser fails, you get back Option.none()

Page 32: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Parser engineering

externally visible builder / factory method public static Option<Value> parseSexpr(String input) { ... }

internal helper methods static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { ... }

static Option<Result<Value>> makeWord(IList<Token<SexprPatterns>> tokenList) { ... }

Given a list of tokens, return an s-expression “result” if you can.

Page 33: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Parser engineering

externally visible builder / factory method public static Option<Value> parseSexpr(String input) { ... }

internal helper methods static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { ... }

static Option<Result<Value>> makeWord(IList<Token<SexprPatterns>> tokenList) { ... }

Given a list of tokens, return a word “result” if you can.

Page 34: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Internal results management

If the parsing function succeeds, it returns Option.some() of: a production (Word, Sexpr, etc.) a list of remaining tokens.

And if it fails? It return Option.none(). static class Result<T> { public final T production; public final IList<NamedMatcher.Token<Scanner.SexprPatterns>> tokens;

. . . }

Page 35: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

List parsing is recursive, just like lists

static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.OPEN)

? makeSexprHelper(remainingTokens) .map(result -> new Result<>(new Sexpr(result.production), result.tokens)) : Option.none());} private static Option<Result<IList<Value>>> makeSexprHelper(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.CLOSE) ? Option.some(new Result<>(List.makeEmpty(), remainingTokens)) : makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));}

Page 36: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

List parsing is recursive, just like lists

static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.OPEN)

? makeSexprHelper(remainingTokens) .map(result -> new Result<>(new Sexpr(result.production), result.tokens)) : Option.none());} private static Option<Result<IList<Value>>> makeSexprHelper(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.CLOSE) ? Option.some(new Result<>(List.makeEmpty(), remainingTokens)) : makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));}

If there are no tokens left, we can’t make anything!

Page 37: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

List parsing is recursive, just like lists

static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.OPEN)

? makeSexprHelper(remainingTokens) .map(result -> new Result<>(new Sexpr(result.production), result.tokens)) : Option.none());} private static Option<Result<IList<Value>>> makeSexprHelper(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.CLOSE) ? Option.some(new Result<>(List.makeEmpty(), remainingTokens)) : makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));}

Sexpr must start with an open-paren, and if so…

Page 38: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

List parsing is recursive, just like lists

static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.OPEN)

? makeSexprHelper(remainingTokens) .map(result -> new Result<>(new Sexpr(result.production), result.tokens)) : Option.none());} private static Option<Result<IList<Value>>> makeSexprHelper(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.CLOSE) ? Option.some(new Result<>(List.makeEmpty(), remainingTokens)) : makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));}

We want to get an List of all the Values within.

Page 39: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

List parsing is recursive, just like lists

static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.OPEN)

? makeSexprHelper(remainingTokens) .map(result -> new Result<>(new Sexpr(result.production), result.tokens)) : Option.none());} private static Option<Result<IList<Value>>> makeSexprHelper(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.CLOSE) ? Option.some(new Result<>(List.makeEmpty(), remainingTokens)) : makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));}

We require at least one token, a close-paren.

Page 40: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

List parsing is recursive, just like lists

static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.OPEN)

? makeSexprHelper(remainingTokens) .map(result -> new Result<>(new Sexpr(result.production), result.tokens)) : Option.none());} private static Option<Result<IList<Value>>> makeSexprHelper(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.CLOSE) ? Option.some(new Result<>(List.makeEmpty(), remainingTokens)) : makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));}

If we find close-paren, then we return an empty-list.

Page 41: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

List parsing is recursive, just like lists

static Option<Result<Value>> makeSexpr(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.OPEN)

? makeSexprHelper(remainingTokens) .map(result -> new Result<>(new Sexpr(result.production), result.tokens)) : Option.none());} private static Option<Result<IList<Value>>> makeSexprHelper(IList<Token<SexprPatterns>> tokenList) { return tokenList.match( emptyList -> Option.none(), (token, remainingTokens) -> (token.type == SexprPatterns.CLOSE) ? Option.some(new Result<>(List.makeEmpty(), remainingTokens)) : makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));}

Otherwise…

Page 42: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Recursive list parsing

// tokenList: every token (the argument to the function) // token: the first token (from the match(): which we know, at this point, is not a close-paren) // remainingTokens: all but the first token (also from the match())

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));

Page 43: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Recursive list parsing

// tokenList: every token (the argument to the function) // token: the first token (from the match(): which we know, at this point, is not a close-paren) // remainingTokens: all but the first token (also from the match())

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));

Recursively, see if it’s another Sexpr or Word. (We ate the open-paren beforehand; we’re making progress!)

Page 44: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Recursive list parsing

// tokenList: every token (the argument to the function) // token: the first token (from the match(): which we know, at this point, is not a close-paren) // remainingTokens: all but the first token (also from the match())

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));

If it succeeded (Option.some), then we’ll recursively call ourselves with the remaining tokens.

Page 45: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Recursive list parsing

// tokenList: every token (the argument to the function) // token: the first token (from the match(): which we know, at this point, is not a close-paren) // remainingTokens: all but the first token (also from the match())

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));

And if that succeeded, we’ll take the IList<Value> from the tail and put our head value on the front.

Page 46: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Recursive list parsing

// tokenList: every token (the argument to the function) // token: the first token (from the match(): which we know, at this point, is not a close-paren) // remainingTokens: all but the first token (also from the match())

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));

And whatever tokens we didn’t eat, after we hit the close-paren, we’ll pass along to our parent for further processing.

Page 47: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Review: Option.map and Option.flatmap

default <R> Option<R> map(Function<T, R> func) { return flatmap(val -> some(func.apply(val)));} default <R> Option<R> flatmap(Function<T, Option<R>> func) { return match(Option::none, func);}

Page 48: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

flatmap, map, etc.

// tokenList: every token (the argument to the function) // token: the first token (from the match(): which we know, at this point, is not a close-paren) // remainingTokens: all but the first token (also from the match())

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));

Page 49: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

flatmap, map, etc.

// tokenList: every token (the argument to the function) // token: the first token (from the match(): which we know, at this point, is not a close-paren) // remainingTokens: all but the first token (also from the match())

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));

Returns Option<Result<Value>>

Page 50: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

flatmap, map, etc.

// tokenList: every token (the argument to the function) // token: the first token (from the match(): which we know, at this point, is not a close-paren) // remainingTokens: all but the first token (also from the match())

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));

Returns Option<Result<IList<Value>>> already, so we don’t want to say Option.some() of it. It’s already an Option. Thus flatmap.

Page 51: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

flatmap, map, etc.

// tokenList: every token (the argument to the function) // token: the first token (from the match(): which we know, at this point, is not a close-paren) // remainingTokens: all but the first token (also from the match())

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));

Gets us Result<IList<Value>> from the tail, and then we’re adding on the token from the head. Only runs if makeSexprHelper

returned Option.some().

Page 52: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Cool trick: trying multiple productions

private static final IList<Function<IList<Token<SexprPatterns>>, Option<Result<Value>>>> MAKERS = List.of( Parser::makeSexpr, Parser::makeWord); static Option<Result<Value>> makeValue(IList<Token<SexprPatterns>> tokenList) { return MAKERS.oflatmap(x -> x.apply(tokenList)).ohead();}

Page 53: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Cool trick: trying multiple productions

private static final IList<Function<IList<Token<SexprPatterns>>, Option<Result<Value>>>> MAKERS = List.of( Parser::makeSexpr, Parser::makeWord); static Option<Result<Value>> makeValue(IList<Token<SexprPatterns>> tokenList) { return MAKERS.oflatmap(x -> x.apply(tokenList)).ohead();}

Read this slow. A list of lambdas that take lists of tokens and return an optional result. It’s less ugly than it seems.

Page 54: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Cool trick: trying multiple productions

private static final IList<Function<IList<Token<SexprPatterns>>, Option<Result<Value>>>> MAKERS = List.of( Parser::makeSexpr, Parser::makeWord); static Option<Result<Value>> makeValue(IList<Token<SexprPatterns>> tokenList) { return MAKERS.oflatmap(x -> x.apply(tokenList)).ohead();}

The resulting list has all the Option.some() values, and none of the Option.none() values. So no more Option!

Page 55: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Cool trick: trying multiple productions

private static final IList<Function<IList<Token<SexprPatterns>>, Option<Result<Value>>>> MAKERS = List.of( Parser::makeSexpr, Parser::makeWord); static Option<Result<Value>> makeValue(IList<Token<SexprPatterns>> tokenList) { return MAKERS.oflatmap(x -> x.apply(tokenList)).ohead();}

If the list is non-empty, this returns Option.some() of the head. If the list is empty, this returns Option.none().

Page 56: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Functional program engineeringEach maker-method has no side-effects, so we can try them all No worries that one will crash the other one

• Mutating parsers tend to “push back” tokens they don’t need

• Testing and debugging is tedious!

private static final IList<Function<IList<Token<SexprPatterns>>, Option<Result<Value>>>> MAKERS = List.of( Parser::makeSexpr, Parser::makeWord);static Option<Result<Value>> makeValue(IList<Token<SexprPatterns>> tokenList) { return MAKERS.oflatmap(x -> x.apply(tokenList)).ohead();}

Page 57: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Functional program engineeringEach maker-method has no side-effects, so we can try them all No worries that one will crash the other one

• Mutating parsers tend to “push back” tokens they don’t need

• Testing and debugging is tedious!

private static final IList<Function<IList<Token<SexprPatterns>>, Option<Result<Value>>>> MAKERS = List.of( Parser::makeSexpr, Parser::makeWord);static Option<Result<Value>> makeValue(IList<Token<SexprPatterns>> tokenList) { return MAKERS.oflatmap(x -> x.apply(tokenList)).ohead();}

Page 58: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Your project this week: Parse JSON

The grammar for you is pretty simple We’ve already stubbed out the functions you’ll implement.

Start early. This will take some work!

Page 59: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Lab this week is on paper!

Bring a pencil or pen.

(Time to get you warmed up for the midterm.)

Page 60: Recursive descent parsing · Option 1: “recursive-descent parser” You’re writing one this week for JSON. By hand. Option 2: “table-driven parser” Tools take a BNF grammar

Live coding: writing complex expressions

We’ll walk through how to write and debug something like this:

makeValue(tokenList) .flatmap(headResult -> makeSexprHelper(headResult.tokens) .map(tailResults -> new Result<>(tailResults.production.add(headResult.production), tailResults.tokens))));