Build a compiler using C#, Irony and RunSharp.

Post on 19-Jun-2015

949 views 3 download

description

Build a compiler using C#, Irony and RunSharp.

Transcript of Build a compiler using C#, Irony and RunSharp.

Creating a Compiler with Irony and RunSharp

James M. CurranNYC CodeCamp Sept 13, 2014

Creating a Compiler with Irony and RunSharp

James M. CurranNYC CodeCamp Sept 13, 2014

Brought to you by : MARQUEE SPONSOR

Simple Country Programmer 20+ Years in Software Development Assembler, C, C++, C# Worked in Medical, Financial, Online

Retailing, and Embedded systems Microsoft MVP C++ (1994-2004) Currently Itinerant Programmer at One

Call Medical Write blog: HonestIllusion.com Designer: NJTheater.Com Resume'n'stuff: NovelTheory.com

Who the Heck am I?

Simple Country Programmer 20+ Years in Software Development Assembler, C, C++, C# Worked in Medical, Financial, Online

Retailing, and Embedded systems Microsoft MVP C++ (1994-2004) Currently Itinerant Programmer at One

Call Medical Write blog: HonestIllusion.com Designer: NJTheater.Com Resume'n'stuff: NovelTheory.com

Who the Heck am I?

Brought to you by PLATINUM SPONSOR

Yeah, Comic SansDeal with it.

“Naked Came the Null Delegate”➲ Collaborative blog story telling a tale of sex,

betrayal and .NET coding.

➲ Serialized across the blogs of several noted .NET writers including Charles Petzold and Jon Skeet.

➲ Dormant since 2010, but new writers wanted.

➲ http://nakedcamethenulldelegate.wordpress.com

Brought to you by PLATINUM SPONSOR

Agenda

The Language The Parser The Code Generators (plural) The Command-line interface.

Brought to you by GOLD SPONSORS

Shakespeare Programming Language - History

Designed by Jon Åslund and Karl Hasselström in 2001.

The design goal was to make a language with beautiful source code that resembled Shakespeare plays.

Original compiler written in YACC (actually Bison), producing C code.

I wrote the C# compiler in Sept 2013. Apparently I was the first person to touch the code in 12 years.

Since then, Javascript, Python, Perl, and Java versions of been open-sourced and http://shakespearelang.org/ was launched.

Variables

Must be the name of an actual character from one of Shakespeare's play.

That’s exactly 152 specific allowable variable names.

Some names are more than one word!

Only data type is an integer, but it can be used for a character.

It's really a stack of integers.

Numbers/Constants Nouns that are “nice” or neutral equal 1.

Nouns that are “not nice” equal -1.

Adjectives (nice or not) multiply by 2.

Examples:“big hairy hound” = 2 * 2 * (+1) = 4“sorry little codpiece” = 2 * 2 * (-1) = -4

Can be combined with “Sum of ___ and ___”, “Difference between”, “Square of”, “Cube of”

41 neutral, 13 positive, and 25 negative nouns20 neutral, 36 positive and 32 negative adjectives

Input and Output "Open your heart" outputs the variable's value as a number "Speak your mind" outputs the corresponding ASCII character. "Listen to your heart" cause the variable to receive input from the

user as a number. "Open your mind" receive input from a character.

Conditional Statements and Gotos An if/then statement is phrased as a question posed by a

character. The words "as [any adjective] as" represent a test for

equality, "better" and "worse" correspond to greater than and less

than A subsequent line, starting "if so" or "if not," determines

what happens in response to the truth or falsehood of the original condition.

A goto statement begins "Let us," "We shall," or "We must," continues "return to" or "proceed to," and then gives an act or scene.

ExampleOutputting Input Reversedly. Othello, a stacky man.Lady Macbeth, who pushes him around till he pops. Act I: The one and only.Scene I: In the beginning, there was nothing. [Enter Othello and Lady Macbeth] Othello:You are nothing! Scene II: Pushing to the very end. Lady Macbeth:Open your mind! Remember yourself. Othello:You are as hard as the sum of yourself and a stone wall. Am I as horrid as a flirt-gill?

Lady Macbeth:If not, let us return to scene II. Recall your imminent death!

The original compilerLike most languages, specified in Backus–Naur

Form (BNF)

Compiler coded in YACC (“Yet Another Compiler Compiler”) (Actually, in GNU’s version BISON)

It’s object code (output) is C source code, which would then need to be compiled.

BNF & YACCBNF is made up of “Terminals” and “Non-terminal”Terminal symbols are the elementary symbols of

the languageNon-terminals are made of a combining terminals

and other non-terminals.Example:digit = '0'|'1' |'2' |'3' |'4' |'5' |'6' |'7' |'8' |'9‘integer = ['-'] digit+

YACC is a LALR parser generator.YACC source files are BNF with C code attached to

some non-terminals.

Original Compiler (excerpt)Play: Title CharacterDeclarationList Act+ CharacterDeclarationList: CharacterDeclaration+Act: ActHeader Scene+ActHeader: ACT_ROMAN COLON Comment EndSymbol

Constant: ARTICLE UnarticulatedConstant | FIRST_PERSON_POSSESSIVE UnarticulatedConstant | SECOND_PERSON_POSSESSIVE UnarticulatedConstant | THIRD_PERSON_POSSESSIVE UnarticulatedConstant | NOTHING

1st tool: IronyIrony is a development kit for implementing

languages on .NET platform.

Name is a play on IronPython, IronRuby, IronScheme : “Use Irony to create your own Iron* language”

Tries to replicate BNF in C# code via operator overloading.

Project home: https://irony.codeplex.com/Available on nuget: “irony”

Irony Example: public class ShakespeareGrammar : InterpretedLanguageGrammar { public ShakespeareGrammar() : base(false) { string[] endSymbols = { ".", "?", "!" }; KeyTerm COLON = ToTerm(":", "colon"); KeyTerm COMMA = ToTerm(",");

// : var Act = new NonTerminal("Act"); var Acts = new NonTerminal("Acts"); Acts.Rule = MakePlusRule(Acts, Act);

var Play = new NonTerminal("Play"); Play.Rule = Title + CharacterDeclarationList + Acts;

Example, cont: var Constant = new NonTerminal("Constant »); Constant.Rule = Article + UnarticulatedConstant | FirstPersonPossessive + UnarticulatedConstant | SecondPersonPossessive + UnarticulatedConstant | ThirdPersonPossessive + UnarticulatedConstant ;

UnarticulatedConstant.Rule = PositiveConstant | NegativeConstant;

PositiveConstant.Rule = PositiveNoun | PositiveAdjective + PositiveConstant | NeutralAdjective + PositiveConstant;

Question.Rule =Be + Value + Comparison + Value + QuestionSymbol;

Multi-Word keywords BnfTerm MultiWordTermial(string term) { var parts = term.Split(' '); if (parts.Length == 1) return ToTerm(term);

var nonterm = new NonTerminal(term); var expr = new BnfExpression(ToTerm(parts[0])); foreach (var part in parts.Skip(1)) { expr += ToTerm(part); } nonterm.Rule = expr; return nonterm; }

Equivalent Keywords NonTerminal BuildTerminal(string name, string filename) { var termList = new NonTerminal(name)); var strm = assembly.GetManifestResourceStream(filename); var sw = new StreamReader(strm); var block = sw.ReadToEnd(); var lines = block.Split('\n', '\r');

if (lines.Any()) { var expr = new BnfExpression(MultiWordTermial(lines[0])); foreach (var line in lines.Skip(1)) { if (!String.IsNullOrWhiteSpace(line)) expr |= MultiWordTermial(line); } termList.Rule = expr; } return termList; }

Irony's GrammarExplorer tool

Code Generation

• Ok, You’ve got a parser. Now what?

• Irony documentation drops off here.

• Here’s what I learned single-stepping with the debugger.

Create AST tree from source.var grammar = new ShakespeareGrammar();var parser = new Parser(grammar);var text = File.ReadAllText(filename);var tree = parser.Parse(text, filename);

var app = new ScriptApp(parser.Language);var thread = new ScriptThread(app);var output= tree.Root.AstNode.Evaluate(thread);

Walking the TreeAs tree.Root.AstNode.Evaluate(thread) runs,

node.DoEvaluate() is called for each node on the tree.

By default, AstNode objects are in tree.

So we need our own nodes in the tree.

ASTNode (Fixed)var Act = new NonTerminal("Act", typeof(ActNode));var NegativeConstant = new NonTerminal("NegativeConstant", typeof(NegativeConstantNode));

public class NegativeConstantNode : AstNode{ protected override object DoEvaluate(ScriptThread thread) { var tw = thread.tc().Writer; if (AstNode1 is NegativeNounNode) return "(-1)"; else return string.Format("2*{0}", AstNode2.ToString(thread)); } }

public class PlayNode : AstNode{ protected override object DoEvaluate(Irony.Interpreter.ScriptThread thread) { var sw = thread.tc().Writer; sw.WriteLine("using System;"); sw.WriteLine("using Shakespeare.Support;"); sw.WriteLine(); sw.WriteLine("namespace Shakespeare.Program"); sw.WriteLine("{"); sw.WriteLine("\tclass Program"); sw.WriteLine("\t{"); sw.WriteLine("\t\tstatic void Main(string[] args) {"); sw.WriteLine("\t\t\tvar script = new Script();"); sw.WriteLine("\t\t\tscript.Action();"); sw.WriteLine("\t\t}"); sw.WriteLine("\t}"); sw.WriteLine("\t\tclass Script : Dramaturge {"); sw.WriteLine("\t\tpublic Script()"); sw.WriteLine("\t\t : base(Console.In, Console.Out)"); sw.WriteLine("\t\t{ }"); sw.WriteLine(); sw.WriteLine("\t\tpublic void Action()"); sw.WriteLine("\t\t{"); AstNode1.Evaluate(thread); // Title sw.WriteLine(); var cdl = AstNode2 as CharacterDeclarationListNode; foreach (var ch in cdl.Characters) ch.Evaluate(thread); sw.WriteLine(); AstNode3.Evaluate(thread); sw.WriteLine("\t\t}"); sw.WriteLine("\t}"); sw.WriteLine("}"); sw.Flush(); sw.Close(); return sw; }}

A second code generatorThat works fine for if you only need one code

generator.

But I wanted to write a C code generator, a C# code generator and a MSIL code generator.

{more digging with debugger}

I’ve got great tool from switching between several generators – Check the github…..

2nd tool: RunSharp

RunSharp is a layer above the standard .NET Reflection.Emit API, allowing to generate/compile dynamic code at runtime very quickly and efficiently (unlike using CodeDOM and invoking the C# compiler).

Project home : https://code.google.com/p/runsharp/ (but it’s stagnant, so check github for forks)

Not on Nuget.

public class PlayNode : AstNode { protected override object DoEvaluate(Irony.Interpreter.ScriptThread thread) { var rs = thread.rs(); var ag = rs.AssemblyGen; using (ag.Namespace("Shakespeare.Program")) { TypeGen scriptClass = ag.Public.Class("Script", typeof(Shakespeare.Support.Dramaturge)); { rs.Script = scriptClass;

var console = typeof(Console);

CodeGen ctor = scriptClass.Public.Constructor(); ctor.InvokeBase(Static.Property(console, "In"), Static.Property(console, "Out")); { }

CodeGen action = scriptClass.Public.Void("Action"); { rs.Action = action; action.At(Span); AstNode1.Evaluate(thread); var cdl = AstNode2 as CharacterDeclarationListNode; foreach (var ch in cdl.Characters) ch.Evaluate(thread);

AstNode3.Evaluate(thread); } } scriptClass.Complete();

TypeGen MyClass = ag.Public.Class("Program"); CodeGen Main = MyClass.Public.Static.Void("Main").Parameter<string[]>("args"); { var script = Main.Local(Exp.New(scriptClass.GetCompletedType())); Main.Invoke(script, "Action"); } } ag.Save();

return ag.GetAssembly().GetName().Name; }}

public class NegativeConstantNode : AstNode{   protected override object DoEvaluate(ScriptThread thread) { if (AstNode1 is NegativeNounNode)          return ((Operand) 0- 1);      else           return (AstNode2.Evaluate(thread) as Operand) * 2;    }}