Post on 28-Dec-2015
Spreadsheets: Spreadsheets: functional functional
programming for programming for the massesthe massesSimon Peyton JonesSimon Peyton Jones
Margaret BurnettMargaret Burnett
Alan BlackwellAlan Blackwell
Q1: What should a functional programmer in Microsoft Research do?
Q1: What should a functional programmer in Microsoft Research do?
A1: Persuade developers to implement stuff in Haskell. No more C#! Haskell is better!
Q1: What should a functional programmer in Microsoft Research do?
A1: Ask Q2
Q2: What is the worlds most widely used functional language, by far?
Q1: What should a functional programmer in MSR do?
Q2: What is the worlds most widely used functional language, by far?
Violent exothermic reaction
Exc
el!
Spreadsheets are functional Spreadsheets are functional programsprograms
B1 = A1*A1C1 = A2*A2D1 = B1-C1B2 = A1+A2C2 = A1-A2D2 = B2*C2
Just a big bunch Just a big bunch of equationsof equations
No side effectsNo side effects Order of Order of
evaluation evaluation controlled by controlled by data data dependenciesdependencies
Q3: What chance does a pointy-headed researcher have of influencing the direction of a Microsoft cash cow?
Q3: What chance does a pointy-headed researcher have of influencing the direction of a Microsoft cash cow?
Marketsize
Can use VB, C++
Can use Excel
2m
50m
2m “classic” programmers (write VB,C++, C#)
50m end-userprogrammers• Real job is engineering,
teaching, financial; NOT programming
• Use Excel formulae to build "models"
• No need to "sell" functional programming: they are already doing it!
Excel’s market is tall
Programmers
End users
Marketsize
Can use VB, C++
Can use Excel
2m
50m
2m “classic” programmers (write VB,C++, C#)
Research effort
expended
Not much
Programmers
End users
Marketsize
2m
50m
Programmers
End users
Application requirements
Tall, but narrowWhen the task...
becomes large or complex
changes over time
rewards re-use
is mission-critical
... cells and formulas are not enough.
Current solution: shift programming paradigm
Use Excel + VB, C#
Marketsize
2m
50m Our vision
Programmers
End users
Application requirements
New territory to colonise
Increase Excel’s “reach” by
empowering end users to write
“programs” without hiring programmers
Excel Functional
programming
Research inputs
Excel Plus
End user & visual
software engineering
Psychology of
programming
Simon Peyton JonesMargaret Burnett
Alan Blackwell
Ruthless design-time focus Ruthless design-time focus on on usabilityusability, based on , based on empirically-grounded empirically-grounded
research.research.
Our target end users
Of all Excel usersOf all Excel users Some just use Excel for listsSome just use Excel for lists Some can type very simple formulae e.g. Some can type very simple formulae e.g.
=SUM(A1:A10)=SUM(A1:A10) Some use formulae, and understand copy-Some use formulae, and understand copy-
and-paste of formulae (absolute and and-paste of formulae (absolute and relative cell references)relative cell references)
Some can use Visual BasicSome can use Visual Basic
Our target end usersOf all Excel usersOf all Excel users Some just use Excel for listsSome just use Excel for lists Some can type very simple formulae e.g. Some can type very simple formulae e.g.
=SUM(A1:A10)=SUM(A1:A10) Some use formulae, and understand Some use formulae, and understand
copy-and-paste of formulae (absolute copy-and-paste of formulae (absolute and relative cell references)and relative cell references)
Some can use Visual Basic Some can use Visual Basic (professional programmers)(professional programmers)This is our target audience.
A minority of Excel users, but still extremely numerous
Marketsize
2m
50m How?
Programmers
End users
Application requirements
Two complementa
ry ideas1. Functions as ordinary spreadsheets
2. First class array values
Functions as Functions as ordinary ordinary
spreadsheetsspreadsheets
What's missing?What's missing?
B1 = A1*A1C1 = A2*A2D1 = B1-C1B2 = A1+A2C2 = A1-A2D2 = B2*C2
ScenarioScenario teacher types formula to compute student teacher types formula to compute student
gradegrade copies and pastes down a columncopies and pastes down a column (much later) wants to change the formula(much later) wants to change the formula
ProblemProblem must alter many cells to implement a single must alter many cells to implement a single
changechange impacts re-use, error-proneness, impacts re-use, error-proneness,
modularitymodularity
Obvious solutionObvious solution (to a programmer) (to a programmer) Make a named function to encapsulate the Make a named function to encapsulate the
formulaformula
Functions as ordinary Functions as ordinary spreadsheetsspreadsheets
User is working on a User is working on a formulaformula
User brings up the right User brings up the right click menuclick menu
CutCopyPaste…Make a function
…
A new function is A new function is automatically created in automatically created in
a sheet and calleda sheet and called
New function
worksheet
Formula replaced by call
to function
Now, fill down does not Now, fill down does not lose sharinglose sharing
Regular fill down
User can see/modify the User can see/modify the function definition as function definition as
desireddesired
Functions as worksheetsFunctions as worksheets CreatingCreating a function is fast a function is fast Understanding Understanding a function requires a function requires
no new skills: no new skills: no paradigm no paradigm shiftshift
Using Using a functiona function improves qualityimproves quality
Named abstraction is our primary Named abstraction is our primary weapon in the war against weapon in the war against complexity. Imagine conventional complexity. Imagine conventional programming with no procedures, programming with no procedures, only smart copy/paste!only smart copy/paste!
Creating a function from Creating a function from scratchscratch
Build a worksheetBuild a worksheet to calculate the to calculate the distance a ball will travel, when at a distance a ball will travel, when at a particular angle and velocityparticular angle and velocity
Turn it into a functionTurn it into a function by identifying the by identifying the input cells (a bit like “scenarios”, only input cells (a bit like “scenarios”, only callable)callable)
Call the functionCall the function many times to see the many times to see the distance the ball goes for different distance the ball goes for different throwing angles throwing angles
DebuggingDebugging
The “call tree” becomes a tree of The “call tree” becomes a tree of linked worksheets, linked worksheets, laid out in laid out in space, not in timespace, not in time..
So debugging is particularly easy. So debugging is particularly easy. Need new mechanisms for Need new mechanisms for navigating the plethora of navigating the plethora of worksheets, via the tree structure.worksheets, via the tree structure.
First year programming courses will First year programming courses will be taught this way!be taught this way!
Main program
Function CylVol
Function CircArea
Calls
Calls
Domain-specific librariesDomain-specific libraries Every domain (physics, electronics, Every domain (physics, electronics,
statistics, financial, marketing...) has statistics, financial, marketing...) has domain-specific abstractions.domain-specific abstractions.
Excel’s function libraries are an ideal Excel’s function libraries are an ideal way of packaging those abstractions for way of packaging those abstractions for Excel users.Excel users.
Hence, we want to make it easy for end Hence, we want to make it easy for end users to build, encapsulate, and share users to build, encapsulate, and share their own function libraries, their own function libraries, without help without help from professional programmersfrom professional programmers..
First class data First class data valuesvalues
First class data valuesFirst class data values User-defined functions need array User-defined functions need array
arguments. e.g. SUM( A1:B9 )arguments. e.g. SUM( A1:B9 ) Simple but powerful idea: Simple but powerful idea: anything a anything a
scalar can do, an array can doscalar can do, an array can do:: be the value of a formulabe the value of a formula be the value of a cellbe the value of a cell be the argument or result of a functionbe the argument or result of a function
Make Excel’s existing “array formulae” Make Excel’s existing “array formulae”
simpler simpler andand more powerful. more powerful.
First class valuesFirst class values Currency; units in general (unit-aware Currency; units in general (unit-aware
arithmetic)arithmetic) HyperlinkHyperlink Matrix (index, add, multiply…)Matrix (index, add, multiply…) Relation (filter, select, join…)Relation (filter, select, join…) XML blob (query, combine)XML blob (query, combine) Picture (generate picture from Picture (generate picture from
numbers, combine pictures)numbers, combine pictures)Each value type comes Each value type comes
complete with a repertoire of complete with a repertoire of functions over itfunctions over it
Bulk data operationsBulk data operationsA1 = …connect to a database relation…A1 = …connect to a database relation…
A2 = EXTEND( A1, [First Name], GetFirst( [Name] ) )A2 = EXTEND( A1, [First Name], GetFirst( [Name] ) )A3 = EXTEND( A1, [Last Name], GetLast( [Name] ) )A3 = EXTEND( A1, [Last Name], GetLast( [Name] ) )
A4 = FILTER( A3, AND( [Age] > 30, [Age] < 50 ) )A4 = FILTER( A3, AND( [Age] > 30, [Age] < 50 ) )
A5 = SELECT( A4, [First Name], [Last Name], [Age] )A5 = SELECT( A4, [First Name], [Last Name], [Age] )
This stuff can be done today, by hand (e.g. Data/AutoFilter), but it can’t be automated robustly
Extensible typesExtensible types
It should be easy for a VB or C# It should be easy for a VB or C# programmer to add a programmer to add a new data typenew data type. All . All Excel needs to know about it is:Excel needs to know about it is: How to display itHow to display it How to “drill into” it to display its full How to “drill into” it to display its full valuevalue Perhaps, how to downcast it to a Perhaps, how to downcast it to a number/stringnumber/string
The recalc chain and dependency analysis The recalc chain and dependency analysis are completely unaffectedare completely unaffected
Back to the supertankerBack to the supertanker Small crew, high-value payload, Small crew, high-value payload,
many customer requests, so many customer requests, so systemic changes are not easysystemic changes are not easy
Excel 2003 is out -- the next version Excel 2003 is out -- the next version is being designedis being designed
We’re talking to the Excel team We’re talking to the Excel team regularly (weekly)regularly (weekly)
Back to the supertankerBack to the supertanker Small crew, high-value payload, many Small crew, high-value payload, many
customer requests, so systemic customer requests, so systemic changes are not easychanges are not easy
Excel 2003 is out -- the next version is Excel 2003 is out -- the next version is being designedbeing designed
We’re talking to the Excel team weeklyWe’re talking to the Excel team weekly Next: Next:
higher order functionshigher order functions assertions, test generationassertions, test generation static type system? static type system?
Functional programmin
g
End user & visual
software engineering
Psychology of
programming
Empower non-programmer end users (accountants, engineers, salesmen...) to
do things they could not do before
• Control complexity through
building re-usable abstractions
• Succeed in more ambitious applications
• Encapsulate domain-specific expertise in function libraries
• Crush more errors earlier
SummarySummary
Multi-disciplinary inputs
http://research.microsoft.com/~simonpj/papers/excel