High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
Click here to load reader
-
Upload
semsworkshop -
Category
Technology
-
view
134 -
download
1
Transcript of High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
www.itu.dk 1
High-performance sheet-defined functions
in spreadsheets Peter Sestoft
IT University of Copenhagen
SEMS 2014-07-02
With thanks to Thomas S Iversen, Daniel Cortes, Morten Hansen, Poul Serek, Morten Poulsen, Hui Xu, Mainul Liton, Poul Brønnum,
Tim Garbos, Kasper Videbæk, Jens Hamann, Jonas Druedahl Rask, Simon Eikeland Timmermann
www.itu.dk 2
The trouble with functions • One cannot define functions in a spreadsheet • To define new functions, ”experts” use VBA • Often very poorly, witness newsgroup
microsoft.public.excel.programming • Many (Excel) built-in functions are bad:
– Week numbers: two kinds, but not ISO standard • Possible answers to this mess:
– ”People should not use spreadsheets” – ”Only computer scientists should define functions” – ”All necessary functions should be built in” – Or: Functions within the spreadsheet metaphor
(Nuñez 2000, Peyton-Jones et al 2003)
www.itu.dk
Problem example: Area of triangles • Area of triangle with sides a, b, c is
SQRT(s(s-a)(s-b)(s-c)) where s = (a+b+c)/2 Either (1) compute s in column D:
or (2) try to inline s in the area formula:
Annoying intermediate result
Horrible and error-prone
www.itu.dk
A solution: Sheet-defined function TRIAREA
Func
tion
shee
t O
rdin
ary
shee
t
Input cells
Output cell
www.itu.dk 5
How use sheet-defined functions • Assumptions
– End-users understand spreadsheet models – End-users do not understand VBA, C#, VB.NET, …
• Sheet-defined functions in the organization – Models are developed in ordinary spreadsheets – After a while functions are factored out of models – Functions can be further developed interactively – An organization can develop and share libraries – Without preventing further evolution by users
• Works only if – Sheet-defined functions are fast enough
Dual implementation
• Ordinary sheets, interpretive evaluation
– Frequently edited, rarely evaluated (at "recalculation")
• Function sheets, compiled evaluation – Rarely edited, frequently evaluated (at function calls) – Run-time code generation permits interactive editing
Runtime code generation =SQRT(a1*a1+a2*a2)
ldloc 2 ldloc 2 mul ldloc 3 ldloc 3 mul add call Math.Sqrt
fldl 0xfffffff0(%ebp) fldl 0xfffffff0(%ebp) fmulp %st,%st(1) fldl 0xffffffe8(%ebp) fldl 0xffffffe8(%ebp) fmulp %st,%st(1) faddp %st,%st(1) fsqrt
Spreadsheet formula
.NET bytecode
x86 machine code
My compiler
JIT compiler
Result: A very fast, portable spreadsheet implementation
www.itu.dk
New book (next month) • Spreadsheet Implementation Technology,
MIT Press, August 2014
8
Peter Sestoft
SpreadsheetImplementationTechnology
Basics and Extensions
Version 0.99.5 of 2014-05-10
The MIT PressCambridge, MassachusettsLondon, England
• A standard spreadsheet implementation
• Sheet-defined functions • Examples • Design choices • Scalability and speed • Implementation details • Funcalc user manual
www.itu.dk 9
Example function: NORMDISTCDF
• Normal distribution N(0,1) cumulative distribution function • As accurate as Excel’s built-in NORMSDIST(z), and faster
Input cell Output cell
NORMDISTCDF generated code
• Approximately 118 ns/call on 2.66 GHz Intel Core 2 • VBA: 1760 ns; Excel built-in: 1140 ns; C#: 64 ns; C: 54 ns
0000 ldarg V_0 0068 ldloc.0 0198 div 0004 call ValueToDoubleOrNan 0069 call Double.IsInfinity 0199 add 0009 stloc.s V_6 006e brtrue IL_01a0 019a div 000b ldloc.s V_6 0073 ldloc.0 019b br IL_01a1 000d call Math.Abs 0074 call Double.IsNaN 01a0 ldloc.0 0012 stloc.3 0079 brtrue IL_01a0 01a1 br IL_01a7 0013 ldc.r8 -1 007e ldloc.0 01a6 ldloc.0 001c ldloc.3 007f ldc.r8 7.071 01a7 stloc.s V_5 001d mul 0088 bge IL_0144 01a9 ldloc.s V_6 001e ldloc.3 008d ldloc.s V_4 01ab stloc.0 001f mul 008f ldc.r8 220.206867912376 01ac ldloc.0 0020 ldc.r8 2 0098 ldloc.3 01ad call Double.IsInfinity 0029 div 0099 ldc.r8 221.213596169931 01b2 brtrue IL_01f3 002a call Math.Exp 00a2 ldloc.3 01b7 ldloc.0 002f stloc.s V_4 00a3 ldc.r8 112.07929149787 01b8 call Double.IsNaN 0031 ldloc.3 00ac ldloc.3 01bd brtrue IL_01f3 0032 stloc.0 00ad ldc.r8 33.912866078383 01c2 ldloc.0 0033 ldloc.0 00b6 ldloc.3 01c3 ldc.r8 0 0034 call Double.IsInfinity 00b7 ldc.r8 6.37396220353165 01cc bge IL_01dd 0039 brtrue IL_01a6 00c0 ldloc.3 01d1 ldloc.s V_5 003e ldloc.0 00c1 ldc.r8 0.700383064443688 01d3 call NumberValue.Make 003f call Double.IsNaN 00ca ldloc.3 01d8 br IL_01ee 0044 brtrue IL_01a6 00cb ldc.r8 0.035262496599891 01dd ldc.r8 1 0049 ldloc.0 00d4 mul 01e6 ldloc.s V_5 004a ldc.r8 37 00d5 add 01e8 sub 0053 ble IL_0066 00d6 mul 01e9 call NumberValue.Make 0058 ldc.r8 0 00d7 add 01ee br IL_01f9 0061 br IL_01a1 00d8 mul 01f3 ldloc.0 0066 ldloc.3 00d9 add 01f4 call NumberValue.Make 0067 stloc.0 ... 61 lines left out ... 01f9 ret
A s
ingl
e un
wra
ppin
g
Wra
ppin
g W
rapp
ing
W
rapp
ing
Examples: Calendrical functions • Excel’s calendar functions are poor
– Wrong before 1900, no ISO week numbers, cannot easily find first Monday of month, Easter, …
• Easy to implement as sheet-defined functions • Example: Easter in a given year (1400 ns/call):
By MSc students Xu and Liton, following Dershowitz & Reingold (3rd ed, Cambridge UP)
Input: year
Output: Easter fixdate
• Some other functions: – Fixdate to/from day-month-year – Fixdate to/from ISO week and ISO year – Last/nth Monday (etc) before given date – First/nth Monday (etc) after given date
Higher-order functions: Sheet-defined functions as values
• New built-ins to manipulate functions – CLOSURE(“name”, a1, …) evaluates to a closure:
a partially applied sheet-defined function – APPLY(f, b1, …) applies a function value
• Example function “ndie”, a general n-side die
• Defining & rolling 6-sided and 20-sided dice
Input cell
Output cell
www.itu.dk
Funsheet: Linking Excel and Funcalc • Sheet-defined functions in Excel! • Eikeland and Timmermann MSc, June 2014
• Via Excel DNA, an Excel-.NET bridge • Generated code is as fast as Funcalc • Call speed Excel -> Funcalc suffers from
general Excel slowness, 11 us/call or so
• Complete Funcalc functionality: DEFINE, CLOSURE, APPLY, SPECIALIZE, BENCHMARK
• Prototype, so still a number of defects 13
www.itu.dk
TO DO: Validation • Improve the Excel <-> Funcalc link • Demonstrate one application area • Fix obvious problems • Perform development experiments • Perform maintenance experiments • ...
• But experiments is not my area of expertise
14