Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing...

30
Implementing Legacy Implementing Legacy Statistical Algorithms in a Statistical Algorithms in a Spreadsheet Environment Spreadsheet Environment Stephen W. Stephen W. Liddle Liddle Information Systems Faculty Information Systems Faculty Rollins eBusiness Center Rollins eBusiness Center John S. Lawson John S. Lawson Department of Statistics Department of Statistics Brigham Young University Brigham Young University Provo, UT 84602 Provo, UT 84602

Transcript of Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing...

Page 1: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Implementing Legacy Implementing Legacy Statistical Algorithms in a Statistical Algorithms in a Spreadsheet EnvironmentSpreadsheet Environment

Stephen W. Stephen W. LiddleLiddleInformation Systems FacultyInformation Systems Faculty

Rollins eBusiness CenterRollins eBusiness Center

John S. LawsonJohn S. LawsonDepartment of StatisticsDepartment of Statistics

Brigham Young UniversityBrigham Young UniversityProvo, UT 84602Provo, UT 84602

Page 2: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

OverviewOverview

nn IntroductionIntroductionnn Fundamentals of VBA in ExcelFundamentals of VBA in Excelnn Retargeting traditional algorithms to a Retargeting traditional algorithms to a

spreadsheet environmentspreadsheet environmentnn Converting FORTRAN to VBAConverting FORTRAN to VBAnn ConclusionsConclusions

Page 3: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Why Convert FORTRAN Programs to Run Why Convert FORTRAN Programs to Run in a Spreadsheet Environment?in a Spreadsheet Environment?

nn Useful code available that is not Useful code available that is not implemented in standard statistical implemented in standard statistical packagespackages

nn FORTRAN compilers not usually available FORTRAN compilers not usually available on normal Windows workstationon normal Windows workstation

nn Many textbooks refer to published Many textbooks refer to published FORTRAN algorithms FORTRAN algorithms

Page 4: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Sources for Published FORTRAN Sources for Published FORTRAN AlgorithmsAlgorithms

nn STATLIB (STATLIB (http://http://lib.stat.cmu.edulib.stat.cmu.edu//))nn General ArchiveGeneral Archivenn Applied Statistics ArchiveApplied Statistics Archivenn Journal of Quality Technology ArchiveJournal of Quality Technology Archivenn JASA Software ArchiveJASA Software Archivenn JCGS ArchiveJCGS Archive

Page 5: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Advantages of Running Legacy Advantages of Running Legacy FORTRAN Code in ExcelFORTRAN Code in Excel

nn Comfortable environment for practitionersComfortable environment for practitionersnn More user friendly input from spreadsheetMore user friendly input from spreadsheetnn Output to spreadsheet allows further Output to spreadsheet allows further

graphical and computational analysis of graphical and computational analysis of results with Excel functionsresults with Excel functions

Page 6: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information
Page 7: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information
Page 8: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

VDG Inputs: Design X1 X2 X3 X4 X5 X6

Nickname: 1-hybrid 0 0 0 0 0 2.3094Runs: 30 -1 -1 -1 -1 -1 0.57735

Factors: 6 1 1 -1 -1 -1 0.57735Model Order(1/2): 2 1 -1 1 -1 -1 0.57735

Design Region(S/C): s -1 1 1 -1 -1 0.57735Weight by N (Y/N): y 1 -1 -1 1 -1 0.57735

Number of Radii (20-99): 20 -1 1 -1 1 -1 0.57735Scaling Unit (suggest 1): 1 -1 -1 1 1 -1 0.57735

Design Radius/Region Radius: 1 1 1 1 1 -1 0.577351 -1 -1 -1 1 0.57735

-1 1 -1 -1 1 0.57735-1 -1 1 -1 1 0.577351 1 1 -1 1 0.57735

-1 -1 -1 1 1 0.577351 1 -1 1 1 0.577351 -1 1 1 1 0.57735

-1 1 1 1 1 0.577352 0 0 0 0 -1.1547

-2 0 0 0 0 -1.15470 2 0 0 0 -1.1547

Run VDG

Page 9: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information
Page 10: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information
Page 11: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information
Page 12: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information
Page 13: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information
Page 14: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Proposed MethodologyProposed Methodology

nn Understand original FORTRAN programUnderstand original FORTRAN programnn Choose suitable I/O methodsChoose suitable I/O methodsnn Convert original FORTRAN code to VBAConvert original FORTRAN code to VBAnn Test and use resulting Excel codeTest and use resulting Excel code

Page 15: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Visual Basic For ApplicationsVisual Basic For Applications

nn Built on ANSI BASICBuilt on ANSI BASICnn Language engine of Microsoft OfficeLanguage engine of Microsoft Officenn Modern structured programming languageModern structured programming languagenn Has vast array of types, functions, Has vast array of types, functions,

programming helpsprogramming helpsnn Powerful support environment (Office platform)Powerful support environment (Office platform)

nn Popular in business contextsPopular in business contexts

Page 16: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Excel Object ModelExcel Object Model

nn Objects in Excel are Objects in Excel are addressable in VBAaddressable in VBA

nn Each object has:Each object has:nn PropertiesPropertiesnn MethodsMethods

Application

Workbooks (Workbook)

Range Chart

Worksheets (Worksheet)

Page 17: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Clicking these buttons runs the ORPS1 and ORPS2 algorithms.

Input Region

Output Region

Input/Output MethodsInput/Output Methods

nn NonNon--interactiveinteractivenn Files, databasesFiles, databasesnn Worksheet cellsWorksheet cells

nn InteractiveInteractivenn Message boxesMessage boxesnn Input boxesInput boxesnn Custom GUI formsCustom GUI forms

Page 18: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

FORTRAN vs. VBAFORTRAN vs. VBA

nn VBA: VBA: ((--b+Sqrb+Sqr (b^ 2(b^ 2--4*a*c))/(2*a)4*a*c))/(2*a)

nn FORTRAN:FORTRAN: ((--b+SQRT(bb+SQRT(b**2**2--4*a*c))/(2*a)4*a*c))/(2*a)

aacbb

242 −±−

aacbb

242 −±−

Page 19: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

More OperatorsMore Operators

nn .EQ..EQ. ==nn .NE..NE. <><>nn .LT..LT. <<nn .LE..LE. <=<=nn .GT..GT. >>nn .GE..GE. >=>=

nn .AND..AND. AndAndnn .OR..OR. OrOrnn .NOT..NOT. NotNot

nn //// &&

Page 20: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Data TypesData Types

nn INTEGERINTEGER Byte, Integer, LongByte, Integer, Longnn REALREAL SingleSinglenn DOUBLE PRECISION DOUBLE PRECISION DoubleDoublenn COMPLEXCOMPLEX NonNon--primitive in VBAprimitive in VBAnn LOGICALLOGICAL BooleanBooleannn CHARACTERCHARACTER StringStringnn CHARACTER*CHARACTER*lengthlength String*String*lengthlength

nn Other notable VBA types:Other notable VBA types:nn Currency, Decimal, Date, VariantCurrency, Decimal, Date, Variant

Page 21: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Worksheet FunctionsWorksheet Functions

n ChiDist(x,deg_freedom) n Returns one-tailed probability of the ?2 distribution.

n Correl(array1,array2)n Returns the correlation coefficient of two cell ranges.

n Fisher(x)n Returns the Fisher transformation at a given x.

n Pearson(array1,array2)n Returns the Pearson product moment correlation coefficient for two sets.

n Quartile(array,quart)n Returns the requested quartile of a data set.

n StDev(array)n Returns the standard deviation of a data set.

n ZTest(array,x,sigma)n Returns the two-tailed P-value of a z-test.

Page 22: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

FlowFlow--Control StatementsControl Statements

If expr1 Thenstmt1

ElseIf expr2 Thenstmt2

…Else

stmtnEndIf

IF (expr1) THENstmt1

ELSE IF (expr2) THENstmt2

…ELSE

stmtnEND IF

Block if

If expr Then stmtIF (expr) stmtLogical ifVBAFORTRAN

Page 23: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Subtle Differences (“Subtle Differences (“GotchasGotchas”)”)

nn Implicit conversion of real to integer valuesImplicit conversion of real to integer valuesnn FORTRAN: truncateFORTRAN: truncatenn VBA: roundVBA: roundnn Solution: use Solution: use VBA’sVBA’s Fix(), which truncatesFix(), which truncates

nn Both languages allow implicit typingBoth languages allow implicit typingnn This introduces ambiguityThis introduces ambiguitynn Solution: supply explicit types everywhereSolution: supply explicit types everywhere

Page 24: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Eliminating Eliminating GotoGoto StatementsStatements

nn Computer science accepts the axiom that Computer science accepts the axiom that gotogoto is generally “considered harmful”is generally “considered harmful”

nn We advocate rewriting We advocate rewriting alogrithmsalogrithms to use to use structured programming techniques where structured programming techniques where feasiblefeasiblenn Sine qua nonSine qua non is “make it work”is “make it work”nn It’s a good idea for maintainability, It’s a good idea for maintainability,

understandability to move to structured formunderstandability to move to structured form

Page 25: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Eliminating Eliminating GotoGoto StatementsStatements

DO 8 J=1,3...

6 ...IF(OBJFN.GT.BESTFN) GO TO 7...GO TO 6

7 IF(J.EQ.3) GO TO 8XK=BESTK-STEP

8 CONTINUE

Page 26: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Eliminating Eliminating GotoGoto StatementsStatements

For j=1 To 3...

6 ...IF(OBJFN.GT.BESTFN) GO TO 7...GO TO 6

7 IF(J.EQ.3) GO TO 8XK=BESTK-STEP

8 Next j

Page 27: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Eliminating Eliminating GotoGoto StatementsStatements

For j=1 To 3...

6 ...IF(OBJFN.GT.BESTFN) GO TO 7...GO TO 6

7 If j <> 3 Thenxk = bestk - step

End IfNext j

Page 28: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Eliminating Eliminating GotoGoto StatementsStatements

For j=1 To 3...Do Until objfn > bestfn

...LoopIf j <> 3 Then

xk = bestk - stepEnd If

Next j

Page 29: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

Our ReasoningOur Reasoning

nn Digital assets are fragileDigital assets are fragilenn FORTRAN is not universally availableFORTRAN is not universally availablenn Excel is a ubiquitous, powerful platformExcel is a ubiquitous, powerful platformnn VBA is a fullVBA is a full--featured language capable of featured language capable of

handling sophisticated statistical handling sophisticated statistical computationscomputations

Page 30: Implementing Legacy Statistical Algorithms in a Spreadsheet … · 2003-09-28 · Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information

ConclusionsConclusions

nn We recommend creating a WebWe recommend creating a Web--based based repository of Excel/VBA implementations repository of Excel/VBA implementations of classic statistical algorithmsof classic statistical algorithms

nn We can preserve our legacy algorithms in We can preserve our legacy algorithms in this modern spreadsheet environmentthis modern spreadsheet environment

nn EE--mail us if you want a copy of our mail us if you want a copy of our manuscript (manuscript (liddleliddle or or [email protected]@byu.edu))