Refactoring Functional Programs
-
Upload
marenda-faunus -
Category
Documents
-
view
24 -
download
0
description
Transcript of Refactoring Functional Programs
Huiqing LiClaus Reinke
Simon Thompson
Computing Lab, University of Kent
www.cs.kent.ac.uk/projects/refactor-fp/
Refactoring Functional Programs
Manchester 04 2
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> String
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\t") : fomrat xs
Manchester 04 3
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> String
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\t") : fomrat xs
Manchester 04 4
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> String
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\t") : format xs
Manchester 04 5
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> String
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\t") : format xs
Manchester 04 6
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\t") : format xs
Manchester 04 7
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ “\t") : format xs
Manchester 04 8
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
format :: [String] -> [String]
format [] = []
format [x] = [x]
format (x:xs) = (x ++ "\n") : format xs
Manchester 04 9
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
appNL :: [String] -> [String]
appNL [] = []
appNL [x] = [x]
appNL (x:xs) = (x ++ "\n") : appNL xs
Manchester 04 10
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . format
appNL :: [String] -> [String]
appNL [] = []
appNL [x] = [x]
appNL (x:xs) = (x ++ "\n") : appNL xs
Manchester 04 11
Writing a program
-- appNL a list of Strings, one per line
table :: [String] -> String
table = concat . appNL
appNL :: [String] -> [String]
appNL [] = []
appNL [x] = [x]
appNL (x:xs) = (x ++ "\n") : appNL xs
Manchester 04 12
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . appNL
appNL :: [String] -> [String]
appNL [] = []
appNL [x] = [x]
appNL (x:xs) = (x ++ "\n") : appNL xs
Manchester 04 13
Writing a program
-- format a list of Strings, one per line
table :: [String] -> String
table = concat . appNL
where
appNL :: [String] -> [String]
appNL [] = []
appNL [x] = [x]
appNL (x:xs) = (x ++ "\n") : appNL xs
Manchester 04 14
Refactoring
Refactoring means changing the design of program …
… without changing its behaviour.
Refactoring comes in many forms
• micro refactoring as a part of program development,• major refactoring as a preliminary to revision,• as a part of debugging, …
As programmers, we do it all the time.
Manchester 04 15
Not just programming
Paper or presentationmoving sections about; amalgamate sections; move inline code to a figure; animation; …
Proof introduce lemma; remove, amalgamate hypotheses, …
Programthe topic of the lecture
Manchester 04 16
Overview of the talk
Example refactorings … what do we learn?
Refactoring functional programs
Generalities
Tooling: demo, rationale, design.
What comes next?
Conclusions
Manchester 04 17
Refactoring Functional Programs
• 3-year EPSRC-funded project Explore the prospects of refactoring functional programs
Catalogue useful refactorings
Look into the difference between OO and FP refactoring
A real life refactoring tool for Haskell programming
A formal way to specify refactorings … and a set of proofs that the implemented refactorings are correct.
• Currently mid-project: the second HaRe release is module-aware.
Manchester 04 18
Refactoring functional programs
Semantics: can articulate preconditions and … … verify transformations.
Absence of side effects makes big changes predictable and verifiable … … unlike OO.
Language support: expressive type system, abstraction mechanisms, HOFs, …
Opens up other possibilities … proof …
Manchester 04 19
Rename
f x y = …
Name may be too specific, if the function is a candidate for reuse.
findMaxVolume x y = …
Make the specific purpose of the function clearer.
Needs scope information: just change this f and not all fs (e.g. local definitions or variables).
Needs module information: change f wherever it is imported.
Manchester 04 20
Lift / demote
f x y = … h …
where
h = …
Hide a function which is clearly subsidiary to f; clear up the namespace.
f x y = … (h y) …
h y = …
Makes h accessible to the other functions in the module and beyond.
Needs free variable information: which of the parameters of f is used in the definition of h?
Need h not to be defined at the top level, … , DMR.
Manchester 04 21
Lessons from the first examples
Changes are not limited to a single point or even a single module: diffuse and bureaucratic …
… unlike traditional program transformation.
Many refactorings bidirectional …
… as there is never a unique correct design.
Manchester 04 22
How to apply refactoring?
By hand, in a text editor
Tedious
Error-prone
Depends on extensive testing
With machine support
Reliable
Low cost: easy to make and un-make large changes.
Exploratory … a full part of the programmer’s toolkit.
Manchester 04 23
Machine support invaluable
Current practice: editor + type checker (+ tests).
Our project: automated support for a repertoire of refactorings …
… integrated into the existing development process: Haskell IDEs such as vim and emacs.
Manchester 04 24
Demonstration of HaRe, hosted in vim.
Manchester 04 25
Proof of concept …
To show proof of concept it is enough to:
• build a stand-alone tool,
• work with a subset of the language,
• ‘pretty print’ the refactored source code in a standard format.
Manchester 04 26
… or a useful tool?
To make a tool that will be used we must:
• integrate with existing program development tools: the program editors emacs and vim: only add to their capabilities;
• work with the complete Haskell 98 language;
• preserve the formatting and comments in the refactored source code;
• allow users to extend and script the system.
Manchester 04 27
Refactorings implemented in HaRe
RenameDeleteLift (top level / one level)DemoteIntroduce definitionRemove definitionUnfoldGeneraliseAdd and remove parameters
All these refactorings are module aware.
Manchester 04 28
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Promote the definition of sq to top level
Manchester 04 29
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Identify the definition of sq to be promoted
Manchester 04 30
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Is sq defined at top level, here or in importing modules; is sq imported from elsewhere?
Manchester 04 31
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Does sq use anything defined locally to sumSquares ?
Manchester 04 32
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq pow x + sq pow y
where sq :: Int->Int->Int
sq pow x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
If so, generalise to add these as parameters, and change type signature.
Manchester 04 33
Implementing HaRe: an example
-- This is an example
module Main where
sumSquares x y = sq pow x + sq pow y
where pow = 2 :: Int
sq :: Int->Int->Int
sq pow x = x ^ pow
main = sumSquares 10 20
Finally, move the definition to top level.
Manchester 04 34
The Implementation of Hare
Informationgathering
Pre-conditionchecking
Programtransformation
Programrendering
Manchester 04 35
Information needed
Syntax: replace the function called sq, not the variable sq …… parse tree.
Static semantics: replace this function sq, not all the sq functions …… scope information.
Module information: what is the traffic between this module and its clients …… call graph.
Type information: replace this identifier when it is used at this type …… type annotations.
Manchester 04 36
Infrastructure
To achieve this we chose to:
• build a tool that can interoperate with emacs, vim, … yet act separately.
• leverage existing libraries for processing Haskell 98, for tree transformation, yet …
… modify them as little as possible.
• be as portable as possible, in the Haskell space.
Manchester 04 37
The Haskell background
Libraries• parser: many• type checker: few• tree transformations: few
Difficulties• Haskell98 vs. Haskell extensions.• Libraries: proof of concept vs. distributable.• Source code regeneration.• Real project
Manchester 04 38
Programatica
Project at OGI to build a Haskell system …
… with integral support for verification at various levels: assertion, testing, proof etc.
The Programatica project has built a Haskell front end in Haskell, supporting syntax, static, type and module analysis …
… freely available under BSD licence.
Manchester 04 39
The Implementation of Hare
Informationgathering
Pre-conditionchecking
Programtransformation
Programrendering
Manchester 04 40
First steps … lifting and friends
Use the Haddock parser … full Haskell given in 500 lines of data type definitions.
Work by hand over the Haskell syntax: 27 cases for expressions …
Code for finding free variables, for instance …
Manchester 04 41
Finding free variables … 100 lines
instance FreeVbls HsExp where
freeVbls (HsVar v) = [v]
freeVbls (HsApp f e)
= freeVbls f ++ freeVbls e
freeVbls (HsLambda ps e)
= freeVbls e \\ concatMap paramNames ps
freeVbls (HsCase exp cases)
= freeVbls exp ++ concatMap freeVbls cases
freeVbls (HsTuple _ es)
= concatMap freeVbls es
… etc.
Manchester 04 42
This approach
Boiler plate code …
… 1000 lines for 100 lines of significant code.
Error prone: significant code lost in the noise.
Want to generate the boiler plate and the tree traversals …
… DriFT: Winstanley, Wallace… Strafunski: Lämmel and Visser
Manchester 04 43
Strafunski
Strafunski allows a user to write general (read generic), type safe, tree traversing programs …
… with ad hoc behaviour at particular points.
Traverse through the tree accumulating free variables from component parts, except in the case of lambda abstraction, local scopes, …
Strafunski allows us to work within Haskell … other options are under development.
Manchester 04 44
Rename an identifier
rename:: (Term t)=>PName->HsName->t->Maybe t rename oldName newName = applyTP worker where worker = full_tdTP (idTP ‘adhocTP‘ idSite) idSite :: PName -> Maybe PName idSite v@(PN name orig) | v == oldName = return (PN newName orig) idSite pn = return pn
Manchester 04 45
The coding effort
Transformations with Strafunski are straightforward …
… the chore is implementing conditions that guarantee that the transformation is meaning- preserving.
This is where much of our code lies.
Manchester 04 46
The Implementation of Hare
Informationgathering
Pre-conditionchecking
Programtransformation
Programrendering
Manchester 04 47
Program rendering example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Promote the definition of sq to top level
Manchester 04 48
Program rendering example
module Main where
sumSquares x y
= sq pow x + sq pow y where pow = 2 :: Int
sq :: Int->Int->Int
sq pow x = x ^ pow
main = sumSquares 10 20
Using a pretty printer: comments lost and layout quite different.
Manchester 04 49
Program rendering example
-- This is an example
module Main where
sumSquares x y = sq x + sq y
where sq :: Int->Int
sq x = x ^ pow
pow = 2 :: Int
main = sumSquares 10 20
Promote the definition of sq to top level
Manchester 04 50
Program rendering example
-- This is an example
module Main where
sumSquares x y = sq pow x + sq pow y
where pow = 2 :: Int
sq :: Int->Int->Int
sq pow x = x ^ pow
main = sumSquares 10 20
Layout and comments preserved.
Manchester 04 51
Rendering: our approach
White space and comments in the token stream.
2 views of the program: token stream and AST.
Modification of the AST guides the modification of the token stream.
After a refactoring, the program source is extracted from the token stream not the AST.
Use heuristics to associate comments with semantic entities.
Manchester 04 52
Production tool (version 0)
Programaticaparser and
type checker
Refactorusing a
Strafunskiengine
Render codefrom the
token streamand
syntax tree.
Manchester 04 53
Production tool (version 1)
Programaticaparser and
type checker
Refactorusing a
Strafunskiengine
Render codefrom the
token streamand
syntax tree.
Pass lexical information toupdate thesyntax treeand so avoid reparsing
Manchester 04 54
Module awareness: example
Move a top-level definition f from module A to B.
-- Is f defined at the top-level of B? -- Are the free variables in f accessible within module B? -- Will the move require recursive modules?
-- Remove the definition of f from module A. -- Add the definition to module B. -- Modify the import/export lists in module A, B and the client modules of A and B if necessary. -- Change uses of A.f to B.f or f in all affected modules. -- Resolve ambiguity.
Manchester 04 55
What have we learned?
Emerging Haskell libraries make it a practical platform.
Efficiency issues … type checking large systems.
Limitations of IDE interactions in vim and emacs.
Reflections on Haskell itself.
Manchester 04 56
Reflecting on Haskell
Cannot hide items in an export list (though you can on import).
The formal semantics of pattern matching is problematic.
‘Ambiguity’ vs. name clash.
‘Tab’ is a nightmare!
Correspondence principle fails …
Manchester 04 57
Correspondence
Operations on definitions and operations on expressions can be placed in correspondence
(R.D.Tennent, 1980)
Manchester 04 58
Correspondence
Definitions
where
f x y = e
f x
| g1 = e1
| g2 = e2
Expressions
let
\x y -> e
f x = if g1 then e1 else if g2 …
Manchester 04 59
Where do we go next?
• Larger-scale examples: ADTs, monads, …
• An API for do-it-yourself refactorings, or …
• … a language for composing refactorings
• Detecting ‘bad smells’
• Evolving the evidence: GC6.
Manchester 04 60
What do users want?
Find and remove duplicate code.
Argument permutations.
Data refactorings.
More traditional program transformations.
Monadification.
Manchester 04 61
Monadification (cf Erwig)
r = f e1 e2 do v1 <- e1
v2 <- e2
r <- f v1 v2
return r
Manchester 04 62
Larger-scale examples
More complex examples in the functional domain; often link with data types.
Dawning realisation that can some refactorings are pretty powerful.
Bidirectional … no right answer.
Manchester 04 63
Algebraic or abstract type?
data Tr a
= Leaf a |
Node a (Tr a) (Tr a)
Tr
Leaf
Node
flatten :: Tr a -> [a]
flatten (Leaf x) = [x]
flatten (Node s t)
= flatten s ++
flatten t
Manchester 04 64
Algebraic or abstract type?
data Tr a
= Leaf a |
Node a (Tr a) (Tr
a)
isLeaf = …
isNode = …
…
Tr
isLeaf
isNode
leaf
left
right
mkLeaf
mkNode
flatten :: Tr a -> [a]
flatten t
| isleaf t = [leaf t]
| isNode t
= flatten (left t)
++ flatten (right t)
Manchester 04 65
Algebraic or abstract type?
Pattern matching syntax is more direct …
… but can achieve a considerable amount with field names.
Other reasons? Simplicity (due to other refactoring steps?).
Allows changes in the implementation type without affecting the client: e.g. might memoise
Problematic with a primitive type as carrier.
Allows an invariant to be preserved.
Manchester 04 66
Outside or inside?
data Tr a
= Leaf a |
Node a (Tr a) (Tr
a)
isLeaf = …
…
Tr
isLeaf
isNode
leaf
left
right
mkLeaf
mkNode
flatten :: Tr a -> [a]
flatten t
| isleaf t = [leaf t]
| isNode t
= flatten (left t)
++ flatten (right t)
Manchester 04 67
Outside or inside?
data Tr a
= Leaf a |
Node a (Tr a) (Tr
a)
isLeaf = …
…
flatten = …
Tr
isLeaf
isNode
leaf
left
right
mkLeaf
mkNode
flatten
Manchester 04 68
Outside or inside?
If inside and the type is reimplemented, need to reimplement everything in the signature, including flatten.
The more outside the better, therefore.
If inside can modify the implementation to memoise values of flatten, or to give a better implementation using the concrete type.
Layered types possible: put the utilities in a privileged zone.
Manchester 04 69
API
Refactorings
Refactoringutilities
Strafunski
Haskell
Manchester 04 70
DSL
Refactorings
Refactoringutilities
Strafunski
Haskell
Combining forms
Manchester 04 71
Detecting ‘bad smells’
Work byChris Ryder
Manchester 04 72
Evolving the evidence
Dependable System Evolution is the software engineering grand challenge.
Build systems with evidence of their dependability …
… but this begs the question of how to evolve the evidence in line with the system.
Refactoring proofs, test coverage data etc.
Manchester 04 73
Teaching and learning design
Exciting prospect of using a refactoring tool as an integral part of an elementary programming course.
Learning a language: learn how you could modify the programs that you have written …
… appreciate the design space, and
… the features of the language.
Manchester 04 74
Conclusions
Refactoring + functional programming: good fit.
Practical tool … not ‘yet another type tweak’.
Leverage from available libraries … with work.
We have begun to use the tool in building itself!
Much more to do than we have time for.
Martin Fowler’s ‘Rubicon’: ‘extract definition’ … in HaRe version 1 … fp productivity.
www.cs.kent.ac.uk/projects/refactor-fp/