Going totally Bananas - Uni Koblenz-Landaulaemmel/120927-delft.pdf · mapM :: Monad m => (a -> m b)...
Transcript of Going totally Bananas - Uni Koblenz-Landaulaemmel/120927-delft.pdf · mapM :: Monad m => (a -> m b)...
© 2012 Ralf Lämmel
Going totally Bananas(The Art of Traversal)
Ralf Lämmel@reallynotabba
Software Language EngineerUniversity of Koblenz-Landau
Germany
1
© 2012 Ralf Lämmel
Avoiding unemployability
(Mastering basics of map and fold.)
First topic
2
© 2012 Ralf Lämmel
Haskell’s map
map f l applies the function f to each element of the list l and produces the list of results.
> map ((+) 1) [1,2,3][2,3,4]> map ((*) 2) [1,2,3][2,4,6]> map even [1,2,3][False,True,False]> :t mapmap :: (a -> b) -> [a] -> [b]
Increment the numbers.
The type of map
Double the numbers.
Test all numbers to be even.
3
© 2012 Ralf Lämmel
The definition of map
map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
Case for non-empty list
Case for empty list
Example: map f [1,2,3,4] = [f 1,f 2,f 3,f 4]
4
© 2012 Ralf Lämmel
Haskell’s foldr
foldr f x l combines all elements of the list l with the binary operation f starting from x.
> foldr (+) 0 [1,2,3]6> foldr (&&) True [True,True,False]False> :t foldrfoldr :: (a -> b -> b) -> b -> [a] -> b
Sum up the numbers.
The type of foldr
‘And‘ the Booleans.
5
© 2012 Ralf Lämmel
The definition of foldr
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f k [] = k
foldr f k (x:xs) = f x (foldr f k xs)
Case for non-empty list
Case for empty list
Example: foldr f k [1,2,3,4] = 1 `f` (2 `f` (3 `f` (4 `f` k)))
6
© 2012 Ralf Lämmel
Functional Programming Languages and Computer Architecture 1991 (FPCA’91)
Reading this paper causes
serious headache.
This is why folds are called bananas.
The p
aper
!
7
© 2012 Ralf Lämmel
Don’t fear the monoid!
(Looking at traversal algebraically.)
Next topic
8
© 2012 Ralf Lämmel
Typical forms of reduction
sum = foldr (+) 0
product = foldr (*) 1
and = foldr (&&) True
or = foldr (II) False
concat = foldr (++) []
We usemonoids!
9
© 2012 Ralf Lämmel
[http://en.wikipedia.org/wiki/Monoid] 27 Sep 2012
10
© 2012 Ralf Lämmel
Monoid typeclass
class Monoid a where mappend :: a -> a -> a -- associative operation mempty :: a -- identity of mappend
11
© 2012 Ralf Lämmel
Monoid instance for ‘summation’
newtype Sum a = Sum { getSum :: a }
instance Num a => Monoid (Sum a)
where Sum x `mappend` Sum y = Sum (x + y) mempty = Sum 0
Distinguished numbers
Addition on wrapped numbers
12
© 2012 Ralf Lämmel
Monoid instance for ‘multiplication’
newtype Product a = Product { getProduct :: a }
instance Num a => Monoid (Product a)
where Product x `mappend` Product y = Product (x * y) mempty = Product 1
13
© 2012 Ralf Lämmel
Monoid laws
x `mappend` (y `mappend` z) = (x `mappend` y) `mappend` z
x `mappend` mempty = mempty `mappend` x = x
Associativity
Identity element
These required
properties could be validated or
verified.
14
© 2012 Ralf Lämmel
Foldr for monoids
class Monoid a where mappend :: a -> a -> a -- associative operation mempty :: a -- identity of mappend mconcat :: [a] -> a mconcat = foldr mappend mempty
Why is this operation defined within the typeclass?
15
© 2012 Ralf Lämmel
Basic interview knowledge
• Map & friends is ‘foreach’ for intellectuals.
• Foldr is ‘aggregation’ of results from collections.
• Google et al. use map+reduce for ‘big data’.
• Universal algebra is peaceful and lovely.
aka ‘reduce’
16
© 2012 Ralf Lämmel
Lists are everywhere. Ask LINQ.
(Doing traversal on real data.)
Next topic
17
© 2012 Ralf Lämmel
Datatypes for companies
data Company
= Company Name [Department]
data Department
= Department Name Manager [Department] [Employee]
data Employee = Employee Name Address Salary
type Manager = Employee
type Name = String
type Address = String
type Salary = Float
[101implementation:haskell]
Let’s map and foldr
over companies.
Click to enjoy!
18
© 2012 Ralf Lämmel
Company structure in Haskell
company = Company "meganalysis" [ Department "Research" (Employee "Craig" "Redmond" 123456) [] [ Employee "Erik" "Utrecht" 12345, Employee "Ralf" "Koblenz" 1234 ], Department "Development" (Employee "Ray" "Redmond" 234567) [ Department "Dev1" (Employee "Klaus" "Boston" 23456) [ Department "Dev1.1" (Employee "Karl" "Riga" 2345) [] [ Employee "Joe" "Wifi City" 2344 ] ] [] ] [] ]
19
© 2012 Ralf Lämmel
Cutting salaries in Haskell
cut :: Company -> Company
cut (Company n ds) = Company n (map dep ds)
where
dep :: Department -> Department
dep (Department n m ds es)
= Department n (emp m) (map dep ds) (map emp es)
where
emp :: Employee -> Employee
emp (Employee n a s) = Employee n a (s/2)
[101implementation:haskell]
20
© 2012 Ralf Lämmel
Totaling salaries in Haskell
total :: Company -> Float
total (Company n ds) = sum (map dep ds)
where
dep :: Department -> Float
dep (Department _ m ds es)
= sum (emp m : map dep ds ++ map emp es)
where
emp :: Employee -> Float
emp (Employee _ _ s) = s
[101implementation:haskell]
21
© 2012 Ralf Lämmel
101companies potpourri
•Haskell
•Python
•SQL
•Java
•C#
• ...Click to enjoy!
Map and fold or counterparts thereof in various languages
22
© 2012 Ralf Lämmel
Foldr does it all.
(Doing traversal like a geek.)
Next topic
23
© 2012 Ralf Lämmel
Mapping with foldr
> let words = ["abc","bcd","bxy","ayz"]
> map head words
"abba" BTW, this is a “type-changing” map because string elements are
mapped to characters.
Apply f per elementPreserve list shape
map :: (a -> b) -> [a] -> [b]
map f = foldr ((:) . f) []
24
© 2012 Ralf Lämmel
Monadic with foldr> let words = ["abc","bcd","bxy","ayz"]
> runState (mapM head_and_count words) 0
("abba",4)
head_and_count :: String -> State Int Charhead_and_count x = get >>= put . (+1) >> return (head x)
mapM :: Monad m => (a -> m b) -> [a] -> m [b]mapM g = foldr f k where k = return [] f x mys = do y <- g x; ys <- mys; return (y:ys)
Apply ‘head’ to element, but also
increment counter
Define monadic map in terms of foldr.
Perform mapping in State monad used for
counting elements
25
© 2012 Ralf Lämmel
Compute a list’s length with foldr
> length [1,2,3,4]4
length :: [a] -> Intlength = foldr ( (+) . (const 1) ) 0
Ignore the actual element
26
© 2012 Ralf Lämmel
Filtering a list with foldr
> filter odd [1,2,3,4][1,3]
filter :: (a -> Bool) -> [a] -> [a]filter p = foldr f [] where f x = if p x then (:) x else id
Exercise: make the definition less point-
free for clarity.
27
© 2012 Ralf Lämmel
List reversal with foldr
> reverse [1,2,3,4][4,3,2,1]
reverse :: [a] -> [a]reverse = foldr ( flip (++) . (\x->[x]) ) []
Inefficient definition
28
© 2012 Ralf Lämmel
foldr vs. foldl
foldr :: (a -> b -> b) -> b -> [a] -> b -- for comparisonfoldl :: (b -> a -> b) -> b -> [a] -> bfoldl f k [] = kfoldl f k (x:xs) = foldl f (k `f` x) xs
reverse :: [a] -> [a]reverse = foldl (flip (:)) []
foldr f k [1,2,3,4] = 1 `f` (2 `f` (3 `f` (4 `f` k)))
foldl f k [1,2,3,4] = (((k `f` 1) `f` 2) `f` 3) `f` 4
Efficient definition
foldl can also be defined (efficiently) in terms of
foldr; see further reading.
29
© 2012 Ralf Lämmel
Functors: datatypes that can be mapped.
class Functor f where fmap :: (a -> b) -> f a -> f b
instance Functor [ ] where fmap = map
Laws: fmap id = id fmap (p . q) = (fmap p) . (fmap q)
30
© 2012 Ralf Lämmel
Let’s do an exercise:a functor for n-ary, labeled trees.
31
© 2012 Ralf Lämmel
Why will it fail to capture traversal over companies with a functor?
32
© 2012 Ralf Lämmel
Foldables:datatypes that can be mapped & reduced.
class Foldable t where foldMap :: Monoid m => (a -> m) -> t a -> m ...
instance Foldable [ ] where foldMap f = foldr (mappend . f) mempty ...
Can this definition be understood as map followed by reduce?
33
© 2012 Ralf Lämmel
Let’s do an exercise:foldMap for n-ary, labeled trees.
34
© 2012 Ralf Lämmel
Traversables: datatypes that can be traversedwith applicative functors (‘effects’)
class (Functor t, Foldable t) => Traversable t where traverse :: Applicative f => (a -> f b) -> t a -> f (t b) ...
instance Traversable [ ] where traverse = mapA
traverse is like a monadic map--except that applicative functors are used instead of monads.
35
© 2012 Ralf Lämmel
Dealing with large bananas.
(Traversal for systems of datatypes.)
Next topic
Exercise: try to figure out how this topic relates to visitors.
36
© 2012 Ralf Lämmel
Let’s factor folding!
cut :: Company -> Company
cut (Company n ds) = Company n (map dep ds)
where
dep :: Department -> Department
dep (Department n m ds es)
= Department n (emp m) (map dep ds) (map emp es)
where
emp :: Employee -> Employee
emp (Employee n a s) = Employee n a (s/2)
(This is the basic implementation of cut.)
Observe the function applications per compound constructor component.
37
© 2012 Ralf Lämmel
Another way to think of foldr
> sum l1 = [1,2,3,4]
10
sum :: Num a => [a] -> a
sum = foldr (+) 0
1
:
2
:
3
:
4
:
[]
1
+
2
+
3
+
4
+
0
[1,2,3,4] = 1:(2:(3:(4:[])))
Two arguments: i) How to replace ‘:’?
ii) How to replace ‘[]’?
38
© 2012 Ralf Lämmel
data ListAlgebra a b = ListAlgebra { atNil :: b, atCons :: a -> b -> b }
foldList :: ListAlgebra a b -> [a] -> bfoldList f [] = atNil ffoldList f (x:xs) = atCons f x (foldList f xs)
Type parameter for element type
Type parameter for results
Another way to think of foldr
A product with one component per constructor
A fold function using the algebra
39
© 2012 Ralf Lämmel
Fold algebras for companiesmodule CompanyAlgebra where
import Companyimport Data.Monoid
data CompanyAlgebra c d e = CompanyAlgebra { atCompany :: Name -> [d] -> c, atDepartment :: Name -> e -> [d] -> [e] -> d, atEmployee :: Name -> Address -> Salary -> e}
foldCompany :: CompanyAlgebra c d e -> Company-> cfoldCompany f (Company n ds) = atCompany f nds'where ds' = map (foldDepartment f) ds
foldDepartment :: CompanyAlgebra c d e ->Department -> dfoldDepartment f (Department n m ds es) =atDepartment f n m' ds' es'where m' = foldEmployee f m ds' = map (foldDepartment f) ds es' = map (foldEmployee f) es
foldEmployee :: CompanyAlgebra c d e ->Employee -> efoldEmployee f (Employee n w s) = atEmployee fn w s
[101implementation:tabaluga]
This product provides one component per constructor in the datatypes (sum of ...).
Type parameters for result types corresponding to different datatypes
40
© 2012 Ralf Lämmel
A fold algebra for cutting salaries
[101implementation:tabaluga]
module Cut where
import Companyimport CompanyAlgebra
cut' :: Company -> Companycut' = foldCompany cutAlgebra where cutAlgebra = mapCompany { atEmployee = \n a s -> Employee n a (s/2) }
-- For comparison, without using mapCompany.
cut :: Company -> Companycut = foldCompany cutAlgebra where cutAlgebra :: CompanyAlgebra Company Department Employee cutAlgebra = CompanyAlgebra { atCompany = Company, atDepartment = Department, atEmployee = \n a s -> Employee n a (s/2) } Reconstruct companies
and departments, cut salaries for employees.
41
© 2012 Ralf Lämmel
Folds for companies
module CompanyAlgebra where
import Companyimport Data.Monoid
data CompanyAlgebra c d e = CompanyAlgebra { atCompany :: Name -> [d] -> c, atDepartment :: Name -> e -> [d] -> [e] -> d, atEmployee :: Name -> Address -> Salary -> e}
foldCompany :: CompanyAlgebra c d e -> Company -> cfoldCompany f (Company n ds) = atCompany f n ds' where ds' = map (foldDepartment f) ds
foldDepartment :: CompanyAlgebra c d e -> Department -> dfoldDepartment f (Department n m ds es) = atDepartment f n m' ds' es' where m' = foldEmployee f m ds' = map (foldDepartment f) ds es' = map (foldEmployee f) es
foldEmployee :: CompanyAlgebra c d e -> Employee -> efoldEmployee f (Employee n w s) = atEmployee f n w s
mapCompany :: CompanyAlgebra Company Department EmployeemapCompany = CompanyAlgebra { atCompany = Company, atDepartment = Department, atEmployee = Employee}
mconcatCompany :: Monoid m => CompanyAlgebra m m mmconcatCompany = CompanyAlgebra { atCompany = \n ds -> mconcat ds, atDepartment = \n m ds es -> mappend m (mappend (mconcat ds)
[101implementation:tabaluga]
Never mind. This is/should be generated. It captures
‘drill into structure’.
42
© 2012 Ralf Lämmel
Let’s factor constructor defaults!
module Cut where
import Companyimport CompanyAlgebra
cut' :: Company -> Companycut' = foldCompany cutAlgebra where cutAlgebra = mapCompany { atEmployee = \n a s -> Employee n a (s/2) }
-- For comparison, without using mapCompany.
cut :: Company -> Companycut = foldCompany cutAlgebra where cutAlgebra :: CompanyAlgebra Company Department Employee cutAlgebra = CompanyAlgebra { atCompany = Company, atDepartment = Department, atEmployee = \n a s -> Employee n a (s/2) }
Observe the pattern of rewrapping intermediate results with the same constructor.
43
© 2012 Ralf Lämmel
Deep identity map
module CompanyAlgebra where
import Companyimport Data.Monoid
data CompanyAlgebra c d e = CompanyAlgebra { atCompany :: Name -> [d] -> c, atDepartment :: Name -> e -> [d] -> [e] -> d, atEmployee :: Name -> Address -> Salary -> e}
foldCompany :: CompanyAlgebra c d e -> Company -> cfoldCompany f (Company n ds) = atCompany f n ds' where ds' = map (foldDepartment f) ds
foldDepartment :: CompanyAlgebra c d e -> Department -> dfoldDepartment f (Department n m ds es) = atDepartment f n m' ds' es' where m' = foldEmployee f m ds' = map (foldDepartment f) ds es' = map (foldEmployee f) es
foldEmployee :: CompanyAlgebra c d e -> Employee -> efoldEmployee f (Employee n w s) = atEmployee f n w s
mapCompany :: CompanyAlgebra Company Department EmployeemapCompany = CompanyAlgebra { atCompany = Company, atDepartment = Department, atEmployee = Employee }
mconcatCompany :: Monoid m => CompanyAlgebra m m mmconcatCompany = CompanyAlgebra { atCompany = \n ds -> mconcat ds, atDepartment = \n m ds es -> mappend m (mappend
[101implementation:tabaluga]
Think of algebraic datatypes for a language syntax with hundreds of constructors.
Having default matters.
44
© 2012 Ralf Lämmel
Cutting salaries with a fold
[101implementation:tabaluga]
module Cut where
import Companyimport CompanyAlgebra
cut :: Company -> Companycut = foldCompany cutAlgebra where cutAlgebra = mapCompany { atEmployee = \n a s -> Employee n a (s/2) }
Reuse ‘mapCompany’; customize the case for
‘employees’.
45
© 2012 Ralf Lämmel
What’s a good default for ‘reduction’?
total :: Company -> Float
total (Company n ds) = sum (map dep ds)
where
dep :: Department -> Float
dep (Department _ m ds es)
= sum (emp m : map dep ds ++ map emp es)
where
emp :: Employee -> Float
emp (Employee _ _ s) = s
(This is the basic implementation of total.)46
© 2012 Ralf Lämmel
Deep zero reduction
mconcatCompany :: Monoid m => CompanyAlgebra m m mmconcatCompany = CompanyAlgebra { atCompany = \n ds -> mconcat ds, atDepartment = \n m ds es -> mappend m (mappend (mconcat ds) (mconcat es)), atEmployee = \n w s -> mempty}
[101implementation:tabaluga]
Return ‘mempty’ for base cases; ‘mappend’ intermediate results.
47
© 2012 Ralf Lämmel
Totaling salaries with a fold
[101implementation:tabaluga]
module Total where
import Companyimport CompanyAlgebraimport Data.Monoid
total :: Company -> Floattotal = getSum . foldCompany totalAlgebra where totalAlgebra = mconcatCompany { atEmployee = \n a s -> Sum s }
Reuse ‘mconcatCompany’; customize the case for
‘employees’.
48
© 2012 Ralf Lämmel
101companies potpourri
•Java (using visitors)
• ... Click to enjoy!
49
© 2012 Ralf Lämmel
Scrap your boilerplate.
(Doing traversal really concisely.)
Next topic
http://commons.wikimedia.org/wiki/File:Boilerplate_Mercury_Capsule_-_GPN-2000-003008.jpg
50
© 2012 Ralf Lämmel
The visitor pattern is
pretty much brain-dead.Fold algebras are
conceptually clean and simple, but they feel a bit bulky and limited, also.
51
© 2012 Ralf Lämmel
total :: Company -> Float
total = everything (+) (extQ (const 0) id)
cut :: Company -> Company
cut = everywhere (extT id (/(2::Float)))
SYB style generic programming
Traversal scheme to query all subterms
Override polymorphic constant function 0 to
return floats when present.
Traversal scheme to transform all subterms
Override polymorphic identity to divide by 2 when faced with floats.
52
© 2012 Ralf Lämmel
extT g s x = case cast s of Nothing -> g x (Just s') -> s' x
Likewise for extQ.
Generic function construction
cast is a primitive operation.
generic defaulttype-specific
casepolymorphic
argument
53
© 2012 Ralf Lämmel
everywhere f = f . gmapT (everywhere f)
everything k f x = foldl k (f x) (gmapQ (everything k f) x)
type GenericT = ∀ a. Data a => a -> atype GenericQ r = ∀ a. Data a => a -> r
everywhere :: GenericT -> GenericT
everything :: (r -> r -> r) -> GenericQ r -> GenericQ r
Generic traversal schemesGeneric
transformationsand queries
Recursively transform all children, then transform root.
Recursively query all children, then fold over the intermediate results using the root query for the initial value.
54
© 2012 Ralf Lämmel
gmapT :: GenericT -> GenericTgmapT f (C t1 ... tn) = C (f t1) ... (f tn)
gmapQ :: GenericQ r -> GenericQ [r]gmapQ f (C t1 ... tn) = [(f t1), ..., (f tn)]
Pseudo-code forgeneric one-layer traversal
Map f over immediate subterms, preserve
constructor C.
Map f over immediate subterms, collect results in
list.
gmapT and gmapQ are defined in terms of a gfoldl primitive.
55
© 2012 Ralf Lämmel
Summary
• Mastering basics of map and fold.
• Looking at traversal algebraically.
• Doing traversal on real data.
• Doing traversal like a geek.
• Traversal for systems of datatypes.
• Doing traversal really concisely.
Outlook: M
apReduce for
data parallelism and big data.
56
© 2012 Ralf Lämmel
• “Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire” by Erik Meijer, Maarten Fokkinga, and Ross Paterson, Proceedings of FPCA 1991.http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.125
• “A fold for all seasons” by Tim Sheard and Leonidas Fegaras, Proceedings of FPCA 1993.http://portal.acm.org/citation.cfm?id=165216
• “A tutorial on the universality and expressiveness of fold” by Graham Hutton, Journal of Functional Programming 1999.http://www.citeulike.org/user/pintman/article/468391
• “Dealing with large bananas” by Ralf Lämmel, Joost Visser, and Jan Kort, Proceedings of WGP 2000.http://homepages.cwi.nl/~ralf/wgp00/
• “Scrap your boilerplate ....” (various papers)by various authors since 2003.http://sourceforge.net/apps/mediawiki/developers/index.php?title=ScrapYourBoilerplate
• “The Essence of Data Access in Comega” by Gavin M. Bierman, Erik Meijer, and Wolfram Schulte, Proceeedings of ECOOP 2005.http://dx.doi.org/10.1007/11531142_13
• “The Essence of the Iterator Pattern” by Jeremy Gibbons and Bruno Oliveira, Proceedings of MSFP 2006.http://lambda-the-ultimate.org/node/1410
Further reading
57