Haskell Monad Tutorial

8/2/2019 Haskell Monad Tutorial

1/27

Monad tutorial

From http://spbhug.folding-maps.org/wiki/MonadsEn

Introduction

The word "monad" is familiar to virtually everyone who has dealt with functional programming.

Many people are scared away by their seeming abstractness and mathematicity and by the need

to use them for the most simple tasks, like console I/O. The Internet is filled with monad tutorials

and there is even a popular belief that one has the right to be considered an FP newbie no soonerthan he writes such a tutorial. These articles present monads from different points of view: from

the one of their underlying mathematical concepts, or like a 'hack' into the imperative world, or

from a purely practical point of view, or extremely broadly and comprehensively. The text you

are reading is meant to illuminate them from another one: monads will be presented as ageneralization of certain common idioms and as a method for abstracting them.

Goal and target audience

It is assumed that the reader is familiar with basics of Haskell and with the methodology of FP in

general. Otherwise, one can get the required knowledge from articles listed at the end of thisarticle. My goal is to explain the concept and give an intuitive understanding of monads with

several examples, some of which are common and familiar and some are not. This text is not a

practical manual on usage of standard monads.

Three familiar examples

Here we consider three common idioms: sequential computations, computations with handling ofabsent values and computations that return a set of possible results. Then we shall point out their

commonality.

Sequential computations (the IO monad)

Consider the ';' operation, which means 'and then':

(print "Hello") ; (print "Goodbye")

This line means: Display "Hello" and then display "Goodbye". The execution of this programproceeds as follows:

At moment t0 the screen is blank At moment t1 the screen is modified by print "Hello"; the screen contains 'Hello'. At moment t2 the screen is modified by print "Goodbye"; the screen contains 'Hello

Goodbye'


2/27

Let us write this sequence as a system of simultaneous equations:

world(t=0) = (empty screen)world(t=1) = world(t=0) + (print "Hello")world(t=2) = world(t=1) + (print "Goodbye")

This way of programming (expressing the solution of a problem via a sequence of actions) is themost well-known and used one in main-stream languages: C, Java etc. One can easily see that inthe world of imperative computations, I/O etc. all such systems of equations look like a 'chain':

world(n+1) = world(n) + action. One can not take an old state of the world out of one'spocket and perform an action on it, because two different states of the world cannot coexist.

Thus, an equation like world(10) = world(3) + (print "Hello") is impossible.

So, a sequence of actions A1; A2; A3; ... can be viewed as

A1 `then` (A2 `then` A3 `then` ...). The then operation suffices for describing a

sequence of actions (it is actually the same as a semicolon, ;, but we are not used to seeing thesemicolon as a binary operation). Note that the operation is associative (t.i.,

((A1 `then` A2) `then` A3) == (A1 `then` (A2 `then` A3))), so the parentheses can beomitted; all languages do so. However, for a better intuitive understanding ('do the first action,

then do the rest') one should interpret the `then` operation as right-associative.

Now, a question arises: what should be considered the value of(A `then` B): the value of A,

the value of B or something else? The traditional and logical way is to take the value of B.Really, if we had to first evaluate A and then evaluate B, that means that correctness of

evaluation of B did depend on evaluation of A: for example, in the sequence

(send query to database) `then` (await answer from database)

the correctness of 'await answer' depends on whether 'send query' was evaluated. Thus, if we areinterested in the correctness of B, we are interested in its value, at least in the cases where we are

evaluating (A `then` B) to get some value and not just for the side-effects.

From a purely functional point of view (where one considers that the only important thing about

a computation is its value) one could say that the `then` operation is unneeded; then in theexample above we could just write 'await answer from database'. But this is obviously false: we

should first completely evaluate 'send query' and only afterwards start evaluating 'await answer'.

This necessity is dictated by the pure fact that both computations have side effects, t.i. interactwith the outer world.

Morale: In the world of computations with side effects (I/O, modifications of variables)computations are bound with a 'and then' operation, which means 'do the first, and, only after it

completes, do the rest'.

Computations with absent values (Maybe monad)


3/27

Consider a synthetic but typical and familiar example: A program opens a file, reads a line from

it and searches a database for the corresponding value. If an error occurs at any stage, null isreturned.

File f = open("keys.txt");if(f == null)

return null;

String key = readLine(f);if(keys == null)

return null;

String value = ourDatabase.get(key);if(value == null)

return null;

return "The value is: " + value;

Now let us substitute the ; ('and then') operation for ;? ('and then, if successful'):

File f = open("keys.txt") ;?String key = readLine(f) ;?String value = ourDatabase.get(key) ;?return "The value is: " + value;

This example can be easily generalized to a more sensible way of handling errors, where instead

ofnull one uses values of an Exception type. However, the essence remains: functions returnvalues of two kinds: 'results' and 'error signals', and error in an intermediate computation causes

the error in the whole computation.

Morale: In the world where computations sometimes fail, computations are bound with the 'andthen, if successful' operation, which means 'do first and, if successful, do rest; if error E occurred,

return error E'. The example shows a particular case where there is only one type of error - null.

Computations with many results (List monad)

Consider computations that return a list of values. They can return them either directly, like 'list

files in a directory' or 'split string into a list of symbols', or indirectly, via a combination of such

multiple-valued computations:

the procedure 'get all orders of all departments of all shops', based on procedures 'listshops', 'list departments of a shop' and 'list orders in a department'

the procedure 'get all words in a file', based on procedures 'list lines of a file' and 'listwords of a line'

the procedure 'list files in a directory and its subdirectories', assembling its result fromresults of recursive calls.

Here is a typical implementation of the first procedure, which computes the total cost of allorders.


4/27

foreach(Shop s : getShops()) {foreach(Department d : getDepartments(s)) {

foreach(Order ord : getOrders(d)) {sum += ord.getCost();

}}

}

Now let's consider the case where the number of levels of iteration is unknown: for example,recursively listing a directory.

listFilesRec(File file) {List res;res.add(file);foreach(File sub : getContents(file)) {

res.addAll(listFilesRec(sub));}return res;

}

Let us now rewrite these examples using a ;* operation, which threads the next operationthrough all results of the previous one.

Shop s = getShops() ;*Department d = getDepartments(s) ;*Order ord = getOrders(d) ;*sum += ord.getCost();listFilesRec(File file) {

contents = getContents(file) ;*rec = listFilesRec(contents) ;*

return [file]++[rec];}

Morale: In the world of iterations and non-determinate computations, computations are bound

by the 'and, for each result' operation, which means, for example, 'for each result ord of

getOrders(d) evaluate sum += ord.getCost(d)'.

Warning

At this point, it is important to know that, although these three monads are very often used for

illustrating the concept, they are the most easy and maybe even degenerate ones, understanding

of just them does not give an understanding of monads in general. In my opinion, that's the

precise reason why many people don't understand monads or consider them too simple, toocomplex or too useless. You will get a good understanding after two other examples of monads,

namely Parser and especially Dist, which are described somewhat below. However, don't

proceed to that section before reading the text in between Now, let us continue.

What's in common?


5/27

An attentive reader might have already noticed that in all cases we chose a specific strategy of

binding two computations, the 'first' one and the 'rest' one; we overloaded the ; operator. One cansay that this is the precise essence of monads. Now, to use this as a means of abstraction, we'll

have to formalize this concept.

Let us formalize the concept of 'binding' two computations, t.i., generalize the ;, ;? and ;*operations to one operation, namely (>>=).

1. This is a polymorphic operation: its logic does not depend on the values and types ofvalues it binds. For example, ;* works in absolutely the same way, whether it binds a list

of shops with a computation depending on a shop or when it binds a list of files with a

computation depending on a file.

2. The type of(>>=) obviously depends on the types of the first and second (rest)computations: it will have at least 2 type variables: a (type of the first computation) and b

(type of the second computation). Note again the polymorphism: the type will be of formforall a, b . (a type expression depending on a and b)

3.

The result of this operation is the result of the second computation.forall a, b . ... -> b4. The second computation depends on the first, otherwise we wouldn't need any shiny

monads and could just perform the second computation.

So far, we get something like a -> (a -> b) -> b, where a is the first computation and (a -

> b) is the second, depending on a 'parameter' computed by the first. This is close to true(although there is surely something wrong with it: there exist only one function with such a type,

and it is flip ($), the reverse application function, and it doesn't deserve such a loud name as 'amonad'). However, there is an important caveat: a value is radically different from the same

value computed inside a monad! For example, computing "Hello" as "He"++"llo" is obviously

radically different from reading "Hello" from keyboard: the first way doesn't have side effects,whereas the second does (mainstream languages and languages without lazy evaluation tend to

ignore this distinction, but we won't). For example, readLine has type IO String instead of

String and a hypothetical lookupUserByName function has type String -> Maybe User

instead ofString -> User. So, the type of a computation in a monad m is not 'a' but 'm a'.This can be interpreted as 'A value computed in a particular way', for example 'A value computed

with side effects'(IO monad) or 'A value that maybe hasn't been computed because of an error'

(Maybe monad) or 'A value with multiple alternatives'(List monad). A monad attaches anadjective to all values.

Now, we get something like m a -> (m a -> m b) -> m b. This is even closer, but the second

computation actually doesn't need to know about the fact that its argument is 'wrapped'. Theprint function doesn't care about how the displayed string was computed, whereas

getContents doesn't care that it is actually called on several files and that its results areconcatenated. These dependencies should be managed not by the computations but by the

implementation of(>>=) in the monad; this is the precise mission of(>>=).

So, we get the type (>>=) :: m a -> (a -> m b) -> m b . This is the final and correct

variant. This type can be read as follows: (>>=) in a monadmbinds monadic computation of a


6/27

parameter (m a) with a monadic computation depending on this parameter (a -> m b),

yielding a monadic result (m b).

Now we are able to bind two monadic computations together, but there is no way to create a

monadic computation 'from nothing' for an arbitrary monad. It may seem that there is no need to

have such an operation, because we will always use a concrete monad (Further we'll see that thisisn't true and there exist useful functions that work for all monads), for instance, the List monad,

and for a concrete case, a way to create a monadic value usually exists: there's nothing difficult

about creating a list of values (a value of type List a).

However, since such an operation must, directly or indirectly, exist in any useful monad, it

makes sense to include it into the definition of a monad. This operation is called 'return' and has

type a -> m a. It takes an arbitrary fixed value and 'lifts' it into the monad, attaching the

monad's adjective to it. return converts a value to what we would get if the value was

computed inside the monad. Things will become more clear soon.

For example, in the IO monad return "Hello" :: IO String and this computation represents'the string "Hello", as if it was computed with side-effects' . In the Maybe monad,

return "Hello" = Just "Hello" and it means 'the string "Hello", as if it were the successfulresult of a computation that potentially could fail', and in the List monad

return "Hello" = ["Hello"] , 'the string "Hello", as if it were the sole result of a multiple-

valued computation'. The reasoning behind such a choice of implementing return are

intuitively clear but now we'll see their theoretical grounding.

Remember that (>>=) in the monad m binds monadic computation of a parameter with a monadic

computation depending on this parameter. If we perform the 'monadic computation of a

parameter' using return, it would make sense to require the effect to be the same as if the

parameter was simply passed to the second computation: (return x >>= f) == (f x). This isthe first of the three monad laws.

It's easy to check that for the monads considered above such implementations

(return "Hello" = ["Hello"] etc.) satisfy this law. There are another two laws, which are

less obvious; their meaning will be clear from their definitions. So, here are the three monad

laws:

Monad laws

Agreement ofreturn and (>>=): (return x >>= f) == (f x) Associativity of(>>=) : ((x >>= f) >>= g) == (x >>= (\y -> (f y >>= g))) .

This law allows to interpret the sequence a ; b ; c ; ... as monolithic and to not care

about placement of parentheses. In terms of operations like ;?, ;*, ; (let us denote them

all with ;;) one can re-formulate this law as follows:

z = ( y = x ;;return (f y ) ;;

return (g z)


7/27

is identical to

y = x ;;( z = (f y) ;;

return (g z) )

Right unit: (x >>= return) = x . One can rewrite this law as (x >>= (\a -> return a)) = x, that is, binding a monadic computation x with a parametrized

computation which simply returns the parameter is the same as just x.

Now we can give a complete definition of a monad:

Definition of a monad

A monad is a triple (m, return, >>=), where:

m is a type with one argument

return :: forall a . a -> m a(>>=) :: forall a, b . m a -> (a -> m b) -> m b

These requirements can be formalized at the type system level, and Prelude contains a classdefinition:

class Monad m wherereturn :: a -> m a(>>=) :: m a -> (a -> m b) -> m b

A monad should satisfy the laws (they can't be enforced or checked by the compiler but their

violation will most probably lead to exotic consequences):

(return x) >>= f = f x(x >>= f) >>= g = x >>= (\y -> f y >>= g)(x >>= return) = x

Actually, the Monad class in Prelude contains one more member: fail :: String -> m a butwe won't talk about it because it has no relation to the concept of monads and in the next version

of Haskell standard it is planned to move this function to a separate class.

Standard monads

Identity

This is the simplest monad, to which corresponds the adjective 'ordinary': Identity String is

'just an ordinary String'. This monad essentially doesn't change neither types of values, nor thebinding strategy.

data Identity a = Identity areturn a = Identity a


8/27

(Identity a) >>= f = f a

It is not clear why one would need such a monad, but it has applications with monadtransformers; however, they are beyond the scope of this article. So, let us consider this monad a

purely illustrative degenerate example.

Maybe (the monad of computations with handling of absent

values)

In this monad there are two kinds of values: 'ordinary' ones and a special value 'the value ismissing'.

data Maybe a = Nothing| Just a

The implementation is trivial: binding a 'just' value with a parametrized computation is simply

passing the parameter to it, whereas binding a missing parameter with a parametrizedcomputation yields a missing result.

return a = Just aNothing >>= f = Nothing(Just a) >>= f = f a

List (the monad of computations with multiple results)

Here each value has several 'alternatives'. If one value depends on another, alternatives of both

are enumerated.

return a = [a]params >>= f = concat [f x | x [b] is an 'indeterminate' function that returns a bunch of possible results, dependingon a parameter

Thus, as expected, params >>= f :: [b] is a list of possible results obtained by applying thefunction to possible arguments. For example, one might have["c:/music", "c:/work"] >>= getDirectoryContents = ["c:/music/Bach", "c:/music/Beethoven", "c:/music/Rammstein", "c:/work/projects", "c:/work/documents"]

State (the monad of computations with a mutable state)

This is a more complex monad not mentioned above. It corresponds to a computation that has an

internal state modified while it proceeds, but the state is not as global as in the IO monad (where


9/27

a computation potentially changes state of the whole world and where the state can't be obtained

directly or stored). Here each computation does two things: it returns a value and modifies thestate (t.i., returns a new state).

So, a computation of type a with a state of type s has type s -> (a,s). That's exactly the type

of the State monad: newtype State s a = State {runState :: s -> (a, s)} (probably itwould better be called StatefulComputation but that's somewhat too long).

Consider, for instance, a program implementing the Monte-Carlo method. It will need a stateful

random number generator. This program will use a function 'generate a random number' thatreturns a new random number and changes the generator state:

rand :: State RndGen Intrand = ... -- Let us omit the implementation for now; it will be shown below.

monteCarlo :: (Int -> Bool) -> Int -> State RndGen IntmonteCarlo experiment 0 = return 0

monteCarlo experiment n = rand >>= \r ->if (experiment r) thenmonteCarlo experiment (n-1) >>= \s ->return (s+1)

elsemonteCarlo experiment (n-1)

Note that the code ofmonteCarlo has no signs of presence of assignment and change of state:

the code is written in a purely functional style and its statefulness is hidden by the State monad

and the rand function.

How would we implement rand? Well, we could do it in a boilerplate fashion:

rand :: State RndGen Intrand = State $ \(RndGen x) -> (x, RndGen ((x*1367823 + 918237) `mod` 32768))

However, it seems more natural to equip the State monad with functions for reading and

changing state; they are called get and put and the State monad in Haskell is actually equippedwith them:

rand = get >>= \(RndGen x) ->put $ RndGen ((x*1367823 + 918237) `mod` 32768) >>=return x

This code is a bit longer but it doesn't look like a mystical lambda abstraction, instead itmanipulates state in an obvious way. It is not beautiful but somewhat later we'll see that Haskellhas syntactic sugar for this kind of code.

The get and put functions are not hacks, they are implemented rather trivially but it is

instructive to have a look at them:

get :: State s s


10/27

get = State $ \s -> (s, s)

put :: s -> State s ()put s' = State $ \s -> ((), s')

It turns out that get is 'a stateful computation that just returns the state' and put is 'a stateful

computation that returns nothing in particular and just modifies the state'

Now look at the implementation of the State monad itself:

instance Monad (State s) wherereturn a = State dontChangeStateAndReturnA

where dontChangeStateAndReturnA s = (a, s)

-- r1 :: State s a = State (s -> (a, s)) is a statefulcomputation of 'a'

-- p :: a -> State s b = a -> State (s -> (b, s)) is a statefulcomputation of 'b' parameterized by r1

(State r1) >>= p = State passStatewhere passState s = (res2, finalState)

where (res1, intermediateState) = r1 s -- Performthe first computation, compute the parameter

(State r2) = p res1 -- *Compute*the second computation, using the parameter

(res2, finalState) = r2 intermediateState -- Performthe second computation

This code can be made clearer if we remove the State constructor (however, it will becomeincorrect):

r1 >>= p = passState

where passState s = (res2, finalState)where (res1, intermediateState) = r1 s -- Perform

the first computation, compute the parameter(res2, finalState) = (p res1) intermediateState -- Compute

and perform the second computation

IO (the monad of computations with side-effects)

This section presents the IO monad in a perhaps intuitively understandable butincorrect way; the correct way is more complex than the intended complexity of this

article. However, it is also much more peculiar; you can look at it in [C.5]

IO's adjective is 'computed with side-effects', and its strategy of binding two computations is

'first perform side effects of the first one, then side effects of the second one' (remember theexample with sending query to a database and waiting for response), whereas pure computations

involved proceed lazily as usual because their order doesn't matter.

Let us take a naive approach to describing IO:


11/27

data IO a = IO a

return a = IO a(IO a) >>= f = f a -- (*) but with side effects of 'a' magically performedbefore applying 'f', see later

Remember that in the beginning of this text, we told that there's no way to restore a previousstate of the world and that the essence of sequential computations is to build a strictly linearchain of world states. The only way to construct a sound model of sequential computation is to

prohibit saving the world's state for later use and breaking the linearity. To do this, it suffices to

hide the IO constructor and thus prohibit pattern matching: let (IO s) = readLine in ....

Having done that, we now have a stunning property: there's no way out of the IO monad! There

is no way to convert IO a to just a and no way to hide the fact that a computation has side

effects. If a function uses a value of type ... -> IO ..., then the type of this function will also

be '... -> IO ...'. Thus, a function that uses side effects gets itself annotated with "warning, side

effects present" and vice versa: if a function does not return an IO value, we can be sure that it

has no side effects, this fact is enforced by the typechecker! This is one of the reasons for lowbug count in Haskell problems because it eliminates one of the most frequent causes of bugs:

indirect interactions because of side effects.

To be honest, there exists a function System.IO.Unsafe.unsafePerformIO with typeIO a -> a, but one should think twice before even considering its usage, because withthis function we also obtain all the imperative problems of hidden effects, which getmuch worse because of Haskell's lazy evaluation mechanism: now it is very hard to say

when exactly will an effectful computation be performed and whether it will be

performed at all. Usage ofunsafePerformIO can be reasonably safe only in cases whereits side-effect does not affect anything (for example, debug output) or when it is always

the same, t.i. the computation is idempotent(for example, reading a global configurationfile). But even in this cases, it's better to avoid it.

It is difficult to explain how exactly (*) is achieved, and, with this approach

(data IO a = IO a) it is completely impossible. However, this approach, together with (*) andhiding the IO data constructor, shows the most important properties of the IO monad: the fact

that pure functional computations get unchanged and that effectful computations get sequenced,and the fact that one can't get out of the IO monad and hide the effectfulness of a computation.

Some ways of implementing the IO monad include presenting it as a State monad which has thewhole world as its state (or some internal runtime state, like processor registers, I/O ports and

process memory, transformation of which leads to the computer performing observable effects),presenting an IO value as a chunk of code that, when ran, performs an effect, and presenting it as

a function from a bunch of (already performed) effects to a bunch of some more effects. This

forces sequencing of effects because the function can't proceed before its 'input' effects are

performed.

Others


12/27

Haskell's standard library defines some more monads: Reader, Writer and Cont. I

recommended to look at them yourself: Reader and Writer are conceptually simple and often

used; Cont is a complicated monad used for programming in 'continuation-passing style'. Itsdiscussion is beyond the scope of this article.

do syntax

Consider again the example with multiple-valued functions (the List monad):

Shop s = getShops() ;*Department d = getDepartments(s) ;*Order ord = getOrders(d) ;*sum += ord.getCost();

Let us rewrite it in Haskell:

let s = getShopsd = getDepartments sord = getOrders d

insum (getCost ord)

This code is of course incorrect: it doesn't typecheck. s has type [Shop], not Shop, so

getDepartments can't be applied to s, similarly for d/getOrders and ord/getCost. Code with

the ;* operator shouldn't be interpreted literally. Actually, when writing

Shop s = getShops ;*

we meant that s is a parameter used by the rest of the computation. So, this code means not 'Lets equal getShops' but instead 'Let parameter s be computed via getShops'. Results of this

computation will be used by ;* that, in case of the List monad, has type [a] -> (a -> [b]) -

> [b] and, in the particular case where a = Shop, [Shop] -> (Shop -> [b]) -> [b]. So, s

has type [Shop] but the rest of the computation depends on Shop. Taking this into account wecan rewrite the first fragment:

getShops() >>= \s ->getDepartments(s) >>= \d ->getOrders(d) >>= \ord ->sum += ord.getCost();

In haskell:

sum (getShops >>= \s ->getDepartments s >>= \d ->getOrders d >>= \ord ->return (getCost ord))

Obviously, such a kind of 'assignment' corresponds to the syntactic idiom

value >>= \variable -> . Haskell has syntactic sugar for that, called 'do syntax':


13/27

sum (do s


14/27

a parser for numbers a parser that always returns the same constant value a parser that always returns the input string a parser that always fails a parser that expects a particular string, returns () if it is found and fails otherwise

One can also implement the combinator: parallel composition of parsers: it tries to apply thefirst parser and, if it fails, applies the second one instead.

() :: Parser a -> Parser a -> Parser aa b = \s -> case (a s) of

Just v -> Just vNothing -> b s

However we can't implement sequential composition (), for example, a parser for two

numbers ("123 456" -> (123, 456)) can't be implemented efficiently in this way. That'sbecause the combinator knows nothing about the implementation of its arguments and it has to

check all possible partitions of the input string into two, call the first parser on the left part andthe second parser on the right part. This is extremely slow and, in case of more than two parsers,

complexity becomes exponential in both length of the input string and parser count.

Actually, we would like to implement sequential composition as follows: let the first parser 'eat'

as much of the input string as it can and pass the rest to the second parser. We have to take this

fact in the Parser type: type Parser a = String -> Maybe (a, String). Now sequentialcomposition is easy:

() :: Parser a -> Parser b -> Parser (a,b)a b = \s -> do

(va, s') (a s) >>= \(va, s') ->(b s') >>= \(vb, s'') ->return (va,vb)

The second variant, which does not use the Maybe monad at all: the (>>=) and return functionsare expanded:

a b = \s ->case (a s) of

Nothing -> NothingJust (va,s') ->


15/27

case (b s') ofNothing -> NothingJust (vb, s'') -> Just ((va,vb), s'')

So, with a type type Parser a = String -> Maybe (a, String) it is easy to implementsequential parser composition and also (left as an exercise for the reader) parallel composition.

Let us now consider a couple more important combinators: start withoneOrMore :: Parser a -> Parser [a]. It will be useful for parsing strings like this:

Napoleon 1769|Emperor of FranceHenry the 8 th 1491|King of England, had 6 wives

A recognizing parser (one that just checks correctness but doesn't return anything) can be

implemented like

kingInfo = (oneOrMore anyChar) space (oneOrMore digit) char '|' (oneOrMore anyChar)`

Let us invent an implementation for oneOrMore:

A greedy variant: oneOrMore p applies p while it is possible. Obviously this won't workwith this example because the very first oneOrMore anyChar will eat the whole string.

A non-deterministic variant: (oneOrMore p1) p2 should find a way to split theinput string into two, where the first can be parsed by several applications ofp1 and the

second can be parsed by p2.

Only the second variant suits our needs. The easiest (or the only) way to implement it is to

transition from the Maybe type to lists. Philip Wadler formulated this as 'Replace failure by a list

of successes' in his work [C.1]. Now the parser will have type type Parser a = String -> [(a, String)]. Implementations of various operations will become even simpler and the

implementation of with the do-syntax won't change at all!

Now let us implement a better kingInfo that will not only check syntax but also return a King:

data King = King {name::String, birth::Int, info::String}:

kingInfo = \s -> do(name, s1)


16/27

newtype Parser a = Parser {runParser :: String -> [(a,String)]}instance Monad Parser where

return a = Parser $ \s -> [(a,s)]pa >>= pb = Parser $ \s -> [(b,s'') | (a,s')


17/27

The goal is to estimate the collision probability between two given types of drivers and average it

over all types.

Let us forget for a moment about the statistical nature of the problem. Then the solution becomes

a trivial deterministic procedure:

bool collide(Driver first, Driver second, Light light) {Action firstAction = first.actionOn(light);Action secondAction = second.actionOn(light);bool result = doActionsLeadToCollision(firstAction, secondAction);return result;

}

The only thing we need to do in order to transform this simple procedure into a computation of

collision probability is to replace every type T with 'a distribution over T': for example, 'bool'

will become 'distribution over bool', 'Driver' will become 'distribution over Driver', etc. Then,if the language provides us with arithmetic over such types, the problem will be solved.

Two questions arise: 1) What is a 'distribution over T' and how to represent it in our program and2) How to implement arithmetic over such values.

The answer to the first question depends on how we shall use these values. It makes sense to

assume that we'll generate values and compute statistics over them. For that we'll need:

Domain of a value (its carrier, or support) Probabilities of each value appearing (in the discrete case) or the distribution function (in

the continuous case)

A procedure of generating a value with such a distribution

A way to compute the dispersion or mean of the value, or, in a more general case, thedispersion or expectation of some function of the value

But dispersion can be expressed via the expectation:Dx = M[(x-Mx)2], also probabilities ofindividual values in the finite case can be expressed as P{x=x0} = M[if(x==x0) then 1 else 0]. In

the infinite case we don't actually need the distribution function. Really, of what use can it be?

One usually integrates it over an interval to obtain the probability of a value belonging to theinterval, but this integral can also be expressed via the mean: 'P{x 0


18/27

Let us consider only discrete distributions with a finite carrier, because for the general case of

infinite distributions we'll be unable to implement the expectation function precisely.

module Dist where

import System.Random

data Dist a = Dist {support :: [a],gen :: StdGen -> (a, StdGen),expect :: (a -> Float) -> Float

}

Now, what about the second question?

Let us compute the sum of two fuzzy values: an integer a and an integer b. The distribution of

their sum can be computed by randomization ofa+b over theparametera (randomization of avalue over a parameter is averaging or obtaining of its distribution with a random value of the

parameter). Addition can be generalized to an arbitrary function; the number of functionarguments can also be generalized to an arbitrary large value.

So, the only operation needed for composing distributions from parts, t.i., for computing

characteristics of expressions of the form f(a1, a2, a3, ...), is randomization.

The randomization operator has type: (distribution of parameter p) -> (distribution depending on

p) -> (randomized distribution), or, if we replace 'distribution over a' with 'Dist a', Dist a -

> (a -> Dist b) -> Dist b. Whoa! This is precisely (>>=) :: m a -> (a -> m b) -

> m b applied to the Dist monad!

That's true: statistical distributions form a monad that has randomization as its binding operator.Obviously, return corresponds to the delta distribution, t.i., the distribution where the randomvalue can have only one particular value with a probability of 1:

instance Monad Dist wherereturn a = Dist {support = [a], gen = \g -> (a, g), expect = \f -> f a}da >>= fdb = Dist {

support = concat [support (fdb a) | a let (a, g') = (gen da g) in (gen (fdb a) g'),expect = \f -> expect da (\a -> expect (fdb a) f)

}

This is how these operations work: return a returns a distribution whose carrier is a singletonset ofa, the generator always generates a and the expectation of function f equals f(a).

(>>=) :: Dist a -> (a -> Dist b) -> Dist b : here da is distribution of the parameter and

fdb is a function depending on the parameter and returning a distribution db. During

randomization, the following occurs:

The carrier is assembled as a union of carriers of 'db' distribution across all possiblevalues of the parameter, t.i. across values of 'support da'.


19/27

The generator generates a parameter, computes the 'db' distribution and uses it to generatethe result

Expectation uses the randomization formula:M,,a,b,,[f(a,b)] = Ma[Mb[f(a,b)]]Now we can define a couple of easy distributions and proceed to solving the original problem.

For illustrative purposes a single distribution, freqs, will suffice, which is determined by pairs(probability, value), which, in turn, can be expressed with the choose combinator: 'a

mixture of two distributions in proportion p : 1-p'.

choose p d1 d2 = Dist {support = s, gen = g, expect = e}where

s = support d1 ++ support d2g sg = let (x,sg') = randomR (0.0,1.0) sg in (if x < p then (gen d1

sg') else (gen d2 sg'))e f = p * expect d1 f + (1-p) * expect d2 f

prob p = choose p (return True) (return False)

freqs [] = error "Empty cases list"freqs [(_,a)] = return afreqs ((w1,a1):as) = choose w1 (return a1) (freqs $ map (\(w,a) -> (w/(1-w1),a)) as)

mean d = expect d iddisp d = expect d (\x -> (x-m)^2) where m = mean dprobability f d = expect d (\x -> if f x then 1 else 0)

This suffices for the original problem.

data Light = Red | Green | Yellowdata Driver = Cautious | Normal | Aggressive

data Action = Drive | DontDrive

drive p = choose p (return Drive) (return DontDrive)

_ àctOn` Green = drive 1.0Cautious àctOn` Yellow = drive 0.1Normal àctOn` Yellow = drive 0.2Aggressive àctOn` Yellow = drive 0.9Cautious àctOn` Red = drive 0.0Normal àctOn` Red = drive 0.1Aggressive àctOn` Red = drive 0.3

Drive `collision` Drive = prob 0.3_ `collision` _ = prob 0.0

driver = freqs [(0.2, Cautious), (0.6, Normal), (0.2, Aggressive)]

simulate d1 d2 light = doa1


20/27

simulateOverDrivers light = dod1 ... m .... It is notobvious that such operations do exist at all and that they can be useful: all the monads we spoke

about were completely different and served different purposes. However, they do.

Simplest monadic combinators

The first situation where we used different monads in the same way is the do-syntax. It's hard tocall it an 'operation' but it nonetheless works the same way for all monads and is equally useful

for all of them.

Also, our programs often contained code fragments like this one:

do a m c

Why not write simply return (f foo bar)? Alas, this won't typecheck. But this pattern occursextremely frequently and it makes sense to abstract it.

liftM :: (a -> b) -> (m a -> m b)

liftM f ma = do a b -> c) -> m a -> m b -> m cliftM2 f ma mb = do a


21/27

These functions liftM, liftM2 etc. are implemented in the standard library, in Control.Monad.Their somewhat obscure names mean 'lift function into monad, so that it operates not on simple

values but on monadic values but does the same thing as before'.

Now we can write things like this: (this code fragment reads a string from the keyboard and

returns it converted to upper case)

readAndUpper :: IO StringreadAndUpper = liftM (map toUpper) getLine

Or like this (in the Dist monad):

twoDrivers :: Dist (Driver,Driver)twoDrivers = liftM2 (,) driver driver

Now let us imagine that we are going to print a list ofvalues: perform putStrLn for each value.

map putStrLn values

Alas, this is not a solution. Look at the types:

values :: [String]putStrLn :: String -> IO ()map putStrLn values :: [IO ()]

So, we get a list of values of type IO (), that is, a list of actions. But Haskell is a lazy language

and elements of the list are not forced and remain unevaluated calls ofputStrLn. Now, to

perform this list of actions we have to perform each of them in turn.

doIOs [] = return []doIOs (a:as) = do a ; doIOs as

doIOs (map putStrLn values)

If we start the interpreter and look at the type of doIOs, we see that it is

doIOs: (Monad m) => [m a] -> m [a]. It has no mention of the IO monad and is monad-

polymorphic! Analogously, the IO word should be removed from the name of the function; weobtain the standard function

sequence :: Monad m => [m a] -> m [a]

and a similar function

mapM :: Monad m => (a -> m b) -> [a] -> m [b]mapM f as = sequence (map f as)

The standard library defines some more counterparts of list functions and control structures formonads:


22/27

when :: Monad m => Bool -> m () -> m ()when b m = if b then m else return ()

replicateM :: Monad m => Int -> m a -> m [a]foldM :: Monad m => (a -> b -> m a) -> a -> [b] -> m a

and some more. For the case where a computation is important solely for its side-effects but notthe return value these operations have counterparts with an underscore:

mapM_ :: Monad m => (a -> m b) -> [a] -> m ()sequence_ :: Monad m => [m a] -> m ()

etc.

Why do we need this genericity?

It is not yet obvious what applications there are for the combinators described above (except for

liftM) apart from structuring actions in the IO monad. Let's look at some examples:

Apart from IO, the monad of 'computations with mutable world' they may be used in thesimilar State monad, the monad of 'computations with mutable state' for the samepurposes. For example, to collect distinct elements of a list we can traverse it with

mapM insert, where insert has type (Ord a) => State (BinaryTree a) () and

BinaryTree is the type of balanced trees:

collectDistinct :: (Ord a) => [a] -> BinaryTree acollectDistinct as = s'

where (s', _) = runState (mapM_ insert) emptyBinaryTreeinsert a = do t readNumber)

The functions that end in an underscore and return m () deserve special attention. They seem tonot perform any computations (if they would, these computations would have a more senseful

result than (), right?); these functions should be used only with monads that have effects. These

effects include not only side-effects of the IO monad; presence of effects should be interpreted as


23/27

sensibility of the >> operator and ofdo without action) -> Tree -

> result. But in a typical SAX implementation the handler doesn't return anything and all thehandling and computation work falls on the shoulders of the handler. We'll follow this way:

walk :: (NodeEvent -> actionWithoutResult) -> Tree -> noResult.

What can be used as an action? Of course, a pure function won't fit because its result won't be

included in the answer anyway, and will be dismissed. It makes sense to use a monadic action

with an effectful monad, like IO, State etc.


24/27

Thus, we get walk :: Monad m => (NodeEvent -> m ()) -> Tree -> m (), which

traverses the tree inside an effectful monad m, performing a monadic action for each entry or exit

event. Let us proceed to the implementation:

walk :: (Monad m) => (NodeEvent -> m a) -> Tree -> m ()walk f (Tree n ts) = do f (Enter n)

mapM_ (walk f) tsf (Leave n)return ()

Now let us use this function to implement indented pretty-printing the tree to XML. Printingindented text has two aspects:

There are side effects, t.i. we'll have to use the IO monad There is a mutable state passed around, namely the current indent level

The first aspect forces us to use the IO monad while the second one makes us think og State, but

neither monad fits by its own. Let us implement our own simple monad, the monad of 'indentedoutput':

newtype IndentIO a = IndentIO { runWithIndent :: Int -> IO (Int, a) }instance Monad IndentIO where

return a = IndentIO $ \i -> return (i, a)(IndentIO r) >>= f = IndentIO r'

where r' i = do (i', a) do action ; return (i,()) -- Performs anaction without changing the indentindentMore = IndentIO $ \i -> return (i+4, ()) -- Increasesindent without performing any actionindentLess = IndentIO $ \i -> return (i-4, ()) -- Decreasesindent without performing any actiongetIndent = IndentIO $ \i -> return (i, i) -- Returns thecurrent indent without performing any action

printIndent s = do i


25/27

-- Generates a string of the form showLeave (Node tag _) = ""

printTree = walk pwhere p (Enter node) = do printIndent (showEnter node)

indentMorep (Leave node) = do indentLess

printIndent (showLeave node)

This trick, namely combining several monads in one, is needed rather often, and it's inconvenient

to write a new monad for each such case. This problem is addressed by monad transformers but

this topic is beyond the scope of this text. However, the example with IndentIO is a goodillustration of how monad transformers work and actually is a special case of them, specialized

for two particular monads. You can get more familiar with monad transformers in [B.7] and[B.11].

The walk function and State monad can also be used to count the number of nodes satisfying apredicate:

countNodes p tree = execState (walk (\e -> when (counts e) (modify (1+)))tree) 0

where counts (Enter n) = p ncounts _ = False

In this code the handler is represented by a function that increases the state variable by 1 if it

enters a node satisfying the predicate. The result of traversal is a value of type State Int (),

t.i., it is not yet the answer but just a function ready to compute it, if passed an initial value. This

is done by calling execState with a second argument of0 (start counting from zero).

ConclusionWe've looked at several 'facets' of monads: we inferred the conception itself as a generalization

of some imperative idioms, we saw how some non-trivial practical problems are formulated andsolved in terms of monads and we also saw the usefulness of the monad abstraction and which

operations may exploit this abstraction.

As we progressed, we saw that monads are most commonly used for two different purposes:

structuring the control flow and describing imperative effectful computations (IO, State,

IndentIO), and structuring data flow (Maybe, List, Dist). Some monads belong to both classes

(Parser). Monad-polymorphic operations are almost always used with monads of the first class.

We've also touched the topic of monad transformers, composite monads that combine propertiesof some simpler ones, but we didn't study them thoroughly.

Also, we've missed another interesting topic: the theoretic grounding for the conception ofmonads and their formulation in terms of category theory. This topis is especially interesting

because monads were initially discovered in category theory and only later Philip Wadler applied

them to programming.


26/27

I hope that the text has succeeded in its intent to form an intuitive understanding of how monads

work and a 'feeling' for them, and that you got interested in studying topics omitted in this article.

Acknowledgements

I would like to thank Denis Moskvin, Ivan Veselov and Oleg Tsarev for valuable comments andoffers to early drafts of this text and to Julia Astakhova for those to later versions, for

countenance and patience. People from the #haskell IRC-channel also gave valuable commentson monad-polymorphic functions; namely, dons, bd_ and Saizan. Ivan Tarasov, Maxim Taldykin

and Artem Shalkhakov helped find some errors in the published version. quicksilver, ski, ddarius

and roconnor pointed me to the fact that my section about the IO monad was incorrect,

completely changed my understanding of how it works and helped develop a more correctversion of the section about it.

References

A. Basics of Haskell

1.http://darcs.haskell.org/yaht/yaht.pdf- Yet Another Haskell Tutorial, one of the easiest yet

largest tutorials

2.http://www.haskell.org/tutorial/- A gentle introduction to Haskell

3.http://www.haskell.org/haskellwiki/Tutorials - A list of Haskell tutorials

B. Other monad tutorials

1.http://citeseer.ist.psu.edu/wadler95monads.html - Philip Wadler's classical article on monads

that fired up interest to them

2.http://darcs.haskell.org/yaht/yaht.pdf- Contains a chapter on monads, mostly ones similar toState

3.http://www.haskell.org/haskellwiki/Monad - An article on haskellwiki

4.http://www.haskell.org/haskellwiki/Tutorials#Using_monads - A list of monad tutorials

5.http://www.haskell.org/haskellwiki/Monad_tutorials_timeline - A timeline of monad tutorials

with good annotations

6.http://www.haskell.org/all_about_monads/html/index.html - A big and thorough tutorial

describing all the standard monads and monad transformers

7.http://book.realworldhaskell.org/beta/monads.html ,http://book.realworldhaskell.org/beta/monadcase.html ,
http://darcs.haskell.org/yaht/yaht.pdfhttp://darcs.haskell.org/yaht/yaht.pdfhttp://darcs.haskell.org/yaht/yaht.pdfhttp://www.haskell.org/tutorial/http://www.haskell.org/tutorial/http://www.haskell.org/tutorial/http://www.haskell.org/haskellwiki/Tutorialshttp://www.haskell.org/haskellwiki/Tutorialshttp://www.haskell.org/haskellwiki/Tutorialshttp://citeseer.ist.psu.edu/wadler95monads.htmlhttp://citeseer.ist.psu.edu/wadler95monads.htmlhttp://citeseer.ist.psu.edu/wadler95monads.htmlhttp://darcs.haskell.org/yaht/yaht.pdfhttp://darcs.haskell.org/yaht/yaht.pdfhttp://darcs.haskell.org/yaht/yaht.pdfhttp://www.haskell.org/haskellwiki/Monadhttp://www.haskell.org/haskellwiki/Monadhttp://www.haskell.org/haskellwiki/Monadhttp://www.haskell.org/haskellwiki/Tutorials#Using_monadshttp://www.haskell.org/haskellwiki/Tutorials#Using_monadshttp://www.haskell.org/haskellwiki/Tutorials#Using_monadshttp://www.haskell.org/haskellwiki/Monad_tutorials_timelinehttp://www.haskell.org/haskellwiki/Monad_tutorials_timelinehttp://www.haskell.org/haskellwiki/Monad_tutorials_timelinehttp://www.haskell.org/all_about_monads/html/index.htmlhttp://www.haskell.org/all_about_monads/html/index.htmlhttp://www.haskell.org/all_about_monads/html/index.htmlhttp://book.realworldhaskell.org/beta/monads.htmlhttp://book.realworldhaskell.org/beta/monads.htmlhttp://book.realworldhaskell.org/beta/monads.htmlhttp://book.realworldhaskell.org/beta/monadcase.htmlhttp://book.realworldhaskell.org/beta/monadcase.htmlhttp://book.realworldhaskell.org/beta/monadcase.htmlhttp://book.realworldhaskell.org/beta/monads.htmlhttp://www.haskell.org/all_about_monads/html/index.htmlhttp://www.haskell.org/haskellwiki/Monad_tutorials_timelinehttp://www.haskell.org/haskellwiki/Tutorials#Using_monadshttp://www.haskell.org/haskellwiki/Monadhttp://darcs.haskell.org/yaht/yaht.pdfhttp://citeseer.ist.psu.edu/wadler95monads.htmlhttp://www.haskell.org/haskellwiki/Tutorialshttp://www.haskell.org/tutorial/http://darcs.haskell.org/yaht/yaht.pdf


27/27

http://book.realworldhaskell.org/beta/monadtrans.html - Chapters of the 'Real World Haskell'

book about monads. Very detailed and with practical examples. Mostly about Maybe, State and

IO.

8.http://www.haskell.org/haskellwiki/Monads_as_containers

9.http://research.microsoft.com/~simonpj/papers/marktoberdorf/- Simon Peython Jones' article

'Tackling The Awkward Squad', brilliantly tells about the IO monad

10.http://members.chello.nl/hjgtuyl/tourdemonad.html - A thorough overview of everything

monad-related in Haskell's standard library

11.http://spbhug.folding-maps.org/wiki/MonadTransformers - Mikhail Mitrofanov's

presentation about monad transformers (in Russian)

C. Scientific articles on monads

1.http://books.google.com/books?hl=en&lr=&id=AiMwYZs-TGkC&oi=fnd&pg=PA113&ots=prga1Pri15&sig=3Z5qGXfwK1AWI6RKZ5Jc2ubyeTg - Philip

Wadler, 'How to replace failure by a list of successes'

2.http://okmij.org/ftp/Computation/monads.html - Oleg Kiselyov's articles on monads, including

the monad of statistical experiments (Monte-Carlo method) and the monad of logical inference

3.http://www.cs.nott.ac.uk/~gmh//monparsing.ps - Graham Hutton, Erik Meijer - Monadic

Parser Combinators

4.http://www.randomhacks.net/darcs/probability-monads/probability-monads.pdf- Eric Kidd,'Build your own probability monads' - an article with several interesting variations on the topic of

probability monads

5.http://luqui.org/blog/archives/2008/03/29/io-monad-the-continuation-presentation/ A blog postby Luke Palmer that explains how the IO monad might be implemented with continuations

D. Other

1.http://www.amazon.com/Expert-F-Experts-Voice-Net/dp/1590598504 - The book 'Expert F#',

also a brilliant introduction to functional programming in general

MonadsEn (last edited 2011-05-20 20:40:17 bymax ulidtko)
http://book.realworldhaskell.org/beta/monadtrans.htmlhttp://book.realworldhaskell.org/beta/monadtrans.htmlhttp://www.haskell.org/haskellwiki/Monads_as_containershttp://www.haskell.org/haskellwiki/Monads_as_containershttp://www.haskell.org/haskellwiki/Monads_as_containershttp://research.microsoft.com/~simonpj/papers/marktoberdorf/http://research.microsoft.com/~simonpj/papers/marktoberdorf/http://research.microsoft.com/~simonpj/papers/marktoberdorf/http://members.chello.nl/hjgtuyl/tourdemonad.htmlhttp://members.chello.nl/hjgtuyl/tourdemonad.htmlhttp://members.chello.nl/hjgtuyl/tourdemonad.htmlhttp://spbhug.folding-maps.org/wiki/MonadTransformershttp://spbhug.folding-maps.org/wiki/MonadTransformershttp://spbhug.folding-maps.org/wiki/MonadTransformershttp://books.google.com/books?hl=en&lr=&id=AiMwYZs-TGkC&oi=fnd&pg=PA113&ots=prga1Pri15&sig=3Z5qGXfwK1AWI6RKZ5Jc2ubyeTghttp://books.google.com/books?hl=en&lr=&id=AiMwYZs-TGkC&oi=fnd&pg=PA113&ots=prga1Pri15&sig=3Z5qGXfwK1AWI6RKZ5Jc2ubyeTghttp://books.google.com/books?hl=en&lr=&id=AiMwYZs-TGkC&oi=fnd&pg=PA113&ots=prga1Pri15&sig=3Z5qGXfwK1AWI6RKZ5Jc2ubyeTghttp://books.google.com/books?hl=en&lr=&id=AiMwYZs-TGkC&oi=fnd&pg=PA113&ots=prga1Pri15&sig=3Z5qGXfwK1AWI6RKZ5Jc2ubyeTghttp://okmij.org/ftp/Computation/monads.htmlhttp://okmij.org/ftp/Computation/monads.htmlhttp://okmij.org/ftp/Computation/monads.htmlhttp://www.cs.nott.ac.uk/~gmh/monparsing.pshttp://www.cs.nott.ac.uk/~gmh/monparsing.pshttp://www.cs.nott.ac.uk/~gmh/monparsing.pshttp://www.randomhacks.net/darcs/probability-monads/probability-monads.pdfhttp://www.randomhacks.net/darcs/probability-monads/probability-monads.pdfhttp://www.randomhacks.net/darcs/probability-monads/probability-monads.pdfhttp://luqui.org/blog/archives/2008/03/29/io-monad-the-continuation-presentation/http://luqui.org/blog/archives/2008/03/29/io-monad-the-continuation-presentation/http://luqui.org/blog/archives/2008/03/29/io-monad-the-continuation-presentation/http://www.amazon.com/Expert-F-Experts-Voice-Net/dp/1590598504http://www.amazon.com/Expert-F-Experts-Voice-Net/dp/1590598504http://www.amazon.com/Expert-F-Experts-Voice-Net/dp/1590598504http://spbhug.folding-maps.org/wiki/max_ulidtkohttp://spbhug.folding-maps.org/wiki/max_ulidtkohttp://spbhug.folding-maps.org/wiki/max_ulidtkohttp://spbhug.folding-maps.org/wiki/max_ulidtkohttp://www.amazon.com/Expert-F-Experts-Voice-Net/dp/1590598504http://luqui.org/blog/archives/2008/03/29/io-monad-the-continuation-presentation/http://www.randomhacks.net/darcs/probability-monads/probability-monads.pdfhttp://www.cs.nott.ac.uk/~gmh/monparsing.pshttp://okmij.org/ftp/Computation/monads.htmlhttp://books.google.com/books?hl=en&lr=&id=AiMwYZs-TGkC&oi=fnd&pg=PA113&ots=prga1Pri15&sig=3Z5qGXfwK1AWI6RKZ5Jc2ubyeTghttp://books.google.com/books?hl=en&lr=&id=AiMwYZs-TGkC&oi=fnd&pg=PA113&ots=prga1Pri15&sig=3Z5qGXfwK1AWI6RKZ5Jc2ubyeTghttp://spbhug.folding-maps.org/wiki/MonadTransformershttp://members.chello.nl/hjgtuyl/tourdemonad.htmlhttp://research.microsoft.com/~simonpj/papers/marktoberdorf/http://www.haskell.org/haskellwiki/Monads_as_containershttp://book.realworldhaskell.org/beta/monadtrans.html

Haskell Monad Tutorial

Documents

Transcript of Haskell Monad Tutorial