“… she would carve on the tree Rose is a Rose is a Rose is a Rose is a Rose until it went all the way around.” – The World is Round, Gertrude Stein

Many a blog and tutorial alike will tell you that a monad is something other than what it is. Perhaps it is a “box”, or a “burrito”. Sometimes it is a “computation” (aren’t all Haskell terms computations?). Certainly there is some merit to this style of explanation-by-analogy. However, such presentations are necessarily limited to approximations of what a monad really is. Analogies may impart some initial intuition, but sooner or later end up holding our mental model back from reaching the full generality of what a monad really is. And it isn’t as complicated as you might expect.

## Type Families

A monad is a special kind of type family, along with a couple of important functions. Let’s clarify some terminology. By a type family, I mean anything with kind `* -> *`. A simple function to and from types. For example, let `t :: * -> *` and `x :: *`. Then `t` is a type family and `x` is a type. `t x :: *` is also a type, and an “instance” of the type family `t`. We may also say `x` is the parameterized type in `t x`.

Some examples of type families include `[]`, `Maybe`, `Tree`, `Map k`, and `Either a` (where `k` and `a` are arbitrary types). Each of these families may generally be understood as describing some structured collection on their parameterized type. But we can also concoct some type families which do not admit a notion of structure. For instance, `((->) a)` describes the family of functions with domain `a`.

## Functors

Rather than jumping right to monads, let’s work our way there incrementally. A monad is an applicative functor is a functor. We’ll start with the prototypical functor, the `[]` type family. There exists a ubiquitous function for lists, called `map`:

``````map :: (a -> b) -> [a] -> [b]
map f (x : xs) = f x :: map f xs
map _ [] = []
``````

`map` applies a function to each element of a list. The result is a new list with the same structure, but whose elements have been “mapped” through the function.

A functor is a generalization of `map` to other type families. Let’s start by generalizing the type signature of `map`. Recall that we may opt out of the list-specific type syntax, writing instead in the regular prefix application style. Written this way, the type is `map :: (a -> b) -> [] a -> [] b`. Now to generalize the type to arbitrary type family `t :: * -> *`, we would write `map :: (a -> b) -> t a -> t b`. This brings us precisely to our definition of a functor:

``````class Functor f where
fmap :: (a -> b) -> f a -> f b
``````

The mapping function for arbitrary functors is called `fmap`, for functor map. The only reason that it isn’t called `map` is because that is taken by our list-specific function2.

Any functor instance is expected to obey certain laws so that it aligns with our intuitive understanding of what a map should do.

1. `fmap id x` = `x` 3
2. `fmap (f . g)` = `(fmap f) . (fmap g)`

In English, the first law says that mapping the identity function over an object should leave it unchanged. The second law says that sequential map operation may be combined in to a single map.

If these laws seem confusing, try to consider concrete examples. Let’s say we are mapping `(+ 1)` over a list, and then we map `(* 2)` over that. Intuitively, wouldn’t this be equivalent to a single mapping of `\x -> (x + 1) * 2`, as enforced by the second law?

Together, these formal laws enforce the informal intuition that a functoral map should preserve the underlying “structure” of the functoral object.

Before moving on, I’d suggest spending some time with the idea of functors. Make sure that you are comfortable with them. All of the type families I mentioned earlier are functors; try to imagine how their respective `fmap` definitions would behave and why they would be useful.

## Applicative Functors

I won’t introduce applicative functors in full, only the part which is directly relevant to our eventual monad definition. An applicative functor is a functor, which adds two additional functions. The one we care about is called `pure`:

``````class Functor f => Applicative f where
pure :: a -> f a
...
``````

The idea behind `pure` is that it “lifts” a value into the most minimal structure. Some examples, given `x :: a`:

1. `pure x :: [a]` = `[x]`
2. `pure x :: Maybe a` = `Just x`
3. `pure x :: Tree a` = `Node x []`

Formally, we expect the following law to hold:

1. `fmap f . pure` = `pure . f`

Again, when such laws are unclear, I encourage you to consider concrete examples. What does this mean for `[]`? How about for `Maybe`?

Finally, we come to the definition of a monad. A Monad is an applicative functor which adds a function called `join`.

``````class Applicative m => Monad m where
join :: m (m a) -> m a
``````

What `join` does is it flattens two layers of “structure” down to one. Let’s again consider some examples with respect to concrete type families.

When we specialize `join` to the monad `[]`, we get `join :: [[a]] -> [a]`. The most “obvious” function of this type is the one which concatenates all the lists together, and that is indeed the correct definition of this `join`.

The specialization of `join` to `Maybe` gets the type `join :: Maybe (Maybe a) -> Maybe a`. This behaves such that `join (Just (Just x))` = `x`, but the `join` of anything else results in `Nothing`.

Our expectation is that `join` doesn’t do anything silly or throw away values. For instance `join` on lists could instead always return the empty list, and that would have the right type. Similarly, `join` on `Maybe` could always return `Nothing`. But both of these definitions would go against the spirit of what we want `join` to do.

Formally, we expect the following behavior4:

1. `join . pure` = `join . fmap pure` = `id`
2. `join . join` = `join . fmap join`
3. `join . fmap (fmap f)` = `fmap f . join`

That’s it! We’ve defined a monad! I bet that wasn’t as complicated as you were expecting.

## Bind

We’ve defined a monad, but we aren’t done yet. Given our definition of a monad, we may define an important infix operator, `>>=`, pronounced “bind”:

``````(>>=) :: Monad m => m a -> (a -> m b) -> m b
m >>= f = join (fmap f m)
``````

We can see that `>>=` is like `fmap`, except the codomain of the function we are mapping is also monadic. Rather than accumulating nested monads, we `join` the result to return to our original level of structure.

You may wonder why this function is useful. Let’s consider an example. Say we have a partial division function `(/) :: Int -> Int -> Maybe Int` which will return `Nothing` when the divisor is zero. Let `a, b, c, d :: Int`. What if we want to divide `a` by `b`, then the result of that by `c`, and finally divide by `d`? Representing this as a series of maps and joins is going to be quite tedious, but it is quite simple to write it in terms of binds: `a/b >>= (/c) >>= (/d)`.

We often wish to chain together multiple monadic operations, such as in the division example above. `>>=` is incredibly useful in its ability to perform such chaining without accumulating nested monad layers.

The monad definition given in the earlier section is valid and straightforward. However it is not the definition we’ll find in Haskell. Rather than defining `join` and then deriving `>>=`, Haskell has you define `>>=` and derives `join`5.

We’ve already seen how to derive `>>=` from `join` and `fmap`. If we instead consider `>>=` as primitive, how do we derive `join`?

``````join = (>>= id)
``````

When stated in terms of `>>=`, a monad is expected to obey the following laws:

1. `pure a >>= f` = `f a`
2. `m >>= pure` = `m`
3. `m >>= (\x -> n x >>= o)` = `(m >>= n) >>= o`

Finally, out of convenience, we introduce the degenerative bind:

``````(>>) :: Monad m => m a -> m b -> m b
m >> n = m >> const n
``````

Looking at the type, it might seem that `>>` ignores the left value, but this is not so. The structure will inform the overall result. For instance, `[1..4] >> ` evaluates to `[3, 3, 3, 3]` (each element of the left list is replaced with ``, and then the sublists are concatenated).

## Notation

Many of the functions I’ve discussed in this post have optional alternative names, notations, and syntaxes.

`fmap` has an infix alias, `<\$>`. So `fmap f x` = `f <\$> x`. This infix operator is more common in practice.

Monads define their own function called `return` which is often used instead of `pure`.

This is purely6 a historical artifact, and it is expected that `return` = `pure`.

Finally, the (in?)famous `do` notation is used to greatly ease the strain of using `>>=`. The notation `do {a <- b ; ...}` is syntactic sugar which corresponds to the expression `b >>= \a -> ...`. Similarly, `do {b ; ...}` desugars to `b >> ...`.

## Conclusion

We have seen that a monad is just a special kind of functor, and that `>>=` allows us to chain together monadic actions without accumulating multiple layers of nested monadic structure. The last piece of the puzzle is an appreciation of monads as a consistently useful abstraction. I think this is where many struggle. The definitions themselves are not terribly complex. But why are monadic interfaces so convenient for dealing with a variety of issues, from IO to error handling? There is not a simple answer to this, and I think one may only come to appreciate the usefulness of monads through real experience using them.

## Footnotes

1. When I talk about a monad in this post, I mean a monad in the Haskell and functional programming sense, as opposed to the category-theoretic definition. The same applies for our discussion of functors. ↩︎

2. Since `map` is just the specialization of `fmap` to the `[]` functor, it is not necessary to have a distinct `map` function. The reason we have one is because the early Haskell developers thought the type of `map` would be more beginner-friendly. ↩︎

3. The equivalent statement `fmap id` = `id` is simpler. However, it may be confusing in that the first instance of `id` takes the type `a -> a`, while the second takes the type `f a -> f a`, where `f` is the functor in question. ↩︎

4. This version of the monad laws was taken from a stack overflow post↩︎

5. Presumably, Haskell’s monads are defined in terms of `>>=` because this function is much more widely used than `join`. I maintain that the definition in terms of `join` is still superior, because it is less of an obligation. `join` is weaker than `>>=`, and the definition of `>>=` will necessarily have overlap with that of the previously defined `fmap`↩︎

6. No pun intended. ↩︎