Chapters 3.5 and 3.6

This commit is contained in:
Igal Tabachnik 2017-09-25 22:34:57 +03:00
parent 1c61b7fa20
commit 4aed6e6df3
4 changed files with 257 additions and 246 deletions

BIN
ctfp.pdf

Binary file not shown.

View File

@ -1,12 +1,4 @@
\begin{quote}
This is part 21 of Categories for Programmers. Previously:
\href{https://bartoszmilewski.com/2016/11/21/monads-programmers-definition/}{Monads:
Programmer's Definition}. See the
\href{https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/}{Table
of Contents}.
\end{quote}
Now that we know what the monad is for --- it lets us compose
\lettrine[lhang=0.17]{N}{ow that we know} what the monad is for --- it lets us compose
embellished functions --- the really interesting question is why
embellished functions are so important in functional programming. We've
already seen one example, the \code{Writer} monad, where embellishment
@ -90,7 +82,7 @@ identify the embellishment that applies to each problem in turn.
We modify the return type of every function that may not terminate by
turning it into a ``lifted'' type --- a type that contains all values of
the original type plus the special ``bottom'' value \code{}. For
the original type plus the special ``bottom'' value \code{\ensuremath{\bot}}. For
instance, the \code{Bool} type, as a set, would contain two elements:
\code{True} and \code{False}. The lifted \code{Bool} contains
three elements. Functions that return the lifted \code{Bool} may
@ -128,16 +120,16 @@ containers --- \code{concat} concatenates a list of lists into a
single list. \code{return} creates a singleton list:
\begin{verbatim}
instance Monad [] where join = concat return x = [x]
instance Monad [] where
join = concat
return x = [x]
\end{verbatim}
The bind operator for the list monad is given by the general formula:
\code{fmap} followed by \code{join} which, in this case gives:
\begin{verbatim}
as >>= k = concat (fmap k as)
\end{verbatim}
Here, the function \code{k}, which itself produces a list, is applied
to every element of the list \code{as}. The result is a list of lists,
which is flattened using \code{concat}.
@ -169,22 +161,27 @@ My favorite example is the program that generates Pythagorean triples
--- triples of positive integers that can form sides of right triangles.
\begin{verbatim}
triples = do z <- [1..] x <- [1..z] y <- [x..z] guard (x^2 + y^2 == z^2) return (x, y, z)
triples = do
z <- [1..]
x <- [1..z]
y <- [x..z]
guard (x^2 + y^2 == z^2)
return (x, y, z)
\end{verbatim}
The first line tells us that \code{z} gets an element from an infinite
list of positive numbers \code{{[}1..{]}}. Then \code{x} gets an
element from the (finite) list \code{{[}1..z{]}} of numbers between 1
and \code{z}. Finally \code{y} gets an element from the list of
numbers between \code{x} and \code{z}. We have three numbers
\code{1\ \textless{}=\ x\ \textless{}=\ y\ \textless{}=\ z} at our
\code{1 <= x <= y <= z} at our
disposal. The function \code{guard} takes a \code{Bool} expression
and returns a list of units:
\begin{verbatim}
guard :: Bool -> [()] guard True = [()] guard False = []
guard :: Bool -> [()]
guard True = [()]
guard False = []
\end{verbatim}
This function (which is a member of a larger class called
\code{MonadPlus}) is used here to filter out non-Pythagorean triples.
Indeed, if you look at the implementation of bind (or the related
@ -192,7 +189,7 @@ operator \code{>>}), you'll notice that,
when given an empty list, it produces an empty list. On the other hand,
when given a non-empty list (here, the singleton list containing unit
\code{{[}(){]}}), bind will call the continuation, here
\code{return\ (x,\ y,\ z)}, which produces a singleton list with a
\code{return (x, y, z)}, which produces a singleton list with a
verified Pythagorean triple. All those singleton lists will be
concatenated by the enclosing binds to produce the final (infinite)
result. Of course, the caller of \code{triples} will never be able to
@ -205,9 +202,11 @@ been dramatically simplified with the help of the list monad and the
simplify this code even further using list comprehension:
\begin{verbatim}
triples = [(x, y, z) | z <- [1..] , x <- [1..z] , y <- [x..z] , x^2 + y^2 == z^2]
triples = [(x, y, z) | z <- [1..]
, x <- [1..z]
, y <- [x..z]
, x^2 + y^2 == z^2]
\end{verbatim}
This is just further syntactic sugar for the list monad (strictly
speaking, \code{MonadPlus}).
@ -219,25 +218,24 @@ languages under the guise of generators and coroutines.
A function that has read-only access to some external state, or
environment, can be always replaced by a function that takes that
environment as an additional argument. A pure function
\code{(a,\ e)\ ->\ b} (where \code{e} is the type of
\code{(a, e) -> b} (where \code{e} is the type of
the environment) doesn't look, at first sight, like a Kleisli arrow. But
as soon as we curry it to
\code{a\ ->\ (e\ ->\ b)} we recognize the
\code{a -> (e -> b)} we recognize the
embellishment as our old friend the reader functor:
\begin{verbatim}
newtype Reader e a = Reader (e -> a)
\end{verbatim}
You may interpret a function returning a \code{Reader} as producing a
mini-executable: an action that given an environment produces the
desired result. There is a helper function \code{runReader} to execute
such an action:
\begin{verbatim}
runReader :: Reader e a -> e -> a runReader (Reader f) e = f e
runReader :: Reader e a -> e -> a
runReader (Reader f) e = f e
\end{verbatim}
It may produce different results for different values of the
environment.
@ -251,28 +249,29 @@ produces a \code{b}:
\begin{verbatim}
ra >>= k = Reader (\e -> ...)
\end{verbatim}
Inside the lambda, we can execute the action \code{ra} to produce an
\code{a}:
\begin{verbatim}
ra >>= k = Reader (\e -> let a = runReader ra e in ...)
ra >>= k = Reader (\e -> let a = runReader ra e
in ...)
\end{verbatim}
We can then pass the \code{a} to the continuation \code{k} to get a
new action \code{rb}:
\begin{verbatim}
ra >>= k = Reader (\e -> let a = runReader ra e rb = k a in ...)
ra >>= k = Reader (\e -> let a = runReader ra e
rb = k a
in ...)
\end{verbatim}
Finally, we can run the action \code{rb} with the environment
\code{e}:
\begin{verbatim}
ra >>= k = Reader (\e -> let a = runReader ra e rb = k a in runReader rb e)
ra >>= k = Reader (\e -> let a = runReader ra e
rb = k a
in runReader rb e)
\end{verbatim}
To implement \code{return} we create an action that ignores the
environment and returns the unchanged value.
@ -280,7 +279,9 @@ Putting it all together, after a few simplifications, we get the
following definition:
\begin{verbatim}
instance Monad (Reader e) where ra >>= k = Reader (\e -> runReader (k (runReader ra e)) e) return x = Reader (\e -> x)
instance Monad (Reader e) where
ra >>= k = Reader (\e -> runReader (k (runReader ra e)) e)
return x = Reader (\e -> x)
\end{verbatim}
\subsection{Write-Only State}\label{write-only-state}
@ -291,20 +292,22 @@ the \code{Writer} functor:
\begin{verbatim}
newtype Writer w a = Writer (a, w)
\end{verbatim}
For completeness, there's also a trivial helper \code{runWriter} that
unpacks the data constructor:
\begin{verbatim}
runWriter :: Writer w a -> (a, w) runWriter (Writer (a, w)) = (a, w)
runWriter :: Writer w a -> (a, w)
runWriter (Writer (a, w)) = (a, w)
\end{verbatim}
As we've seen before, in order to make \code{Writer} composable,
\code{w} has to be a monoid. Here's the monad instance for
\code{Writer} written in terms of the bind operator:
\begin{verbatim}
instance (Monoid w) => Monad (Writer w) where (Writer (a, w)) >>= k = let (a', w') = runWriter (k a) in Writer (a', w `mappend` w') return a = Writer (a, mempty)
instance (Monoid w) => Monad (Writer w) where
(Writer (a, w)) >>= k = let (a', w') = runWriter (k a)
in Writer (a', w `mappend` w')
return a = Writer (a, mempty)
\end{verbatim}
\subsection{State}\label{state}
@ -313,22 +316,21 @@ Functions that have read/write access to state combine the
embellishments of the \code{Reader} and the \code{Writer}. You may
think of them as pure functions that take the state as an extra argument
and produce a pair value/state as a result:
\code{(a,\ s)\ ->\ (b,\ s)}. After currying, we get them
\code{(a, s) -> (b, s)}. After currying, we get them
into the form of Kleisli arrows
\code{a\ ->\ (s\ ->\ (b,\ s))}, with the
\code{a -> (s -> (b, s))}, with the
embellishment abstracted in the \code{State} functor:
\begin{verbatim}
newtype State s a = State (s -> (a, s))
\end{verbatim}
Again, we can look at a Kleisli arrow as returning an action, which can
be executed using the helper function:
\begin{verbatim}
runState :: State s a -> s -> (a, s) runState (State f) s = f s
runState :: State s a -> s -> (a, s)
runState (State f) s = f s
\end{verbatim}
Different initial states may not only produce different results, but
also different final states.
@ -337,26 +339,30 @@ to that of the \code{Reader} monad, except that care has to be taken
to pass the correct state at each step:
\begin{verbatim}
sa >>= k = State (\s -> let (a, s') = runState sa s sb = k a in runState sb s')
sa >>= k = State (\s -> let (a, s') = runState sa s
sb = k a
in runState sb s')
\end{verbatim}
Here's the full instance:
\begin{verbatim}
instance Monad (State s) where sa >>= k = State (\s -> let (a, s') = runState sa s in runState (k a) s') return a = State (\s -> (a, s))
instance Monad (State s) where
sa >>= k = State (\s -> let (a, s') = runState sa s
in runState (k a) s')
return a = State (\s -> (a, s))
\end{verbatim}
There are also two helper Kleisli arrows that may be used to manipulate
the state. One of them retrieves the state for inspection:
\begin{verbatim}
get :: State s s get = State (\s -> (s, s))
get :: State s s
get = State (\s -> (s, s))
\end{verbatim}
and the other replaces it with a completely new state:
\begin{verbatim}
put :: s -> State s () put s' = State (\s -> ((), s'))
put :: s -> State s ()
put s' = State (\s -> ((), s'))
\end{verbatim}
\subsection{Exceptions}\label{exceptions}
@ -365,7 +371,7 @@ An imperative function that throws an exception is really a partial
function --- it's a function that's not defined for some values of its
arguments. The simplest implementation of exceptions in terms of pure
total functions uses the \code{Maybe} functor. A partial function is
extended to a total function that returns \code{Just\ a} whenever it
extended to a total function that returns \code{Just a} whenever it
makes sense, and \code{Nothing} when it doesn't. If we want to also
return some information about the cause of the failure, we can use the
\code{Either} functor instead (with the first type fixed, for
@ -374,9 +380,11 @@ instance, to \code{String}).
Here's the \code{Monad} instance for \code{Maybe}:
\begin{verbatim}
instance Monad Maybe where Nothing >>= k = Nothing Just a >>= k = k a return a = Just a
instance Monad Maybe where
Nothing >>= k = Nothing
Just a >>= k = k a
return a = Just a
\end{verbatim}
Notice that monadic composition for \code{Maybe} correctly
short-circuits the computation (the continuation \code{k} is never
called) when an error is detected. That's the behavior we expect from
@ -396,8 +404,7 @@ which represents ``the rest of the computation'':
\begin{verbatim}
data Cont r a = Cont ((a -> r) -> r)
\end{verbatim}
The handler \code{a\ ->\ r}, when it's eventually called,
The handler \code{a -> r}, when it's eventually called,
produces the result of type \code{r}, and this result is returned at
the end. A continuation is parameterized by the result type. (In
practice, this is often some kind of status indicator.)
@ -406,9 +413,9 @@ There is also a helper function for executing the action returned by the
Kleisli arrow. It takes the handler and passes it to the continuation:
\begin{verbatim}
runCont :: Cont r a -> (a -> r) -> r runCont (Cont k) h = k h
runCont :: Cont r a -> (a -> r) -> r
runCont (Cont k) h = k h
\end{verbatim}
The composition of continuations is notoriously difficult, so its
handling through a monad and, in particular, the \code{do} notation,
is of extreme advantage.
@ -417,17 +424,17 @@ Let's figure out the implementation of bind. First let's look at the
stripped down signature:
\begin{verbatim}
(>>=) :: ((a -> r) -> r) -> (a -> (b -> r) -> r) -> ((b -> r) -> r)
(>>=) :: ((a -> r) -> r) ->
(a -> (b -> r) -> r) ->
((b -> r) -> r)
\end{verbatim}
Our goal is to create a function that takes the handler
\code{(b\ ->\ r)} and produces the result \code{r}. So
\code{(b -> r)} and produces the result \code{r}. So
that's our starting point:
\begin{verbatim}
ka >>= kab = Cont (\hb -> ...)
\end{verbatim}
Inside the lambda, we want to call the function \code{ka} with the
appropriate handler that represents the rest of the computation. We'll
implement this handler as a lambda:
@ -435,21 +442,22 @@ implement this handler as a lambda:
\begin{verbatim}
runCont ka (\a -> ...)
\end{verbatim}
In this case, the rest of the computation involves first calling
\code{kab} with \code{a}, and then passing \code{hb} to the
resulting action \code{kb}:
\begin{verbatim}
runCont ka (\a -> let kb = kab a in runCont kb hb)
runCont ka (\a -> let kb = kab a
in runCont kb hb)
\end{verbatim}
As you can see, continuations are composed inside out. The final handler
\code{hb} is called from the innermost layer of the computation.
Here's the full instance:
\begin{verbatim}
instance Monad (Cont r) where ka >>= kab = Cont (\hb -> runCont ka (\a -> runCont (kab a) hb)) return a = Cont (\ha -> ha a)
instance Monad (Cont r) where
ka >>= kab = Cont (\hb -> runCont ka (\a -> runCont (kab a) hb))
return a = Cont (\ha -> ha a)
\end{verbatim}
\subsection{Interactive Input}\label{interactive-input}
@ -473,10 +481,9 @@ as a Kleisli arrow:
\begin{verbatim}
getChar :: () -> IO Char
\end{verbatim}
(Actually, since a function from the unit type is equivalent to picking
a value of the return type, the declaration of \code{getChar} is
simplified to \code{getChar\ ::\ IO\ Char}.)
simplified to \code{getChar :: IO Char}.)
Being a functor, \code{IO} lets you manipulate its contents using
\code{fmap}. And, as a functor, it can store the contents of any type,
@ -505,13 +512,11 @@ signature:
\begin{verbatim}
main :: IO ()
\end{verbatim}
and you are free to think of it as a Kleisli arrow:
\begin{verbatim}
main :: () -> IO ()
\end{verbatim}
From that perspective, a Haskell program is just one big Kleisli arrow
in the \code{IO} monad. You can compose it from smaller Kleisli arrows
using monadic composition. It's up to the runtime system to do something
@ -551,13 +556,11 @@ object of this type. An \code{IO} action is just a function:
\begin{verbatim}
type IO a = RealWorld -> (a, RealWorld)
\end{verbatim}
Or, in terms of the \code{State} monad:
\begin{verbatim}
type IO = State RealWorld
\end{verbatim}
However, \code{>=>} and \code{return} for
the \code{IO} monad have to be built into the language.
@ -571,13 +574,11 @@ that they do nothing. For instance, why do we have:
\begin{verbatim}
putStr :: String -> IO ()
\end{verbatim}
rather than the simpler:
\begin{verbatim}
putStr :: String -> ()
\end{verbatim}
Two reasons: Haskell is lazy, so it would never call a function whose
output --- here, the unit object --- is not used for anything. And, even
if it weren't lazy, it could still freely change the order of such calls
@ -589,9 +590,11 @@ passed between \code{IO} actions enforces sequencing.
Conceptually, in this program:
\begin{verbatim}
main :: IO () main = do putStr "Hello " putStr "World!"
main :: IO ()
main = do
putStr "Hello "
putStr "World!"
\end{verbatim}
the action that prints ``World!'' receives, as input, the Universe in
which ``Hello '' is already on the screen. It outputs a new Universe,
with ``Hello World!'' on the screen.
@ -606,8 +609,4 @@ though. The major complaint about monads is that they don't easily
compose with each other. Granted, you can combine most of the basic
monads using the monad transformer library. It's relatively easy to
create a monad stack that combines, say, state with exceptions, but
there is no formula for stacking arbitrary monads together.
Next:
\href{https://bartoszmilewski.com/2016/12/27/monads-categorically/}{Monads
Categorically}.
there is no formula for stacking arbitrary monads together.

View File

@ -1,12 +1,4 @@
\begin{quote}
This is part 22 of Categories for Programmers. Previously:
\href{https://bartoszmilewski.com/2016/11/30/monads-and-effects/}{Monads
and Effects}. See the
\href{https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/}{Table
of Contents}.
\end{quote}
If you mention monads to a programmer, you'll probably end up talking
\lettrine[lhang=0.17]{I}{f you mention monads} to a programmer, you'll probably end up talking
about effects. To a mathematician, monads are about algebras. We'll talk
about algebras later --- they play an important role in programming ---
but first I'd like to give you a little intuition about their relation
@ -16,42 +8,43 @@ me.
Algebra is about creating, manipulating, and evaluating expressions.
Expressions are built using operators. Consider this simple expression:
\begin{verbatim}
x2 + 2 x + 1
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
x\textsuperscript{2} + 2 x + 1
\end{Verbatim}
This expression is formed using variables like \code{x}, and constants
like 1 or 2, bound together with operators like plus or times. As
programmers, we often think of expressions as trees.
\includegraphics[width=1.82292in]{images/exptree.png}
\begin{wrapfigure}[9]{R}{0pt}
\raisebox{0pt}[\dimexpr\height-0\baselineskip\relax]{
\includegraphics[width=40mm]{images/exptree.png}}%
\end{wrapfigure}
\noindent
Trees are containers so, more generally, an expression is a container
for storing variables. In category theory, we represent containers as
endofunctors. If we assign the type \code{a} to the variable
\code{x}, our expression will have the type \code{m\ a}, where
\code{x}, our expression will have the type \code{m a}, where
\code{m} is an endofunctor that builds expression trees. (Nontrivial
branching expressions are usually created using recursively defined
endofunctors.)
What's the most common operation that can be performed on an expression?
It's substitution: replacing variables with expressions. For instance,
in our example, we could replace \code{x} with \code{y\ -\ 1} to
in our example, we could replace \code{x} with \code{y - 1} to
get:
\begin{verbatim}
(y - 1)2 + 2 (y - 1) + 1
\end{verbatim}
Here's what happened: We took an expression of type \code{m\ a} and
applied a transformation of type \code{a\ ->\ m\ b}
\begin{Verbatim}[commandchars=\\\{\}]
(y - 1)\textsuperscript{2} + 2 (y - 1) + 1
\end{Verbatim}
Here's what happened: We took an expression of type \code{m a} and
applied a transformation of type \code{a -> m b}
(\code{b} represents the type of \code{y}). The result is an
expression of type \code{m\ b}. Let me spell it out:
expression of type \code{m b}. Let me spell it out:
\begin{verbatim}
m a -> (a -> m b) -> m b
\end{verbatim}
Yes, that's the signature of monadic bind.
That was a bit of motivation. Now let's get to the math of the monad.
@ -66,20 +59,18 @@ Therefore, in category theory, a monad is defined as an endofunctor
μ is a natural transformation from the square of the functor \code{T2}
back to \code{T}. The square is simply the functor composed with
itself, \code{T\ \ T} (we can only do this kind of squaring for
itself, \code{TT} (we can only do this kind of squaring for
endofunctors).
\begin{verbatim}
μ :: T2 -> T
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
μ :: T\textsuperscript{2} -> T
\end{Verbatim}
The component of this natural transformation at an object \code{a} is
the morphism:
\begin{verbatim}
μa :: T (T a) -> T a
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
μ\textsubscript{a} :: T (T a) -> T a
\end{Verbatim}
which, in \newterm{Hask}, translates directly to our definition of
\code{join}.
@ -89,34 +80,31 @@ and \code{T}:
\begin{verbatim}
η :: I -> T
\end{verbatim}
Considering that the action of \code{I} on the object \code{a} is
just \code{a}, the component of η is given by the morphism:
\begin{verbatim}
ηa :: a -> T a
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
η\textsubscript{a} :: a -> T a
\end{Verbatim}
which translates directly to our definition of \code{return}.
These natural transformations must satisfy some additional laws. One way
of looking at it is that these laws let us define a Kleisli category for
the endofunctor \code{T}. Remember that a Kleisli arrow between
\code{a} and \code{b} is defined as a morphism
\code{a\ ->\ T\ b}. The composition of two such arrows
\code{a -> T b}. The composition of two such arrows
(I'll write it as a circle with the subscript \code{T}) can be
implemented using μ:
\begin{verbatim}
g ∘T f = μc ∘ (T g) ∘ f
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
g ◦\textsubscript{T} f = μ\textsubscript{c} ◦ (T g) ◦ f
\end{Verbatim}
where
\begin{verbatim}
f :: a -> T b g :: b -> T c
f :: a -> T b
g :: b -> T c
\end{verbatim}
Here \code{T}, being a functor, can be applied to the morphism
\code{g}. It might be easier to recognize this formula in Haskell
notation:
@ -124,13 +112,11 @@ notation:
\begin{verbatim}
f >=> g = join . fmap g . f
\end{verbatim}
or, in components:
\begin{verbatim}
(f >=> g) a = join (fmap g (f a))
\end{verbatim}
In terms of the algebraic interpretation, we are just composing two
successive substitutions.
@ -142,52 +128,68 @@ more like monoid laws. In fact \code{μ} is often called
multiplication, and \code{η} unit.
Roughly speaking, the associativity law states that the two ways of
reducing the cube of \code{T}, \code{T3}, down to \code{T} must
reducing the cube of \code{T}, \code{T\textsuperscript{3}}, down to \code{T} must
give the same result. Two unit laws (left and right) state that when
\code{η} is applied to \code{T} and then reduced by \code{μ}, we
get back \code{T}.
Things are a little tricky because we are composing natural
transformations and functors. So a little refresher on horizontal
composition is in order. For instance, \code{T3} can be seen as a
composition of \code{T} after \code{T2}. We can apply to it the
composition is in order. For instance, \code{T\textsuperscript{3}} can be seen as a
composition of \code{T} after \code{T\textsuperscript{2}}. We can apply to it the
horizontal composition of two natural transformations:
\begin{verbatim}
IT ∘ μ
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
I\textsubscript{T} μ
\end{Verbatim}
\includegraphics[width=2.58333in]{images/assoc1.png}
\begin{wrapfigure}[8]{R}{0pt}
\raisebox{0pt}[\dimexpr\height-0\baselineskip\relax]{
\includegraphics[width=60mm]{images/assoc1.png}}%
\end{wrapfigure}
and get \code{T∘T}; which can be further reduced to \code{T} by
applying \code{μ}. \code{IT} is the identity natural transformation
\noindent
and get \code{T◦T}; which can be further reduced to \code{T} by
applying \code{μ}. \code{I\textsubscript{T}} is the identity natural transformation
from \code{T} to \code{T}. You will often see the notation for this
type of horizontal composition \code{IT\ \ μ} shortened to
\code{Tμ}. This notation is unambiguous because it makes no sense to
type of horizontal composition \code{I\textsubscript{T}μ} shortened to
\code{Tμ}. This notation is unambiguous because it makes no sense to
compose a functor with a natural transformation, therefore \code{T}
must mean \code{IT} in this context.
must mean \code{I\textsubscript{T}} in this context.
We can also draw the diagram in the (endo-) functor category
\code{{[}C,\ C{]}}:
We can also draw the diagram in the (endo-) functor category\\
\code{{[}C, C{]}}:
\includegraphics[width=1.73958in]{images/assoc2.png}
\begin{figure}[H]
\centering
\includegraphics[width=50mm]{images/assoc2.png}
\end{figure}
Alternatively, we can treat \code{T3} as the composition of
\code{T2∘T} and apply \code{μ∘T} to it. The result is also
\code{T∘T} which, again, can be reduced to \code{T} using μ. We
\noindent
Alternatively, we can treat \code{T\textsuperscript{3}} as the composition of
\code{T\textsuperscript{2}◦T} and apply \code{μ◦T} to it. The result is also
\code{T◦T} which, again, can be reduced to \code{T} using μ. We
require that the two paths produce the same result.
\begin{figure}[H]
\centering
\includegraphics[width=2.16667in]{images/assoc.png}
\end{figure}
Similarly, we can apply the horizontal composition \code{η∘T} to the
\noindent
Similarly, we can apply the horizontal composition \code{η◦T} to the
composition of the identity functor \code{I} after \code{T} to
obtain \code{T2}, which can then be reduced using \code{μ}. The
obtain \code{T\textsuperscript{2}}, which can then be reduced using \code{μ}. The
result should be the same as if we applied the identity natural
transformation directly to \code{T}. And, by analogy, the same should
be true for \code{Tη}.
be true for \code{Tη}.
\includegraphics[width=3.12500in]{images/unitlawcomp-1.png}
\begin{figure}[H]
\centering
\includegraphics[width=80mm]{images/unitlawcomp-1.png}
\end{figure}
\noindent
You can convince yourself that these laws guarantee that the composition
of Kleisli arrows indeed satisfies the laws of a category.
@ -203,9 +205,10 @@ with a binary operation and a special element called unit. In Haskell,
this can be expressed as a typeclass:
\begin{verbatim}
class Monoid m where mappend :: m -> m -> m mempty :: m
class Monoid m where
mappend :: m -> m -> m
mempty :: m
\end{verbatim}
The binary operation \code{mappend} must be associative and unital
(i.e., multiplication by the unit \code{mempty} is a no-op).
@ -216,18 +219,16 @@ function:
\begin{verbatim}
mappend :: m -> (m -> m)
\end{verbatim}
It's this interpretation that gives rise to the definition of a monoid
as a single-object category where endomorphisms
\code{(m\ ->\ m)} represent the elements of the monoid.
\code{(m -> m)} represent the elements of the monoid.
But because currying is built into Haskell, we could as well have
started with a different definition of multiplication:
\begin{verbatim}
mu :: (m, m) -> m
\end{verbatim}
Here, the cartesian product \code{(m,\ m)} becomes the source of pairs
Here, the cartesian product \code{(m, m)} becomes the source of pairs
to be multiplied.
This definition suggests a different path to generalization: replacing
@ -238,7 +239,6 @@ there, and define multiplication as a morphism:
\begin{verbatim}
μ :: m × m -> m
\end{verbatim}
We have one problem though: In an arbitrary category we can't peek
inside an object, so how do we pick the unit element? There is a trick
to it. Remember how element selection is equivalent to a function from
@ -248,7 +248,6 @@ the singleton set? In Haskell, we could replace the definition of
\begin{verbatim}
eta :: () -> m
\end{verbatim}
The singleton is the terminal object in \textbf{Set}, so it's natural to
generalize this definition to any category that has a terminal object
\code{t}:
@ -256,7 +255,6 @@ generalize this definition to any category that has a terminal object
\begin{verbatim}
η :: t -> m
\end{verbatim}
This lets us pick the unit ``element'' without having to talk about
elements.
@ -272,7 +270,6 @@ law is:
\begin{verbatim}
mu x (mu y z) = mu (mu x y) z
\end{verbatim}
Before we can generalize it to other categories, we have to rewrite it
as an equality of functions (morphisms). We have to abstract it away
from its action on individual variables --- in other words, we have to
@ -282,66 +279,60 @@ bifunctor, we can write the left hand side as:
\begin{verbatim}
(mu . bimap id mu)(x, (y, z))
\end{verbatim}
and the right hand side as:
\begin{verbatim}
(mu . bimap mu id)((x, y), z)
\end{verbatim}
This is almost what we want. Unfortunately, the cartesian product is not
strictly associative --- \code{(x,\ (y,\ z))} is not the same as
\code{((x,\ y),\ z)} --- so we can't just write point-free:
strictly associative --- \code{(x, (y, z))} is not the same as
\code{((x, y), z)} --- so we can't just write point-free:
\begin{verbatim}
mu . bimap id mu = mu . bimap mu id
\end{verbatim}
On the other hand, the two nestings of pairs are isomorphic. There is an
invertible function called the associator that converts between them:
\begin{verbatim}
alpha :: ((a, b), c) -> (a, (b, c)) alpha ((x, y), z) = (x, (y, z))
alpha :: ((a, b), c) -> (a, (b, c))
alpha ((x, y), z) = (x, (y, z))
\end{verbatim}
With the help of the associator, we can write the point-free
associativity law for \code{mu}:
\begin{verbatim}
mu . bimap id mu . alpha = mu . bimap mu id
\end{verbatim}
We can apply a similar trick to unit laws which, in the new notation,
take the form:
\begin{verbatim}
mu (eta (), x) = x mu (x, eta ()) = x
\end{verbatim}
They can be rewritten as:
\begin{verbatim}
(mu . bimap eta id) ((), x) = lambda ((), x) (mu . bimap id eta) (x, ()) = rho (x, ())
(mu . bimap eta id) ((), x) = lambda((), x)
(mu . bimap id eta) (x, ()) = rho (x, ())
\end{verbatim}
The isomorphisms \code{lambda} and \code{rho} are called the left
and right unitor, respectively. They witness the fact that the unit
\code{()} is the identity of the cartesian product up to isomorphism:
\begin{verbatim}
lambda :: ((), a) -> a lambda ((), x) = x
lambda :: ((), a) -> a
lambda ((), x) = x
\end{verbatim}
\begin{verbatim}
rho :: (a, ()) -> a rho (x, ()) = x
rho :: (a, ()) -> a
rho (x, ()) = x
\end{verbatim}
The point-free versions of the unit laws are therefore:
\begin{verbatim}
mu . bimap id eta = lambda mu . bimap eta id = rho
\end{verbatim}
We have formulated point-free monoidal laws for \code{mu} and
\code{eta} using the fact that the underlying cartesian product itself
acts like a monoidal multiplication in the category of types. Keep in
@ -354,8 +345,12 @@ associative up to isomorphism and the terminal object is the unit, also
up to isomorphism. The associator and the two unitors are natural
isomorphisms. The laws can be represented by commuting diagrams.
\includegraphics[width=3.12500in]{images/assocmon.png}
\begin{figure}[H]
\centering
\includegraphics[width=80mm]{images/assocmon.png}
\end{figure}
\noindent
Notice that, because the product is a bifunctor, it can lift a pair of
morphisms --- in Haskell this was done using \code{bimap}.
@ -369,7 +364,7 @@ that uses projections. We haven't used any projections in our
formulation of monoidal laws.
A bifunctor that behaves like a product without being a product is
called a tensor product, often denoted by the infix operator . A
called a tensor product, often denoted by the infix operator \ensuremath{\otimes}. A
definition of a tensor product in general is a bit tricky, but we won't
worry about it. We'll just list its properties --- the most important
being associativity up to isomorphism.
@ -384,18 +379,18 @@ together:
A monoidal category is a category \emph{C} equipped with a bifunctor
called the tensor product:
\begin{verbatim}
⊗ :: C × C -> C
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
\ensuremath{\otimes} :: C × C -> C
\end{Verbatim}
and a distinct object \code{i} called the unit object, together with
three natural isomorphisms called, respectively, the associator and the
left and right unitors:
\begin{verbatim}
αa b c :: (a ⊗ b) ⊗ c -> a ⊗ (b ⊗ c) λa :: i ⊗ a -> a ρa :: a ⊗ i -> a
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
α\textsubscript{a b c} :: (a \ensuremath{\otimes} b) \ensuremath{\otimes} c -> a \ensuremath{\otimes} (b \ensuremath{\otimes} c)
λ\textsubscript{a} :: i \ensuremath{\otimes} a -> a
ρ\textsubscript{a} :: a \ensuremath{\otimes} i -> a
\end{Verbatim}
(There is also a coherence condition for simplifying a quadruple tensor
product.)
@ -411,29 +406,40 @@ Category}\label{monoid-in-a-monoidal-category}
We are now ready to define a monoid in a more general setting of a
monoidal category. We start by picking an object \code{m}. Using the
tensor product we can form powers of \code{m}. The square of
\code{m} is \code{m\ \ m}. There are two ways of forming the cube
\code{m} is \code{m \ensuremath{\otimes} m}. There are two ways of forming the cube
of \code{m}, but they are isomorphic through the associator. Similarly
for higher powers of \code{m} (that's where we need the coherence
conditions). To form a monoid we need to pick two morphisms:
\begin{verbatim}
μ :: m ⊗ m -> m η :: i -> m
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
μ :: m \ensuremath{\otimes} m -> m
η :: i -> m
\end{Verbatim}
where \code{i} is the unit object for our tensor product.
\includegraphics[width=3.12500in]{images/monoid-1.jpg}
\begin{figure}[H]
\centering
\includegraphics[width=70mm]{images/monoid-1.jpg}
\end{figure}
\noindent
These morphisms have to satisfy associativity and unit laws, which can
be expressed in terms of the following commuting diagrams:
\begin{figure}[H]
\centering
\includegraphics[width=3.12500in]{images/assoctensor.jpg}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=3.12500in]{images/unitmon.jpg}
\end{figure}
\noindent
Notice that it's essential that the tensor product be a bifunctor
because we need to lift pairs of morphisms to form products such as
\code{μ\ \ id} or \code{η\ \ id}. These diagrams are just a
\code{μ \ensuremath{\otimes} id} or \code{η \ensuremath{\otimes} id}. These diagrams are just a
straightforward generalization of our previous results for categorical
products.
@ -448,11 +454,11 @@ composition of morphisms --- and, as we know, functors are indeed
morphisms in the category \textbf{Cat}. But just like endomorphisms
(morphisms that loop back to the same object) are always composable, so
are endofunctors. For any given category \emph{C}, endofunctors from
\emph{C} to \emph{C} form the functor category \code{{[}C,\ C{]}}. Its
\emph{C} to \emph{C} form the functor category \code{{[}C, C{]}}. Its
objects are endofunctors, and morphisms are natural transformations
between them. We can take any two objects from this category, say
endofunctors \code{F} and \code{G}, and produce a third object
\code{F\ \ G} --- an endofunctor that's their composition.
endofunctors \code{F} and \code{G}, and produce a third object\\
\code{FG} --- an endofunctor that's their composition.
Is endofunctor composition a good candidate for a tensor product? First,
we have to establish that it's a bifunctor. Can it be used to lift a
@ -460,21 +466,23 @@ pair of morphisms --- here, natural transformations? The signature of
the analog of \code{bimap} for the tensor product would look something
like this:
\begin{verbatim}
bimap :: (a -> b) -> (c -> d) -> (a ⊗ c -> b ⊗ d)
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
bimap :: (a -> b) -> (c -> d) -> (a \ensuremath{\otimes} c -> b \ensuremath{\otimes} d)
\end{Verbatim}
If you replace objects by endofunctors, arrows by natural
transformations, and tensor products by composition, you get:
\begin{verbatim}
(F -> F') -> (G -> G') -> (F ∘ G -> F' ∘ G')
(F -> F') -> (G -> G') -> (F ◦ G -> F' ◦ G')
\end{verbatim}
which you may recognize as the special case of horizontal composition.
\includegraphics[width=2.65625in]{images/horizcomp.png}
\begin{figure}[H]
\centering
\includegraphics[width=70mm]{images/horizcomp.png}
\end{figure}
\noindent
We also have at our disposal the identity endofunctor \code{I}, which
can serve as the identity for endofunctor composition --- our new tensor
product. Moreover, functor composition is associative. In fact
@ -487,56 +495,61 @@ endofunctor \code{T}; and two morphisms --- that is natural
transformations:
\begin{verbatim}
μ :: T ∘ T -> T η :: I -> T
μ :: T ◦ T -> T
η :: I -> T
\end{verbatim}
Not only that, here are the monoid laws:
\begin{figure}[H]
\centering
\includegraphics[width=1.90625in]{images/assoc.png}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=2.86458in]{images/unitlawcomp.png}
\end{figure}
\noindent
They are exactly the monad laws we've seen before. Now you understand
the famous quote from Saunders Mac Lane:
\begin{quote}
All told, monad is just a monoid in the category of endofunctors.
\end{quote}
You might have seen it emblazoned on some t-shirts at functional
programming conferences.
\section{Monads from Adjunctions}\label{monads-from-adjunctions}
An
\href{https://bartoszmilewski.com/2016/04/18/adjunctions/}{adjunction},
\code{L\ \ R}, is a pair of functors going back and forth between two
An \hyperref[adjunctions]{adjunction},
\code{L \ensuremath{\dashv} R}, is a pair of functors going back and forth between two
categories \emph{C} and \emph{D}. There are two ways of composing them
giving rise to two endofunctors, \code{R\ \ L} and \code{L\ \ R}.
giving rise to two endofunctors, \code{R ◦ L} and \code{L ◦ R}.
As per an adjunction, these endofunctors are related to identity
functors through two natural transformations called unit and counit:
\begin{verbatim}
η :: ID -> R ∘ L ε :: L ∘ R -> IC
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
η :: I\textsubscript{D} -> R ◦ L
ε :: L ◦ R -> I\textsubscript{C}
\end{Verbatim}
Immediately we see that the unit of an adjunction looks just like the
unit of a monad. It turns out that the endofunctor \code{R\ \ L} is
unit of a monad. It turns out that the endofunctor \code{RL} is
indeed a monad. All we need is to define the appropriate μ to go with
the η. That's a natural transformation between the square of our
endofunctor and the endofunctor itself or, in terms of the adjoint
functors:
\begin{verbatim}
R ∘ L ∘ R ∘ L -> R ∘ L
R ◦ L ◦ R ◦ L -> R ◦ L
\end{verbatim}
And, indeed, we can use the counit to collapse the \code{L\ \ R} in
And, indeed, we can use the counit to collapse the \code{L ◦ R} in
the middle. The exact formula for μ is given by the horizontal
composition:
\begin{verbatim}
μ = R ∘ ε ∘ L
μ = R ◦ ε ◦ L
\end{verbatim}
Monadic laws follow from the identities satisfied by the unit and counit
of the adjunction and the interchange law.
@ -545,41 +558,38 @@ because an adjunction usually involves two categories. However, the
definitions of an exponential, or a function object, is an exception.
Here are the two endofunctors that form this adjunction:
\begin{verbatim}
L z = z × s R b = s ⇒ b
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
L z = z × s
R b = s \ensuremath{\Rightarrow} b
\end{Verbatim}
You may recognize their composition as the familiar state monad:
\begin{verbatim}
R (L z) = s ⇒ (z × s)
\end{verbatim}
\begin{Verbatim}[commandchars=\\\{\}]
R (L z) = s \ensuremath{\Rightarrow} (z × s)
\end{Verbatim}
We've seen this monad before in Haskell:
\begin{verbatim}
newtype State s a = State (s -> (a, s))
\end{verbatim}
Let's also translate the adjunction to Haskell. The left functor is the
product functor:
\begin{verbatim}
newtype Prod s a = Prod (a, s)
\end{verbatim}
and the right functor is the reader functor:
\begin{verbatim}
newtype Reader s a = Reader (s -> a)
\end{verbatim}
They form the adjunction:
\begin{verbatim}
instance Adjunction (Prod s) (Reader s) where counit (Prod (Reader f, s)) = f s unit a = Reader (\s -> Prod (a, s))
instance Adjunction (Prod s) (Reader s) where
counit (Prod (Reader f, s)) = f s
unit a = Reader (\s -> Prod (a, s))
\end{verbatim}
You can easily convince yourself that the composition of the reader
functor after the product functor is indeed equivalent to the state
functor:
@ -587,16 +597,15 @@ functor:
\begin{verbatim}
newtype State s a = State (s -> (a, s))
\end{verbatim}
As expected, the \code{unit} of the adjunction is equivalent to the
\code{return} function of the state monad. The \code{counit} acts by
evaluating a function acting on its argument. This is recognizable as
the uncurried version of the function \code{runState}:
\begin{verbatim}
runState :: State s a -> s -> (a, s) runState (State f) s = f s
runState :: State s a -> s -> (a, s)
runState (State f) s = f s
\end{verbatim}
(uncurried, because in \code{counit} it acts on a pair).
We can now define \code{join} for the state monad as a component of
@ -604,9 +613,8 @@ the natural transformation μ. For that we need a horizontal composition
of three natural transformations:
\begin{verbatim}
μ = R ∘ ε ∘ L
μ = R ◦ ε ◦ L
\end{verbatim}
In other words, we need to sneak the counit ε across one level of the
reader functor. We can't just call \code{fmap} directly, because the
compiler would pick the one for the \code{State} functor, rather than
@ -619,17 +627,17 @@ the function inside the \code{State} functor. This is done using
\code{runState}:
\begin{verbatim}
ssa :: State s (State s a) runState ssa :: s -> (State s a, s)
ssa :: State s (State s a)
runState ssa :: s -> (State s a, s)
\end{verbatim}
Then we left-compose it with the counit, which is defined by
\code{uncurry\ runState}. Finally, we clothe it back in the
\code{State} data constructor:
\begin{verbatim}
join :: State s (State s a) -> State s a join ssa = State (uncurry runState . runState ssa)
join :: State s (State s a) -> State s a
join ssa = State (uncurry runState . runState ssa)
\end{verbatim}
This is indeed the implementation of \code{join} for the
\code{State} monad.
@ -638,7 +646,5 @@ the converse is also true: every monad can be factorized into a
composition of two adjoint functors. Such factorization is not unique
though.
We'll talk about the other endofunctor \code{L\ \ R} in the next
section.
Next: \href{https://bartoszmilewski.com/2017/01/02/comonads/}{Comonads}.
We'll talk about the other endofunctor \code{L ◦ R} in the next
section.

View File

@ -124,7 +124,7 @@ PDF compiled by @url{https://github.com/hmemcpy/milewski-ctfp-pdf, Igal Tabachni
\chapter{It's All About Morphisms}
\subfile{content/3.1/It's All About Morphisms}
\chapter{Adjunctions}
\chapter{Adjunctions}\label{adjunctions}
\subfile{content/3.2/Adjunctions}
\chapter{Free/Forgetful Adjunctions}\label{free-forgetful-adjunctions}
@ -133,6 +133,12 @@ PDF compiled by @url{https://github.com/hmemcpy/milewski-ctfp-pdf, Igal Tabachni
\chapter{Monads: Programmer's Definition}\label{monads-programmers-definition}
\subfile{content/3.4/Monads - Programmer's Definition}
\chapter{Monads and Effects}\label{monads-and-effects}
\subfile{content/3.5/Monads and Effects}
\chapter{Monads Categorically}\label{monads-categorically}
\subfile{content/3.6/Monads Categorically}
\backmatter
@unnumbered Acknowledgments