Converting chapters 1 and 2 to use external .tex files

This commit is contained in:
Igal Tabachnik 2017-09-22 18:32:18 +03:00 committed by Igal Tabachnik
parent c6e5ba1e94
commit ae8ca9fe04
6 changed files with 182 additions and 688 deletions

BIN
ctfp.pdf

Binary file not shown.

View File

@ -1,7 +1,3 @@
You may also watch me
\href{https://www.youtube.com/playlist?list=PLbgaMIhjbmEnaH_LTkxLI7FMa2HsnawM_}{teaching
this material} to a live audience.
\begin{quote}
For some time now I've been floating the idea of writing a book about
category theory that would be targeted at programmers. Mind you, not
@ -13,24 +9,26 @@ strong compulsion to explain things. I have tremendous admiration for
Richard Feynman who was the master of simple explanations. I know I'm no
Feynman, but I will try my best. I'm starting by publishing this preface
--- which is supposed to motivate the reader to learn category theory
--- in hopes of starting a discussion and soliciting feedback.
--- in hopes of starting a discussion and soliciting feedback.\footnote{
You may also watch me \href{https://www.youtube.com/playlist?list=PLbgaMIhjbmEnaH_LTkxLI7FMa2HsnawM_}{teaching
this material} to a live audience.}
\end{quote}
I will attempt, in the space of a few paragraphs, to convince you that
this book is written for you, and whatever objections you might have to
learning one of the most abstract branches of mathematics in your
``copious spare~time'' are totally unfounded.
\lettrine[lhang=0.17]{I}{will attempt}, in the space of a few paragraphs,
to convince you that this book is written for you, and whatever
objections you might have to learning one of the most abstract branches
of mathematics in your ``copious spare time'' are totally unfounded.
My optimism is based on several observations. First, category theory is
a treasure trove~of extremely useful programming ideas. Haskell
programmers have been tapping this resource~for a long time, and the
a treasure trove of extremely useful programming ideas. Haskell
programmers have been tapping this resource for a long time, and the
ideas are slowly percolating into other languages, but this process is
too slow. We need to speed it up.
Second, there are many different kinds of math, and they appeal to
different audiences. You might be allergic to calculus or algebra, but
it doesn't mean you won't enjoy category theory. I would go as far as
to~argue~that category theory is the kind of math that is particularly
to argue that category theory is the kind of math that is particularly
well suited for the minds of programmers. That's because category theory
--- rather than dealing with particulars --- deals with structure. It
deals with the kind of structure that makes programs composable.
@ -40,7 +38,7 @@ definition of the category itself. And I will argue strongly that
composition is the essence of programming. We've been composing things
forever, long before some great engineer came up with the idea of a
subroutine. Some time ago the principles of structural programming
revolutionized programming~because they made blocks of code composable.
revolutionized programming because they made blocks of code composable.
Then came object oriented programming, which is all about composing
objects. Functional programming is not only about composing functions
and algebraic data structures --- it makes concurrency composable ---
@ -100,7 +98,10 @@ you keep heating water, it will eventually start boiling. We are now in
the position of a frog that must decide if it should continue swimming
in increasingly hot water, or start looking for some alternatives.
\includegraphics[width=3.12500in]{images/img_1299.jpg}
\begin{figure}
\centering
\fbox{\includegraphics[width=90mm]{images/img_1299.jpg}}
\end{figure}
One of the forces that are driving the big change is the multicore
revolution. The prevailing programming paradigm, object oriented
@ -145,13 +146,9 @@ internet infrastructure. And, frankly, they should, because we've done
all this based on very flimsy theoretical foundations. We have to fix
those foundations if we want to move forward.
\hypertarget{attachment_3472}{}
\includegraphics[width=2.34375in]{images/beauvais_interior_supports.jpg}
Ad hoc measures preventing the Beauvais cathedral from collapsing
Next:
\href{https://bartoszmilewski.com/2014/11/04/category-the-essence-of-composition/}{Category:
The Essence of Composition}.
\href{https://twitter.com/BartoszMilewski}{Follow @BartoszMilewski}
\begin{figure}
\centering
\includegraphics[totalheight=8cm]{images/beauvais_interior_supports.jpg}
\captionsetup{labelformat=empty,font=scriptsize}
\caption{Ad hoc measures preventing the Beauvais cathedral from collapsing.}
\end{figure}

View File

@ -1,47 +1,32 @@
\begin{quote}
I was overwhelmed by the positive response to my previous post, the
\href{https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/}{Preface
to Category Theory for Programmers}. At the same time, it scared the
heck out of me because I realized what high expectations people were
placing in me. I'm afraid that no matter what I'll write, a lot of
readers will be disappointed. Some readers would like the book to be
more practical, others more abstract. Some hate C++ and would like all
examples in Haskell, others hate Haskell and demand examples in Java.
And I know that the pace of exposition will be too slow for some and too
fast for others. This will not be the perfect book. It will be a
compromise. All I can hope is that I'll be able to share some of my aha!
moments with my readers. Let's start with the basics.
\end{quote}
\lettrine[lhang=0.17]{A}{category} is an embarrassingly simple concept.
A category consists of \emph{objects} and \emph{arrows} that go between them. That's
why categories are so easy to represent pictorially. An object can be
drawn as a circle or a point, and an arrow\ldots{} is an arrow. (Just
for variety, I will occasionally draw objects as piggies and arrows as
fireworks.) But the essence of a category is \emph{composition}. Or, if you
prefer, the essence of composition is a category. Arrows compose, so
if you have an arrow from object A to object B, and another arrow from
object B to object C, then there must be an arrow --- their composition
--- that goes from A to C.
A category is an embarrassingly simple concept. A category consists of
objects and arrows that go between them. That's why categories are so
easy to represent pictorially. An object can be drawn as a circle or a
point, and an arrow\ldots{} is an arrow. (Just for variety, I will
occasionally draw objects as piggies and arrows as fireworks.) But the
essence of a category is composition. Or, if you prefer, the essence of
composition is a category. Arrows compose, so if you have an arrow from
object A to object B, and another arrow from object B to object C, then
there must be an arrow --- their composition --- that goes from A to C.
\hypertarget{attachment_3483}{}
\includegraphics[width=5.31250in]{images/img_1330.jpg}
In a category, if there is an arrow going from A to B and an arrow going
from B to C then there must also be a direct arrow from A to C that is
their composition. This diagram is not a full category because it's
missing identity morphisms (see later).
\begin{figure}
\centering
\includegraphics[width=\textwidth]{images/img_1330.jpg}
\captionsetup{labelformat=empty,font=scriptsize}
\caption{In a category, if there is an arrow going from A to B and an arrow going from B to C then there must also be a direct arrow from A to C that is their composition. This diagram is not a full category because its missing identity morphisms (see later).}
\end{figure}
\subsection{Arrows as Functions}\label{arrows-as-functions}
Is this already too much abstract nonsense? Do not despair. Let's talk
concretes. Think of arrows, which are also called \emph{morphisms}, as
functions. You have a function f that takes an argument of type A and
returns a B. You have another function g that takes a B and returns a C.
You can compose them by passing the result of f to g. You have just
functions. You have a function \code{f} that takes an argument of type A and
returns a B. You have another function \code{g} that takes a B and returns a C.
You can compose them by passing the result of \code{f} to \code{g}. You have just
defined a new function that takes an A and returns a C.
In math, such composition is denoted by a small circle between
functions: g∘f. Notice the right to left order of composition. For some
functions: \ensuremath{g \circ f}. Notice the right to left order of composition. For some
people this is confusing. You may be familiar with the pipe notation in
Unix, as in:
@ -51,27 +36,26 @@ lsof | grep Chrome
or the chevron \texttt{\textgreater{}\textgreater{}} in F\#, which both
go from left to right. But in mathematics and in Haskell functions
compose right to left. It helps if you read g∘f as ``g \emph{after} f.''
compose right to left. It helps if you read \texttt{g◦f} as ``g \emph{after} f.''
Let's make this even more explicit by writing some C code. We have one
function \texttt{f} that takes an argument of type \texttt{A} and
returns a value of type \texttt{B}:
\begin{verbatim}
B f(A a);
\end{verbatim}
\mint{c}|B f(A a);|
and another:
\begin{verbatim}
C g(B b);
\end{verbatim}
\mint{c}|C g(B b);|
Their composition is:
\begin{verbatim}
C g_after_f(A a) { return g(f(a)); }
\end{verbatim}
\begin{minted}{c}
C g_after_f(A a)
{
return g(f(a));
}
\end{minted}
Here, again, you see right-to-left composition: \texttt{g(f(a))}; this
time in C.
@ -81,36 +65,28 @@ Library that takes two functions and returns their composition, but
there isn't one. So let's try some Haskell for a change. Here's the
declaration of a function from A to B:
\begin{verbatim}
f :: A -> B
\end{verbatim}
\mint{haskell}|f :: A -> B|
Similarly:
\begin{verbatim}
g :: B -> C
\end{verbatim}
\mint{haskell}|g :: B -> C|
Their composition is:
\begin{verbatim}
g . f
\end{verbatim}
\mint{text}|g . f|
Once you see how simple things are in Haskell, the inability to express
straightforward functional concepts in C++ is a little embarrassing. In
fact, Haskell will let you use Unicode characters so you can write
composition as:
\begin{verbatim}
g ∘ f
\end{verbatim}
\mint{text}|g ◦ f|
You can even use Unicode double colons and arrows:
\begin{verbatim}
f A → B
\end{verbatim}
\begin{minted}[escapeinside=||,mathescape=true]{text}
f |\ensuremath{\Colon}| A → B
\end{minted}
So here's the first Haskell lesson: Double colon means ``has the type
of\ldots{}'' A function type is created by inserting an arrow between
@ -122,51 +98,50 @@ two types. You compose two functions by inserting a period between them
There are two extremely important properties that the composition in any
category must satisfy.
1. Composition is associative. If you have three morphisms, f, g, and h,
\begin{enumerate}
\item
Composition is associative. If you have three morphisms, f, g, and h,
that can be composed (that is, their objects match end-to-end), you
don't need parentheses to compose them. In math notation this is
expressed as:
\begin{verbatim}
h∘(g∘f) = (h∘g)∘f = h∘g∘f
\end{verbatim}
\mint{text}|h◦(g◦f) = (h◦g)◦f = h◦g◦f|
In (pseudo) Haskell:
\begin{verbatim}
f :: A -> B g :: B -> C h :: C -> D h . (g . f) == (h . g) . f == h . g . f
\end{verbatim}
\begin{minted}{haskell}
f :: A -> B
g :: B -> C
h :: C -> D
h . (g . f) == (h . g) . f == h . g . f
\end{minted}
(I said ``pseudo,'' because equality is not defined for functions.)
Associativity is pretty obvious when dealing with functions, but it may
be not as obvious in other categories.
2. For every object A there is an arrow which is a unit of composition.
\item
For every object A there is an arrow which is a unit of composition.
This arrow loops from the object to itself. Being a unit of composition
means that, when composed with any arrow that either starts at A or ends
at A, respectively, it gives back the same arrow. The unit arrow for
object A is called id\textsubscript{A} (identity on A). In math
notation, if f goes from A to B then
object A is called \texttt{id\textsubscript{A}} (\emph{identity} on A). In math
notation, if \texttt{f} goes from A to B then
\begin{verbatim}
f∘idA = f
\end{verbatim}
\texttt{f◦id\textsubscript{A} = f}
and
\begin{verbatim}
idB∘f = f
\end{verbatim}
\texttt{id\textsubscript{B}◦f = f}
\end{enumerate}
When dealing with functions, the identity arrow is implemented as the
identity function that just returns back its argument. The
implementation is the same for every type, which means this function is
universally polymorphic. In C++ we could define it as a template:
\begin{verbatim}
template<class T> T id(T x) { return x; }
\end{verbatim}
\mint{c++}|template<class T> T id(T x) { return x; }|
Of course, in C++ nothing is that simple, because you have to take into
account not only what you're passing but also how (that is, by value, by
@ -175,9 +150,10 @@ reference, by const reference, by move, and so on).
In Haskell, the identity function is part of the standard library
(called Prelude). Here's its declaration and definition:
\begin{verbatim}
id :: a -> a id x = x
\end{verbatim}
\begin{minted}{haskell}
id :: a -> a
id x = x
\end{minted}
As you can see, polymorphic functions in Haskell are a piece of cake. In
the declaration, you just replace the type with a type variable. Here's
@ -203,9 +179,10 @@ This concludes our second Haskell lesson.
The identity conditions can be written (again, in pseudo-Haskell) as:
\begin{verbatim}
f . id == f id . f == f
\end{verbatim}
\begin{minted}{text}
f . id == f
id . f == f
\end{minted}
You might be asking yourself the question: Why would anyone bother with
the identity function --- a function that does nothing? Then again, why
@ -252,7 +229,7 @@ mind. Our brains can only deal with a small number of concepts at a
time. One of the most cited papers in psychology,
\href{http://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two}{The
Magical Number Seven, Plus or Minus Two}, postulated that we can only
keep 7 ± 2 ``chunks'' of information in our minds. The details of our
keep \ensuremath{7 \pm 2} ``chunks'' of information in our minds. The details of our
understanding of the human short-term memory might be changing, but we
know for sure that it's limited. The bottom line is that we are unable
to deal with the soup of objects or the spaghetti of code. We need
@ -293,7 +270,6 @@ advantages of your programming paradigm.
\subsection{Challenges}\label{challenges}
\begin{enumerate}
\tightlist
\item
Implement, as best as you can, the identity function in your favorite
language (or the second favorite, if your favorite language happens to
@ -312,9 +288,4 @@ advantages of your programming paradigm.
morphisms?
\item
When is a directed graph a category?
\end{enumerate}
Next:
\href{https://bartoszmilewski.com/2014/11/24/types-and-functions/}{Types
and Functions}.\\
\href{https://twitter.com/BartoszMilewski}{Follow @BartoszMilewski}
\end{enumerate}

View File

@ -1,13 +1,4 @@
\begin{quote}
This is part of the book Category Theory for Programmers. The previous
instalment was
\href{https://bartoszmilewski.com/2014/11/04/category-the-essence-of-composition/}{Category:
The Essence of Composition}. See the
\href{https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/}{Table
of Contents}.
\end{quote}
The category of types and functions plays an important role in
\lettrine[lhang=0.17]{T}{he category of types and functions} plays an important role in
programming, so let's talk about what types are and why we need them.
\subsection{Who Needs Types?}\label{who-needs-types}
@ -18,7 +9,10 @@ a thought experiment. Imagine millions of monkeys at computer keyboards
happily hitting random keys, producing programs, compiling, and running
them.
\includegraphics[width=2.68750in]{images/img_1329.jpg}
\begin{figure}
\centering
\fbox{\includegraphics[width=60mm]{images/img_1329.jpg}}
\end{figure}
With machine language, any combination of bytes produced by monkeys
would be accepted and run. But with higher level languages, we do
@ -114,9 +108,7 @@ synonym for a list of \texttt{Char}, is an example of an infinite set.
When we declare \texttt{x} to be an \texttt{Integer}:
\begin{verbatim}
x :: Integer
\end{verbatim}
\mint{haskell}|x :: Integer|
we are saying that it's an element of the set of integers.
\texttt{Integer} in Haskell is an infinite set, and it can be used to do
@ -154,12 +146,10 @@ undecidable --- the famous halting problem. That's why computer
scientists came up with a brilliant idea, or a major hack, depending on
your point of view, to extend every type by one more special value
called the \emph{bottom} and denoted by \texttt{\_\textbar{}\_}, or
Unicode . This ``value'' corresponds to a non-terminating computation.
Unicode \ensuremath{\bot}. This ``value'' corresponds to a non-terminating computation.
So a function declared as:
\begin{verbatim}
f :: Bool -> Bool
\end{verbatim}
\mint{haskell}|f :: Bool -> Bool|
may return \texttt{True}, \texttt{False}, or \texttt{\_\textbar{}\_};
the latter meaning that it would never terminate.
@ -169,20 +159,22 @@ is convenient to treat every runtime error as a bottom, and even allow
functions to return the bottom explicitly. The latter is usually done
using the expression \texttt{undefined}, as in:
\begin{verbatim}
f :: Bool -> Bool f x = undefined
\end{verbatim}
\begin{minted}{haskell}
f :: Bool -> Bool
f x = undefined
\end{minted}
This definition type checks because \texttt{undefined} evaluates to
bottom, which is a member of any type, including \texttt{Bool}. You can
even write:
\begin{verbatim}
f :: Bool -> Bool f = undefined
\end{verbatim}
\begin{minted}{haskell}
f :: Bool -> Bool
f = undefined
\end{minted}
(without the \texttt{x}) because the bottom is also a member of the type
\texttt{Bool-\textgreater{}Bool}.
\texttt{Bool->Bool}.
Functions that may return bottom are called partial, as opposed to total
functions, which return valid results for every possible argument.
@ -193,7 +185,10 @@ the theoretical point of view, this is the source of never-ending
complications, so at this point I will use my butcher's knife and
terminate this line of reasoning. From the pragmatic point of view, it's
okay to ignore non-terminating functions and bottoms, and treat
\textbf{Hask} as bona fide \textbf{Set} (see Bibliography at the end).
\textbf{Hask} as bona fide \textbf{Set}\footnote{Nils Anders Danielsson,
John Hughes, Patrik Jansson, Jeremy Gibbons, @url{http://www.cs.ox.ac.uk/jeremy.gibbons/publications/fast+loose.pdf,
Fast and Loose Reasoning is Morally Correct}. This paper provides justification for ignoring bottoms in most contexts.}.
\subsection{Why Do We Need a Mathematical
Model?}\label{why-do-we-need-a-mathematical-model}
@ -245,18 +240,22 @@ usually quite simple, if not trivial.
Consider the definition of a factorial function in Haskell, which is a
language quite amenable to denotational semantics:
\begin{verbatim}
fact n = product [1..n]
\end{verbatim}
\mint{haskell}|fact n = product [1..n]|
The expression \texttt{{[}1..n{]}} is a list of integers from 1 to n.
The expression \texttt{{[}1..n{]}} is a list of integers from \texttt{1} to \texttt{n}.
The function \texttt{product} multiplies all elements of a list. That's
just like a definition of factorial taken from a math text. Compare this
with C:
\begin{verbatim}
int fact(int n) { int i; int result = 1; for (i = 2; i <= n; ++i) result *= i; return result; }
\end{verbatim}
\begin{minted}{c}
int fact(int n) {
int i;
int result = 1;
for (i = 2; i <= n; ++i)
result *= i;
return result;
}
\end{minted}
Need I say more?
@ -327,9 +326,7 @@ return, there are no restrictions whatsoever. It can return any type
it's a function that's polymorphic in the return type. Haskellers have a
name for it:
\begin{verbatim}
absurd :: Void -> a
\end{verbatim}
\mint{haskell}|absurd :: Void -> a|
(Remember, \texttt{a} is a type variable that can stand for any type.)
The name is not coincidental. There is deeper interpretation of types
@ -345,9 +342,7 @@ Think of functions from and to this type. A function from \texttt{void}
can always be called. If it's a pure function, it will always return the
same result. Here's an example of such a function:
\begin{verbatim}
int f44() { return 44; }
\end{verbatim}
\mint{c}|int f44() { return 44; }|
You might think of this function as taking ``nothing'', but as we've
just seen, a function that takes ``nothing'' can never be called because
@ -361,9 +356,10 @@ because of the Haskell's love of terseness, the same symbol \texttt{()}
is used for the type, the constructor, and the only value corresponding
to a singleton set. So here's this function in Haskell:
\begin{verbatim}
f44 :: () -> Integer f44 () = 44
\end{verbatim}
\begin{minted}{haskell}
f44 :: () -> Integer
f44 () = 44
\end{minted}
The first line declares that \texttt{f44} takes the type \texttt{()},
pronounced ``unit,'' into the type \texttt{Integer}. The second line
@ -371,9 +367,7 @@ defines \texttt{f44} by pattern matching the only constructor for unit,
namely \texttt{()}, and producing the number 44. You call this function
by providing the unit value \texttt{()}:
\begin{verbatim}
f44 ()
\end{verbatim}
\mint{haskell}|f44 ()|
Notice that every function of unit is equivalent to picking a single
element from the target type (here, picking the \texttt{Integer} 44). In
@ -394,18 +388,20 @@ element of A to the single element of that singleton set. For every A
there is exactly one such function. Here's this function for
\texttt{Integer}:
\begin{verbatim}
fInt :: Integer -> () fInt x = ()
\end{verbatim}
\begin{minted}{haskell}
fInt :: Integer -> ()
fInt x = ()
\end{minted}
You give it any integer, and it gives you back a unit. In the spirit of
terseness, Haskell lets you use the wildcard pattern, the underscore,
for an argument that is discarded. This way you don't have to invent a
name for it. So the above can be rewritten as:
\begin{verbatim}
fInt :: Integer -> () fInt _ = ()
\end{verbatim}
\begin{minted}{haskell}
fInt :: Integer -> ()
fInt _ = ()
\end{minted}
Notice that the implementation of this function not only doesn't depend
on the value passed to it, but it doesn't even depend on the type of the
@ -417,35 +413,40 @@ such functions with one equation using a type parameter instead of a
concrete type. What should we call a polymorphic function from any type
to unit type? Of course we'll call it \texttt{unit}:
\begin{verbatim}
unit :: a -> () unit _ = ()
\end{verbatim}
\begin{minted}{haskell}
unit :: a -> ()
unit _ = ()
\end{minted}
In C++ you would write this function as:
\begin{verbatim}
template<class T> void unit(T) {}
\end{verbatim}
\begin{minted}{c++}
template<class T>
void unit(T) {}
\end{minted}
Next in the typology of types is a two-element set. In C++ it's called
\texttt{bool} and in Haskell, predictably, \texttt{Bool}. The difference
is that in C++ \texttt{bool} is a built-in type, whereas in Haskell it
can be defined as follows:
\begin{verbatim}
\begin{minted}{haskell}
data Bool = True | False
\end{verbatim}
\end{minted}
(The way to read this definition is that \texttt{Bool} is either
\texttt{True} or \texttt{False}.) In principle, one should also be able
to define a Boolean type in C++ as an enumeration:
\begin{verbatim}
enum bool { true, false };
\end{verbatim}
\begin{minted}{c++}
enum bool {
true,
false
};
\end{minted}
but C++ \texttt{enum} is secretly an integer. The C++11
``\texttt{enum\ class}'' could have been used instead, but then you
``\texttt{enum class}'' could have been used instead, but then you
would have to qualify its values with the class name, as in
\texttt{bool::true} and \texttt{bool::false}, not to mention having to
include the appropriate header in every file that uses it.
@ -465,7 +466,6 @@ have the form \texttt{ctype::is(alpha,\ c)},
\subsection{Challenges}\label{challenges}
\begin{enumerate}
\tightlist
\item
Define a higher-order function (or a function object) \texttt{memoize}
in your favorite language. This function takes a pure function
@ -494,17 +494,25 @@ have the form \texttt{ctype::is(alpha,\ c)},
\item
The factorial function from the example in the text.
\item
\begin{verbatim}
\begin{minted}{c++}
std::getchar()
\end{verbatim}
\end{minted}
\item
\begin{verbatim}
bool f() { std::cout << "Hello!" << std::endl; return true; }
\end{verbatim}
\begin{minted}{c++}
bool f() {
std::cout << "Hello!" << std::endl;
return true;
}
\end{minted}
\item
\begin{verbatim}
int f(int x) { static int y = 0; y += x; return y; }
\end{verbatim}
\begin{minted}{c++}
int f(int x)
{
static int y = 0;
y += x;
return y;
}
\end{minted}
\end{enumerate}
\item
How many different functions are there from \texttt{Bool} to
@ -514,11 +522,4 @@ int f(int x) { static int y = 0; y += x; return y; }
\texttt{Void}, \texttt{()} (unit), and \texttt{Bool}; with arrows
corresponding to all possible functions between these types. Label the
arrows with the names of the functions.
\end{enumerate}
\subsection{Bibliography}\label{bibliography}
Next:
\href{https://bartoszmilewski.com/2014/12/05/categories-great-and-small/}{Categories
Great and Small}\\
\href{https://twitter.com/BartoszMilewski}{Follow @BartoszMilewski}
\end{enumerate}

View File

@ -49,6 +49,8 @@
% To use Libertine letters and numbers,
% but tx-style operators in math environment:
\usepackage[libertine]{newtxmath}
% Needed to display additional math unicode symbols (like double-colon)
\usepackage{unicode-math}
% Workaround to fix mismatched left and right math delimiters. Taken from:
% http://tex.stackexchange.com/questions/63410/parentheses-differ-xelatex-fontspec-newtxmath-libertine
@ -90,6 +92,16 @@
linkcolor=LinkRed,
urlcolor=LinkRed,
}
\usepackage{subfiles}
\makeatletter
\let\org@subfile\subfile
\renewcommand*{\subfile}[1]{%
\filename@parse{#1}% LaTeX's file name parser
\expandafter
\graphicspath\expandafter{\expandafter{\filename@area}}%
\org@subfile{"#1"}%
}
\makeatother
% Document colors
\definecolor{SchemeLight} {HTML} {686868}

View File

@ -66,7 +66,7 @@ Bartosz Milewski@*
This work is licensed under a Creative Commons@*
Attribution-ShareAlike 4.0 International License@*
(@url{http://creativecommons.org/licenses/by-sa/4.0/, @acronym{CC BY-SA 4.0}}).
Based on a work at @url{https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/, https://bartoszmilewski.com}.
Based on the work at @url{https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/, https://bartoszmilewski.com}.
@sp 1.26
@noindent
@ -81,506 +81,19 @@ based on the work at @url{https://github.com/sarabander/sicp-pdf}.
@unnumbered Preface
@node Preface
@c @sp -0.6
@quotation
For some time now I've been floating the idea of writing a book about category theory that would be targeted at programmers.
Mind you, not computer scientists but programmers -- engineers rather than scientists. I know this sounds crazy and I am properly scared.
I can't deny that there is a huge gap between science and engineering because I have worked on both sides of the divide. But I've always
felt a very strong compulsion to explain things. I have tremendous admiration for Richard Feynman who was the master of simple explanations.
I know I'm no Feynman, but I will try my best. I'm starting by publishing this preface --- which is supposed to motivate the reader to learn
category theory --- in hopes of starting a discussion and soliciting feedback.
@end quotation
@c @sp 0.8
@noindent
\lettrine[lhang=0.17]{I}{will attempt}, in the space of a few paragraphs, to convince you that this book is written for you, and whatever
objections you might have to learning one of the most abstract branches of mathematics in your “copious spare time” are totally unfounded.
My optimism is based on several observations. First, category theory is a treasure trove of extremely useful programming ideas.
Haskell programmers have been tapping this resource for a long time, and the ideas are slowly percolating into other languages, but this
process is too slow. We need to speed it up.
Second, there are many different kinds of math, and they appeal to different audiences. You might be allergic to calculus or algebra,
but it doesn't mean you won't enjoy category theory. I would go as far as to argue that category theory is the kind of math that is
particularly well suited for the minds of programmers. That's because category theory --- rather than dealing with particulars --- deals
with structure. It deals with the kind of structure that makes programs composable.
Composition is at the very root of category theory -- it's part of the definition of the category itself. And I will argue strongly that
composition is the essence of programming. We've been composing things forever, long before some great engineer came up with the idea of
a subroutine. Some time ago the principles of structural programming revolutionized programming because they made blocks of code composable.
Then came object oriented programming, which is all about composing objects. Functional programming is not only about composing functions
and algebraic data structures -- it makes concurrency composable - something that's virtually impossible with other programming paradigms.
Third, I have a secret weapon, a butcher's knife, with which I will butcher math to make it more palatable to programmers. When you're a
professional mathematician, you have to be very careful to get all your assumptions straight, qualify every statement properly, and construct
all your proofs rigorously. This makes mathematical papers and books extremely hard to read for an outsider. I'm a physicist by training,
and in physics we made amazing advances using informal reasoning. Mathematicians laughed at the Dirac delta function, which was made up on
the spot by the great physicist P. A. M. Dirac to solve some differential equations. They stopped laughing when they discovered a completely
new branch of calculus called distribution theory that formalized Dirac's insights.
Of course when using hand-waving arguments you run the risk of saying something blatantly wrong, so I will try to make sure that there is solid
mathematical theory behind informal arguments in this book. I do have a worn-out copy of Saunders Mac Lane's @emph{Category Theory for the Working
Mathematician} on my nightstand.
Since this is category theory @emph{for programmers}, I will illustrate all major concepts using computer code. You are probably aware that functional
languages are closer to math than the more popular imperative languages. They also offer more abstracting power. So a natural temptation would be
to say: You must learn Haskell before the bounty of category theory becomes available to you. But that would imply that category theory has no
application outside of functional programming and thats simply not true. So I will provide a lot of C++ examples. Granted, youll have to overcome
some ugly syntax, the patterns might not stand out from the background of verbosity, and you might be forced to do some copy and paste in lieu of higher
abstraction, but thats just the lot of a C++ programmer.
But youre not off the hook as far as Haskell is concerned. You dont have to become a Haskell programmer, but you need it as a language for sketching
and documenting ideas to be implemented in C++. Thats exactly how I got started with Haskell. I found its terse syntax and powerful type system a great
help in understanding and implementing C++ templates, data structures, and algorithms. But since I cant expect the readers to already know Haskell, I
will introduce it slowly and explain everything as I go.
If youre an experienced programmer, you might be asking yourself: Ive been coding for so long without worrying about category theory or functional methods,
so whats changed? Surely you cant help but notice that theres been a steady stream of new functional features invading imperative languages. Even Java,
the bastion of object-oriented programming, let the lambdas in C++ has recently been evolving at a frantic pace --- a new standard every few years --- trying
to catch up with the changing world. All this activity is in preparation for a disruptive change or, as we physicist call it, a phase transition. If you keep
heating water, it will eventually start boiling. We are now in the position of a frog that must decide if it should continue swimming in increasingly hot water,
or start looking for some alternatives.
\begin{figure}
\centering
\fbox{\includegraphics[width=90mm]{fig/frog}}
\end{figure}
One of the forces that are driving the big change is the multicore revolution. The prevailing programming paradigm, object oriented programming, doesnt
buy you anything in the realm of concurrency and parallelism, and instead encourages dangerous and buggy design. Data hiding, the basic premise of object
orientation, when combined with sharing and mutation, becomes a recipe for data races. The idea of combining a mutex with the data it protects is nice but,
unfortunately, locks dont compose, and lock hiding makes deadlocks more likely and harder to debug.
But even in the absence of concurrency, the growing complexity of software systems is testing the limits of scalability of the imperative paradigm. To put it simply,
side effects are getting out of hand. Granted, functions that have side effects are often convenient and easy to write. Their effects can in principle be encoded in
their names and in the comments. A function called @code{SetPassword} or @code{WriteFile} is obviously mutating some state and generating side effects, and we are
used to dealing with that. Its only when we start composing functions that have side effects on top of other functions that have side effects, and so on, that things
start getting hairy. Its not that side effects are inherently bad -- its the fact that they are hidden from view that makes them impossible to manage at larger scales.
Side effects dont scale, and imperative programming is all about side effects.
Changes in hardware and the growing complexity of software are forcing us to rethink the foundations of programming. Just like the builders of Europes great gothic
cathedrals weve been honing our craft to the limits of material and structure. There is an unfinished gothic @url{https://en.wikipedia.org/wiki/Beauvais_Cathedral, cathedral in Beauvais},
France, that stands witness to this deeply human struggle with limitations. It was intended to beat all previous records of height and lightness, but it suffered a
series of collapses. Ad hoc measures like iron rods and wooden supports keep it from disintegrating, but obviously a lot of things went wrong. From a modern perspective, its a miracle
that so many gothic structures had been successfully completed without the help of modern material science, computer modelling, finite element analysis, and general
math and physics. I hope future generations will be as admiring of the programming skills weve been displaying in building complex operating systems, web servers,
and the internet infrastructure. And, frankly, they should, because weve done all this based on very flimsy theoretical foundations. We have to fix those foundations
if we want to move forward.
\begin{figure}
\centering
\includegraphics[totalheight=8cm]{fig/beauvais_interior_supports.jpg}
\captionsetup{labelformat=empty,font=scriptsize}
\caption{Ad hoc measures preventing the Beauvais cathedral from collapsing.}
\end{figure}
\subfile{content/0.0/Preface}
\mainmatter
@chapter Category: The Essence of Composition
@node Category: The Essence of Composition, Types and Functions, Top
@noindent
\lettrine[lhang=0.17]{A}{category} is an embarrassingly simple concept. A category consists of @newterm{objects} and @newterm{arrows}
that go between them. That's why categories are so easy to represent pictorially. An object can be drawn as a circle or a point, and
an arrow\textellipsis\ is an arrow. (Just for variety, I will occasionally draw objects as piggies and arrows as fireworks.) But the essence of
a category is @newterm{composition}. Or, if you prefer, the essence of composition is a category. Arrows compose, so if you have an arrow
from object A to object B, and another arrow from object B to object C, then there must be an arrow - their composition - that goes from A to C.
@subsubheading Arrows as Functions
Is this already too much abstract nonsense? Do not despair. Let's talk concretes. Think of arrows, which are also called @newterm{morphisms},
as functions. You have a function @code{f} that takes an argument of type A and returns a B. You have another function @code{g} that takes a B and
returns a C. You can compose them by passing the result of @code{f} to @code{g}. You have just defined a new function that takes an A and returns a C.
\begin{figure}
\centering
\includegraphics[width=\textwidth]{fig/chap1/fig1.1.jpg}
\captionsetup{labelformat=empty,font=scriptsize}
\caption{In a category, if there is an arrow going from A to B and an arrow going from B to C then there must also be a direct arrow from A to C that is their composition. This diagram is not a full category because its missing identity morphisms (see later).}
\end{figure}
@sp 1.0
In math, such composition is denoted by a small circle between functions: @math{g \circ f}. Notice the right to left order of composition. For some
people this is confusing. You may be familiar with the pipe notation in Unix, as in:
\begin{minted}{sh}
lsof | grep Chrome
\end{minted}
@noindent Or the chevron @code{>>} in F\#, which both go from left to right. But in mathematics and in Haskell functions compose right to left. It helps if you read @math{g \circ f} as “g after f.”
\pagebreak
Lets make this even more explicit by writing some C code. We have one function @code{f} that takes an argument of type @code{A} and returns a value of type @code{B}:
\mint{c}|B f(A a);|
@noindent
and another:
\mint{c}|C g(B b);|
@noindent
Their composition is:
\begin{minted}{c}
C g_after_f(A a)
{
return g(f(a));
}
\end{minted}
@noindent
Here, again, you see right-to-left composition: @code{g(f(a));} this time in C.
@sp 1.0
@noindent
I wish I could tell you that there is a template in the C++ Standard Library that takes two functions and returns their composition, but there isnt one. So lets try some
Haskell for a change. Heres the declaration of a function from @code{A} to @code{B}:
\mint{haskell}|f :: A -> B|
@noindent
Similarly:
\mint{haskell}|g :: B -> C|
@noindent
Their composition is:
@comment todo fix the higlighting
\mint{text}|g . f|
@noindent
Once you see how simple things are in Haskell, the inability to express straightforward functional concepts in C++ is a little embarrassing. In fact, Haskell will
let you use Unicode characters so you can write composition as:
\mint{text}|g ◦ f|
@noindent
You can even use Unicode double colons and arrows:
\mint{text}|f ∷ A → B|
So heres the first Haskell lesson: Double colon means "has the type of\textellipsis." A function type is created by inserting an arrow between two types. You compose two functions
by inserting a period between them (or a Unicode circle).
@subsubheading Properties of Composition
There are two extremely important properties that the composition in any category must satisfy.
@sp 1.0
@enumerate 1
@item
Composition is associative. If you have three morphisms, @code{f}, @code{g}, and @code{h}, that can be composed (that is, their objects match end-to-end), you dont need parentheses
to compose them. In math notation this is expressed as:
\mint{text}|h◦(g◦f) = (h◦g)◦f = h◦g◦f|
@noindent
In (pseudo) Haskell:
\begin{minted}{haskell}
f :: A -> B
g :: B -> C
h :: C -> D
h . (g . f) == (h . g) . f == h . g . f
\end{minted}
@noindent
(I said "pseudo," because equality is not defined for functions.)
@sp 1.0
@noindent
Associativity is pretty obvious when dealing with functions, but it may be not as obvious in other categories.
@sp 1.0
@item
For every object A there is an arrow which is a unit of composition. This arrow loops from the object to itself. Being a unit of composition means that,
when composed with any arrow that either starts at A or ends at A, respectively, it gives back the same arrow. The unit arrow for object @code{A} is
called @code{id\textsubscript{A}} (@newterm{identity} on A). In math notation, if @code{f} goes from @code{A} to @code{B} then
@sp 1.0
@noindent @code{f◦id\textsubscript{A} = f}
@sp 1.0
@noindent and
@sp 1.0
@noindent @code{id\textsubscript{B}◦f = f}
@sp 1.0
@end enumerate
@noindent
When dealing with functions, the identity arrow is implemented as the identity function that just returns back its argument. The implementation is the same for every type,
which means this function is universally polymorphic. In C++ we could define it as a template:
\mint{c++}|template<class T> T id(T x) { return x; }|
@noindent
Of course, in C++ nothing is that simple, because you have to take into account not only what youre passing but also how (that is, by value, by reference, by const reference,
by move, and so on).
In Haskell, the identity function is part of the standard library (called Prelude). Heres its declaration and definition:
\begin{minted}{haskell}
id :: a -> a
id x = x
\end{minted}
@noindent
As you can see, polymorphic functions in Haskell are a piece of cake. In the declaration, you just replace the type with a type variable. Heres the trick: names of concrete
types always start with a capital letter, names of type variables start with a lowercase letter. So here a stands for all types.
Haskell function definitions consist of the name of the function followed by formal parameters — here just one, @code{x}. The body of the function follows the equal sign.
This terseness is often shocking to newcomers but you will quickly see that it makes perfect sense. Function definition and function call are the bread and butter of functional
programming so their syntax is reduced to the bare minimum. Not only are there no parentheses around the argument list but there are no commas between arguments (youll see that
later, when we define functions of multiple arguments).
The body of a function is always an expression -- there are no statements in functions. The result of a function is this expression -- here, just @code{x}.
This concludes our second Haskell lesson.
\pagebreak
@noindent
The identity conditions can be written (again, in pseudo-Haskell) as:
\begin{minted}{text}
f . id == f
id . f == f
\end{minted}
@noindent
You might be asking yourself the question: Why would anyone bother with the identity function -- a function that does nothing? Then again, why do we bother with the number zero?
Zero is a symbol for nothing. Ancient Romans had a number system without a zero and they were able to build excellent roads and aqueducts, some of which survive to this day.
Neutral values like zero or id are extremely useful when working with symbolic variables. Thats why Romans were not very good at algebra, whereas the Arabs and the Persians,
who were familiar with the concept of zero, were. So the identity function becomes very handy as an argument to, or a return from, a higher-order function. Higher order functions
are what make symbolic manipulation of functions possible. They are the algebra of functions.
To summarize: A category consists of objects and arrows (morphisms). Arrows can be composed, and the composition is associative. Every object has an identity arrow that serves as
a unit under composition.
@subsubheading Composition is the Essence of Programming
Functional programmers have a peculiar way of approaching problems. They start by asking very Zen-like questions. For instance, when designing an interactive program, they would ask:
What is interaction? When implementing Conways Game of Life, they would probably ponder about the meaning of life. In this spirit, Im going to ask: What is programming? At the most
basic level, programming is about telling the computer what to do. “Take the contents of memory address x and add it to the contents of the register EAX.” But even when we program in
assembly, the instructions we give the computer are an expression of something more meaningful. We are solving a non-trivial problem (if it were trivial, we wouldnt need the help of
the computer). And how do we solve problems? We decompose bigger problems into smaller problems. If the smaller problems are still too big, we decompose them further, and so on.
Finally, we write code that solves all the small problems. And then comes the essence of programming: we compose those pieces of code to create solutions to larger problems.
Decomposition wouldnt make sense if we werent able to put the pieces back together.
This process of hierarchical decomposition and recomposition is not imposed on us by computers. It reflects the limitations of the human mind. Our brains can only deal with a small
number of concepts at a time. One of the most cited papers in psychology, \href{http://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two}{The Magical Number Seven, Plus or Minus Two},
postulated that we can only keep @math{7 \pm 2} “chunks” of information in our minds. The details of our understanding of the human short-term memory might be changing, but we know for
sure that its limited. The bottom line is that we are unable to deal with the soup of objects or the spaghetti of code. We need structure not because well-structured programs are
pleasant to look at, but because otherwise our brains cant process them efficiently. We often describe some piece of code as elegant or beautiful, but what we really mean is that
its easy to process by our limited human minds. Elegant code creates chunks that are just the right size and come in just the right number for our mental digestive system to assimilate them.
So what are the right chunks for the composition of programs? Their surface area has to increase slower than their volume. (I like this analogy because of the intuition that the surface
area of a geometric object grows with the square of its size -- slower than the volume, which grows with the cube of its size.) The surface area is the information we need in order to
compose chunks. The volume is the information we need in order to implement them. The idea is that, once a chunk is implemented, we can forget about the details of its implementation
and concentrate on how it interacts with other chunks. In object-oriented programming, the surface is the class declaration of the object, or its abstract interface. In functional programming,
its the declaration of a function. (Im simplifying things a bit, but thats the gist of it.)
Category theory is extreme in the sense that it actively discourages us from looking inside the objects. An object in category theory is an abstract nebulous entity. All you can ever know
about it is how it relates to other object -- how it connects with them using arrows. This is how internet search engines rank web sites by analyzing incoming and outgoing links (except when
they cheat). In object-oriented programming, an idealized object is only visible through its abstract interface (pure surface, no volume), with methods playing the role of arrows. The moment
you have to dig into the implementation of the object in order to understand how to compose it with other objects, youve lost the advantages of your programming paradigm.
@subsubheading Challenges
@enumerate 1
@item Implement, as best as you can, the identity function in your favorite language (or the second favorite, if your favorite language happens to be Haskell).
@item Implement the composition function in your favorite language. It takes two functions as arguments and returns a function that is their composition.
@item Write a program that tries to test that your composition function respects identity.
@item Is the world-wide web a category in any sense? Are links morphisms?
@item Is Facebook a category, with people as objects and friendships as morphisms?
@item When is a directed graph a category?
@end enumerate
\subfile{content/1.1/Category - The Essence of Composition}
@chapter Types and Functions
@node Types and Functions, Categories Great and Small, Top
@noindent
\lettrine[lhang=0.17]{T}{he category of types and functions} plays an important role in programming, so lets talk about what types are and why we need them.
@subsubheading Who Needs Types?
There seems to be some controversy about the advantages of static vs. dynamic and strong vs. weak typing. Let me illustrate these choices with a thought experiment.
Imagine millions of monkeys at computer keyboards happily hitting random keys, producing programs, compiling, and running them.
\begin{center}
\fbox{@image{fig/chap2/monkey_typewriter, 60mm,,,.jpg}}
\end{center}
@noindent
With machine language, any combination of bytes produced by monkeys would be accepted and run. But with higher level languages, we do appreciate the fact that a compiler
is able to detect lexical and grammatical errors. Lots of monkeys will go without bananas, but the remaining programs will have a better chance of being useful. Type checking
provides yet another barrier against nonsensical programs. Moreover, whereas in a dynamically typed language, type mismatches would be discovered at runtime, in strongly typed
statically checked languages type mismatches are discovered at compile time, eliminating lots of incorrect programs before they have a chance to run.
So the question is, do we want to make monkeys happy, or do we want to produce correct programs?
The usual goal in the typing monkeys thought experiment is the production of the complete works of Shakespeare. Having a spell checker and a grammar checker in the loop would
drastically increase the odds. The analog of a type checker would go even further by making sure that, once Romeo is declared a human being, he doesnt sprout leaves or trap
photons in his powerful gravitational field.
@subsubheading Types Are About Composability
Category theory is about composing arrows. But not any two arrows can be composed. The target object of one arrow must be the same as the source source object of the next arrow.
In programming we pass the results on one function to another. The program will not work if the target function is not able to correctly interpret the data produced by the source
function. The two ends must fit for the composition to work. The stronger the type system of the language, the better this match can be described and mechanically verified.
The only serious argument I hear against strong static type checking is that it might eliminate some programs that are semantically correct. In practice, this happens extremely
rarely and, in any case, every language provides some kind of a backdoor to bypass the type system when thats really necessary. Even Haskell has @code{unsafeCoerce}. But such
devices should by used judiciously. Franz Kafkas character, Gregor Samsa, breaks the type system when he metamorphoses into a giant bug, and we all know how it ends.
Another argument I hear a lot is that dealing with types imposes too much burden on the programmer. I could sympathize with this sentiment after having to write a few declarations
of iterators in C++ myself, except that there is a technology called @newterm{type inference} that lets the compiler deduce most of the types from the context in which they are used.
In C++, you can now declare a variable @code{auto} and let the compiler figure out its type.
In Haskell, except on rare occasions, type annotations are purely optional. Programmers tend to use them anyway, because they can tell a lot about the semantics of code, and they make
compilation errors easier to understand. Its a common practice in Haskell to start a project by designing the types. Later, type annotations drive the implementation and become
compiler-enforced comments.
Strong static typing is often used as an excuse for not testing the code. You may sometimes hear Haskell programmers saying, “If it compiles, it must be correct.” Of course, there is
no guarantee that a type-correct program is correct in the sense of producing the right ouput. The result of this cavalier attitude is that in several studies Haskell didnt come as
strongly ahead of the pack in code quality as one would expect. It seems that, in the commercial setting, the pressure to fix bugs is applied only up to a certain quality level, which
has everything to do with the economics of software development and the tolerance of the end user, and very little to do with the programming language or methodology. A better criterion
would be to measure how many projects fall behind schedule or are delivered with drastically reduced functionality.
As for the argument that unit testing can replace strong typing, consider the common refactoring practice in strongly typed languages: changing the type of an argument of a particular
function. In a strongly typed language, its enough to modify the declaration of that function and then fix all the build breaks. In a weakly typed language, the fact that a function
now expects different data cannot be propagated to call sites. Unit testing may catch some of the mismatches, but testing is almost always a probabilistic rather than a deterministic
process. Testing is a poor substitute for proof.
@subsubheading What Are Types?
The simplest intuition for types is that they are sets of values. The type @code{Bool} (remember, concrete types start with a capital letter in Haskell) is a two-element set of @code{True}
and @code{False}. Type @code{Char} is a set of all Unicode characters like @code{'a'} or @code{'ą'}.
Sets can be finite or infinite. The type of @code{String}, which is a synonym for a list of @code{Char}, is an example of an infinite set.
When we declare @code{x} to be an @code{Integer}:
\mint{haskell}|x :: Integer|
@noindent
we are saying that its an element of the set of integers. @code{Integer} in Haskell is an infinite set, and it can be used to do arbitrary precision arithmetic. There is also a finite-set
@code{Int} that corresponds to machine type, just like the C++ @code{int}.
There are some subtleties that make this identification of types and sets tricky. There are problems with polymorphic functions that involve circular definitions, and with the fact that you
cant have a set of all sets; but as I promised, I wont be a stickler for math. The great thing is that there is a category of sets, which is called @b{Set}, and well just work with it.
In @b{Set}, objects are sets and morphisms (arrows) are functions.
@b{Set} is a very special category, because we can actually peek inside its objects and get a lot of intuitions from doing that. For instance, we know that an empty set has no elements.
We know that there are special one-element sets. We know that functions map elements of one set to elements of another set. They can map two elements to one, but not one element to two.
We know that an identity function maps each element of a set to itself, and so on. The plan is to gradually forget all this information and instead express all those notions in purely
categorical terms, that is in terms of objects and arrows.
In the ideal world we would just say that Haskell types are sets and Haskell functions are mathematical functions between sets. There is just one little problem: A mathematical function
does not execute any code -- it just knows the answer. A Haskell function has to calculate the answer. Its not a problem if the answer can be obtained in a finite number of steps -- however
big that number might be. But there are some calculations that involve recursion, and those might never terminate. We cant just ban non-terminating functions from Haskell because distinguishing
between terminating and non-terminating functions is undecidable -- the famous halting problem. Thats why computer scientists came up with a brilliant idea, or a major hack, depending
on your point of view, to extend every type by one more special value called the bottom and denoted by @code{\_|\_}, or Unicode @math{\bot}. This “value” corresponds to a non-terminating
computation. So a function declared as:
\mint{haskell}|f :: Bool -> Bool|
@noindent
may return @code{True}, @code{False}, or @code{\_|\_}; the latter meaning that it would never terminate.
Interestingly, once you accept the bottom as part of the type system, it is convenient to treat every runtime error as a bottom, and even allow functions to return the bottom explicitly.
The latter is usually done using the expression @code{undefined}, as in:
\begin{minted}{haskell}
f :: Bool -> Bool
f x = undefined
\end{minted}
@noindent
This definition type checks because @code{undefined} evaluates to bottom, which is a member of any type, including @code{Bool}. You can even write:
\begin{minted}{haskell}
f :: Bool -> Bool
f = undefined
\end{minted}
@noindent
(without the @code{x}) because the bottom is also a member of the type @code{Bool->Bool}.
Functions that may return bottom are called partial, as opposed to total functions, which return valid results for every possible argument.
Because of the bottom, youll see the category of Haskell types and functions referred to as @b{Hask} rather than @b{Set}. From the theoretical point of view, this is the source of
never-ending complications, so at this point I will use my butchers knife and terminate this line of reasoning. From the pragmatic point of view, its okay to ignore non-terminating
functions and bottoms, and treat @b{Hask} as bona fide @b{Set}\footnote{Nils Anders Danielsson, John Hughes, Patrik Jansson, Jeremy Gibbons, @url{http://www.cs.ox.ac.uk/jeremy.gibbons/publications/fast+loose.pdf, Fast and Loose Reasoning is Morally Correct}. This paper provides justification for ignoring bottoms in most contexts.}.
@subsubheading Why Do We Need a Mathematical Model?
As a programmer you are intimately familiar with the syntax and grammar of your programming language. These aspects of the language are usually described using formal notation at the
very beginning of the language spec. But the meaning, or semantics, of the language is much harder to describe; it takes many more pages, is rarely formal enough, and almost never complete.
Hence the never ending discussions among language lawyers, and a whole cottage industry of books dedicated to the exegesis of the finer points of language standards.
There are formal tools for describing the semantics of a language but, because of their complexity, they are mostly used with simplified academic languages, not real-life programming
behemoths. One such tool called @newterm{operational semantics} describes the mechanics of program execution. It defines a formalized idealized interpreter. The semantics of industrial
languages, such as C++, is usually described using informal operational reasoning, often in terms of an “abstract machine.”
The problem is that its very hard to prove things about programs using operational semantics. To show a property of a program you essentially have to “run it” through the idealized interpreter.
It doesnt matter that programmers never perform formal proofs of correctness. We always “think” that we write correct programs. Nobody sits at the keyboard saying, “Oh, Ill just throw a
few lines of code and see what happens.” We think that the code we write will perform certain actions that will produce desired results. We are usually quite surprised when it doesnt.
That means we do reason about programs we write, and we usually do it by running an interpreter in our heads. Its just really hard to keep track of all the variables. Computers are good
at running programs -- humans are not! If we were, we wouldnt need computers.
But there is an alternative. Its called @newterm{denotational semantics} and its based on math. In denotational semantics every programing construct is given its mathematical interpretation.
With that, if you want to prove a property of a program, you just prove a mathematical theorem. You might think that theorem proving is hard, but the fact is that we humans have been building
up mathematical methods for thousands of years, so there is a wealth of accumulated knowledge to tap into. Also, as compared to the kind of theorems that professional mathematicians prove,
the problems that we encounter in programming are usually quite simple, if not trivial.
Consider the definition of a factorial function in Haskell, which is a language quite amenable to denotational semantics:
\mint{haskell}|fact n = product [1..n]|
@noindent
The expression @code{[1..n]} is a list of integers from @code{1} to @code{n}. The function @code{product} multiplies all elements of a list. Thats just like a definition of factorial
taken from a math text. Compare this with C:
\begin{minted}{c}
int fact(int n) {
int i;
int result = 1;
for (i = 2; i <= n; ++i)
result *= i;
return result;
}
\end{minted}
@noindent
Need I say more?
Okay, Ill be the first to admit that this was a cheap shot! A factorial function has an obvious mathematical denotation. An astute reader might ask: Whats the mathematical model for reading
a character from the keyboard or sending a packet across the network? For the longest time that would have been an awkward question leading to a rather convoluted explanation.
It seemed like denotational semantics wasnt the best fit for a considerable number of important tasks that were essential for writing useful programs, and which could be easily tackled
by operational semantics. The breakthrough came from category theory. Eugenio Moggi discovered that computational effect can be mapped to monads. This turned out to be an important
observation that not only gave denotational semantics a new lease on life and made pure functional programs more usable, but also shed new light on traditional programming. Ill talk
about monads later, when we develop more categorical tools.
One of the important advantages of having a mathematical model for programming is that its possible to perform formal proofs of correctness of software. This might not seem so important
when youre writing consumer software, but there are areas of programming where the price of failure may be exorbitant, or where human life is at stake. But even when writing web applications
for the health system, you may appreciate the thought that functions and algorithms from the Haskell standard library come with proofs of correctness.
@subsubheading Pure and Dirty Functions
The things we call functions in C++ or any other imperative language, are not the same things mathematicians call functions. A mathematical function is just a mapping of values to values.
\subfile{content/1.2/Types and Functions}