add print

This commit is contained in:
Boris Marinov 2023-03-19 16:51:38 +02:00
parent b5c658358c
commit c58d6cf66d
7 changed files with 314 additions and 277 deletions

View File

@ -39,11 +39,12 @@ Sets
Perhaps unsurprisingly, everything in set theory is defined in terms of sets. A set is a collection of things where the "things" can be anything you want (like individuals, populations, genes etc.) Consider, for example, these balls.
![Balls](elements.svg)
![Balls](../01_set/elements.svg)
Let's construct a set, call it $G$ (as gray) that contains *all* of them as elements. There can only be one such set: Because a set has no structure (there is not order, no ball goes before or after another, there are no members which are "special" with respect to their membership of the set) two sets that contain the same elements are just two pictures of the same set.
![The set of all balls](all.svg)
![The set of all balls](../01_set/all.svg)
This example may look overly-simple but in fact it is just as valid as any other one.
@ -54,18 +55,18 @@ Subsets
Let's construct one more set. The set of *all balls that are warm color*. Let's call it $Y$ (because in the diagram is colored in **y**ellow.)
![The set of all balls of warm colors](subset.svg)
![The set of all balls of warm colors](../01_set/subset.svg)
Notice that $Y$ contains only elements that are also present in $G$. That is, every element of the set of $Y$ is also an element in the set $G$. When two sets have this relation, we may say that $Y$ is a *subset* of $G$ (or $Y \subseteq G$). A subset resides completely inside its superset When the two are drawn together.
![Y and G together](set_subset.svg)
![Y and G together](../01_set/set_subset.svg)
Singleton Sets
---
The set of all *red balls* contains just one ball. We said above that sets summarize *several* elements into one. Still, sets that contain just one element are perfectly valid - simply put, there are things that are *one of a kind*. The set of queens of England is a singleton set. The set of books written by the American writer Harper Lee and published during her lifetime is a singleton set.
![The singleton set of red balls](singleton.svg)
![The singleton set of red balls](../01_set/singleton.svg)
What's the point of the singleton set? Well, it is part of the language of set theory e.g. if we have a function which expects a set of given items, but if there is only one item that meets the criteria, we can just create a singleton set with that item.
@ -74,7 +75,7 @@ The Empty set
Of course if one is a valid answer, so can be zero. If we want a set of all *black balls* $B$ or all the *white balls*, $W$, the answer to all these questions is the same - the empty set.
![The empty set](void.svg)
![The empty set](../01_set/void.svg)
Because the a set is defined only by the items it contains, the empty set is *unique*- there is no difference between the set that contains zero *balls* and the set that contains zero *numbers*, for instance. Formally, the empty set is marked with the symbol $\varnothing$ (so $B = W = \varnothing$).
@ -92,19 +93,19 @@ These two sets are also called the *domain* and *codomain* of the function, or i
Here is a function, $f$ which converts each ball from the set $R$ to the ball with the opposite color in another set $G$ (in mathematics a function's name is often accompanied by the names of its source and target sets, like this: $f: R → G$)
![Opposite colors](function_one_one.svg)
![Opposite colors](../01_set/function_one_one.svg)
This is probably one of the simplest type of functions that exist - one which encodes a *one-to-one relationship* between the sets - *one* element from the source is connected to exactly *one* element from the target (and the other way around).
But functions can also express relationships of the type *many-to-one*, where *many* elements from the source might be connected to *one* element from the target (but not the other way around). For example, a function can express a relationship in which several elements from the source set relate to the same element of the target set.
![Function from a bigger set to a smaller one](function_big_small.svg)
![Function from a bigger set to a smaller one](../01_set/function_big_small.svg)
Such functions might represent operations such as *categorizing* a given collection of objects by some criteria, or partitioning them, based on some property that they might have.
A function can also express relationships in which some elements from the target set do not play a part.
![Function from a smaller set to a bigger one](function_small_big.svg)
![Function from a smaller set to a bigger one](../01_set/function_small_big.svg)
An example might be the relationship between some kind of pattern or structure and the emergence of this pattern in some more complicated context.
@ -130,7 +131,7 @@ The Identity Function
For every set $G$, no matter what it represents, we can define the function that does nothing, or in other words, a function which maps every element of $G$ to itself. It is called the *the identity function* of $G$ or $idG: G → G$.
![The identity function](function_identity.svg)
![The identity function](../01_set/function_identity.svg)
You can think of $idG$ as a function which represents the set $G$ in the realm of functions. Its existence allows us to prove many theorems, that we "know" by intuition, formally.
@ -139,7 +140,7 @@ Functions and Subsets
For each set and subset, no matter what they represent, we can define the function that maps each element of the subset to itself:
![Function from a smaller set to a bigger one](function_small_big.svg)
![Function from a smaller set to a bigger one](../01_set/function_small_big.svg)
Every set is a subset of itself, in which case this function is the same as the identity.
@ -148,7 +149,7 @@ Functions and the Empty Set
There is a unique function from the empty set to any other set.
![Function with empty set](function_empty.svg)
![Function with empty set](../01_set/function_empty.svg)
**Question:** Is this really valid? Why? Check the definition.
@ -161,7 +162,7 @@ Functions and Singleton Sets
There is a unique function from any set to any singleton set.
![Function with a singleton set](function_singleton.svg)
![Function with a singleton set](../01_set/function_singleton.svg)
**Question:** Is this really the only way to connect *any* set to a singleton set in a valid way?
@ -182,7 +183,7 @@ Number functions
Each numerical operation is a function between two of these sets. For example, squaring a number is a function from the set of real numbers to the set of real positive numbers (because both sets are infinite, we cannot draw them in their entirety, however we can draw a part of them).
![The square function](square.svg)
![The square function](../01_set/square.svg)
I will use the occasion to reiterate some of the more important characteristics of functions:
@ -208,11 +209,11 @@ Sets and types
Sets are not exactly the same thing as types, but all types are (or can be seen as) sets, for example, we can view the `Boolean` type as a set containing two elements - `true` and `false`.
![Set of boolean values](boolean.svg)
![Set of boolean values](../01_set/boolean.svg)
Another very basic set in programming is the set of keyboard characters, or `Char`. Characters are actually used rarely by themselves and mostly as parts of sequences.
![Set of characters](char.svg)
![Set of characters](../01_set/char.svg)
Most of the types of programming are composite types - they are a combination of the primitive ones that are listed here. Again, we will cover these later.
@ -223,7 +224,7 @@ Functions and methods/subroutines
Some functions in programming (also called methods, subroutines, etc.) kinda resemble mathematical functions - they sometimes take one value of a given type (or in other words, an element that belongs to a given set) and always return exactly one element which belongs to another type (or set). For example here is a function which that takes an argument of type `Char` and returns a `Boolean`, depending on whether the character is a letter.
![A function from Char to Boolean](char_boolean.svg)
![A function from Char to Boolean](../01_set/char_boolean.svg)
However functions in most programming languages can also be quite different from mathematical functions - they can perform various operations that have nothing to do with returning a value, which are sometimes called side effects.
@ -254,19 +255,19 @@ In order to understand type theory better, it's useful to see why it was created
In particular, a set can contain itself.
![A set that contains itself](set_contains_itself.svg)
![A set that contains itself](../01_set/set_contains_itself.svg)
Unlike the set above, most sets that we discussed (like the empty set and singleton sets) do not contain themselves.
![Sets that don't contains themselves](sets_dont_contain_themselves.svg)
![Sets that don't contains themselves](../01_set/sets_dont_contain_themselves.svg)
In order to understand Russell's paradox we will try to visualize *the set all sets that do not contain themselves*. In the original set notation we can define this set to be such that it contains all sets $x$ such that $x$ is not a member of $x$), or $\{x \mid x ∉ x \}$
![Russel's paradox - option one](russells_paradox.svg)
![Russel's paradox - option one](../01_set/russells_paradox.svg)
If we look at the definition, we recognize that the set that we just defined does not contain itself and therefore it belongs there as well.
![Russel's paradox - option one](russells_paradox_2.svg)
![Russel's paradox - option one](../01_set/russells_paradox_2.svg)
Hmm, something is not quite right with this diagram as well - because of the new adjustments that we made, our set now *contains itself*. And removing it from the set would just bring us back to the previous situation. So this is Russell's paradox.
@ -303,15 +304,15 @@ Functional Composition
Now, we were just about to reach the heart of the matter regarding the topic of functions. And that is functional composition. Assume that we have two functions, $g: Y → P$ and $f: P → G$ and the target of the first one is the same set as the source of the second one.
![Matching functions](functions_matching.svg)
![Matching functions](../01_set/functions_matching.svg)
If we apply the first function $g$ to some element from set $Y$, we will get an element of the set $P$. Then, if we apply the second function $f$ to *that* element, we will get an element from type $G$.
![Applying one function after another](functions_one_after_another.svg)
![Applying one function after another](../01_set/functions_one_after_another.svg)
We can define a function that is the equivalent to performing the operation described above. Let us call it $h: Y → G$. We may say that $h$ is the *composition* of $g$ and $f$, or $h = f \bullet g$ (notice that the first function is on the right, so it's similar to $b = f(g(a)$).
![Functional composition](functions_compose.svg)
![Functional composition](../01_set/functions_compose.svg)
Composition is the essence of all things categorical. The key insight is that the sum of two parts is no more complex than the parts themselves.
@ -324,7 +325,7 @@ To understand how powerful composition is, consider the following: one set being
If we have a function $g: P → Y$ from set $P$ to set $Y$, then for every function $f$ from the set $Y$ to any other set, there is a corresponding function $f \bullet g$ from the set $P$ to the same set. In other words, every time you define a new function from $Y$ to some other set, you gain one function from $P$ to that same set for free.
![Functional composition connect](morphism_general.svg)
![Functional composition connect](../01_set/morphism_general.svg)
For example, if we again take the relationship between a person and his mother as a function, with the set of all people in the world as source, and the set of all people that have children as its target, composing this function with other similar functions would give us all relatives on a person's mother side.
@ -335,11 +336,11 @@ Representing Composition with Commutative Diagrams
In the last diagram, the equivalence between $f \bullet g$ and the new function $h$ is expressed by the fact that if you follow the arrow $h$ for any element of set $Y$ you will get to the same element of the set $G$ as the one you will get if you follow the $g$ and then follow $f$. Diagrams that express such equivalence between sequences of function applications are called *commutative diagrams*.
![Functional composition](functions_compose.svg)
![Functional composition](../01_set/functions_compose.svg)
If we "zoom-out" the view of the last diagram so it does not show the individual set elements, we get a more general view of functional composition.
![Functional composition for sets](functions_compose_sets.svg)
![Functional composition for sets](../01_set/functions_compose_sets.svg)
In fact, because this diagram commutes (that is, all arrows, starting from a given set element ultimately lead to the same corresponding element from the resulting set), this view is a more appropriate representation of the concept (as enumerating the elements is redundant).
@ -347,7 +348,7 @@ Having this insight allows us to redefine functional composition in a more visua
> The composition of two functions $f$ and $g$ is a third function $h$ defined in such a way that this diagram commutes.
![Functional composition - general definition](functions_compose_general.svg)
![Functional composition - general definition](../01_set/functions_compose_general.svg)
Diagrams that show functions without showing the elements of the sets are called *external diagrams*, as opposed to the and the ones that we saw before, which are *internal*.
@ -360,11 +361,11 @@ Let's check another concept which is very important in category theory (although
To do that, we go back to the examples of the types of relationships that functions can represent, and to the first and most elementary of them all - the *one-to-one* type of relationship. We know that all functions have exactly one element from the source set, pointing to one element from the target set. But for one-to-one functions *the reverse is also true* - exactly one element from the target set points to one element from the source.
![Opposite colors](function_one_one.svg)
![Opposite colors](../01_set/function_one_one.svg)
If we have a one-one-function that connects sets that are of the same size (as is the case here), then this function has the following property: all elements from the target set have exactly one arrow pointing at them. In this case, the function is *invertible*, that is, if you flip the arrows of the function and it's source and target, you get another valid function.
![Opposite colors](isomorphism_one_one.svg)
![Opposite colors](../01_set/isomorphism_one_one.svg)
Invertible functions are called *isomorphisms*. When there exists an invertible function between two sets we say that the sets are *isomorphic*. For example, because we have an invertable function that converts the temperature measured in *Celsius* to temperature measured in *Fahrenheit* and vise versa, we can say that the temperatures measured in Celsius and Fahrenheit are isomorphic.
@ -377,7 +378,7 @@ Isomorphism and identity
If you look closely you would see that the identity function is invertible too (its reverse is itself), so each set is isomorphic to itself in that way.
![The identity function](isomorphism_identity.svg)
![The identity function](../01_set/isomorphism_identity.svg)
Therefore, the concept of an isomorphism contains the concept of equality - all equal things are also isomorphic.
@ -386,7 +387,7 @@ Isomorphism and composition
An interesting fact about isomorphisms is that if we have functions that convert a member of set $A$ to a member of set $B$ and the other way around . Then, because of functional composition, we know that any function from/to $A$ has a corresponding function from/to $B$.
![The architecture of isomorphism](isomorphism_general.svg)
![The architecture of isomorphism](../01_set/isomorphism_general.svg)
For example, if you have a function "is the partner of" that goes from the set of all married people to the same set, than that function is invertible. That is not to say that you are the same person as your significant other, but rather that every statement about you, or every relation you have to some other person or object is also a relation between them and this person/object, and vice versa.
@ -397,7 +398,7 @@ Another interesting fact about isomorphisms is that if we have two isomorphisms
Composing two isomorphisms into another isomorphism is possible by composing the two pairs of functions that make up the isomorphism in the two directions.
![Composing isomorphisms](isomorphisms_compose.svg)
![Composing isomorphisms](../01_set/isomorphisms_compose.svg)
Informally, we can see that the two morphisms are indeed reverse to each other and hence form an isomorphism. If we want to prove that fact formally, we will do something like the following:
@ -422,11 +423,11 @@ Isomorphisms Between Singleton Sets
Between any two singleton sets, we may define the only possible function.
![The only possible function between singletons](singleton_function.svg)
![The only possible function between singletons](../01_set/singleton_function.svg)
The function is invertible, which means that all singleton sets are isomorphic to one another, and furthermore (which is important) they are isomorphic *in one unique way*.
![Isomorphic singletons](singleton_isomorphism.svg)
![Isomorphic singletons](../01_set/singleton_isomorphism.svg)
Following the logic from the last paragraph, each statement about something that is one of a kind can be transferred to a statement about another thing that is one of a kind.
@ -451,7 +452,7 @@ Reflexivity
The first idea that defines equivalence, is that *everything is equivalent with itself*.
![Reflexivity](reflexivity.svg)
![Reflexivity](../01_set/reflexivity.svg)
This simple principle translates to the equally simple law of *reflexivity*: for all sets $A$, $A=A$.
@ -460,7 +461,7 @@ Transitivity
The second idea that defines the concept of equivalence is the idea that things that are equal to another thing must also equal between themselves.
![Transitivity](transitivity.svg)
![Transitivity](../01_set/transitivity.svg)
Mathematically, for all sets $A$ $B$ and $C$, if $A=B$ and $B=C$ then $A=C$.
@ -470,7 +471,7 @@ Symmetry
---
If one thing is equal to another, the reverse is also true (i.e the other thing is also equal to the first one. This idea is called *symmetry*. Symmetry is probably the most characteristic property of the equivalence relation, which is not true for almost any other relation.
![symmetry](symmetry.svg)
![symmetry](../01_set/symmetry.svg)
In mathematical terms: if $A=B$ then $B=A$.
@ -481,7 +482,7 @@ Isomorphisms *are* indeed equivalence relations. And "incidentally", we already
We said that most characteristic property of the equivalence relation is its *symmetry*. And this property is satisfied by isomorphisms, due to the isomorphisms' most characteristic property, namely the fact that they are *invertible*.
![Symmetry of isomorphisms](isomorphism_symmetry.svg)
![Symmetry of isomorphisms](../01_set/isomorphism_symmetry.svg)
**Task:** One law down, two to go: Go through the previous section and verify that isomorphisms also satisfy the other equivalence relation laws.
@ -496,15 +497,15 @@ Many people would say that the concept of a number is the most basic concept in
To understand how, let's think about how do you teach a person what a number is (in particular, here we will concentrate on the *natural*, or counting numbers.) You may start your lesson by showing them a bunch of objects that are of a given quantity, like for example if you want to demonstrate the number $2$, you might bring them like two pencils, two apples or two of something else.
![Two balls](number_two.svg)
![Two balls](../01_set/number_two.svg)
When you do that, it would be important to highlight that you are not refering to only the left object, or only about the right one, but that we should consider both things as at once (i.e. both things as one), so if the person whom you are explaining happens to know what a set is, this piece of knowledge might come in handy. And also, being good teachers we might provide them with some more examples of sets of 2 things.
![A set of two balls](number_two_sets.svg)
![A set of two balls](../01_set/number_two_sets.svg)
This is a good starting point, but the person may still be staring at the objects instead of the structure - they might ask if this or that set is $2$ as well. At this point you might give up, or, if the person whom you are explaining happens to know about isomorphisms as well (let's say they lived in a cave with nothing but this book with them), you can easily formulate your final definition, saying that the number $2$ is represented by those sets and all other sets that are isomorphic to them, or by the *equivalence class* of sets that have two elements, as the formal definition goes (don't worry, we will learn all about equivalence classes later.)
![A set of two balls](number_two_isomorphism.svg)
![A set of two balls](../01_set/number_two_isomorphism.svg)
At this point there is no more examples that we can add. In fact, because we consider all other sets as well, we might say that this is not just a bunch of examples, but a proper *definition* of the number $2$. And we can extend that to include all other numbers. In fact, he first definition of a natural number (presented by Gottlob Frege in 1884) is roughly based on this very idea.

View File

@ -17,11 +17,11 @@ In the previous chapter there were several places where needed a way to construc
The simplest composite type, of the sets $B$, that contains $b$'s and the set $Y$, that contains $y$'s is the *Cartesian product* of $B$ and $Y$, that is the set of *ordered pairs* that contain one element of the set $Y$ and one element of the set $B$. Or formally speaking: $Y \times B = \{ (y, b) \}$ where $y ∈ Y, b ∈ B$ ($∈$ means "is an element of").
![Product parts](product_parts.svg)
![Product parts](../02_category/product_parts.svg)
It is denoted $B \times Y$ and it comes equipped with two functions for retrieving the $b$ and the $y$ from each $(b, y)$.
![Product](product.svg)
![Product](../02_category/product.svg)
**Question**: Why is this called a product? Hint: How many elements does it have?
@ -36,19 +36,19 @@ You probably know how the Cartesian coordinate system works, but an equally inte
An Cartesian coordinate system consists of two perpendicular lines, situated on an *Euclidian plane* and some kind of mapping that resembles a function, connecting any point in these two lines to a number, representing the distance between the point that is being mapped and the lines' point of overlap (which is mapped to the number $0$).
![Cartesian coordinates](coordinates_x_y.svg)
![Cartesian coordinates](../02_category/coordinates_x_y.svg)
Using this construct (as well as the concept of a Cartesian product), we can describe not only the points on the lines, but any point on the Euclidian plane. We do that by measuring the distance between the point and those two lines.
![Cartesian coordinates](coordinates.svg)
![Cartesian coordinates](../02_category/coordinates.svg)
And since the point is the main primitive of Euclidian geometry, the coordinate system allows us to also describe all kinds of geometric figures such as this triangle (which is described using products of products.)
![Cartesian coordinates](coordinates_triangle.svg)
![Cartesian coordinates](../02_category/coordinates_triangle.svg)
So we can say that the Cartesian coordinate system is some kind of function-like mapping between all kinds of sets of (products of) *products of numbers* and *geometric figures* that correspond to these numbers, using which we can derive some properties of the figures using the numbers (for example, using the products in the example below, we can compute that the triangle that they represent has base of $6$ units and height of $5$ units.
![Cartesian coordinates](coordinates_isomorphism.svg)
![Cartesian coordinates](../02_category/coordinates_isomorphism.svg)
What's even more interesting is that this mapping is one-to-one, which makes the two realms *isomorphic* (traditionally we say that the point is *completely* described by the coordinates, which is the same thing.)
@ -74,7 +74,7 @@ Products can also be used for expressing functions that take more than one argum
Actually, here it is.
![The plus function](plus.svg)
![The plus function](../02_category/plus.svg)
Note that there are languages, such as the ones from the ML family, where the *pair* data structure (also called a *tuple*) is a first-level construct, and multi-argument functions are really implemented in this way.
@ -85,19 +85,19 @@ Defining products in terms of sets
When we said that the product is a set of *ordered* pairs (formally speaking $A \times B ≠ B \times A$). But we didn't define how ordered pairs formally. Note that the criteria for order prevents us from encoding the pair as just a set containing the two elements, as while some mathematical operations (such as addition) indeed don't care about order, others (such as subtraction) do. And in programming, we have the ability to assign names to each member of an object, which accomplishes the same purpose as ordering does for pairs.
![A pair](pair.svg)
![A pair](../02_category/pair.svg)
So does that mean that we have to define ordered pair as a "primitive" type, like we defined sets in order to use them? That's possible, but there is another approach if we can define a construct that is *isomorphic* to the ordered pair, using only sets, we can use that construct instead of them. And mathematicians had come up with multiple ingenious ways to do that. Here is the first one, which was discovered by Norbert Wiener in 1914. Note the smart use of the fact that the empty set is unique.
![A pair, represented by sets](pair_as_set_2.svg)
![A pair, represented by sets](../02_category/pair_as_set_2.svg)
The next one was discovered in the same year by Felix Hausdorff. In order to use that one, we just have to define $1$, and $2$ first.
![A pair, represented by sets](pair_as_set_3.svg)
![A pair, represented by sets](../02_category/pair_as_set_3.svg)
Discovered in 1921 Kazimierz Kuratowski, this one uses just the component of the pair.
![A pair, represented by sets](pair_as_set_1.svg)
![A pair, represented by sets](../02_category/pair_as_set_1.svg)
Defining products in terms of functions
---
@ -106,11 +106,11 @@ In the product definitions presented in the previous section worked by *zooming
How can we define products in terms of functions? To do that, we must first think about *what functions* are there for a given product, and we have two of those - the functions for retrieving the the two elements of the pair (the "getters", so to say) - formally, if we have a set $G$ which is the product of sets $Y$ and $B$, then we should also have functions which give us back the elements of the product, so $G → Y$ and $G → B$.
![Product, external diagram](product_external.svg)
![Product, external diagram](../02_category/product_external.svg)
This diagram already provides a definition, but not a complete definithon, because the product of $Y$ and $B$ is not the only set for which such functions can be defined. For example, a set of triples of $Y \times B \times R$ for any element $R$ also qualifies. And if there is a function from $G$ to $B$ then the set $G$ itself meets our condition for being the product, because it is connected to $B$ and to itself. And there can be many other such objects.
![Product, external diagram](product_candidates.svg)
![Product, external diagram](../02_category/product_candidates.svg)
However, all such objects would be *more complex* than the pair objects. And for this reason, *they all can be converted to it by a function*, as you can always have a function that converts a more complex structure, to a simpler one (we saw an example of this when we covered the functions that convert subsets to their supersets).
@ -120,7 +120,7 @@ More formally, if we suppose that there is a set $I$ that can serve as an impost
Therefore, we can define the product of $B$ and $Y$ as a set that has functions for deriving $B$ and $Y$, but, more importantly, all other sets that have such functions can be converted to it.
![Product, external diagram](products_morphisms.svg)
![Product, external diagram](../02_category/products_morphisms.svg)
In category theory, this type of property that a given object might possess (participating in a structure such that all similar objects can be converted to/from it) is called a *universal property*. I don't want to go into more detail, as it is a bit early for that now (after all we haven't even defined what category theory is).
@ -133,7 +133,7 @@ We will now study a construct that is pretty similar to the product but at the s
The sum of two sets $B$ and $Y$, denoted $B + Y$ is a set that contains *all elements from the first set combined with all elements from the second one*.
![Sum or coproduct](coproduct.svg)
![Sum or coproduct](../02_category/coproduct.svg)
We can immediately see the connection with the *or* logical structure: For example, because a parent is either a mother or a father of a child, the set of all parents is the sum of the set of mothers and the set of fathers, or $P = M + F$.
@ -143,11 +143,11 @@ Defining Sums in Terms of Sets
As with the product, representing sums in terms of sets is not so straightforward. This time the complication comes from the fact that when a given object is an element of both sets, then it appears in the sum twice. This is why this type of sum of two sets is also called a *disjoint union*. Because of this, if two sets can have the same element as a member, then their sum will have that element twice which is not permitted, because a set cannot contain the same element twice. As with the product, the solution is to put some extra structure.
![A member of a coproduct](coproduct_member.svg)
![A member of a coproduct](../02_category/coproduct_member.svg)
And as with the product, there is a low-level way to express a sum using sets alone. Incidentally, we can use pairs.
![A member of a coproduct, examined](coproduct_member_set.svg)
![A member of a coproduct, examined](../02_category/coproduct_member_set.svg)
But again, this distinction is only relevant only when the two sets have common elements. If they don't then just uniting the two sets is sufficient to represent their sum.
@ -158,15 +158,15 @@ As you might already suspect, the interesting part is expressing the sum of two
A property of every *or* relation is that if something is an $A$ that something is also an $A \vee B$ , and same for $B$ (The $\vee$ symbol means *or* by the way). For example, if my hair is *brown*, then my hair is also *either blond or brown*. This is what *or* means, right? This property can be expressed as a function, two functions actually - one for each set that takes part in the sum relation (for example, if parents are either mothers or fathers, then there surely exist functions $mothers → parents$ and $fathers → parents$.)
![Coproduct, external diagram](coproduct_external.svg)
![Coproduct, external diagram](../02_category/coproduct_external.svg)
As you might have already noticed, this definition is pretty similar to the definition of the product from the previous section. And the similarities don't end here. As with products, we have sets that can be thought of as *impostor* sums - ones for which these functions exist, but which also contain additional information.
![Coproduct, external diagram](coproduct_candidates.svg)
![Coproduct, external diagram](../02_category/coproduct_candidates.svg)
All these sets express relationships which are more vague than the simple sum, and therefore given such a set (an "impostor" set as we called it earlier), there would exist a function that would distinguish it from the true sum. The only difference is that, unlike with the products, this time this function goes *from the sum* to the impostor.
![Coproduct, external diagram](coproduct_morphisms.svg)
![Coproduct, external diagram](../02_category/coproduct_morphisms.svg)
Categorical Duality
@ -176,7 +176,7 @@ The concepts of product and sum might already look similar in a way when we view
I use "diagrams" in plural, but actually the two concepts are captured *by one and the same diagram*, the only difference between the two being that their arrows are flipped - many-to-one relationships become one-to-many and the other way around.
![Coproduct and product](coproduct_product_duality.svg)
![Coproduct and product](../02_category/coproduct_product_duality.svg)
The universal properties that define the two construct are the same as well - if we have a sum $Y + B$, for each impostor sum, such as $Y + B + R$, there exist a trivial function $Y + B \to Y + B + R$.
@ -204,7 +204,7 @@ This means, among other things, that the concepts of *and* and *or* are also dua
This duality is subtly encoded in the logical symbols for *and* and *or* ($\land$ and $\lor$) - they are nothing but stylized-versions of the diagrams of products and coproducts.
![Coproduct and product](demorgan_duality.svg)
![Coproduct and product](../02_category/demorgan_duality.svg)
To understand, the connection, consider the two formulas which are most often associated with De Morgan which are known as De Morgan laws, although De Morgan didn't actually discover those (they were previously formulated, by William of Ockham (of "Ockham's razor" fame) among other people.
@ -247,7 +247,7 @@ Category Theory - brief definition
Maybe it is about time to see what a category is. We will start with a short definition - a category consists of objects (an example of which are sets) and morphisms that go from one object to another (which can be viewed as functions) and that should be composable. We can say a lot more about categories, and even present a formal definition, but for now it is sufficient for you to remember that sets are one example of a category and that categorical objects are like sets, except that we don't *see* their elements. Or to put it another way, category-theoretic notions are captured by the external diagrams, while strictly set-theoretic notions can be captured by internal ones.
![Category theory and set theory compared](set_category.svg)
![Category theory and set theory compared](../02_category/set_category.svg)
When we are at the realm of sets we can view the set as a collection of individual elements. In category theory we don't have such notion, but we saw how taking this notion away allows us to define concepts such as the sum and product sets in a whole different and more general way.
@ -289,11 +289,11 @@ Perhaps the set-first approach *is* the best way to introduce people to categori
So. A category is a collection of objects (things) where the "things" can be anything you want. Consider, for example, these ~~colorful~~ gray balls:
![Balls](elements.svg)
![Balls](../02_category/elements.svg)
A category consists of a collection of objects as well as some arrows connecting some of them to one another. We call the arrows, *morphisms*.
![A category](category.svg)
![A category](../02_category/category.svg)
Wait a minute - we said that all sets form a category, but at the same time any one set can be seen as a category on its own right (just one which has no morphisms). This is true and an example of a phenomenon that is very characteristic of category theory - one structure can be examined from many different angles and may play many different roles, often in a recursive fashion.
@ -301,11 +301,11 @@ This particular analogy (a set as a category with no morphisms) is, however, not
Speaking of which, note that objects in a category can be connected by multiple arrows and that arrows having the same source and target sets does not in any way make them equivalent (it does not actually mean that they would produce the same value).
![Two objects connected with multiple arrows](arrows.svg)
![Two objects connected with multiple arrows](../02_category/arrows.svg)
Why that is true is pretty obvious if we go back to set theory for a second. (OK, maybe we really *have* to do it from time to time.) There are, for example, an infinite number of functions that go from number to boolean, and the fact that they have the same input type and the same output type (or the same *type signature*, as we like to say) does not in any way make them equivalent to one another.
![Two sets connected with multiple functions](set_arrows.svg)
![Two sets connected with multiple functions](../02_category/set_arrows.svg)
There are some types of categories in which only one morphism between two objects is allowed (or one in each direction), but we will talk about them later.
@ -314,11 +314,11 @@ Composition
One of the few or maybe even the only requirement for a structure to be called a category is that *two morphisms can make a third*, or in other words, that morphisms are *composable* - given two successive arrows with appropriate type signature, we can draw a third one that is equivalent to the consecutive application of the other two.
![Composition of morphisms](composition.svg)
![Composition of morphisms](../02_category/composition.svg)
Formally, this requirement says that there should exist an operation (denoted with the symbol $•$) such that for each two functions $g: A → B$ and $f: B → C$, there exists a function $(f • g): A → C$ (again, note that there can be many other morphisms with with the same type signature, but there must be *exactly one* morphism that fits these criteria).
![Composition of morphisms in the context of additional morphism](composition_arrows.svg)
![Composition of morphisms in the context of additional morphism](../02_category/composition_arrows.svg)
**NB:** Note (if you haven't already) that functional composition is written from right to left. e.g. applying $g$ and then applying $f$ is written $f • g$ and not the other way around. (You can think of it as a shortcut to $f(g(a))$.)
@ -329,7 +329,7 @@ Before the standard Arabic numerals that we use today, there were Roman numbers.
The zero of category theory is what we call the "identity morphism" for each object. In short, this is a morphism, that doesn't do anything.
![The identity morphism (but can also be any other morphism)](identity.svg)
![The identity morphism (but can also be any other morphism)](../02_category/identity.svg)
It's important to mark this morphism, because there can be (let's add the very important (and also very boring) reminder) many morphisms that go from one object to the same object, many of which actually do stuff. For example, mathematics deals with a multitude of functions that have the set of numbers as source and target, such as $negate$, $square$, $add one$, and are not at all the identity morphism.
@ -344,13 +344,13 @@ The law of associativity
Functional composition is special not only because you can take any two morphisms with appropriate signatures and make a third, but because you can do so indefinitely, i.e. given $n$ successive arrows, each of which starts from the object that the other one finishes, we can draw one (exactly one) arrow that is equivalent to the consecutive application of all $n$ arrows.
![Composition of morphisms with many objects](composition_n_objects.svg)
![Composition of morphisms with many objects](../02_category/composition_n_objects.svg)
But let's get back to the math. If we carefully review the definition above, we can see that it can be reduced to multiple applications of the following formula: given 3 objects and 2 morphisms between them $f$ $g$ $h$, combining $h$ and $g$ and then combining the end result with $f$ should be the same as combining $h$ to the result of $g$ and $f$ (or simply $(h • g) • f = h • (g • f)$).
This formula can be expressed using the following diagram, which would only commute if the formula is true (given that all our category-theoretic diagrams commute, we can say, in such cases, that the formula and the diagram are equivalent.)
![Composition of morphisms with many objects](composition_associativity.svg)
![Composition of morphisms with many objects](../02_category/composition_associativity.svg)
This formula (and the diagram) is the definition of a property called $associativity$. Being associative is required for functional composition to really be called functional composition (and for a category to really be called category). It is also required for our diagrams to work, as diagrams can only represent associative structures (imagine if the diagram above does not commute - it would be super weird.)
@ -366,7 +366,7 @@ Commuting diagrams
The diagrams above, use colors to illustrate the fact that the green morphism is equivalent to the other two (and not just some unrelated morphism), but in practice this notation is a little redundant - the only reason to draw diagrams in the first place is to represent paths that are equivalent to each other - all other paths just belong in different diagrams.
![Composition of morphisms - a commuting diagram](composition_commuting_diagram.svg)
![Composition of morphisms - a commuting diagram](../02_category/composition_commuting_diagram.svg)
Diagrams that are like that (ones in which any two paths between two objects are equivalent to one another) are called *commutative diagrams* (or diagrams that *commute*). All diagrams in this book (except the wrong ones) commute.
@ -400,7 +400,7 @@ Like we said in the previous chapter, an isomorphism between two objects ($A$ an
And here is the same thing expressed with a commuting diagram.
![Isomorphism](isomorphism.svg)
![Isomorphism](../02_category/isomorphism.svg)
Like the previous one, the diagram expresses the same (simple) fact as the formula, namely that going from the one of objects ($A$ and $B$) to the other one and then back again is the same as applying the identity morphism i.e. doing nothing.
@ -417,11 +417,11 @@ Commutativity
One way to state the principle of reductionism is to say that *each thing is nothing but a sum of its parts*. Let's try to formalize that: the *things* that we are thinking about would be colorful balls and let's dub the *sum* with a circle operator. Then, it would mean that a set of objects when combined in whichever way, will always result in the same object.
![Commutativity](commutativity_long.svg)
![Commutativity](../02_category/commutativity_long.svg)
And because of the wonders of maths we can get all these equalities if we specify the law for just two objects.
![Commutativity](commutativity.svg)
![Commutativity](../02_category/commutativity.svg)
Incidentally this is the definition of a mathematicall law called *commutativity*.

View File

@ -15,11 +15,11 @@ Monoids are simpler than categories. A monoid is defined by a collection/set of
Let's take our familiar colorful balls.
![Balls](balls.svg)
![Balls](../03_monoid/balls.svg)
We can define a monoid based on this set by defining an operation for "combining" two balls into one. An example of such operation would be blending the colors of the balls, as if we are mixing paint.
![An operation for combining balls](balls_operation.svg)
![An operation for combining balls](../03_monoid/balls_operation.svg)
You can probably think of other ways to define such an operation. This will help you realize that there can be many ways to create a monoid from a given set of set elements i.e. the monoid is not the set itself, it is the set *together with the operation*.
@ -28,11 +28,11 @@ Associativity
The monoid operation should, like functional composition, be *associative* i.e. applying it on the same number of elements in a different order should make no difference.
![Associativity in the color mixing operation](balls_associativity.svg)
![Associativity in the color mixing operation](../03_monoid/balls_associativity.svg)
When an operation is associative, this means we can use all kinds of algebraic operations to any sequence of terms (or in other words to apply equation reasoning), like for example we can replace any element with a set of elements from which it is composed of, or add a term that is present at both sides of an equation and retaining the equality of the existing terms.
![Associativity in the color mixing operation](balls_arithmetic.svg)
![Associativity in the color mixing operation](../03_monoid/balls_arithmetic.svg)
The identity element
---
@ -41,7 +41,7 @@ Actually, not any (associative) operation for combining elements makes the balls
In the case of our color-mixing monoid the identity element is the white ball (or perhaps a transparent one, if we have one).
![The identity element of the color-mixing monoid](balls_identity.svg)
![The identity element of the color-mixing monoid](../03_monoid/balls_identity.svg)
As you probably remember from the last chapter, functional composition is also associative and it also contains an identity element, so you might start suspecting that it forms a monoid in some way. This is indeed the case, but with one caveat, for which we will talk about later.
@ -55,13 +55,13 @@ Monoids from numbers
Mathematics is not only about numbers, however numbers do tend to pop up in most of its areas, and monoids are no exception. The set of natural numbers $\mathbb{N}$ forms a monoid when combined with the all too familiar operation of addition (or *under* addition as it is traditionally said.) This group is denoted $\left< \mathbb{N},+ \right>$ (in general, all groups are denoted by specifying the set and the operation, enclosed in angle brackets.)
![The monoid of numbers under addition](numbers_addition.svg)
![The monoid of numbers under addition](../03_monoid/numbers_addition.svg)
If you see a $1 + 1 = 2$ in your textbook you know you are either reading something very advanced, or very simple, although I am not really sure which of the two applies in the present case.
Anyways, the natural numbers also form a monoid under multiplication as well.
![The monoid of numbers under multiplication](numbers_multiplication.svg)
![The monoid of numbers under multiplication](../03_monoid/numbers_multiplication.svg)
**Question:** Which are the identity elements of those monoids?
@ -85,11 +85,11 @@ We said that a monoid consists of two things a set (let's call it $A$) and a mon
Defining the operation is not hard at all. Actually, we have already done it for the operation $+$ - in chapter 2, we said that *addition* can be represented in set theory as a function that accepts a product of two numbers and returns a number (formally $+: \mathbb{Z} \times \mathbb{Z} \to \mathbb{Z}$).
![The plus operation as a function](plus_operation.svg)
![The plus operation as a function](../03_monoid/plus_operation.svg)
Every other monoid operation can also be represented in the same way - as a function that takes a pair of elements from the monoid's set and returns one other monoid element.
![The color-mixing operation as a function](color_mixing_operation.svg)
![The color-mixing operation as a function](../03_monoid/color_mixing_operation.svg)
Formally, we can define a monoid from any set $A$, by defining an (associative) function with type signature $A \times A \to A$. That's it. Or to be precise, that is *one way* to define the monoid operation. And there is another way, which we will see next. Before that, let's examine some more categories.
@ -103,13 +103,13 @@ Commutative (abelian) monoids
Looking at the monoid laws and the examples we gave so far, we observe that all of them obey one more rule (law) which we didn't specify - the order in which the operations are applied is irrelevant to the end result.
![Commutative monoid operation](monoid_commutative.svg)
![Commutative monoid operation](../03_monoid/monoid_commutative.svg)
Such operations (ones for which combining a given set of elements yields the same result no matter which one is first and which one is second) are called *commutative* operations. Monoids with operations that are commutative are called *commutative monoids*.
As we said, addition is commutative as well - it does not matter whether if I have given you 1 apple and then 2 more, or if I have given you 2 first and then 1 more.
![Commutative monoid operation](addition_commutative.svg)
![Commutative monoid operation](../03_monoid/addition_commutative.svg)
All monoids that we examined so far are also *commutative*. We will see some non-commutative ones later.
@ -144,18 +144,18 @@ An interesting kinds of groups/monoids are the groups of *symmetries* of geometr
We won't use the balls this time, because in terms of symmetries they have just one position and hence just one action - the identity action (which is it's own reverse, by the way). So let's take this triangle, which, for our purposes, is the same as any other triangle (we are not interested in the triangle itself, but in its rotations).
![A triangle](symmetry_group.svg)
![A triangle](../03_monoid/symmetry_group.svg)
Groups of rotations
---
Let's first review the group of ways in which we can rotate our triangle i.e. its *rotation group*. A geometric figure can be rotated without displacement in positions equal to the number of its sides, so for our triangle there are 3 positions.
![The group of rotations in a triangle](symmetry_rotation.svg)
![The group of rotations in a triangle](../03_monoid/symmetry_rotation.svg)
Connecting the dots (or the triangles in this case) shows us that there are just two possible rotations that get us from any state of the triangle to any other one - a *120-degree rotation* (i.e. flipping the triangle one time) and a *240-degree rotation* (i.e. flipping it two times (or equivalently, flipping it once, but in the opposite direction)). Adding the identity action of 0-degree rotation makes up for 3 rotations (objects) in total.
![The group of rotations in a triangle](symmetry_rotation_actions.svg)
![The group of rotations in a triangle](../03_monoid/symmetry_rotation_actions.svg)
The rotations of a triangle form a monoid - the *rotations are objects* (of which the zero-degree rotation is the identity) and the monoid operation which combines two rotations into one is just the operation of performing the first rotation and then performing the second one.
@ -166,17 +166,17 @@ Cyclic groups/monoids
The diagram that enumerates all the rotations of a more complex geometrical figure looks quite messy at first.
![The group of rotations in a more complex figure](symmetry_rotation_square.svg)
![The group of rotations in a more complex figure](../03_monoid/symmetry_rotation_square.svg)
But it gets much simpler to grasp if we notice the following: although our group has many rotations, and there are more still for figures with more sides (if I am not mistaken, the number of rotations is equal to the number of the sides), *all those rotations can be reduced to the repetitive application of just one rotation*, (for example, the 120-degree rotation for triangles and the 45-degree rotation for octagons). Let's make up a symbol for this rotation.
![The group of rotations in a triangle](symmetry_rotation_cyclic.svg)
![The group of rotations in a triangle](../03_monoid/symmetry_rotation_cyclic.svg)
Symmetry groups that have such "main" rotation, and, in general, groups and monoids that have an object that is capable of generating all other objects by it's repeated application, are called *cyclic groups*. And such rotation are called the group's *generator*.
All rotation groups are cyclic groups. Another example of a cyclic groups is, yes, the natural numbers under addition. Here we can use $+1$ or $-1$ as generators.
![The group of numbers under addition](numbers_cyclic.svg)
![The group of numbers under addition](../03_monoid/numbers_cyclic.svg)
Wait, how can this be a cyclic group when there are no cycles? This is because the integers are an *infinite* cyclic group.
@ -188,13 +188,13 @@ But $13 \pmod{12} = 1$ (as $13/12 = 1$ with $1$ remainder) $14 \pmod{12} = 2$, $
In effect numbers "wrap around", forming a group with as many elements as it the modulus number. Like for example a group representation of modular arithmetic with modulus $3$ has 3 elements.
![The group of numbers under addition](numbers_modular.svg)
![The group of numbers under addition](../03_monoid/numbers_modular.svg)
All cyclic groups that have the same number of elements (or that are of the *same order*) are isomorphic to each other (careful readers might notice that we haven't yet defined what a group isomorphisms are. Even more careful readers might already have an idea about what it is.)
For example, the group of rotations of the triangle is isomorphic to the natural numbers under the addition with modulo $3$.
![The group of numbers under addition](symmetry_modular.svg)
![The group of numbers under addition](../03_monoid/symmetry_modular.svg)
All cyclic groups are *commutative* (or "abelian" as they are also called).
@ -207,7 +207,7 @@ Group isomorphisms
We already mentioned group isomorphisms, but we didn't define what they are. Let's do that now - an isomorphism between two groups is an isomorphism ($f$) between their respective sets of elements, such that for any $a$ and $b$ we have $f(a \bullet b) = f(a) \bullet f(b)$. Visually, the diagrams of isomorphic groups have the same structure.
![Group isomorphism between different representations of S3](group_isomorphism.svg)
![Group isomorphism between different representations of S3](../03_monoid/group_isomorphism.svg)
As in category theory, in group theory isomorphic groups they considered instances of one and the same group. For example the one above is called $Z_3$.
@ -218,11 +218,11 @@ Like with sets, the concept of an isomorphism in group theory allows us to ident
The smallest group is just the trivial group $Z_1$ that has just one element.
![The smallest group](trivial_group.svg)
![The smallest group](../03_monoid/trivial_group.svg)
The smallest non-trivial group is the group $Z_2$ that has two elements.
![The smallest non-trivial group](smallest_group.svg)
![The smallest non-trivial group](../03_monoid/smallest_group.svg)
$Z_2$ is also known as the *boolean group*, due to the fact that it is isomorphic to the ${ True, False }$ set.
@ -237,21 +237,21 @@ Given any two groups, we can combine them to create a third group, comprised of
Let's see how the product looks like take the following two groups (which, having just two elements and one operation, are both isomorphic to $Z2$). To make it easier to imagine them, we can think of the first one as based on the vertical reflection of a figure and the second, just the horizontal reflection.
![Two trivial groups](groups_product.svg)
![Two trivial groups](../03_monoid/groups_product.svg)
We get set of elements of the new group by taking *the Cartesian product* of the set of the elements of the first group and the set of the element of the second one.
![Two trivial groups](groups_product_four.svg)
![Two trivial groups](../03_monoid/groups_product_four.svg)
And the *actions* of a product group are comprised of the actions of the first group, combined with the actions of the second one, where each action is applied only on the element that is a member of its corresponding group, leaving the other element unchanged.
![Klein four](klein_four_as_product.svg)
![Klein four](../03_monoid/klein_four_as_product.svg)
The product of the two groups we presented is called the *Klein four-group* and it is the simplest *abelian non-cyclic* group.
Another way to present the Klein-four group is the *group of symmetries of a non-square rectangle*.
![Klein four](klein_four.svg)
![Klein four](../03_monoid/klein_four.svg)
**Task:** Show that the two representations are isomorphic.
@ -277,21 +277,21 @@ Color-mixing monoid as a product
To see how can we use this theorem, let's revisit our color mixing monoid that we saw earlier.
![color-mixing group](balls_rule.svg)
![color-mixing group](../03_monoid/balls_rule.svg)
As there doesn't exist a color that, when mixed with itself, can produce all other colors, the color-mixing monoid is *not cyclic*. However, the color mixing monoid *is abelian*. So according to the theorem of finite abelian groups (which is valid for monoids as well), the color-mixing monoid must be (isomorphic to) a product.
And it is not hard to find the monoids that form it - although there isn't one color that can produce all other colors, there are three colors that can do that - the prime colors. This observation leads us to the conclusion that the color-mixing monoid, can be represented as the product of three monoids, corresponding to the three primary colors.
![color-mixing group as a product](color_mixing_product.svg)
![color-mixing group as a product](../03_monoid/color_mixing_product.svg)
You can think of each color monoid as a boolean monoid, having just two states (colored and not-colored).
![Cyclic groups, forming the color-mixing group](color_mixing_cyclic.svg)
![Cyclic groups, forming the color-mixing group](../03_monoid/color_mixing_cyclic.svg)
Or alternatively, you can view it as having multiple states, representing the different levels of shading.
![Color-shading cyclic group](cyclic_shading.svg)
![Color-shading cyclic group](../03_monoid/cyclic_shading.svg)
In both cases the monoid would be cyclic.
@ -302,11 +302,11 @@ Groups/monoid of rotations and reflections
Now, let's finally examine a non-commutative group - the group of rotations *and reflections* of a given geometrical figure. It is the same as the last one, but here besides the rotation action that we already saw (and its composite actions), we have the action of flipping the figure vertically, an operation which results in its mirror image:
![Reflection of a triangle](reflection.svg)
![Reflection of a triangle](../03_monoid/reflection.svg)
Those two operations and their composite results in a group called $Dih3$ that is not abelian (and is furthermore the *smallest* non-abelian group).
![The group of rotations and reflections in a triangle](symmetry_reflection.svg)
![The group of rotations and reflections in a triangle](../03_monoid/symmetry_reflection.svg)
**Task:** Prove that this group is indeed not abelian.
@ -358,11 +358,11 @@ Monoid elements as functions/permutations
Let's take a step back and examine the groups/monoids that we covered so far in the light of what we learned. We started off by representing group operation as a function from pairs. For example, the operation of a symmetric group,(let's take $Z_3$ as an example) are actions that converts two rotations to another rotation.
![The group of rotations in a triangle - group notation](symmetry_rotation_actions.svg)
![The group of rotations in a triangle - group notation](../03_monoid/symmetry_rotation_actions.svg)
Using currying, we can represent the elements of a given group/monoid as functions by uniting them to the group operation, and the group operation itself - as functional composition. For example, the 3 elements of $Z_3$ can be seen as 3 bijective (invertable) functions from a set of 3 elements to itself (in group-theoretic context, these kinds of functions are called *permutations*, by the way.)
![The group of rotations in a triangle - set notation](symmetry_rotation_functions.svg)
![The group of rotations in a triangle - set notation](../03_monoid/symmetry_rotation_functions.svg)
We can do the same for the addition monoid - numbers can be seen not as *quantities* (as in two apples, two oranges etc.), but as *operations*, (e.g. as the action of adding two to a given quantity).
@ -389,7 +389,7 @@ Monoid operations as functional composition
The functions that represent the monoid elements have the same set as source and target, or same signature, as we say (formally, they are of the type $A \to A$ for some $A$). Because of that, they all can be composed with one another, using *functional composition*, resulting in functions that *also has the same signature*.
![The group of rotations in a triangle - set notation](symmetry_rotation_cayley.svg)
![The group of rotations in a triangle - set notation](../03_monoid/symmetry_rotation_cayley.svg)
And the same is valid for the addition monoid - number functions can be combined using functional composition.
@ -410,7 +410,7 @@ Once we learn how to represent the elements of any monoid as permutations that a
Formally, if we use $Perm$ to denote the permutation group then $Perm(A) \cong A$ for any $A$.
![The group of rotations in a triangle - set notation and normal notation](symmetry_rotation_comparison.svg)
![The group of rotations in a triangle - set notation and normal notation](../03_monoid/symmetry_rotation_comparison.svg)
Or in other words, representing the elements of a group as permutations actually yields a representation of the monoid itself (sometimes called it's *standard representation*.)
@ -425,15 +425,15 @@ The first thing that you have to know about the symmetric groups is that they ar
So, for example the group $\mathrm{S}_1$ of permutations of the one-element set has just 1 element (because a 1-element set has no other functions to itself other than the identity function.
![The S1 symmetric group](s1.svg)
![The S1 symmetric group](../03_monoid/s1.svg)
The group $\mathrm{S}_2$, has $1 \times 2 = 2$ elements (by the way, the colors are there to give you some intuition as to why the number of permutations of a $n$-element set is $n!$.)
![The S2 symmetric group](s2.svg)
![The S2 symmetric group](../03_monoid/s2.svg)
And with $\mathrm{S}_3$ we are already feeling the power of exponential (and even faster than exponential!) growth of the factorial function - it has $1\times 2\times 3=6$ elements.
![The S3 symmetric group](s3.svg)
![The S3 symmetric group](../03_monoid/s3.svg)
Each symmetric group $\mathrm{S}_n$ contains all groups of order $n$ - this is so, because (as we saw in the prev section) every group with $n$ elements is isomorphic to a set of permutations on the set of $n$ elements and the group $\mathrm{S}_n$ contains *all such* permutations that exist.
@ -441,7 +441,7 @@ Here are some examples:
- $\mathrm{S}_1$ is isomorphic to $Z_1$, the *trivial group*, and $\mathrm{S}_2$ is isomorphic to $Z_2$ , the *boolean group*, (but no other symmetric grops are isomorphic to cycle groups)
- The top three permutations of $\mathrm{S}_3$ are isomorphic to the group $Z_3$.
![The S3 symmetric group](s3_z3.svg)
![The S3 symmetric group](../03_monoid/s3_z3.svg)
- $\mathrm{S}_3$ is also isomorphic to $Dih3$ (but no other symmetric group is isomorphic to a dihedral group)
@ -460,16 +460,16 @@ Monoids as categories
We saw that converting the monoid's elements to actions/functions yields an accurate representation of the monoid in terms of sets and functions.
![The group of rotations in a triangle - set notation and normal notation](symmetry_rotation_set.svg)
![The group of rotations in a triangle - set notation and normal notation](../03_monoid/symmetry_rotation_set.svg)
However, it seems that the set part of the structure in this representation is kinda redundant - you have the same set everywhere - so, it would do it good if we can simplify it. And we can do that by depicting it as an external (categorical) diagram.
![The group of rotations in a triangle - categorical notation](symmetry_rotation_external.svg)
![The group of rotations in a triangle - categorical notation](../03_monoid/symmetry_rotation_external.svg)
But wait, if the monoids' underlying *sets* correspond to *objects* in category theory, then the corresponding category would have just one object. And so the correct representation would involve just one point from which all arrows come and to which they go.
![The group of rotations in a triangle - categorical notation](symmetry_rotation_category.svg)
![The group of rotations in a triangle - categorical notation](../03_monoid/symmetry_rotation_category.svg)
The only difference between different monoids would be the number of morphisms that they have and the relationship between them.
@ -503,31 +503,31 @@ Group/monoid presentations
When we view cyclic groups/monoids as categories, we would see that they correspond to categories that (besides having just one object) also have *just one morphism* (which, as we said, is called a *generator*) along with the morphisms that are created when this morphism is composed with itself. In fact the infinite cyclic monoid (which is isomorphic to the natural numbers), can be completely described by this simple definition.
![Presentation of an infinite cyclic monoid](infinite_cyclic_presentation.svg)
![Presentation of an infinite cyclic monoid](../03_monoid/infinite_cyclic_presentation.svg)
This is so, because applying the generator again and again yields all elements of the infinite cyclic group. Specifically, if we view the generator as the action $+1$ then we get the natural numbers.
![Presentation of an infinite cyclic monoid](infinite_cyclic_presentation_elements.svg)
![Presentation of an infinite cyclic monoid](../03_monoid/infinite_cyclic_presentation_elements.svg)
Finite cyclic groups/monoids are the same, except that their definition contains an additional law, stating that that once you compose the generator with itself $n$ number of times, you get identity morphism. For the cyclic group $Z_3$ (which can be visualized as the group of triangle rotations) this law states that composing the generator with itself $3$ times yields the identity morphism.
![Presentation of a finite cyclic monoid](finite_cyclic_presentation.svg)
![Presentation of a finite cyclic monoid](../03_monoid/finite_cyclic_presentation.svg)
Composing the group generator with itself, and then applying the law, yields the three morphisms of $Z_3$.
![Presentation of a finite cyclic monoid](finite_cyclic_presentation_elements.svg)
![Presentation of a finite cyclic monoid](../03_monoid/finite_cyclic_presentation_elements.svg)
We can represent product groups this way too. Let's take Klein four-group as an example, The Klein four-group has two generators that it inherits from the groups that form it (which we viewed like vertical and horizontal rotation of a non-square rectangle) each of which comes with one law.
![Presentation of Klein four](klein_four_presentation.svg)
![Presentation of Klein four](../03_monoid/klein_four_presentation.svg)
To make the representation complete, we add the law for combining the two generators.
![Presentation of Klein four - third law](klein_four_presentation_third_law.svg)
![Presentation of Klein four - third law](../03_monoid/klein_four_presentation_third_law.svg)
And then, if we start applying the two generators and follow the laws, we get the four elements.
![The elements of Klein four](klein_four_presentation_elements.svg)
![The elements of Klein four](../03_monoid/klein_four_presentation_elements.svg)
The set of generators and laws that defines a given group is called the *presentation of a group*. Every group has a presentation.
@ -540,15 +540,15 @@ We saw how picking a different selection of laws gives rise to different types o
If you revisit the previous section you will notice that we already saw one such monoid. The free monoid with just one generator is isomorphic to the monoid of integers.
![The free monoid with one generator](infinite_cyclic_presentation_elements.svg)
![The free monoid with one generator](../03_monoid/infinite_cyclic_presentation_elements.svg)
We can make a free monoid from the set of colorful balls - the monoid's elements would be sequences of all possible combinations of the balls.
![The free monoid with the set of balls as a generators](balls_free.svg)
![The free monoid with the set of balls as a generators](../03_monoid/balls_free.svg)
The free monoid is a special one - each element of the free monoid over a given set, can be converted to a corresponding element in any any other monoid that uses the same set of generators by just applying the monoid's laws. For example, here is how the elements above would look like if we apply the laws of the color-mixing monoid.
![Converting the elements of the free monoid to the elements of the color-mixing monoid](balls_free_color_mixing.svg)
![Converting the elements of the free monoid to the elements of the color-mixing monoid](../03_monoid/balls_free_color_mixing.svg)
**Task:** Write up the laws of the color-mixing monoid.

View File

@ -14,11 +14,11 @@ Mathematically, the order as a construct is represented (much like a monoid) by
One is a *set of things* (e.g. colorful balls) which we sometimes call the order's *underlying set*.
![Balls](balls.svg)
![Balls](../04_order/balls.svg)
And the other is a *binary relation* between these things, which are often represented as arrows.
![Binary relation](binary_relation.svg)
![Binary relation](../04_order/binary_relation.svg)
Not all binary relationships are orders - only ones that fit certain criteria that we are going to examine as we review the different types of order.
@ -27,11 +27,11 @@ Linear order
Let's start with an example - the most straightforward type of order that you think og is *linear order* i.e. one in which every object has its place depending on every other object. In this case the ordering criteria is completely deterministic and leaves no room for ambiguity in terms of which element comes before which. For example, order of colors, sorted by the length of their light-waves (or by how they appear in the rainbow).
![Linear order](linear_order.svg)
![Linear order](../04_order/linear_order.svg)
Using set theory, we can represent this order, as well as any other order, as a cartesian products of the order's underlying set with itself.
![Binary relation as a product](binary_relation_product.svg)
![Binary relation as a product](../04_order/binary_relation_product.svg)
And in programming, orders are defined by providing a function which, given two objects, tells us which one of them is "bigger" (comes first) and which one is "smaller". It isn't hard to see that this function is actually a definition of a cartesian product.
@ -54,7 +54,7 @@ Reflexivity
Let's get the most boring law out of the way - each object has to be bigger or equal to itself, or $a ≤ a$ for all $a$ (the relationship between elements in an order is commonly denoted as $≤$ in formulas, but it can also be represented with an arrow from first object to the second.)
![Reflexivity](reflexivity.svg)
![Reflexivity](../04_order/reflexivity.svg)
Thre is no special reason for this law to exist, except that the "base case" should be covered somehow.
@ -65,7 +65,7 @@ Transitivity
The second law is maybe the least obvious, (but probably the most essential) - it states that if object $a$ is bigger than object $b$, it is automatically bigger than all objects that are smaller than $b$ or $a ≤ b \land b ≤ c \to a ≤ c$.
![Transitivity](transitivity.svg)
![Transitivity](../04_order/transitivity.svg)
This is the law that to a large extend defines what an order is: if I am better at playing soccer than my grandmother, then I would also be better at it than my grandmother's friend, whom she beats, otherwise I wouldn't really be better than her.
@ -74,7 +74,7 @@ Antisymmetry
The third law is called antisymmetry. It states that the function that defines the order should not give contradictory results (or in other words you have $x ≤ y$ and $y ≤ x$ only if $x = y$).
![antisymmetry](antisymmetry.svg)
![antisymmetry](../04_order/antisymmetry.svg)
It also means that no ties are permitted - either I am better than my grandmother at soccer or she is better at it than me.
@ -85,7 +85,7 @@ The last law is called *totality* (or *connexity*) and it mandates that all elem
By the way, this law makes the reflexivity law redundant, as reflexivity is just a special case of totality when $a$ and $b$ are one and the same object, but I still want to present it for reasons that will become apparent soon.
![connexity](connexity.svg)
![connexity](../04_order/connexity.svg)
Actually, here are the reasons: this law does not look so "set in stone" as the rest of them i.e. we can probably think of some situations in which it does not apply. For example, if we aim to order all people based on soccer skills there are many ways in which we can rank a person compared to their friends their friend's friends etc. but there isn't a way to order groups of people who never played with one another.
@ -102,19 +102,19 @@ The order of natural numbers
Natural numbers form a linear order under the operation *bigger or equal to* (the symbol of which we have been using in our formulas.)
![numbers](numbers.svg)
![numbers](../04_order/numbers.svg)
In many ways, numbers are the quintessential order - every finite order of objects is isomorphic to a subset of the order of numbers, as we can map the first element of any order to the number $1$, the second one to the number $2$ etc (and we can do the opposite operation as well).
If we think about it, this isomorphism is actually closer to the everyday notion of a linear order, than the one defined by the laws - when most people think of order, they aren't thinking of a *transitive, antisymmetric* and *total* relation, but are rather thinking about criteria based on which they can decide which object comes first, which comes second etc. So it's important to notice that the two are equivalent.
![Linear order isomorphisms](linear_order_isomorphism.svg)
![Linear order isomorphisms](../04_order/linear_order_isomorphism.svg)
From the fact that any finite order of objects is isomorphic to the natural numbers, it also follows that all linear orders of the same magnitude are isomorphic to one another.
So, the linear order is simple, but it is also (and I think that this isomorphism proves it) the most *boring* order ever, especially when looked from a category-theoretic viewpoint - all finite linear orders (and most infinite ones) are just isomorphic to the natural numbers and so all of their diagrams look the same way.
![Linear order (general)](general_linear_order.svg)
![Linear order (general)](../04_order/general_linear_order.svg)
However, this is not the case with partial orders that we will look into next.
@ -129,11 +129,11 @@ Partial orders are also related to the concept of an *equivalence relations* tha
If we revisit the example of the soccer players rank list, we can see that the first version that includes just **m**yself, my **g**randmother and her **f**riend is a linear order.
![Linear soccer player order](player_order_linear.svg)
![Linear soccer player order](../04_order/player_order_linear.svg)
However, including this **o**ther person whom none of us played yet, makes the hierarchy non-linear i.e. a partial order.
![Soccer player order - leftover element](player_order_leftover.svg)
![Soccer player order - leftover element](../04_order/player_order_leftover.svg)
This is the main difference between partial and total orders - partial orders cannot provide us with a definite answer of the question who is better than who. But sometimes this is what we need - in sports, as well as in other domains, there isn't always an appropriate way to rate people linearly.
@ -142,13 +142,13 @@ Chains
Before, we said that all linear orders can be represented by the same chain-like diagram, we can reverse this statement and say that all diagrams that look something different than the said diagram represent partial orders. An example of this is a partial order that contains a bunch of linearly-ordered subsets, e.g. in our soccer example we can have separate groups of friends who play together and are ranked with each other, but not with anyone from other groups.
![Soccer order - two hierarchies](player_order_two.svg)
![Soccer order - two hierarchies](../04_order/player_order_two.svg)
The different linear orders that make up the partial order are called *chains*. There are two chains in this diagram $m \to g \to f$ and $d \to o$.
The chains in an order don't have to be completely disconnected from each other in order for it to be partial. They can be connected as long as the connections are not all *one-to-one* i.e. ones when the last element from one chain is connected to the first element of the other one (this would effectively unite them into one chain.)
![Soccer order - two hierarchies and a join](player_order_two_join.svg)
![Soccer order - two hierarchies and a join](../04_order/player_order_two_join.svg)
The above set is not linearly-ordered. This is because, although we know that $d ≤ g$ and that $f ≤ g$, the relationship between $d$ and $f$ is *not* known - any element can be bigger than the other one.
@ -159,11 +159,11 @@ Although partial orders don't give us a definitive answer to "Who is better than
We call such element the *greatest element*. Some (not all) partial orders do have such element - in our last diagram $m$ is the greatest element, in this diagram, the green element is the biggest one.
![Join diagram with one more element](join_additional_element.svg)
![Join diagram with one more element](../04_order/join_additional_element.svg)
Sometimes we have more than one elements that are bigger than all other elements, in this case none of them is the greatest.
![A diagram with no greatest element](non_maximal_element.svg)
![A diagram with no greatest element](../04_order/non_maximal_element.svg)
In addition to the greatest element, a partial order may also have a least (smallest) element, which is defined in the same way.
@ -172,11 +172,11 @@ Joins
The *least upper bound* of two elements that are connected as part of an order is called the *join* of these elements, e.g. the green element is a join of the other two.
![Join](join.svg)
![Join](../04_order/join.svg)
There can be multiple elements bigger than $a$ and $b$ (all elements that are bigger than $c$ are also bigger than $a$ and $b$), but only one of them is a join. Formally, the join of $a$ and $b$ is defined as the smallest element that is bigger than both $a$ and $b$ (i.e. smallest $c$ for which $a ≤ c$, and $b ≤ c$.)
![Join with other elements](join_other_elements.svg)
![Join with other elements](../04_order/join_other_elements.svg)
Given any two elements in which one is bigger than the other (e.g. $a ≤ b$), the join is the bigger element (in this case $b$).
@ -184,11 +184,11 @@ In a linear orders, the *join* of any two elements is just the bigger element.
Like with the greatest element, if two elements have several upper bounds that are equally big, then none of them is a *join* (a join must be unique).
![A non-join diagram](non_join.svg)
![A non-join diagram](../04_order/non_join.svg)
If, however, one of those elements is established as smaller than the rest of them, it immediately qualifies.
![A join diagram](non_join_fix.svg)
![A join diagram](../04_order/non_join_fix.svg)
**Question:** Which concept in category theory reminds you of joins?
@ -197,7 +197,7 @@ Meets
Given two elements, the biggest element that is smaller than both of them is called the *meet* of these elements.
![Meet](meet.svg)
![Meet](../04_order/meet.svg)
The same rules as for the joins apply.
@ -208,7 +208,7 @@ The diagrams that we use in this section are called "Hasse diagrams" and they wo
In terms of arrows, the rule means that if you add an arrow to a point, the point *to* which the arrow points must always be above the one *from* which it points.
![A join diagram](hasse.svg)
![A join diagram](../04_order/hasse.svg)
This arrangement allows us to compare any two points by just seeing which one is above the other e.g. we can determine the *join* of two elements, by just identifying the elements that they connect to and see which one is lowest.
@ -220,45 +220,45 @@ We all know many examples of total orders (any form of chart or ranking is a tot
To stay true to our form, let's revisit our color-mixing monoid and create a *color-mixing partial order* in which all colors point to colors that contain them.
![A color mixing poset](color_mixing_poset.svg)
![A color mixing poset](../04_order/color_mixing_poset.svg)
If you go through it, you will notice that the join of any two colors is the color that they make up when mixed. Nice, right?
![Join in a color mixing poset](color_mixing_poset_join.svg)
![Join in a color mixing poset](../04_order/color_mixing_poset_join.svg)
Numbers by division
---
We saw that when we order numbers by "bigger or equal to", they form a linear order (*the* linear order even.) But numbers can also form a partial order, for example they form a partial order if we order them by which divides which, i.e. if $a$ divides $b$, then $a$ is before $b$ e.g. because $2 \times 5 = 10$, $2$ and $5$ come before $10$ (but $3$, for example, does not come before $10$.)
![Divides poset](divides_poset.svg)
![Divides poset](../04_order/divides_poset.svg)
And it so happens (actually for very good reason) that the join operation again corresponds to an operation that is relevant in the context of the objects - the join of two numbers in this partial order is their *least common multiple*.
And the *meet* (the opposite of join) of two numbers is their *greatest common divisor*.
![Divides poset](divides_poset_meet.svg)
![Divides poset](../04_order/divides_poset_meet.svg)
Inclusion order
---
Given a collection of all possible sets containing a combination of a given set of elements...
![A color mixing poset, ordered by inclusion](color_mixing_poset_inclusion_subsets.svg)
![A color mixing poset, ordered by inclusion](../04_order/color_mixing_poset_inclusion_subsets.svg)
...we can define what is called the *inclusion order* of those sets, in which $a$ comes before $b$ if $a$ *includes* $b$, or in other words if $b$ is a *subset* of $a$.
![A color mixing poset, ordered by inclusion](color_mixing_poset_inclusion.svg)
![A color mixing poset, ordered by inclusion](../04_order/color_mixing_poset_inclusion.svg)
In this case the *join* operation of two sets is their *union*, and the *meet* operation is their set *intersection*.
This diagram might remind you of something - if we take the colors that are contained in each set and mix them into one color, we get the color-blending partial order that we saw earlier.
![A color mixing poset, ordered by inclusion](color_mixing_poset_blend.svg)
![A color mixing poset, ordered by inclusion](../04_order/color_mixing_poset_blend.svg)
The order example with the number dividers is also isomorphic to an inclusion order, namely the inclusion order of all possible sets of *prime* numbers, including repeating ones (or alternatively the set of all *prime powers*). This is confirmed by the fundamental theory of arithmetic, which states that every number can be written as a product of primes in exactly one way.
![Divides poset](divides_poset_inclusion.svg)
![Divides poset](../04_order/divides_poset_inclusion.svg)
Order isomorphisms
---
@ -268,7 +268,7 @@ We mentioned order isomorphisms several times already so this is about time to e
- One function from the prime inclusion order, to the number order (which in this case is just the *multiplication* of all the elements in the set)
- One function from the number order to the prime inclusion order (which is an operation called *prime factorization* of a number, consisting of finding the set of prime numbers that result in that number when multiplied with one another).
![Divides poset](divides_poset_isomorphism.svg)
![Divides poset](../04_order/divides_poset_isomorphism.svg)
When we talk about sets, an isomorphism means just a reversible function. But as orders, besides having their underlying sets, have the arrows that connect them, there is an additional requirement for a pair of functions to form an isomorphism - to be an isomorphism, a function has to *respect those arrows*, in other words it should be *order preserving*. More specifically, applying the function (let's call it F) to any two elements in one set ($a$ and $b$) should result in two elements that have the same corresponding order in the other set (so $a ≤ b$ if $F(a) ≤ F(b)$.
@ -294,7 +294,7 @@ We will now review the orders for which Birkhoff's theorem applies i.e. the *lat
Most partial orders that are created based on some sort of rule are distributive lattices, like for example the partial orders from the previous section are also distributive lattices when they are drawn in full, for example the color-mixing order.
![A color mixing lattice](color_mixing_lattice.svg)
![A color mixing lattice](../04_order/color_mixing_lattice.svg)
Notice that we added the black ball at the top and the white one at the bottom. We did that because otherwise the top three elements wouldn't have a *join* element, and the bottom three wouldn't have a *meet*.
@ -312,15 +312,15 @@ Interlude - semilattices and trees
Lattices are partial orders that have both *join* *and* *meet* for each pair of elements. Partial orders that just have *join* (and no *meet*), or just have *meet* and no *join* are called *semilattices*. More specifically, partial orders that have *meet* for every pair of elements are called *meet-semilattices*.
![Semilattice](semilattice.svg)
![Semilattice](../04_order/semilattice.svg)
A structure that is similar to a semilattice (and probably more famous than it) is the *tree*.
![Tree](tree.svg)
![Tree](../04_order/tree.svg)
The difference between the two is small but crucial: in a tree, each element can have multiple elements connected *to* it, but can itself only be connected to just one other element. If we represent a tree as an inclusion order, each set would "belong" in only one superset, whereas with semilattices there would be no such restrictions.
![Tree and semilattice compared](semilattice_tree.svg)
![Tree and semilattice compared](../04_order/semilattice_tree.svg)
<!-- TODO add a similar diagram for posets and total orders -->
@ -353,11 +353,11 @@ In the previous section, we saw how removing the law of *totality* from the laws
The result is a structure called a *preorder* which is not exactly an order - it can have arrows coming from any point to any other: if a partial order can be used to model who is better than who at soccer, then a preorder can be used to model who has beaten who, either directly (by playing him) or indirectly.
![preorder](preorder.svg)
![preorder](../04_order/preorder.svg)
Preorders have just one law - *transitivity* $a ≤ b \land b ≤ c \to a ≤ c$ (two, if we count *reflexivity*). The part about the indirect wins is a result of this law. Due to it, all indirect wins (ones that are wins not against the player directly, but against someone who had beat them) are added as a direct result of its application, as seen here (we show indirect wins in lighter tone).
![preorder in sport](preorder_sports.svg)
![preorder in sport](../04_order/preorder_sports.svg)
And as a result of that, all "circle" relationships (e.g. where you have a weaker player beating a stronger one) result in just a bunch of objects that are all connected to one another.
@ -376,11 +376,11 @@ Preorders may be viewed as a middle-ground between *partial orders* and *equival
In particular, any subset of objects that are connected with one another both ways (like in the example above) follows the *symmetry* requirement. So if we group all elements that have such connection, we would get a bunch of sets, all of which define different *equivalence relations* based on the preorder, called the preorder's *equivalence classes*.
![preorder](preorder_equivalence.svg)
![preorder](../04_order/preorder_equivalence.svg)
And, even more interestingly, if we transfer the preorder connections between the elements of thesese sets to connections between the sets themselves, these connections would follow the *antisymmetry* requirement, which means that they would form a *partial order.*
![preorder](preorder_partial_order.svg)
![preorder](../04_order/preorder_partial_order.svg)
In short, for every preorder, we can define the *partial order of the equivalence classes of this preorder*.
@ -391,11 +391,11 @@ Maps as preorders
We use maps to get around all the time, often without thinking about the fact that that they are actually diagrams. More specifically, some of them are preorders - the objects represent cities or intercections, and the relations represent the roads.
![A map as a preorder](preorder_map.svg)
![A map as a preorder](../04_order/preorder_map.svg)
Reflexivity reflects the fact that if you have a route allowing you to get from point $a$ to point $b$ and one that allows you to go from $b$ to $c$, then you can go from $a$ to $c$ as well. Two-way roads may be represented by two arrows that form an isomorphism between objects. Objects that are such that you can always get from one object to the other form equivalence classes (ideally all intercections would be in one equivalence class).
![preorder](preorder_map_equivalence.svg)
![preorder](../04_order/preorder_map_equivalence.svg)
However, maps that contain more than one road (and even more than one *route*) connecting two intercections, cannot be represented using preorders. For that we would need categories (don't worry, we are almost there.)
@ -404,7 +404,7 @@ State machines as preorders
Let's now reformat the preorder that we used in the previous two examples, as Hasse diagram that goes from left to right. Now, it (hopefully) doesn't look so much like a hierarchy, nor like map, but like a description of a process (which, if you think about it, is also a map just one that is temporal rather than spatial.) This is actually a very good way to describe a computation model known as *finite state machine*.
![A state machine as a preorder](preorder_state_machine.svg)
![A state machine as a preorder](../04_order/preorder_state_machine.svg)
A specification of a finite state machine consists of a set of states that the machine can have, which, as the name suggest, must be finite and a bunch of transition functions that specify which state do we transition to (often expressed as tables.)
@ -446,11 +446,11 @@ Orders as categories
We saw that preorders are a powerful concept, so let's take a deeper look at the law that governs them - the transitivity law. What this law tells us that if we have two pairs of relationship $a ≤ b$ and $b ≤ c$, then we automatically have a third one $a ≤ c$.
![Transitivity](transitivity.svg)
![Transitivity](../04_order/transitivity.svg)
In other words, the transitivity law tells us that the $≤$ relationship composes i.e. if we view the "bigger than" relationship as a morphism we would see that the law of transitivity is actually the categorical definition of *composition*.
![Transitivity as functional composition](transitivity_composition.svg)
![Transitivity as functional composition](../04_order/transitivity_composition.svg)
(we have to also verify that the relation is associative, but that's easy)
@ -462,7 +462,7 @@ So let's review the definition of a category again.
Looks like we have law number 2 covered. What about that other one - the identity law? We have it too, under the name *reflexivity*.
![Reflexivity](reflexivity.svg)
![Reflexivity](../04_order/reflexivity.svg)
So it's official - preorders are categories (sounds kinda obvious, especially after we also saw that orders can be reduced to sets and functions using the inclusion order, and sets and functions form a category in their own right.)
@ -470,11 +470,11 @@ And since partial orders and total orders are preorders too, they are categories
When we compare the categories of orders to other categories, like the quintessential category of sets, we see one thing that immediately sets them apart: in other categories there can be *many different morphisms (arrows)* between two objects and in orders can have *at most one morphism*, that is, we either have $a ≤ b$ or we do not.
![Orders compared to other categories](arrows_one_arrow.svg)
![Orders compared to other categories](../04_order/arrows_one_arrow.svg)
In the contrast, in the category of sets where there are potentially infinite amount of functions from, say, the set of integers and the set of boolean values, as well as a lot of functions that go the other way around, and the existence of either of these functions does not imply that one set is "bigger" than the other one.
![Orders compared to other categories](order_category.svg)
![Orders compared to other categories](../04_order/order_category.svg)
Note that although two objects in an order might be directly connected by just one arrow, they might still be be *indirectly* connected by more than one arrow. So when we define an order in categorical way it's crucial to specify that *these ways are equivalent* i.e. that all diagrams that show orders commute.
@ -483,11 +483,11 @@ Products and sums
While we are rehashing diagrams from the previous chapters, let's look at the diagram defining the *coproduct* of two objects in a category, from chapter 2.
![Joins as coproduct](coproduct_join.svg)
![Joins as coproduct](../04_order/coproduct_join.svg)
If you recall, this is an operation that corresponds to *set inclusion* in the category of sets.
![Joins as coproduct](coproduct_inclusion.svg)
![Joins as coproduct](../04_order/coproduct_inclusion.svg)
But wait, wasn't there something else that corresponded to set inclusion - oh yes, the *join* operation in orders. And not merely that, but orders are defined in the exact same way as the categorical coproducts.
@ -496,7 +496,7 @@ In category theory, an object $G$ is the coproduct of objects $Y$ and $B$ if the
1. We have a morphism from any of the elements of the coproduct to the coproduct, so $Y → G$ and $B → G$.
2. For any other object $P$ that also has those morphisms (so $Y → P$ and $B → P$) we would have morphism $G → P$.
![Joins as coproduct](coproduct_morphisms.svg)
![Joins as coproduct](../04_order/coproduct_morphisms.svg)
In the realm of orders, we say that $G$ is the *join* of objects $Y$ and $B$ if:
@ -504,7 +504,7 @@ In the realm of orders, we say that $G$ is the *join* of objects $Y$ and $B$ if:
2. It is smaller than any other object that is bigger than them, so for any other object $P$ such that $P ≤ G$ and $P ≤ B$ then we should also have $G ≤ P$.
![Joins as coproduct](coproduct_join_morphisms.svg)
![Joins as coproduct](../04_order/coproduct_join_morphisms.svg)
We can see that the two definitions and their diagrams are the same. So, speaking in category theoretic terms, we can say that the *categorical coproduct* in the category of orders is the *join* operation. Which of course means that *products* correspond to *meets*.

View File

@ -29,7 +29,7 @@ Primary propositions
A consequence of logic being the science of the possible is that in order to do anything at all in it, we should have an initial set of propositions that we accept as true or false. These are also called "premises", "primary propositions" or "atomic propositions" as Wittgenstein dubbed them.
![Balls](balls.svg)
![Balls](../05_logic/balls.svg)
In the context of logic itself, these propositions are abstracted away (i.e. we are not concerned about them directly) and so they can be represented with the colorful balls that you are familiar with.
@ -38,11 +38,11 @@ Composing propositions
At the heart of logic, as in category theory, is the concept of *composition* - if we have two or more propositions that are somehow related to one another, we can combine them into one using a logical operators, like "and", "or" "follows" etc. The result would be a new proposition, not unlike the way in which two monoid objects are combined into one using the monoid operation. And actually some logical operations do form monoids, like for example the operation *and*, with the proposition $true$ serving as the identity element.
![Logical operations that form monoids](logic_monoid.svg)
![Logical operations that form monoids](../05_logic/logic_monoid.svg)
However, unlike monoids/groups, logics have not one but *many* logical operations and logic studies *the ways in which they relate to one another*, for example, in logic we might be interested in the law of distributivity of *and* and $or$ operations and what it entails.
![The distributivity operation of "and" and "or"](logic_distributivity.svg)
![The distributivity operation of "and" and "or"](../05_logic/logic_distributivity.svg)
Important to note that $∧$ is the symbol for *and* and $$ is the symbol for $or$ (although the law above is actually valid even if *and* and $or$ are flipped).
@ -51,7 +51,7 @@ The equivalence of primary and composite propositions
When looking at the last diagram, it is important to stress that, although in the leftmost proposition the green ball is wrapped in a gray ball to make the diagram prettier, propositions that are composed of several premises (symbolized by gray balls, containing some other balls) are not in any way different from "primary" propositions (single-color balls) and that they compose in the same way.
![Balls as propositions](balls_propositions.svg)
![Balls as propositions](../05_logic/balls_propositions.svg)
Modus ponens
---
@ -60,7 +60,7 @@ As an example of a proposition that contains multiple levels of nesting (and als
Modus ponens is a proposition that states that if proposition $A$ is true and also if proposition $(A → B)$ is true (that is if $A$ implies $B$), then $B$ is true as well. For example, if we know that "Socrates is a human" and that "humans are mortal" (or "being human implies being mortal"), we also know that "Socrates is mortal."
![Modus ponens](modus_ponens.svg)
![Modus ponens](../05_logic/modus_ponens.svg)
Let's dive into it. The proposition is composed of two other propositions in a $follows$ relation, where the proposition that follows ($B$) is primary, but the proposition from which $B$ follows is not primary (let's call that one $C$ - so the whole proposition becomes $C → B$.)
@ -73,17 +73,17 @@ We often cannot tell whether a given composite proposition is true or false with
For example, our previous example will not stop being true if we substitute "Socrates" with any other name, nor if we substitute "mortal" for any other quality that humans possess.
![Variation of modus ponens](modus_ponens_variations.svg)
![Variation of modus ponens](../05_logic/modus_ponens_variations.svg)
Propositions that are always true are called *tautologies*. And their more-famous counterparts that are always false are called *contradictions*. You can turn each tautology into contradiction or the other way around by adding a "not".
The simplest tautology is the statement that each proposition implies itself (e.g. "All bachelors are unmarried"). It may remind you of something.
![Identity tautology](tautology_identity.svg)
![Identity tautology](../05_logic/tautology_identity.svg)
Here are some more complex (less boring) tautologies (the symbol $¬$ means "not"/negation.
![Tautologies](tautology_list.svg)
![Tautologies](../05_logic/tautology_list.svg)
We will learn how to determine which propositions are a tautologies shortly, but first let's see why is this important at all i.e. what are tautologies good for: tautologies are useful because they are the basis of *axiom schemas*/*rules of inference*. And *axiom schemas* or *rules of inference* serve as starting point from which we can generate other true logical statements by means of substitution.
@ -92,13 +92,13 @@ Axiom schemas/Rules of inference
Realizing that the colors of the balls in modus ponens are superficial, we may want to represent the general structure of modus ponens that all of its variations share.
![Modus ponens](modus_ponens_schema.svg)
![Modus ponens](../05_logic/modus_ponens_schema.svg)
This structure (the one that looks like a coloring book in our example) is called *axiom schema*. And the propositions that are produced by it are *axioms*.
Note that the propositions that we plug into the schema don't have to be primary. For example, having the proposition $a$ (that is symbolized below by the orange ball) and the proposition stating that $a$ implies $a \lor b$ (which is one of the tautologies that we saw above), we can plug those propositions into the *modus ponens* and prove that $a \lor b$ is true.
![Using modus ponens for rule of inference](modus_ponens_composite.svg)
![Using modus ponens for rule of inference](../05_logic/modus_ponens_composite.svg)
*Axiom schemas* and *rules of inference* are almost the same thing except they allow us to actually distill the conclusion from the premises. For example in the case above, we can use modus ponens as a rule of inference to proves that $a \lor b$ is true.
@ -111,7 +111,7 @@ Knowing that we can use axiom schemas/rules of inference to generate new proposi
Here is one such collection which consists of the following five axiom schemes *in addition to the inference rule modus ponens* (These are axiom schemes, even though we use colors).
![A minimal collection of Hilbert axioms](min_hilbert.svg)
![A minimal collection of Hilbert axioms](../05_logic/min_hilbert.svg)
Proving that this and other similar logical systems are complete (can really generate all other propositions) is due to Gödel and is known as "Gödel's completeness theorem" (Gödel is so important that I specifically searched for the "ö" letter so I can spell his hame right.)
@ -131,7 +131,7 @@ The above is a summary of a worldview that is due to the Greek philosopher Plato
The existence of the world of forms implies that, even if there are many things that we, people, don't know and would not ever know, at least *somewhere out there* there exists an answer to every question. In logic, this translates to *the principle of bivalence* that states that *each proposition is either true or false*. And, due to this principle, propositions in classical logic can be aptly represented in set theory by the boolean set, which contains those two values.
![The set of boolean values](boolean_set.svg)
![The set of boolean values](../05_logic/boolean_set.svg)
According to the classical interpretation, you can think of *primary propositions* as just a bunch of boolean values, *logical operators* are functions that take a one or several boolean values and return another boolean value (and *composite propositions* are, just the results of the application of these functions).
@ -142,7 +142,7 @@ The *negation* operation
Let's begin with the negation operation. Negation is a unary operation, which means that it is a function that takes just *one* argument and (like all other logical operators) returns one value, where both the arguments and the return type are boolean values.
![negation](negation.svg)
![negation](../05_logic/negation.svg)
The same function can also be expressed in a slightly less-fancy way by this table.
@ -158,17 +158,17 @@ Interlude: Proving results by truth tables
Having defined the negation operator, we are in position to prove the first of the axioms of the logical system we saw, namely the *double negation elimination*. In natural language, this axiom is equivalent to the observation that saying "I am *not unable* to do X" is the same as saying "I am *able* to do it".
![Double negation elimination formula](double_negation_formula.svg)
![Double negation elimination formula](../05_logic/double_negation_formula.svg)
(despite its triviality, the double negation axiom is probably the most controversial result in logic, we will see why later.)
If we view logical operators as functions from and to the set of boolean values, than proving axioms involves composing several of those functions into one function and observing its output. More specifically, the proof of the formula above involves just composing the negation function with itself and verifying that it leaves us in the same place from which we started.
![Double negation elimination](double_negation_proof.svg)
![Double negation elimination](../05_logic/double_negation_proof.svg)
If we want to be formal about it, we might say that applying negation two times is equivalent to applying the *identity* function.
![The identity function for boolean values](boolean_identity.svg)
![The identity function for boolean values](../05_logic/boolean_identity.svg)
If we are tired of diagrams, we can represent the composition diagram above as table as well.
@ -186,7 +186,7 @@ OK, *you* know what *and* means and *I* know what it means, but what about those
Because *and* is a *binary* operator, instead of a single value the function would accept a *pair* of boolean values.
![And](and.svg)
![And](../05_logic/and.svg)
Here is the equivalent truth-table (in which $∧$ is the symbol for *and*.)
@ -265,7 +265,7 @@ Proving results by axioms/rules of inference
Let's examine the above formula, stating that $p → q$ is the same as $¬p q$.
![Hilbert formula](hilbert_formula.svg)
![Hilbert formula](../05_logic/hilbert_formula.svg)
We can easily prove this by using truth tables.
@ -280,7 +280,7 @@ But it would be much more intuitive if we do it using axioms and rules of infere
Here is one way to do it. The formulas that are used at each step are specified at the right-hand side, the rule of inference is modus ponens.
![Hilbert proof](hilbert_proof.svg)
![Hilbert proof](../05_logic/hilbert_proof.svg)
Note that to really prove that the two formulas are equivalen we have to also do it the other way around (start with ($¬p q$) and ($p → q$)).
@ -297,11 +297,11 @@ Classical and intuitionistic logic diverge from one another right from the start
So, intuitionistic logic is not bivalent, i.e. we cannot have all propositions reduced to true and false.
![The True/False dichotomy](true_false.svg)
![The True/False dichotomy](../05_logic/true_false.svg)
One thing that we still do have there are propositions that are "true" in the sense that a proof for them is given - the primary propositions. So with some caveats (which we will see later) the bivalence between true and false proposition might be thought out as similar to the bivalence between the existence or absense of a proof for a given proposition - there either is a proof of it or there isn't.
![The proved/unproved dichotomy](proved_unproved.svg)
![The proved/unproved dichotomy](../05_logic/proved_unproved.svg)
This bivalence is at the heart of what is called the BrouwerHeytingKolmogorov (BHK) interpretation of logic, something that we will look into next.
@ -312,7 +312,7 @@ The *and* and *or* operations
As the existence of a proof of a proposition is taken to mean that the proposition is true, the definitions of *and* is rather simple - the proof of $A ∧ B$ is just *a pair* containing a proof of $A$, and a proof of $B$ i.e. *a set-theoretic product* of the two (see chapter 2). The principle for determining whether the proposition is true or false is similar to that of primary propositions - if the pair of proofs of $A$ and $B$ exist (i.e. if both proofs exist) then the proof of $A \land B$ can be constructed (and so $A \land B$ is "true").
![And in the BHK interpretation](bhk_and.svg)
![And in the BHK interpretation](../05_logic/bhk_and.svg)
**Question:** what would be the **or** operation in this case?
@ -321,7 +321,7 @@ The *implies* operation
Now for the punchline: in the BHK interpretation, the *implies* operation is just a *function* between proofs. Saying that $A$ implies $B$ ($A \to B$) would just mean that there exist a function which can convert a proof of $A$ to a proof of $B$.
![Implies in the BHK interpretation](bhk_implies.svg)
![Implies in the BHK interpretation](../05_logic/bhk_implies.svg)
And the *modus ponens* rule of inference is nothing more than *functional application*. i.e. if we have a proof of $A$ and a function $A \to B$ we can call this function to obtain a proof of $B$.
@ -332,7 +332,7 @@ The *if and only if* operation
In the section on classical logic, we proved that two propositions $A$ and $B$ are equivalent if $A$ implies $B$ and $B$ implies $A$. But if the *implies* operation is just a function, then proposition are equivalent precisely when there are two functions, converting each of them to the other i.e. when the sets containing the propositions are *isomorphic*.
![Implies in the BHK interpretation](bhk_iff.svg)
![Implies in the BHK interpretation](../05_logic/bhk_iff.svg)
(Perhaps we should note that *not all functions are proofs*, a designated set of them. We say this, because in set theory you can construct functions and isomorphisms between any pair of singleton sets, but that won't mean that all proofs are equivalent.)
@ -346,15 +346,15 @@ To express this, intuitionistic logic defines the constant $⊥$ which plays the
In set theory, the $⊥$ constant is expressed by the empty set.
![False in the BHK interpretation](bhk_false.svg)
![False in the BHK interpretation](../05_logic/bhk_false.svg)
And the observation that propositions that are connected to the bottom value are false is expressed by the fact that if a proposition is true, i.e. there exists a proof of it, then there can be no function from it to the empty set.
![False in the BHK interpretation](bhk_false_function.svg)
![False in the BHK interpretation](../05_logic/bhk_false_function.svg)
The only way for there to be such function is if the set of proofs of the proposition is empty as well.
![False in the BHK interpretation](bhk_false_function_2.svg)
![False in the BHK interpretation](../05_logic/bhk_false_function_2.svg)
**Task:** Look up the definition of function and verify that there cannot exist a function from any set *to the empty set*
@ -365,7 +365,7 @@ Classical VS intuitionistic logic
Although from first glance intuitionistic logic seems to differ a lot from classical logic, it actually doesn't - if we try to deduce the axiom schemas/rules of inference that correspond to the definitions of the structures outlined above, we would see that they are virtually the same as the ones that define classical logic. With one exception concerning the *double negation elimination axiom* that we saw earlier, a version of which is known as *the law of excluded middle*.
![The formula of the principle of the excluded middle](excluded_middle_formula.svg)
![The formula of the principle of the excluded middle](../05_logic/excluded_middle_formula.svg)
This law is valid in classical logic and is true when we look at it in terms of truth tables, but there is no justification for it in the BHK interpretation - a fact that spawned a heated debate between the inventor of classical logic David Hilbert and the inventor of intuitionistic logic L.E.J. Brouwer, known as *the BrouwerHilbert controversy*.
@ -381,7 +381,7 @@ The Curry-Howard isomorphism
Programmers might find the definition of the BHK interpretation interesting for other reason - it is very similar to a definition of a programming language: propositions are *types*, the *implies* operations are *functions*, *and* operations are composite types (objects), and *or* operations are *sum types* (which are currently not supported in most programming languages, but that's a separate topic.) Finally a proof of a given proposition is represented by a value of the corresponding type.
![Logic as a programming language](logic_curry.svg)
![Logic as a programming language](../05_logic/logic_curry.svg)
This similarity is known as the *Curry-Howard isomorphism*.
@ -394,13 +394,13 @@ Knowing about the Curry-Howard isomorphism and knowing also that programming lan
Let's examine this isomorphism (without being too formal about it). As all other isomorphisms, it comes in two parts. The first part is finding a way to convert a logical system into a category - this would not be hard for us, as sets form a category and the flavor of the BHK interpretation that we saw is based on sets.
![Logic as a category](category_curry_logic.svg)
![Logic as a category](../05_logic/category_curry_logic.svg)
**Task:** See whether you can prove that logic propositions and entailments forms a category. What is missing?
The second part involves converting a category into a logical system - this is much harder. To do it, we have to enumerate the criteria that a given category has to adhere to, in order for it to be "logical". These criteria have to guarantee that the category has objects that correspond to all valid logical propositions and no objects that correspond to invalid ones.
![Logic as a category](logic_curry_category.svg)
![Logic as a category](../05_logic/logic_curry_category.svg)
Categories that adhere to these criteria are called *cartesian closed categories*. We won't describe them here directly, but instead we would start with a similar but simpler structures that are instance of them and that we already examined - orders.
@ -411,19 +411,19 @@ We will now do something that is quite characteristic of category theory - exami
So we already saw that a logical system along with a set of primary propositions forms a category.
![Logic as a preorder](logic_category.svg)
![Logic as a preorder](../05_logic/logic_category.svg)
If we assume that there is only one way to go from proposition $A$, to proposition $B$ (or there are many ways, but we are not interested in the difference between them), then logic is not only a category, but a *preorder* in which the relationship "bigger than" is taken to mean "implies".
![Logic as a preorder](logic_preorder.svg)
![Logic as a preorder](../05_logic/logic_preorder.svg)
Furthermore, if we count propositions that follow from each other (or sets of propositions that are proven by the same proof) as equivalent, then logic is a proper *partial order*.
![Logic as an order](logic_order.svg)
![Logic as an order](../05_logic/logic_order.svg)
And so it can be represented by a Hasse diagram, yey.
![Logic as an order](logic_hasse.svg)
![Logic as an order](../05_logic/logic_hasse.svg)
Now let's examine the question that we asked before - exactly which ~~categories~~ orders represent logic and what laws does an order have to obey so it is isomorphic to a logical system? We will attempt to answer this question as we examine the elements of logic again, this time in the context of orders.
@ -432,19 +432,19 @@ The and and or operations
By now you probably realized that the *and* and *or* operations are the bread and butter of logic (although it's not clear which is which). As we saw, in the BHK interpretation those are represented by set *products* and *sums*. The equivalent constructs in the realm of order theory are *meets* and *joins* (in category-theoretic terms *products* and *coproducts*.)
![Order meet and joing](lattice_meet_join.svg)
![Order meet and joing](../05_logic/lattice_meet_join.svg)
Here comes the first criteria for an order to represent a logical system accurately - *it has to have $meet$ and $join$ operations for all elements*. Having two elements without a meet would mean that you would have a logical system where there are propositions for which you cannot say that one or the other is true. And this not how logic works, so our order has to have meets and joins for all elements. Incidentally we already know how such orders are called - they are called *lattices*.
One important law of the *and* and *or* operations, that is not always present in the *meet*-s and *join*-s concerns the connection between the two, i.e. way that they distribute, over one another.
![The distributivity operation of "and" and "or"](logic_distributivity.svg)
![The distributivity operation of "and" and "or"](../05_logic/logic_distributivity.svg)
Lattices that obey this law are called *distributive lattices*.
Wait, where have we heard about distributive lattices before? In the previous chapter we said that they are isomorphic to *inclusion orders* i.e. orders which contain all combinations of sets of a given number of elements. The fact that they popped up again is not coincidental - "logical" orders are isomorphic to inclusion orders. To understand why, you only need to think about the BHK interpretation - the elements which participate in the inclusion are our prime propositions. And the inclusions are all combinations of these elements, in an $or$ relationship (for simplicity's sake, we are ignoring the *and* operation.)
![A color mixing poset, ordered by inclusion](logic_poset_inclusion.svg)
![A color mixing poset, ordered by inclusion](../05_logic/logic_poset_inclusion.svg)
$NB: For historical reasons, the symbols for *and* and *or* logical operations are flipped when compared to arrows in the diagrams ∧ is *and* and is *or*.$
@ -455,19 +455,19 @@ In order for a distributive lattice to represent a logical system, it has to als
A well-known result in logic, called *the principle of explosion*, states that if we have a proof of $False$ (or if "$False$ is true" if we use the terminology of classical logic), then any and every other statement can be proven. And we also know that no true statement implies $False$ (in fact in intuinistic logic this is the definition of a true statement). Based on these criteria we know that the $False$ object would look like this when compared to other objects:
![False, represented as a Hasse diagram](lattice_false.svg)
![False, represented as a Hasse diagram](../05_logic/lattice_false.svg)
Circling back to the BHK interpretation, we see that the empty set fits both conditions.
![False, represented as a Hasse diagram](lattice_false_bhk.svg)
![False, represented as a Hasse diagram](../05_logic/lattice_false_bhk.svg)
Conversely, the proof of $True$ (or the statement that "$True$ is true") is trivial and doesn't say anything, so *nothing follows from it*, but at the same time it follows from every other statement.
![True, represented as a Hasse diagram](lattice_true.svg)
![True, represented as a Hasse diagram](../05_logic/lattice_true.svg)
So $True$ and $False$ are just the *greatest* and *least* objects of our order (in category-theoretic terms *terminal* and *initial* object.)
![The whole logical system, represented as a Hasse diagram](lattice_true_false.svg)
![The whole logical system, represented as a Hasse diagram](../05_logic/lattice_true_false.svg)
This is another example of the categorical concept of duality - $True$ and $False$ are dual to each other (which makes a lot of sense if you think about it.)
@ -480,15 +480,15 @@ Finally, if a lattice really represents a logical system (that is, it is isomorp
How would this object be described? You guessed it, using categorical language i.e. by recognizing a structure that consists of set of relations between objects in which ($A → B$) plays a part.
![Implies operation](implies.svg)
![Implies operation](../05_logic/implies.svg)
This structure is actually a categorical reincarnation our favorite rule of inference, the *modus ponens* ($A ∧ (A → B) → B$). This rule is the essence of the *implies* operation and, because we already know how the operations that it contains (*and* and *implies*) are represented in our lattice, we can directly "categorize" it and use it as a definition, saying that $(A → B)$ is the object which is related to objects $A$ and $B$ in such a way that such that $A ∧ (A → B) → B$.
![Implies operation with impostors](implies_modus_ponens.svg)
![Implies operation with impostors](../05_logic/implies_modus_ponens.svg)
This definition is not complete, however, because $(A → B)$ is *not the only object* that fits in this formula. For example, the set $A → B ∧ C$ is also one such object, as is $A → B ∧ C ∧ D$. So how do we set apart the real formula from all those "imposter" formulas? If you remember the definitions of the *categorical product* (or of it's equivalent for orders, the *meet* operation) you would already know where this is going: we define the function object using a *universal property*, by recognizing that all other formulas that can be in the place of $X$ in $A ∧ X → B$ point to $(A → B)$ i.e. they are below $(A → B)$ in a Hasse diagram.
![Implies operation with universal property](implies_universal_property.svg)
![Implies operation with universal property](../05_logic/implies_universal_property.svg)
Or, using the logic terminology, we say that $A → B ∧ C$ and $A → B ∧ C ∧ D$ etc. are all "stronger" results than ($A → B$) and so ($A → B$) is the weakest result that fits the formula (stronger results lay lower in the diagram).
@ -498,19 +498,19 @@ Without being too formal, let's try to test if this definition captures the conc
For example, let's take $A$ and $B$ to be the same object. In this case, ($A → B$) (or ($A → A$) if you want to be pedantic) would be the topmost object $X$ for which the criteria given by the formula $A ∧ X → A$ is satisfied. But in this case the formula is *always satisfied* as the *meet* of $A$ and any other object would always be below $A$. So this formula is always for all $X$. The topmost object that fits it is, then, the topmost object out there i.e. $True$.
![Implies identity](implies_identity.svg)
![Implies identity](../05_logic/implies_identity.svg)
This corresponds to the identity axiom in logic, that states that everything follows from itself.
And by the similar logic we can see easily that if we take $A$ to be any object that is below $B$, then $(A → B)$ will also correspond to the $True$ object.
![Implies when A follows from B](implies_b_follows.svg)
![Implies when A follows from B](../05_logic/implies_b_follows.svg)
So if we have $A → B$ if $A$ implies $B$, then $(A → B)$ is always true.
And what if $B$ is lower than $A$. In this case the topmost object that fits the formula $A ∧ X → B$ is $B$ itself: $B$ fits the formula because the meet of two objects is always below those same objects, so $A ∧ B → B$ for all $A$ and $B$. And $B$ is definitely the topmost object that can possibly fit it, as it literary sets its upper bound.
![Implies when B follows from A](implies_a_follows.svg)
![Implies when B follows from A](../05_logic/implies_a_follows.svg)
Translated to logical language, says that if $B → A$, then the proof of $(A → B)$ is just the proof of $B$.

View File

@ -3,6 +3,8 @@ layout: default
title: Functors
---
Functors
===
@ -18,7 +20,7 @@ The category of sets
We began by reviewing the mother of all categories - *the category of sets*.
![The category of sets](category_sets.svg)
![The category of sets](../06_functors/category_sets.svg)
We also saw that it contains within itself many other categories, such as the category of types in programming languages.
@ -27,21 +29,21 @@ Special types of categories
We also learned about other algebraic objects that turned out to be just *special types of categories*, like categories that have just one *object* (monoids, groups) and categories that have only one *morphism* between any two objects (preorders, partial orders.)
![Types of categories](category_types.svg)
![Types of categories](../06_functors/category_types.svg)
Other categories
---
We also defined a lot of *categories based on different concepts*, like the ones based on logics/programming languages, but also some "less-serious ones", as for example the color-mixing partial order/category.
![Category of colors](category_color_mixing.svg)
![Category of colors](../06_functors/category_color_mixing.svg)
Finite categories
---
And most importantly, we saw some categories that are *completely made up*, such as my soccer player hierarchy. Those are formally called *finite categories*.
![Finite categories](finite_categories.svg)
![Finite categories](../06_functors/finite_categories.svg)
Although they are not useful by themselves, the idea behind them is important - we can draw any combination of points and arrows and call it a category, in the same way that we can construct a set out of every combination of objects.
@ -52,21 +54,21 @@ For future reference, let's see some important finite categories.
The simplest category is $0$ (enjoy the minimalism of this diagram.)
![The finite category 0](finite_zero.svg)
![The finite category 0](../06_functors/finite_zero.svg)
The next simplest category is $1$ - it is comprised of one object no morphism besides its identity morphism (which we don't draw, as usual)
![the finite category 1](finite_one.svg)
![the finite category 1](../06_functors/finite_one.svg)
If we increment the number of objects to two, we see a couple of more interesting categories, like for example the category $2$ containing two objects and one morphism.
![the finite category 2](finite_two.svg)
![the finite category 2](../06_functors/finite_two.svg)
**Task:** There are just two more categories that have 2 objects and at most one morphism between two objects, draw them.
And finally the category $3$ has 3 objects and also 3 morphisms (one of which is the composition of the other two.)
![the finite category 3](finite_three.svg)
![the finite category 3](../06_functors/finite_three.svg)
Categorical isomorphisms
===
@ -78,7 +80,7 @@ Set isomorphisms
In chapter 1 we talked about *set isomorphisms*, which establish an equivalence between two sets. In case you have forgotten, a set isomorphism is a *two-way function* between two sets.
![Set isomorphism](set_isomorphism.svg)
![Set isomorphism](../06_functors/set_isomorphism.svg)
It can alternatively be viewed as two "twin" functions such that each of which equals identity, when composed with the other one.
@ -88,7 +90,7 @@ Order isomorphisms
Then, in chapter 4, we encountered *order isomorphisms* and we saw that they are like set isomorphisms, but with one extra condition - aside from just being there, the functions that define the isomorphism have to preserve the order of the object e.g. the greatest object of one order should be connected to the greatest object of the other one, the least object of one order should be connected to the least object of the other one, and same for all objects that are in between.
![Order isomorphism](order_isomorphism.svg)
![Order isomorphism](../06_functors/order_isomorphism.svg)
Or more formally put, for any $a$ and $b$ if we have $a ≤ b$ we should also have $F(a) ≤ F(b)$ (and vise versa.)
@ -99,11 +101,11 @@ Now, we will generalize the definition of an order isomorphism, so it also appli
> Given two categories, an isomorphism between them is an invertible mapping between the underlying sets of objects, *and* an invertible mapping between the morphisms that connect them, which maps each morphism from one category to a morphism *with the same signature*.
![Category isomorphism](category_isomorphism.svg)
![Category isomorphism](../06_functors/category_isomorphism.svg)
After examining this definition closely, we realize that, although it *sounds* a bit more complex (and *looks* a bit messier) than the one we have for orders *it is actually the same thing*. It is just that the so-called "morphism mapping" between categories that have just one morphism for any two objects are trivial, and so we can omit them.
![Order isomorphism](category_order_isomorphism_2.svg)
![Order isomorphism](../06_functors/category_order_isomorphism_2.svg)
**Question:** What are the morphism functions for orders?
@ -113,7 +115,7 @@ We always map the single morphism of the source category to the single morphism
However, when we can have more than one morphism between two given objects, we need to make sure that each morphism in the source category has a corresponding morphism in the target one, and for this reason we need not only a mapping between the categories' objects, but one between their morphisms.
![Category isomorphism](category_order_isomorphism.svg)
![Category isomorphism](../06_functors/category_order_isomorphism.svg)
By the way, what we just did (taking a concept that is defined for a more narrow structure (orders) and redefining it for a more broad one (categories)) is called *generalizing* of the concept.
@ -140,13 +142,13 @@ The logician Rudolf Carnap coined the term "functor" as part of his project to f
In other words, a functor is a phrase that *acts as a function*, only not a function between sets, but one between *linguistic concepts* (such as times and temperature.)
![Functor, as envisioned by Rudolf Carnap.](functor_carnap.svg)
![Functor, as envisioned by Rudolf Carnap.](../06_functors/functor_carnap.svg)
Later, one of the inventors of category theory Sanders Mac Lane borrowed the word, to describe a something that *acts as function between categories*, which he defined in the following way:
> A functor between two categories (let's call them $A$ and $B$) consists of two mappings - a mapping that maps each *object* in $A$ to an object in $B$ and a mapping that maps each *morphism* between any objects in $A$ to a morphism between objects in $B$, in a way that *preserves the structure* of the category.
![Functor](functor.svg)
![Functor](../06_functors/functor.svg)
Now let's unpack this definition by going through each of its components.
@ -155,7 +157,7 @@ Object mapping
In the definition above we use the word "mapping" to avoid misusing the word "function" for something that isn't exactly a function. But in this particular case, calling the mapping a function would barely be a misuse - if we forget about morphisms and treat the source and target categories as sets, the object mapping is nothing but a regular old function.
![Functor for objects](functor_objects.svg)
![Functor for objects](../06_functors/functor_objects.svg)
A more formal definition of object mapping involves the concept of an *underlying set* of a category: Given a category $A$, the underlying set of $A$ is a set that has the objects of $A$ as elements. Utilizing this concept, we say that the object mapping of a functor between two categories is *a function between their underlying sets*. The definition of a function is still the same:
@ -166,11 +168,11 @@ Morphism mapping
The second mapping that forms the functor is a mapping between the categories' morphisms. This mapping resembles a function as well, but with the added requirement that each morphism in $A$ a given source and target must be mapped to a morphism with the corresponding source and target in $B$, as per the object mapping.
![Functor for morphisms](functor_morphisms.svg)
![Functor for morphisms](../06_functors/functor_morphisms.svg)
A more formal definition of a morphism mapping involves the concept of the *homomorphism set*: this is a set that contains all morphisms that go between given two objects in a given category. When utilizing this concept, we say that a mapping between the morphisms of two categories consists of a *set of functions between their respective homomorphism sets*.
![Functor for morphisms](functor_morphisms_formal.svg)
![Functor for morphisms](../06_functors/functor_morphisms_formal.svg)
Note how the concepts of *homomorphism set* and of *underlying set* allowed us to "escape" to set theory when defining categorical concepts and define everything using functions.
@ -186,11 +188,11 @@ So these are the two mappings (one between objects and one between morphisms) th
So this definition translates to the following two *functor laws*
1. Functions between morphisms should *preserve identities* i.e. all identity morphisms should be mapped to other identity morphisms.
![Functor](functor_laws_identity.svg)
![Functor](../06_functors/functor_laws_identity.svg)
2. Functors should also *preserve composition* i.e. for any two morphisms $f$ and $g$, the morphism that corresponds to their composition $F(g•f)$ in the source category should be mapped to the morphism that corresponds to the composition of their counterparts in the target directory, so $F(g•f) = F(g)•F(f)$.
![Functor](functor_laws_composition.svg)
![Functor](../06_functors/functor_laws_composition.svg)
And these laws conclude the definition of functors - a simple but, as we will see shortly, very powerful concept.
@ -209,7 +211,7 @@ For example, in chapter 1 we presented the following definition of functional co
> The composition of two functions $f$ and $g$ is a third function $h$ defined in such a way that this diagram commutes.
![Functional composition - general definition](functions_compose_general.svg)
![Functional composition - general definition](../06_functors/functions_compose_general.svg)
We all see the benefit of defining stuff by means of diagrams as opposed to writing lengthy definitions like
@ -219,13 +221,13 @@ However, it (defining stuff by means of diagrams) presents a problem - definitio
So how can we do that? One key observation is that diagrams look as finite categories, as, for example, the above definition looks in the same way as the category $3$.
![the finite category 3](finite_three.svg)
![the finite category 3](../06_functors/finite_three.svg)
However, this is only part of the story as finite categories are just structures, whereas diagrams are *signs*. They are "something by knowing which we know something more.", as Peirce famously put it (or "...which can be used in order to tell a lie", in the words of Umberto Eco.)
For this reason, aside from a finite category that encodes the diagram's structure, the definition of a diagram must also include a way for "interpreting" this category in some other context i.e. they must include *functors*.
![diagram as a functor](diagram_functor.svg)
![diagram as a functor](../06_functors/diagram_functor.svg)
This is how the concept of functors allows us to formalize the notion of diagrams:
@ -248,11 +250,11 @@ Maps are functors
Functors are sometimes called "maps" for a good reason - maps, like all other diagrams, are functors. If we consider some space, containing cities and roads that we travel by, as a category, in which the cities are objects and roads between them are morphisms, then a road map can be viewed as category that represent some region of that space, together with a functor that maps the objects in the map to real-world objects.
![A map and a preorder of city pathways](preorder_map_functor.svg)
![A map and a preorder of city pathways](../06_functors/preorder_map_functor.svg)
In maps, morphisms that are a result of composition are often not displayed, but we use them all the time - they are called *routes*. And the law of preserving composition tells us that every route that we create on a map corresponds to a real-world route.
![A map and a preorder of city pathways](preorder_map_functor_route.svg)
![A map and a preorder of city pathways](../06_functors/preorder_map_functor_route.svg)
Notice that in order to be a functor, a map does not have to list *all* roads that exist in real life, and *all* traveling options ("the map is not the territory"), the only requirement is that *the roads that it lists should be actual* - this is a characteristic shared by all many-to-one relationships (i.e. functions.)
@ -263,11 +265,11 @@ We saw that, aside from being a category-theoretic concept, functors are connect
My thesis is that to perceive the world around us, we are going through a bunch of functors that go from more raw "low-level" mental models to more abstract "high-level" ones. For example, our brain creates a functor between the category of raw sensory data that we receive from our senses, to a category containing some basic model of how the world works (one that tells us where are we in space, how many objects are we seeing etc.) Then we are connecting this model to another, more abstract model, which provides us with a higher-level view of the situation that we are in, and so on.
![Perception is functorial](chain.svg)
![Perception is functorial](../06_functors/chain.svg)
You can view this as a progression from simpler to more abstract from categories with less morphisms to categories with more morphisms - we start with the category of pieces of sensory data that have no connections between one another, and proceed to another category where some of these pieces of data are connected. Then, we transfer this structure in another category with even more connections.
![Perception is functorial](logic_thought.svg)
![Perception is functorial](../06_functors/logic_thought.svg)
All this is, of course, just a speculation, but we might convince yourself that there is some basis for it, especially after we see how significant functors are for the mathematical structures that we saw before.
@ -280,22 +282,22 @@ Hey, do you know that in group theory, there is this cool thing called *group ho
So, for example, If the time of the day right now is 00:00 o'clock (or 12 PM) then what would the time be after $n$ hours? The answer to this question can be expressed as a function with the set of integers as source and target.
![Group homomorphism as a function](group_homomorphism_function.svg)
![Group homomorphism as a function](../06_functors/group_homomorphism_function.svg)
This function is interesting - it preserves the operation of (modular) addition: if, 13 hours from now the time will be 1 o'clock and if 14 hours from now it will be 2 o'clock, then the time after (13 + 14) hours will be (1 + 2) o'clock.
![Group homomorphism](group_homomorphism.svg)
![Group homomorphism](../06_functors/group_homomorphism.svg)
Or to put it formally, if we call it (the function) $F$, then we have the following equation - $F(a + b) = F(a) + F(b)$ (where $+$ in the right-hand side of the equation means modular addition.) Because this equation holds, the $F$ function is a *group homomorphism* between the group of integers under addition and the group of modulo arithmetic with base 11 under modular addition (where you can replace 11 with any other number.)
The groups don't have to be so similar for there to be a homomorphism between them. Take, for example, the function that maps any number $n$ to 2 to the *power of $n$,* so $n \to 2ⁿ$ (here, again, you can replace 2 with any other number.) This function gives a rise to a group homomorphism between the group of integers under addition and the integers under multiplication, or $F(a + b) = F(a) \times F(b)$.
![Group homomorphism between different groups](group_homomorphism_addition_multiplication.svg)
![Group homomorphism between different groups](../06_functors/group_homomorphism_addition_multiplication.svg)
Wait, what were we talking about, again? Oh yeah - group homomorphisms are functors. To see why, we switch to the category-theoretic representation of groups and revisit our first example and (to make the diagram simpler, we use $mod2$ instead of $mod11$.)
![Group homomorphism as a functor](group_homomorphism_functor.svg)
![Group homomorphism as a functor](../06_functors/group_homomorphism_functor.svg)
It seems that when we view groups/monoid as one-object categories, a group/monoid homomorphism is just a functor between these categories. Let's see if that is the case.
@ -328,11 +330,11 @@ And now let's talk about a concept that is completely unrelated to functors, nud
For example, the function that maps the current time to the distance traveled by some object is monotonic because the distance traveled increases (or stays the same) as time increases.
![A monotonic function](monotone_map.svg)
![A monotonic function](../06_functors/monotone_map.svg)
If we plot this or any other monotonic function on a line graph, we see that it goes just one direction (i.e. just up or just down.)
![A monotonic function, represented as a line-graph](monotone_map_plot.svg)
![A monotonic function, represented as a line-graph](../06_functors/monotone_map_plot.svg)
Now we are about to prove that monotonic functions are functors too, ready?
@ -379,13 +381,13 @@ In calculus, there is this concept of *linear functions* (also called "degree on
But if we start plotting some such functions we will realize that there is another way to describe them - their graphs are always comprised of straight lines.
![Linear functions](linear_functions.svg)
![Linear functions](../06_functors/linear_functions.svg)
**Question:** Why is that?
Another interesting property of these functions is that most of them *preserve* addition, that is for any $x$ and $y$, you have $f(x) + f(y) = f(x + y)$. We already know that this equation is equivalent to the second functor law. So linear functions are just *functors between the group of natural numbers under addition and itself.* As we will see later, they are example of functors in the *category of vector spaces*.
![Linear functions](linear_function_functor.svg)
![Linear functions](../06_functors/linear_function_functor.svg)
**Question:** Are the two formulas we presented to define linear functions completely equivalent?
@ -414,7 +416,7 @@ And if we view that natural numbers as an order, linear functions are also funct
Note, however, that not all functions that are plotted straight lines preserve addition - functions of the form $f(x) = x * a + b$ in which $b$ is non-zero, are also straight lines (and are also called linear) but they don't preserve addition.
![Linear functions](linear_function_non_functor.svg)
![Linear functions](../06_functors/linear_function_non_functor.svg)
For those, the above formula looks like this: $f(x) + b + f(y) + b = f(x + y) + b$.
@ -431,7 +433,7 @@ Functors in programming. The list functor
Types in programming language form a category, associated to that category are some functors that programmers use every day, such as the list functor, that we will use as an example. The list functor is an example of a functor that maps from the realm of simple (primitive) types and functions to the realm of more complex (generic) types and functions.
![A functor in programming](functor_programming.svg)
![A functor in programming](../06_functors/functor_programming.svg)
But let's start with the basics - defining the concept of a functor in programming context is as simple as changing the terms we use, according to the table in chapter 2 (the one that compares category theory with programming languages), and (perhaps more importantly) changing the font we use in our formulas from "modern" to "monospaced".
@ -444,7 +446,7 @@ Type mapping
The first component of a functor is a mapping that converts one type (let's call it `A`) to another type (`B`). So it is *like a function, but between types*. Such constructions are supported by almost all programming languages that have static type checking in the first place - they go by the name of *generic types*. A generic type is nothing but a function that maps one (concrete) type to another (this is why generic types are sometimes called *type-level functions*.)
![A functor in programming - type mapping](functor_programming_objects.svg)
![A functor in programming - type mapping](../06_functors/functor_programming_objects.svg)
Note that although the diagrams they look similar, a *type-level* function is completely different from a *value-level* function. A value-level function from `String`, to `List<String>` (or in mathy Haskell/ML-inspired notation $string \to List\ string$ is) converts a *value* of type `String` (such as `"foo"`) and to a value of type `List<String>`. You even have (as we will see later) a value-level functions with signature $a \to List\ a$ that can convert any value to a list of elements containing that value, but this is different from the *type-level* function `List<A>` as that one converts a *type* $a$ to a *type* $List\ a$ (e.g. the type `string` to the type $List\ string$, $number$ to $List\ number$ etc.)
@ -453,7 +455,7 @@ Function mapping
So the type mapping of a functor is simply a generic type in a programming language (we can also have functors between two generic types, but we will review those later.) So what is the *function mapping* - that is a mapping that convert any function operating on simple types, like $string \to number$ to a function between their more complex counterparts e.g. $List\ string \to List\ number$.
![A functor in programming - function mapping](functor_programming_morphisms.svg)
![A functor in programming - function mapping](../06_functors/functor_programming_morphisms.svg)
In programming languages, this mapping is represented by a higher-order function called `map` with a signature (using Haskell notation), $(a \to b) \to (Fa \to Fb)$, where $F$ represents the generic type.
@ -512,11 +514,11 @@ Endofunctors
To understand what pointed endofunctors are, we have to first understand what are *endofunctors*, and we already saw some examples of those in the last section. Let me explain: from the way the diagrams there looked like, we might get the impression that different type families belong to different categories.
![A functor in programming](functor_programming.svg)
![A functor in programming](../06_functors/functor_programming.svg)
But that is not the case - all type families from a given programming language are actually part of one and the same category - the category of *types*.
![A functor in programming](functor_programming_endo.svg)
![A functor in programming](../06_functors/functor_programming_endo.svg)
Wait, so this is permitted? Yes, these are exactly what we call *endofunctors* i.e. ones that have one and the same category as source and target.
@ -525,7 +527,7 @@ The identity functor
So, what are some examples of endofunctors? I want to focus on one that will probably look familiar to you - it is the *identity functor* of each category, the one that maps each object and morphism to itself.
![Identity functor](identity_functor.svg)
![Identity functor](../06_functors/identity_functor.svg)
And it might be familiar, because an identity functor is similar to an identity morphism - it allow us to talk about value-related stuff without actually involving values.
@ -534,19 +536,19 @@ Pointed functors
Finally, the identity functor, together with all other functors to which the identity functor can be *naturally transformed* are called *pointed functors* (i.e. a functor is pointed if there exist a morphism from the identity functor to it.) As we will see shortly, the list functor is a pointed functor.
![Pointed functor](pointed_functor.svg)
![Pointed functor](../06_functors/pointed_functor.svg)
We still haven't discussed what does it mean for one functor to be naturally transformed to another one (although the commuting diagram above can give you some idea.) This is a complex concept and we have a whole chapter about it (the next one).
However if we concentrate solely on the category of types in programming languages, then *a natural transformation is just a function* that translates each value of what we called the "simple types" to a value of the functor's generic type i.e. $a \to F\ a$), in a way that this diagram commutes.
![Pointed functor in Set](pointed_functor_set.svg)
![Pointed functor in Set](../06_functors/pointed_functor_set.svg)
What does it take for this diagram to commute? It means that the you have two equivalent routes for reaching from the top-left diagonal to the bottom-right diagonal i.e. that applying any function between any two types ($a \to b$), followed by the lifting function ($b \to F\ b$), is equivalent to applying the lifting function first ($a \to F\ a$), and then the mapped version of the original function second ($F\ a \to F\ b$.)
The list functor is pointed, because such a function exist for the list functor - it is the function $a \to [\ a\ ]$ that puts every value in a "singleton" list. So, for every function between simple types, such as the function $length:\ string \to number$ we have a square like this one.
![Pointed functor in Set](pointed_functor_set_internal.svg)
![Pointed functor in Set](../06_functors/pointed_functor_set_internal.svg)
And the fact that the square commutes is expressed by the following equality:
@ -560,7 +562,7 @@ The category of small categories
Ha, I got you this time (or at least I *hope* I did) - you probably thought that I won't introduce another category in this chapter, but this is exactly what I am going to do now. And (surprise again) the new category won't be the category of functors (don't worry, we will introduce that in the next chapter.) Instead, we will examine the category of (small) categories, that has all the categories that we saw so far as objects and functors as its morphisms, like $Set$ - the category of sets, $Mon$, the category of monoids, $Ord$, the category of orders etc.
![The category of categories](category_of_categories.svg)
![The category of categories](../06_functors/category_of_categories.svg)
We haven't yet mentioned the fact that functors compose (and in an associative way at that), but since a functor is just a bunch of functions, it is no wonder that it does.

34
print.md Normal file
View File

@ -0,0 +1,34 @@
---
layout: default
---
<img src="../cover.svg" height="700px">
<div style="break-after:page"></div>
Praise
===
> "The range of applications for category theory is immense, and visually conveying meaning through illustration is an indispensable skill for organizational and technical work. Unfortunately, the foundations of category theory, despite much of their utility and simplicity being on par with Venn Diagrams, are locked behind resources that assume far too much academic background.
>
>Should category theory be considered for this academic purpose or any work wherein clear thinking and explanations are valued, beginner-appropriate resources are essential. There is no book on category theory that makes its abstractions so tangible as "Category Theory Illustrated" does. I recommend it for programmers, managers, organizers, designers, or anyone else who values the structure and clarity of information, processes, and relationships."
[Evan Burchard](https://www.oreilly.com/pub/au/7124), Author of "The Web Game Developer's Cookbook" and "Refactoring JavaScript"
> "The clarity, consistency and elegance of diagrams in 'Category Theory Illustrated' has helped us demystify and explain in simple terms a topic often feared."
[Gonzalo Casas](https://gnz.io/), Software developer and lecturer at ETH Zurich
{% for chapter in site.chapters%}
<div style="break-after:page"></div>
{{ chapter.content }}
{% endfor %}
<script>
window.print()
</script>