This commit is contained in:
Boris Marinov 2021-08-18 18:28:14 +03:00
parent 15f3448874
commit 545ce22d5f
4 changed files with 82 additions and 92 deletions

2
.gitignore vendored
View File

@ -1,4 +1,4 @@
errors.md
*.swp
*.un~
_site

View File

@ -6,23 +6,21 @@ title: Logic
Logic
===
Now let's talk about one more *seemingly* unrelated structure, just so we can "surprise" ourselves at the end (this time there will be another surprise in addition to logic being a category, so don't fall asleep). Also, in this chapter I will not merely transport you to a different branch of mathematics, but an entirely different discipline - *logic*.
Now let's talk about one more *seemingly* unrelated topic just so we can "surprise" ourselves when we realize it's category theory (this time there will be another surprise in addition to that, so don't fall asleep. Also, in this chapter I will not merely transport you to a different branch of mathematics, but an entirely different discipline - *logic*.
What is logic
===
Logic is the science of the *possible*. As such, it is at the root of all other sciences, all of which are sciences of the *actual*, i.e. of that which really exists. For example, if science explains how our universe works, logic is the part of the description which is also applicable to any other universe that is possible to exist.
Logic is the science of the *possible*. As such, it is at the root of all other sciences, all of which are sciences of the *actual*, i.e. that which really exists. For example, if science explains how our universe works then logic is the part of the description which is also applicable to any other universe that is *possible to exist*.
It does this by studying the *rules* by which knowing one thing leads you to conclude or (*prove*) that some other thing is also true, regardless of the things's domain (e.g. scientific discipline) i.e. by only referring to it's form i.e. *in a formal way*.
Logic studies the *rules* by which knowing one thing leads you to conclude or (*prove*) that some other thing is also true, regardless of the things' domain (e.g. scientific discipline) and by only referring to their form.
On top of that, it (logic) tries to organize those rules in what are called *logical systems* (or *formal systems* as they are also called) - these are collections of rules for manipulating proposition that are *complete* in their description of those propositions (we will see what that means shortly).
On top of that, it (logic) tries to organize those rules in *logical systems* (or *formal systems* as they are also called).
Logic and mathematics
---
Seeing this description, we might think think that the subject of logic is quite similar to the subject of set theory and category theory, as we described it in the first chapter. Only, there instead of the word "formal" we used another similar word, namely "abstract" and instead of "formal system" we said "theory". This observation would be quite correct - today most people agree that every mathematical theory is actually logic plus some additional definitions added to it. For example, part of the reason why *set theory* is so popular as a theory for the foundations of mathematics is that set theory adds just a single primitive to the standard axioms of logic which we will see shortly - the binary relation that indicates *set membership*.
So set theory is very close to logic, although exactly how they relate is probably outside the scope of this book. And category theory is close to logic too, but in a quite different way (we will examine the connection later.)
Seeing this description, we might think that the subject of logic is quite similar to the subject of set theory and category theory, as we described it in the first chapter. Only, there instead of the word "formal" we used another similar word, namely "abstract" and instead of "logical system" we said "theory". This observation would be quite correct - today most people agree that every mathematical theory is actually logic plus some additional definitions added to it. For example, part of the reason why *set theory* is so popular as a theory for the foundations of mathematics is that it adds just a single primitive to the standard axioms of logic which we will see shortly - the binary relation that indicates *set membership*. Category theory is close to logic too, but in a quite different way.
Primary propositions
---
@ -36,11 +34,11 @@ In the context of logic itself, these propositions are abstracted away (i.e. we
Composing propositions
---
At the heart of logic, as in category theory is the concept of *composition* - if we have two or more propositions that are somehow related to one another, we can combine them into one using a logical operator, like "and", "or" "follows" etc. The result would be a new proposition, not unlike the way in which monoid objects are combined into a new monoid object using the monoid operation. And actually, some logical operations do form monoids, like the operation **and** with the proposition **true** serving as the identity element.
At the heart of logic, as in category theory, is the concept of *composition* - if we have two or more propositions that are somehow related to one another, we can combine them into one using a logical operator, like "and", "or" "follows" etc. The result would be a new proposition, not unlike the way in which two monoid objects are combined into one using the monoid operation and actually some logical operations do form monoids, like for example the operation **and**, with the proposition **true** serving as the identity element.
![Logical operations that form monoids](logic_monoid.svg)
However, unlike monoids/groups, logics have not one but *many* logical operations and logic studies *the ways in which they relate*, for example, in logic we might be interested in the law of distributivity of the **and** and **or** operations and what it entails.
However, unlike monoids/groups, logics have not one but *many* logical operations and logic studies *the ways in which they relate to one another*, for example, in logic we might be interested in the law of distributivity of **and** and **or** operations and what it entails.
![The distributivity operation of "and" and "or"](logic_distributivity.svg)
@ -49,37 +47,35 @@ Important to note that **∧** is the symbol for **and** and **** is the symb
The equivalence of primary and composite propositions
---
Important to stress that although in the leftmost proposition the green ball is wrapped in a gray ball to make the diagram prettier propositions that are composed of several premises (symbolized by gray balls, containing some other balls) are not in any way different from "primary" propositions (single-color balls).
As a result, (again similar to what we saw with category theory) we can compose propositions with multiple levels of nesting (*recursively* as the computer science people say.)
It is important to stress that although in the leftmost proposition the green ball is wrapped in a gray ball to make the diagram prettier propositions that are composed of several premises (symbolized by gray balls, containing some other balls) are not in any way different from "primary" propositions (single-color balls) and compose in the same way
![Balls as propositions](balls_propositions.svg)
Modus ponens
---
As an example of a proposition that contains multiple levels of nesting (and a great introduction of the subject in its own right), consider one of the oldest (it was alredy known by Stoics at 3rd century B.C.) and most famous propositions ever, namely the *modus ponens*.
As an example of a proposition that contains multiple levels of nesting (and also a great introduction of the subject of logic in its own right), consider one of the oldest (it was already known by Stoics at 3rd century B.C.) and most famous propositions ever, namely the *modus ponens*.
Modus ponens is a proposition that states that if **A** is true and if also **A → B** is true (that is, if **A** implies **B**), then **B** is true as well. For example, if we know that "Socrates is a human" and that "humans are mortal" (or "being human implies being mortal"), we also know that "Socrates is mortal".
Modus ponens is a proposition that states that if proposition **A** is true and also if proposition **(A → B)** is true (that is, if **A** implies **B**), then **B** is true as well. For example, if we know that "Socrates is a human" and that "humans are mortal" (or "being human implies being mortal"), we also know that "Socrates is mortal".
![Modus ponens](modus_ponens.svg)
Let's dive a little deeper. The proposition is composed of two other propositions in a **follows** relation where the proposition that follows (**B**) is primary (or at least could be, again, pretty diagrams before all), but the proposition from which **B** follows is not. Let's call that one **C** - so the whole proposition becomes **C → B**.
Let's dive into it. The proposition is composed of two other propositions in a **follows** relation where the proposition that follows (**B**) is primary , but the proposition from which **B** follows is not (let's call that one **C** - so the whole proposition becomes **C → B**).
Going one more level down, we notice that the **C** propositions is itself composed of two propositions in an **and**, relationship - **A** and let's call the other one **D** (so **A ∧ D**), where **D** is itself composed of two propositions, this time in a **follows** relationship - **A → B**.
Going one more level down, we notice that the **C** propositions is itself composed of two propositions in an **and**, relationship - **A** and let's call the other one **D** (so **A ∧ D**), where **D** is itself composed of two propositions, this time in a **follows** relationship - **A → B**. But all of this is better visualized in the diagram.
Tautologies
---
Because the content of propositions in logic is abstracted away, we often cannot tell whether a given proposition is true or false. However, with propositions such as *modus ponens* we can: modus ponens is *always true*, regardless of whether the propositions which form it are true. If we want to be fancy, we can also say that it is *true in all models of the system* (a model being a set of real-world premises are taken to be signified by our propositions).
Because the content of propositions in logic is abstracted away, we often cannot tell whether a given proposition is true or false without knowing their values. However, with propositions such as *modus ponens* we can: modus ponens is *always true*, regardless of which are the propositions which form it and whether they are true or false. If we want to be fancy, we can also say that it is *true in all models of the logical system*, a model being a set of real-world premises are taken to be signified by our propositions.
For example, our previous example would not stop being true if we substitute "Socrates" with any other name, nor if we substitute "mortal" for any other quality that humans possess.
![Variation of modus ponens](modus_ponens_variations.svg)
Propositions that are always true are called *tautologies*. And their more-famous counterparts that are always false are the *contradictions*. You can turn each tautology into contradiction or the other way around by adding a "not".
Propositions that are always true are called *tautologies*. And their more-famous counterparts that are always false are called *contradictions*. You can turn each tautology into contradiction or the other way around by adding a "not".
The simplest tautology, is the one which states that a proposition implies itself (e.g. "All bachelors are unmarried"). It may remind you of something.
The simplest tautology is the statement that each proposition implies itself (e.g. "All bachelors are unmarried"). It may remind you of something.
![Identity tautology](tautology_identity.svg)
@ -92,60 +88,56 @@ We will learn how to determine which propositions are a tautologies, but first l
Logical systems
===
Tautologies are useful because they are the basis of *axiom schemas* or *rules of inference* (which is almost the same thing). And *axiom schemas* or *rules of inference* serve as starting point from which we can generate other true logical statements by means of substitution.
Tautologies are useful because they are the basis of *axiom schemas*/*rules of inference*. And *axiom schemas* or *rules of inference* serve as starting point from which we can generate other true logical statements by means of substitution.
Axiom schemas
---
Realizing that the colors of the balls in modus ponens are superficial, we may want to represent the general structure of modus ponnes that all of its variations have.
Realizing that the colors of the balls in modus ponens are superficial, we may want to represent the general structure of modus ponens that all of its variations have.
![Modus ponens](modus_ponens_schema.svg)
This structure (the one that looks like a coloring book in our example) is called *axiom schema*. And the propositions that are produced by it are *axioms*.
Note that the propositions that we plug into the schema don't have to be primary. For example, having the proposition **a** (that is symbolized below by the orange ball) and the proposition stating that **a** implies **a or b** (which is one of the tautologies that we saw above), we can plug those propositions into the *modus ponens* and realize that **a or b** is true.
Note that the propositions that we plug into the schema don't have to be primary. For example, having the proposition **a** (that is symbolized below by the orange ball) and the proposition stating that **a** implies **a or b** (which is one of the tautologies that we saw above), we can plug those propositions into the *modus ponens* and prove that **a or b** is true.
![Using modus ponens for rule of inference](modus_ponens_composite.svg)
Rules of inference
---
Rules of inference are procedures for declaring that propositions that follow from true propositions as also true. Axiom schemas can be easily applied as rules of inference (and the other way around) e.g. in the case above, we can use modus ponens as a rule of inference to proves that **a or b** is true.
Rules of inference specify how axiom schemas are applied. Axiom schemas and rules of inference are almost the same thing. All axiom schemas can be easily applied as rules of inference and the other way around e.g. in the case above, we can use modus ponens as a rule of inference to proves that **a or b** is true.
Completeness of logical systems
---
Knowing that we can use axiom schemas/rules of inference to generate new propositions, we might ask whether it is possible to create a small collection of such schemas/rules that is curated in such a way that it enables us to generating *all* other propositions. You would be happy (although a little annoyed, I imagine) to learn that there exist not only one, many such collections. And yes, collections such as the one above are what we call *logical systems* (technically, they should be called *complete* logical systems and a collections that are not capable of generating all other propositions would be *incomplete logical systems*, but who has time for incomplete logical systems?)
Knowing that we can use axiom schemas/rules of inference to generate new propositions, we might ask whether it is possible to create a small collection of such schemas/rules that is curated in such a way that it enables us to generate *all* other propositions. You would be happy (although a little annoyed, I imagine) to learn that there exist not only one, many such collections. And yes, collections such as the one above are what we call *logical systems*.
Here is one such collection, which consists of the following five axiom schemes **in addition to the inference rule modus ponens** (These are axiom schemes, even though we use colors).
Here is one such collection which consists of the following five axiom schemes **in addition to the inference rule modus ponens** (These are axiom schemes, even though we use colors).
![A minimal collection of Hilbert axioms](min_hilbert.svg)
Proving that this and other similar logical systems can really generate all other propositions are complete is due to Godel and is known as "Godel's completeness theorem".
Proving that this and other similar logical systems are complete (can really generate all other propositions) is due to Godel and is known as "Godel's completeness theorem".
Conclusion
---
We now have an idea about how do some of the main logical constructs (axioms, rules of inference) work. But in order to prove that they are true, and to understand *what they are*, we need to do so through a specific *interpretation* of those constructs. We will now look into two interpretations - one very old and the other, relatively recent. This would be a slight detour from our usual subject matter of points and arrows, but it would be worth it. So let's start.
We now have an idea about how do some of the main logical constructs (axioms, rules of inference) work. But in order to prove that they are true, and to understand *what they are*, we need to do so through a specific *interpretation* of those constructs. We will now look into two interpretations - one very old and the other, relatively recent. This would be a slight detour from our usual subject matter of points and arrows, but I assure you that it would be worth it. So let's start.
Classical logic. The truth-functional interpretation
===
> Beyond the world that we inhabit and perceive every day there exist the *world of forms* where all ideas and concepts that manifest themselves in the objects that we perceive reside e.g. beyond all the people that have ever lived, there lies the prototypical person, and we are people only insofar as we resemble that person, beyond all the things in the world that are strong, lies the ultimate concepts of strength, from which all of them borrow etc. And although, as mere mortals, we live in the world of appearances and cannot perceive the world of forms, we can, through philosophy, "recollect" with it and know some of its features.
The above is a summary of a worldview that is due to the Greek philosopher Plato and is sometimes called Plato's *theory of forms*. Originally, the discipline of logic represents an effort to think and structure our thoughts in a way that they apply to this world of forms i.e. in a "formal" way. Today, this original paradigm of logic is known as "classical logic". Although, it all started with Plato, most of it is due to the 20th century mathematician David Hilbert.
The above is a summary of a worldview that is due to the Greek philosopher Plato and is sometimes called Plato's *theory of forms*. Originally, the discipline of logic represents an effort to think and structure our thoughts in a way that they apply to this world of forms i.e. in a "formal" way. Today, this original paradigm of logic is known as "classical logic". Although it all started with Plato, most of it is due to the 20th century mathematician David Hilbert.
The existence of the world of forms implies that even if there are many things that we, people, don't know, at least *somewhere out there* there exists answer to every question. In logic, this translates to *the principle of bivalence* that states that *each proposition is either true or false*.
The existence of the world of forms implies that even if there are many things that we, people, don't know, at least *somewhere out there* there exists answer to every question. In logic, this translates to *the principle of bivalence* that states that *each proposition is either true or false*.
Due to it, propositions in classical logic can be aptly expressed using set theory as functions that return a value from the boolean set (so either true or false).
Due to this principle, propositions in classical logic can be aptly expressed with set theory using boolean set (so either true or false).
![The set of boolean values](boolean_set.svg)
We can view *primary propositions* as simple functions that return a boolean value and don't take any arguments.
We can view *logical operators* as functions that take a one or a pair of boolean values and return another boolean value.
*Composite propositions* are, in this case, just the results of the invocation of these functions.
*Primary propositions* are just a bunch of boolean values, *logical operators* as functions that take a one or several boolean values and return another boolean value. *Composite propositions* are, in this case, just the results of the invocation of these functions.
Let's review all logical operators in this context.
@ -163,14 +155,13 @@ The same function can also be expressed in a slightly less-fancy way by this tab
| True | False |
| False | True |
Such tables are called *truth tables* and they are ubiquitous in classical logic not only for defining operators but for proving results as well.
Such tables are called *truth tables* and they are ubiquitous in classical logic . They can be used not only for defining operators but for proving results as well.
Interlude: Proving results by truth tables
---
Having defined the negation operator, as we did above, we are in position to prove the first of the axioms of the logical system we saw, namely the *double negation elimination*. In natural language, this axiom is equivalent to the observation that saying "I am *not unable* to do X" is the same as saying "I am *able* to do it".
![Double negation elimination formula](double_negation_formula.svg)
(despite its triviality, the double negation axiom is probably the most controversial result in logic, we will see why later.)
@ -179,7 +170,7 @@ If we view logical operators as functions, from and to the set of boolean values
![Double negation elimination](double_negation_proof.svg)
If we want to be formal about it, we might say that applying negation two times is equivalent to the identity function.
If we want to be formal about it, we might say that applying negation two times is equivalent to applying the identity function.
![The identity function for boolean values](boolean_identity.svg)
@ -190,17 +181,16 @@ If we are tired of diagrams, we can represent the composition diagram above as t
| True | False | True |
| False | True | False |
Each proposition in classical logic can be proved with these diagrams (or by truth tables.)
Each proposition in classical logic can be proved with such diagrams/tables.
The **and** and **or** operations
---
OK, you know what **and** means and I know what it means, but what about those annoying people that want everything to be formally specified (nudge, nudge). Well we already know how we can satisfy them - we just have to construct the boolean function that represents **and**. And is a *binary* operator so instead of a single value it accepts a *pair* of boolean values.
OK, you know what **and** means and I know what it means, but what about those annoying people that want everything to be formally specified (nudge, nudge). Well we already know how we can satisfy them - we just have to construct the boolean function that represents **and**. Because **and** is a *binary* operator, instead of a single value the function would accept a *pair* of boolean values.
![And](and.svg)
And the equivalent truth-table (in which **∧** is the symbol for **and**.)
Here is the equivalent truth-table (in which **∧** is the symbol for **and**.)
| p | q | p ∧ q |
|---| --- | --- |
@ -209,7 +199,7 @@ And the equivalent truth-table (in which **∧** is the symbol for **and**.)
| False | True | False |
| False | False | False |
We can do the same for **or**, just the table.
We can do the same for **or**, here is the table.
| p | q | p q |
|---| --- | --- |
@ -220,7 +210,7 @@ We can do the same for **or**, just the table.
**Task:** Draw the diagram for **or**.
Using those we can also prove some axiom schemas we can later use:
Using those tables, we can also prove some axiom schemas we can later use:
- For **and**: **p ∧ q → p** and **p ∧ q → q** "If I am tired and hungry, this means that I am hungry".
- For **or**: **p → p q** and **p → p q** "If I have a pen this means that I am either have a pen or a ruler".
@ -242,10 +232,8 @@ Here is the truth table.
Now there are some aspects of this which are non obvious so let's go through every case.
1. If *p* is true and *q* is also true, then **p** does imply **q** - obviously.
2. If *p* is true but *q* is false then **q** does not follow from **p**, cause it would have been true if it did.
3. If *p* is false but *q* is true, then **p** still does imply **q**. What the hell? Consider two factors:
- By saying that *p* implies *q* we don't say that the two are 100% interdependent e.g. the claim that "drinking causes headache" does not mean that drinking is the only source of headaches, and it won't be refuted the fact that you can get headache without drinking.
- The conclusion that *p* implies *q* is reached only if all four cases are satisfied, so this events means nothing by itself.
2. If *p* is true but *q* is false then **q** does not follow from **p** - cause *q* would have been true if it did.
3. If *p* is false but *q* is true, then **p** still does imply **q**. What the hell? Consider that by saying that *p* implies *q* we don't say that the two are 100% interdependent e.g. the claim that "drinking alcohol causes headache" does not mean that drinking is the only source of headaches.
4. And finally if **p** is false but **q** is false too, then **p** still does imply **q** (just some other day).
It might help you to remember that **p → q** (**p** implies **q**) is true when **-p q** (either **p** is false or **q** is true.)
@ -257,7 +245,7 @@ Let's examine the above formula, stating that **p → q** is the same as **-p
![Hilbert formula](hilbert_formula.svg)
We can prove this by using truth tables.
We can easily prove this by using truth tables.
| p | q | p → q | ¬p | q | ¬p q |
|---| --- | --- | --- | --- | --- |
@ -266,41 +254,38 @@ We can prove this by using truth tables.
| False | True | **True** | True | True | **True** |
| False | False | **True** | True | False | **True** |
But we can also prove it using axioms and rules of inference (axiom schemas that are used at each step are specified at the right-hand side, the rule of inference is modus ponens.
But It would be much more intuitive if we do it using axioms and rules of inference. To do so, we have to start with the formula we have (***p → q**) plus the axiom schemas and arrive at the formula we want to prove (**-p q**). The formulas that are used at each step are specified at the right-hand side, the rule of inference is modus ponens.
![Hilbert proof](hilbert_proof.svg)
To really prove that the two formulas are equivalen we have to also do it the other way around (start with (**-p q**) and (***p → q**)).
Intuinistic logic. The BHK interpretation
===
Althought the classical truth-functional interpretation of logic is correct in its own right, it doesn't fit well the categorical framework that we have created: It is too "low-level", it relies on manipulating the values of the propositions, it doesn't in any way show the connection between **and** and **or** that we already saw. For these and other reasons (mostly other, probably), in the 20th century a whole new school of logic was founded, called *intuinistic logic*.
If classical logic is based on *set theory*, intuinistic logic is based on *category theory* and related formalisms.
If *classical logic* is based on Plato's theory of forms, then intuinism began with a philosophical idea originating from Kant and Schopenhauer: the idea that the world as we experience it is largely predetermined of out perceptions of it. As the mathematician L.E.J. Brouwer puts it.
Although the classical truth-functional interpretation of logic works and is correct in its own right, it doesn't fit well the categorical framework that we are using here: It is too "low-level", it relies on manipulating the values of the propositions, it doesn't in any way show the connection between **and** and **or** that we already saw. For these and other reasons (mostly other, probably), in the 20th century a whole new school of logic was founded, called *intuinistic logic*.
If classical logic is based on *set theory*, intuinistic logic is based on *category theory* and related theories. If *classical logic* is based on Plato's theory of forms, then intuinism began with a philosophical idea originating from Kant and Schopenhauer: the idea that the world as we experience it is largely predetermined of out perceptions of it. As the mathematician L.E.J. Brouwer puts it.
> [...] logic is life in the human brain; it may accompany life outside the brain but it can never guide it by virtue of its own power.
There is a strong connection between category theory and the philosophy of Kant (from whom the term "category" was borrowed (although it has been used by Aristotle before that)), but I won't go into detail about it here.
Classical and intuinistic logic diverge from one another right from the start - because intuinistic logic deals with *constructing* a proof rather than *discovering* or *unveiling* a universal truth, it is *off with the principle of bivalence*, that is, we have no basis to claim that each statements is necessarily *true or false*. For example, there might be a statements that might not be provable not because they are false, but simply because they fall outside of the domain of a given logical system (the twin-prime conjecture is often given as an example for this.)
Classical and intuinistic logic diverge from one another right from the start, when the philosophical foundations are concerned - because intuinistic logic deals with *constructing* a proof rather than *discovering* or unveiling a universal truth we have to be *off with bivalence*, that is, we have no basis to claim that each statements is necessarily *true or false*. For example, there might be a statements that might not be provable not because they are false, but simply because they fall outside of the domain of a given logical system (the twin-prime conjecture is often given as an example.)
Due to the reasons outlined above, intuinistic logic is not bivalent, we cannot have all propositions reduced to true and false.
So, intuinistic logic is not bivalent, i.e. we cannot have all propositions reduced to true and false.
![The True/False dichotomy](true_false.svg)
But one thing that we still do have are propositions that are "true" in the sense that a proof for them is given - the primary propositions. So with some caveats (which we will see later) we can think of the bivalence between true and false proposition might be thought out as similar to the bivalence between the existence or non-existence of a proof (there either is a proof of a given proposition or there isn't).
One thing that we still do have are propositions that are "true" in the sense that a proof for them is given - the primary propositions. So with some caveats (which we will see later) the bivalence between true and false proposition might be thought out as similar to the bivalence between the existence or non-existence of a proof for a given proposition - there either is a proof of it or there isn't.
![The proved/unproved dichotomy](proved_unproved.svg)
This is known as the as the BrouwerHeytingKolmogorov (BHK) interpretations of intuinistic logic.
This bivalence is at the heart of what is called the BrouwerHeytingKolmogorov (BHK) interpretation of logic, something that we will look into next.
The original formulation of the BHK interpretation does not depend on any particular mathematical theory. Here we illustrate it using the language of set theory (just so we can abandon it a little later).
The original formulation of the BHK interpretation is not based on any particular mathematical theory. Here we will first illustrate it using the language of set theory (just so we can abandon it a little later).
The **and** and **or** operations
---
As the existence of a proof of a proposition is taken to mean that the proposition is true, the definitions of **and** is rather simple - the proof of **A ∧ B** is just *a pair* containing a proof of **A**, and a proof of **B** e.g. *a set-theoretic product* of the two (see chapter 2). The principle is similar as with primary proposition - if the pair of proofs of **A** and **B** exist (i.e. if both proofs exist) then the proof of **A ∧ B** can be constructed.
As the existence of a proof of a proposition is taken to mean that the proposition is true, the definitions of **and** is rather simple - the proof of **A ∧ B** is just *a pair* containing a proof of **A**, and a proof of **B** i.e. *a set-theoretic product* of the two (see chapter 2). The principle for determining whether the proposition is true or false is similar to that of primary propositions - if the pair of proofs of **A** and **B** exist (i.e. if both proofs exist) then the proof of **A ∧ B** can be constructed (and so **A ∧ B** is "true").
![And in the BHK interpretation](bhk_and.svg)
@ -315,7 +300,7 @@ Saying that **A** implies **B** (**A → B**) would just mean that there exist a
And the *modus ponens* rule of inference is just the fact that if we have a proof of **A** we can call this function (**A → B**) to obtain a proof of **B**.
(Note that in order for this to work, we also need to define the functions in terms of sets i.e. we need to have a set representing **A → B** for each **A** and **B**, which is possible, but we will not describe how here.)
(Note that in order for this to work, we also need to define the functions in terms of sets i.e. we need to have a set representing **A → B** for each **A** and **B**. We will come back to this later.)
The *negation* operation
---
@ -343,7 +328,7 @@ The only way for there to be such function is if the set of proof of the proposi
Classical VS Intuinistic logic
---
Although intuinistic logic seems to differ a lot from classical logic, it actually doesn't - if we try to deduce the axiom schemas/rules of inference that correspond to the definition of the structures outlined above, we would see that they are virtually the same as the ones that define classical logic. With one exception concerning the *double negation elimination axiom* that we saw earlier, a version of which is known as *the law of excluded middle*.
Although intuinistic logic seems to differ a lot from classical logic, it actually doesn't - if we try to deduce the axiom schemas/rules of inference that correspond to the definitions of the structures outlined above, we would see that they are virtually the same as the ones that define classical logic. With one exception concerning the *double negation elimination axiom* that we saw earlier, a version of which is known as *the law of excluded middle*.
![The formula of the principle of the excluded middle](excluded_middle_formula.svg)
@ -352,9 +337,9 @@ This law is valid in classical logic and is true when we look at it in terms of
Logics as categories
===
Leaving the differences between intuinistic and classical logics aside, the BHK interpretation is interesting because it provides that higher-level view of logic, that we need in order to represent it in terms of category theory.
Leaving the differences between intuinistic and classical logics aside, the BHK interpretation is interesting because it provides that higher-level view of logic, that we need in order to construct a interpretation of it based on category theory.
Such higher-level interpretations of logic are sometimes called an *algebraic* interpretations, *algebraic* being an umbrella term describing all structures that can be represented using category theory, like groups and orders.
Such higher-level interpretations of logic are sometimes called an *algebraic* interpretations, *algebraic* being an umbrella term describing all structures that can be represented using category theory like groups and orders.
The Curry-Howard isomorphism
---
@ -370,11 +355,11 @@ This similarity is known as the *Curry-Howard isomorphism*.
Cartesian closed categories
---
Knowing about the Curry-Howard isomorphism and knowing also that programming languages can be described by category theory may lead us to think that *category theory is part of this isomorphism as well*. And we would be quite correct, this is why it's sometimes called the Curry-Howard-*Lambek* isomorphism, Lambek being the person who discovered the the categorical side.
Knowing about the Curry-Howard isomorphism and knowing also that programming languages can be described by category theory may lead us to think that *category theory is part of this isomorphism as well*. And we would be quite correct, this is why it's sometimes known as the Curry-Howard-*Lambek* isomorphism, Lambek being the person who discovered the categorical side.
Let's examine this isomorphism (without being too formal about it). As all other isomorphisms, it comes in two parts.
The first part is finding a way to convert a logical system into a category - this would not be hard for us, as sets form a category and the flavour of the BHK interpretation that we saw is based on sets.
The first part is finding a way to convert a logical system into a category - this would not be hard for us, as sets form a category and the flavor of the BHK interpretation that we saw is based on sets.
![Logic as a category](category_curry_logic.svg)
@ -391,7 +376,7 @@ Categories that adhere to these criteria are called *cartesian closed categories
Logics as orders
---
We will now do something that is quite characteristic of category theory - examininig a concept in a more limited version of the theory, in order to make things simpler for ourselves.
We will now do something that is quite characteristic of category theory - examining a concept in a more limited version of the theory, in order to make things simpler for ourselves.
So we already saw that a logical system, along with the set of primary propositions forms a category.
@ -414,7 +399,7 @@ Now let's examine the question that we asked before - exactly which ~~categories
The **and** and **or** operations
---
By now you probably realized that the **and** and **or** operations are the bread and butter of logic (although it's not clear which is which). As we saw, in the BHK interpretation those were represented by set *products* and *sums*. And the equivalent constructs in the realm of order theory are *meets* and *joins* (in category-theorethic terms *products* and *coproducts*.)
By now you probably realized that the **and** and **or** operations are the bread and butter of logic (although it's not clear which is which). As we saw, in the BHK interpretation those were represented by set *products* and *sums*. And the equivalent constructs in the realm of order theory are *meets* and *joins* (in category-theoretic terms *products* and *coproducts*.)
Here comes the first criteria for an order to represent logic accurately - it has to have **meet** and **join** operations for all elements. Having two elements without a meet would mean that you would have a logical system where there are propositions for which you cannot say that one or the other is true. And this not how logic works, so our order has to have meets and joins for all elements. Incidentally we already know how such orders are called - they are called *lattices*.
@ -451,7 +436,7 @@ Conversely, the proof of **True** (or the statement that "**True** is true") is
![True, represented as a Hasse diagram](lattice_true.svg)
So **True** and **False** are just the *greatest* and *least* objects of our order (in category-theorethic terms *terminal* and *initial* object.) This is another example of the categorical concept of duality - **True** and **False** are dual to each other, just like **and** and **or**.
So **True** and **False** are just the *greatest* and *least* objects of our order (in category-theoretic terms *terminal* and *initial* object.) This is another example of the categorical concept of duality - **True** and **False** are dual to each other, just like **and** and **or**.
![The whole logical system, represented as a Hasse diagram](lattice_true_false.svg)
@ -466,7 +451,7 @@ How would this object be described? You guessed it, using categorical language i
![Implies operation](implies.svg)
This structure is actually a categorical reincarnation our favourite rule of inference, the *modus ponens* (**A ∧ (A → B) → B**). This rule is the essence of the **implies** operation and, because we already know how the operations that it contains (**and** and **implies**) are represented in our lattice, we can directly "categorize" it and use it as a definition, saying that **(A → B)** is the object which is related to objects **A** and **B** in such a way that such that **A ∧ (A → B) → B**.
This structure is actually a categorical reincarnation our favorite rule of inference, the *modus ponens* (**A ∧ (A → B) → B**). This rule is the essence of the **implies** operation and, because we already know how the operations that it contains (**and** and **implies**) are represented in our lattice, we can directly "categorize" it and use it as a definition, saying that **(A → B)** is the object which is related to objects **A** and **B** in such a way that such that **A ∧ (A → B) → B**.
![Implies operation with impostors](implies_modus_ponens.svg)

View File

@ -65,3 +65,11 @@ bivalence
Zermelo
Fraenkel
distributivity
CTI
monads
monad
hom-set
Curry-Howard-Lambek
Curry-Howard
Lambek
cartesian

View File

@ -14,42 +14,42 @@ Category theory has a variety of applications:
- It is used in other scientific disciplines e.g. quantum mechanics:
> [...] the first paper about diagrammatic reasoning for novel quantum features (Coecke, 2003) was presented at the first QPL. The categorical formalisation of this result (Abramsky and Coecke, 2004), now referred to as categorical quantum mechanics, became a hit within the computer science semantics community, and ultimately allowed for several young people to establish research careers in this area. **Top computer science conferences (e.g. LiCS and ICALP) indeed regularly accept papers on categorical quantum mechanics, and more recently leading physics journals (e.g. PRL and NJP) have started to do so too**.
> [...] The categorical formalization of this result (Abramsky and Coecke, 2004), now referred to as categorical quantum mechanics, became a hit within the computer science semantics community, and ultimately allowed for several young people to establish research careers in this area. **Top computer science conferences (e.g. LiCS and ICALP) indeed regularly accept papers on categorical quantum mechanics, and more recently leading physics journals (e.g. PRL and NJP) have started to do so too**.
Source: Picturing Quantum Processes by Bob Coecke (Cambridge University Press - 2017)
- It may also be used as an axiomatic foundation for mathematics, as an alternative to set theory and other proposed foundations.
- Last but not least (perhaps even most importantly) category theory has a very good potential as a *teaching tool* for a variety of other concepts.
- Last but not least (perhaps even most importantly) category theory has a very good potential as a *teaching tool* for a variety of other subjects.
About the book
---
I am writing a primer in category theory and various related concepts in "higher" mathematics that is *really* accessible to people with no prior exposure to the subject without being dumbed down, by utilizing visual explanations.
My book serves as chapter 0 going through the gist of the material that any other similar introductory book covers, but covering it in a way that would enable non-mathematicians to swift through with ease.
My book serves as chapter 0 going through the gist of the material covered by other similar introductory books, but doing so in a way that would enable non-mathematicians to swift through with ease.
Reading it would enable my readers to effortlessly go through any academic introduction to category theory, as well as to start tacking resources that use category theory as a tool to threat other subjects.
About the author
---
I am a web programmer with background in design and technical communication. I have been studying category theory for many years and applying what I learned in my work.
I am a nerd who works in web programming with background in design and technical communication. I first discovered category theory through programming, but then started pursuing it in its own right. Finding math more and more fascinating.
Currently living in Sofia with my wife and kid.
Similar resources
===
To my knowledge, Category Theory Illustrated (CTI) does not have any *direct* competition, as the materials I have seen are usually have a much more high barrier to entry.
Here is [a good list of category theory introductions](https://github.com/prathyvsh/category-theory-resources)
In this document, I will concentrate on my favourite introductions, target the two biggest target audiences for CTI - programmers (nerds) and scientists (academics).
"Category Theory Illustrated" (which I will call CTI from now on), however possesses some unique features that make it different from the ones that I looked into. Here I will examine those features of CTI when compared specifically to my two favorite introductions, which also target the two biggest target audiences for CTI - programmers (nerds) and scientists (academics).
- Category Theory for Programmers by Bartosz Milewski (self-published- 2018) - [Full text](https://github.com/hmemcpy/milewski-ctfp-pdf/)
- Category Theory for the Sciences by David I. Spivak (MIT press - 2014) - [Free version](http://math.mit.edu/~dspivak/CT4S.pdf)
For comparison, I am attaching excerpts from both of them, as well as one from CTI, that explain the concept of an *order*
I am attaching excerpts from both of them, as well as one from CTI, that explain a similar concept - the concept of an *order*.
Category Theory for Programmers
---
@ -90,7 +90,7 @@ have
[Source](http://math.mit.edu/~dspivak/CT4S.pdf) (page 93)
Orders also discussed on page 132 in the same link
The categorical aspect of orders is described on page 132 in the same link.
Category Theory Illustrated
---
@ -154,20 +154,17 @@ Category Theory Illustrated
[Source](/category-theory-illustrated/04_order)
Features
===
I will now continue by with examining the points which make my work unique compared to those and other similar books.
Diagrams
---
Category theory is very *visual* in its nature. Category-theoretic diagrams are not merely supplemental illustrations for the concepts, but are often the very language that is used to define those concepts e.g. defining a more advanced concept such as *natural transformation* without diagrams is practically impossible. So having a lot of diagrams is essential for people who are inexperienced to understand the concepts.
Category theory is very *visual* in its nature. Category-theoretic diagrams are not merely supplemental illustrations for its concepts, but are often the very language that is used to define those concepts e.g. defining a more advanced concept such as *natural transformation* without diagrams is practically impossible. So having a lot of diagrams is essential for people who are inexperienced to understand the concepts.
However, books on category theory typically have as many diagrams as other math textbooks. I consider this a huge missed opportunity for making the subject more approachable. This was the original motivation for the creation of CTI.
Besides being more in quantity, the diagrams of CTI combine many different prioms from traditional communication design, such as the use of color for example, in order to illuminate the different subjects and abstractions that I am covering.
Besides being more in quantity, the diagrams of CTI use many techniques from traditional communication design (such as the use of color for example) in order to illuminate the different subjects and abstractions that I am covering.
Both books that I am reviewing have less diagrams than CTI. From the resources that I have examined, the only author who stresses on diagrams as much as I do is Tai-Danae Bradley (her blog is [https://www.math3ma.com/](https://www.math3ma.com/).)
@ -176,14 +173,14 @@ The diagrams in CTI had received universal praise from many audiences. The unive
Verbosity of language
---
Leaving the diagrams aside, the descriptions in CTI are much more verbose than equivalent descriptions in other books. This is clearly visible in the excerpts attached: although the authors of the books that I am reviewing have quite different styles of presentation, they both move much faster with the exposition, which would be OK for readers who have prior experience with math and computer science, but would be very hard to follow by lay people (I know that from my own experience).
Leaving the diagrams aside, we can see that the descriptions in CTI are much more verbose than equivalent descriptions in other books. This is clearly visible in the excerpts attached: although the authors of the books that I am reviewing have quite different styles of presentation, they both move much faster with the exposition, which would be OK for readers who have prior experience with math and computer science, but would be very hard to follow by lay people (I know that from my own experience).
In CTI, I try above all to keep a really slow pace and be gentle in my exposition, taking the time to stress out the important parts of the descriptions and the places where misunderstanding might occur, such as the fact that the term "object" has entirely different meanings in computer science and in category theory.
In CTI, I try above all to be gentle in my exposition, taking the time to stress out the important parts of the descriptions and the places where misunderstanding might occur, such as the fact that the term "object" has entirely different meanings in computer science and in category theory.
Chapter on logic
---
Category theory is deeply bound with mathematical logic. Although the connection is mentioned in many books, it isn't fully examined by any introductory texts on category theory that I have seen.
Category theory is deeply bound with mathematical logic. But although the connection is mentioned in many books, it isn't fully examined by any introductory texts on category theory that I have seen.
CTI has a dedicated chapter on logic (the longest one in the book), which introduces the connection and makes use of it to give context to some categorical phenomena, such as limits.
@ -197,7 +194,7 @@ Programmers
### who are curious about category theory, because they are into functional programming.
Functional programming is on the rise with both new languages and technologies that are gaining traction (Elixir, Clojure), and mainstream languages, trying to catch up (Java, JavaScript). There are many programmers who are interested in functional programming and category theory, but lack the math background to tackle the existing resources. CTI is a way for those people to understand "what the fuss is about" before diving to such technologies.
Functional programming is on the rise with both new languages and technologies that are gaining traction (Elixir, Clojure), and mainstream languages, trying to catch up (Java, JavaScript) and so there are many programmers who are interested in functional programming and category theory, but lack the math background to tackle the existing resources. CTI is a way for those people to understand "what the fuss is about" before diving into such technologies.
Students
---