Merge branch 'test' of https://github.com/urbit/urbit into test

2024-12-17 11:51:32 +03:00 · 2014-08-22 18:38:33 -07:00 · 2014-08-22 18:38:33 -07:00 · b7270db63a
commit b7270db63a
parent b28000dd01 ef682f2aca
1 changed files with 191 additions and 19 deletions
--- a/main/pub/src/doc/ref/hoon/morphology.md
+++ b/main/pub/src/doc/ref/hoon/morphology.md
@ -1,40 +1,212 @@
 Morphology
 ==========

-Hoon is a statically typed language that compiles to Nock. 
+Hoon is a statically typed language that compiles to Nock.  You can test out
+most of the compilation routines in the REPL, so we'll point out some of those
+as we go along.  At the highest level, there is `++make`, which turns text into
+nock.

-A type is a function whose domain is the set of all nouns and whose range is the set of all nouns that are members of that type.
+    ~hoclur-bicrel/try=> `*`(make '|=  [@ @ud]  +<')
+    [8 [1 0 0] [1 0 6] 0 1]

-The compilation process is as follows:
+If you want to know how Hoon code is compiled, start your investigation in
+`++make`.  Another way to do this, of course, is with `!=`.

-First, a runic expression is parsed into an abstact syntax-tree, called a `twig`
+    ~hoclur-bicrel/try=> !=(|=([@ @ud] +<))
+    [8 [1 0 0] [1 0 6] 0 1]

-    expression => twig
+The difference here is of course that `!=` requires you to put in the code
+literally and syntactically correctly.  So, for example, you can't use tall
+form runes within a wide form usage of `!=`.  `++make` may be called on literal
+text, which may even be programmatically generated.  Essentially, `++make` is
+more general, but `!=` is convenient for learning and debugging.

-A subject type is generated from the twig. This type describes the subject
-of the Nock formula that the twig compiles to.
+At a high level, the compilation process is as follows:

-    twig => [subject-type twig]
+First, a runic expression is parsed into an abstact syntax tree, called a
+`twig`:

-The twig is then compiled into nock formula, and the type of the product of
-the formula is inferred.
+    text => twig
+
+This can be tested at the command line with `++ream`:
+
+    ~hoclur-bicrel/try=> (ream '|=  [@ @ud]  +<')
+    [ %brts
+      p=[p=[%axil p=[%atom p=~.]] q=[%axil p=[%atom p=~.ud]]]
+      q=[%cnts p=~[[%.y p=6]] q=~]
+    ]
+
+This is simply a parser -- no compilation happens here.  The details of the
+parser can be found in the syntax section of this doc.
+
+Next, we get into the real meat of the compiler.  In the compiler proper, which
+is `++mint` in `++ut`, we take a twig and a subject type and turn it into a
+product type and a blob of nock.

    [subject-type twig] => [product-type nock-formula]

-As long as subject-type is a correct description of some subject, you can
-take any twig and compile it against subject-type, producing a formula such
-that 
+For example, we can call `++mint` on the twig we produced earlier with a
+subject type of any noun (pardon the necessity for splitting the input line).

-    *[subject formula] 
+    ~hoclur-bicrel/try=> (~(mint ut %noun) %noun
+        [%brts [[%axil [%atom ~.]] [%axil [%atom ~.ud]]] [%cnts ~[[%.y 6]] ~]])

-is a product correctly described by product-type.
+    [   p
+      [ %core
+        p=[%cell p=[%cell p=[%atom p=%$] q=[%atom p=%ud]] q=%noun]
+          q
+        [ p=%gold
+          q=[%cell p=[%cell p=[%atom p=%$] q=[%atom p=%ud]] q=%noun]
+          r=[p=[0 6] q={[p=%$ q=[%ash p=[%cnts p=~[[%.y p=6]] q=~]]]}]
+        ]
+      ]
+      q=[%8 p=[%1 p=[0 0]] q=[p=[%1 p=[0 6]] q=[%0 p=1]]]
+    ]
+
+Note that in the result we get first a type, which is a gold core with an ash
+gate.  The second part, though, is exactly the nock that was produced by
+`++make`.  It looks a little different because it has some faces on it, but it
+is in fact the same.
+
+The astute reader will notice that we casually starting referring to types
+without defining them.  We must rectify this heinous atrocity.
+
+A "type" is simply a set of possible values, combined with a set of semantics
+for operating on these values.  These are defined in Hoon as one of the ten
+kinds of type found in `++type`.  At its most general, the `noun` type contains
+all Hoon nouns.  Everything in Hoon is a noun.  The `%void` type contains no
+values at all.  The `%atom` type contains only atoms.  The `%cube` type is
+parameterized by a value, and it contains only that single value.  The other
+types are defined in the lexicon.
+
+Our type inference algorithms work well enough that in Hoon there is no direct
+syntax for defining or declaring a type. There is only a syntax for
+constructing twigs.  Types are always produced by inference.  Of course, since
+we often need to refer to types, there are a number of runes that are specfically
+used for referring to types.  These runes act on `tile`s.
+
+What are the usual things we want to do with types?  We sometimes want to test
+if some value is of a particular type.  We sometimes want to declare that a
+gate takes a particular type of argument, or that it produces a particular type
+of result.  We sometimes want to coerce an untyped value into a type so that we
+always know what type we are operating on.  We could write twigs to do each of
+these things for every type we wish to use, and our type inference algorithms
+will figure out what type we're talking about each time.
+
+It would work, but it would be miserable.  Thankfully, there's an easier way.
+It is possible to generate twigs for each of the above use cases from two
+pieces of information:  (1) a tile describing a type and (2) a reference to
+which case we want to generate.  This eliminates much of the tedious
+boilerplate.
+
+The first and most important thing to remember about tiles is that tiles are
+not types.  The second and nearly as important thing to remember about tiles
+is that every tile has a unique "icon", which is an associated type.  A tile
+is not a type, but every tile corresponds to some type.  But what is a tile?
+A tile is a nice little representation of a type that may be acted upon and
+reduced in several ways.  Remember, types never show up in our code -- they are
+always produced by inference.  But sometimes it's convenient to have little
+macros to do all the little things we usually want to do with types without
+having to rewrite them for every type.
+
+Formalizing the operations on a tile, there are exactly four.  We will briefy
+describe them here, but they are documented more thoroughly elsewhere.
+
+Bunting a tile is simply creating a blank default example of the tile's icon.
+This may seem to have limited usefulness, but this is actually the most common
+use of tiles.  This is due to the the way in which we create, for example,
+gates.  The sample of a gate is usually defined in terms of a tile, as in `|=
+[p=@ q=@]  (add p q)`.  The `|=` rune takes a tile (in this case `[p=@ q=@]`
+and bunts it to create a default sample of `[0 0]` with the type of a cell of
+two atoms.  Note that this is different from `|=  [p=%0 q=%0]  (add p q)`,
+which still bunts to create a default sample of `[0 0]`, but whose type is a
+cell of two constant `0`s.  In the first case, when we call our gate, the type
+checker will make sure that we are only ever replacing the sample with a cell
+of two atoms, while in the second case the type checker will enforce that we
+are only ever replacing the sample with a cell of two `0`s.
+
+Clamming a tile is creating a gate whose sample is a noun and whose product is
+a member of the icon.  This allows us to coerce any noun into any type.  If the
+noun is not a member of that type, then we produce some member of the type that
+is either in some sense "similar" to the given noun or else we simply give a
+default value, usually the bunt of this tile.
+
+Fishing a tile is testing whether a particular value is of that type.  This
+returns a loobean.  This obviously has many uses.
+
+Whipping a tile is something you'll never have to do directly.  Whipping is
+never directly used in any Hoon source code.  Whipping is used internally by
+clamming.
+
+In summary, a tile is simply a convenient syntax for creating well-typed nouns.
+Say that again, *a tile is simply a convenient syntax for creating well-typed
+nouns.*  A tile is not a tiwg, but tiles always are reduced statically in one
+of four ways to a twig.  `++tile` is a sort of intermediate representation
+between text and twigs that is often used when we're referring to types.
+
+Returning from our digression about types and tiles to our discussion of the
+compilation process, recall that `++mint` takes a subject-type and a twig and
+compiles it into a product-type and a blob of nock.
+
+    [subject-type twig] => [product-type nock-formula]
+
+As we compile each twig, we compile it against a subject type.  During the
+compilation of a twig, we obviously don't have access to the value of the
+subject (else compilation would include running the code).  We do, however,
+have some guarantees about its value.  We have its type.
+
+Most runes don't change the subject type, but some do.  In the most simple
+example, consider `=>  p  q`, which means to evaluate `q` with a subject of
+`p`.  Here, the subject type of `q` is simply the type of `p`.
+
+In a slightly more complicated example, consider `?:  p  q  r`, which means
+simply to evaluate `q` if `p` is true, else evaluate `r`.  When compiling `q`,
+we get to assume in the subject type that `p` is true while when compiling `r`,
+we get to assume in the subject type that `p` is false.  This is used to great
+practical purpose, for example, when handling lists.  If `p` is a test whether
+the list is empty (which would be `?=(~ l)`), then in `q` we can assume that
+the list is empty while in `r` we know that the list has a head and a tail.
+Without that test, if you attempt to refer to the head of the list, the
+compiler will complain that it cannot verify there is even a head to which to
+refer.  Thus, the compiler will generate `find-limb` and `find-fork` errors.
+The `find-limb` refers to its inability to find the head of the list while the
+`find-fork` refers to the fact that it's looking in a fork type -- that is, it
+believes that the values is of one of multiple types, and in at least one of
+the constituent types there is no head.  The error messages are described in
+detail in the lexicon.
+
+Recapping, when compiling Hoon code, we compile each individual twig against
+the subject with which it will eventually be called.  The product is a nock
+formula and the type of value that it may produce.  Thus, both the nock formula
+and the product type may depend on the both the subject type and the twig in
+question.
+
+It's obvious that the product type will usually depend on the subject type, but
+it's less obvious when the nock formula will depend on the type.  It does,
+however, happen at times.  When resolving a face, for example, the axis that
+ends up in the nock formula depends on the where the face is in the subject.
+We only know this because faces are in the subject type.  Thus, in `=>  [p=42
+q=1.337]  p`, the `p` twig compiles to nock `[0 2]` while in `=> [q=42 p=1.337]
+p`, the `p` twig compiles to nock `[0 3]`.  This is true even though the actual
+nock produced by `[p=42 q=1.337]` is the same as that produced by `[q=42
+p=1.337]`.  Thus, the nock formula may depend on the subject type.  It is for
+this reason that we say that a type defines not only a set of values, but also
+the semantics for operating on those values.
+
+As long as some value is in the subject type, you can run it against the
+produced nock formula as `*[subject formula]` and get a value in the product
+type.
+
+We've ignored one question thus far:  it's all well and good once we've started
+compiling, for we know the subject type.  But what subject type do we start
+with?  We could, of course, put some restrictions on the subject type of
+compiled Hoon code, but (1) there's no reason to do that, and (2) since this
+will be returned as nock, and nock is untyped, the compiler cannot actually
+make any guarantees about what subject it will be called with.  Thus, we start
+the compilation with a subject type of all nouns.

-This works well enough that in Hoon there is no direct syntax for defining or
-declaring a type. There is only a syntax for constructing twigs.  Types are
-always produced by inference.


-Let's look at a simple example of the above proc


 Hoon has 120 [XX count] digraph runes. The choice of glyph is not random. The