mirror of
https://github.com/ilyakooo0/urbit.git
synced 2024-12-17 11:51:32 +03:00
Merge branch 'test' of https://github.com/urbit/urbit into test
This commit is contained in:
commit
b7270db63a
@ -1,40 +1,212 @@
|
||||
Morphology
|
||||
==========
|
||||
|
||||
Hoon is a statically typed language that compiles to Nock.
|
||||
Hoon is a statically typed language that compiles to Nock. You can test out
|
||||
most of the compilation routines in the REPL, so we'll point out some of those
|
||||
as we go along. At the highest level, there is `++make`, which turns text into
|
||||
nock.
|
||||
|
||||
A type is a function whose domain is the set of all nouns and whose range is the set of all nouns that are members of that type.
|
||||
~hoclur-bicrel/try=> `*`(make '|= [@ @ud] +<')
|
||||
[8 [1 0 0] [1 0 6] 0 1]
|
||||
|
||||
The compilation process is as follows:
|
||||
If you want to know how Hoon code is compiled, start your investigation in
|
||||
`++make`. Another way to do this, of course, is with `!=`.
|
||||
|
||||
First, a runic expression is parsed into an abstact syntax-tree, called a `twig`
|
||||
~hoclur-bicrel/try=> !=(|=([@ @ud] +<))
|
||||
[8 [1 0 0] [1 0 6] 0 1]
|
||||
|
||||
expression => twig
|
||||
The difference here is of course that `!=` requires you to put in the code
|
||||
literally and syntactically correctly. So, for example, you can't use tall
|
||||
form runes within a wide form usage of `!=`. `++make` may be called on literal
|
||||
text, which may even be programmatically generated. Essentially, `++make` is
|
||||
more general, but `!=` is convenient for learning and debugging.
|
||||
|
||||
A subject type is generated from the twig. This type describes the subject
|
||||
of the Nock formula that the twig compiles to.
|
||||
At a high level, the compilation process is as follows:
|
||||
|
||||
twig => [subject-type twig]
|
||||
First, a runic expression is parsed into an abstact syntax tree, called a
|
||||
`twig`:
|
||||
|
||||
The twig is then compiled into nock formula, and the type of the product of
|
||||
the formula is inferred.
|
||||
text => twig
|
||||
|
||||
This can be tested at the command line with `++ream`:
|
||||
|
||||
~hoclur-bicrel/try=> (ream '|= [@ @ud] +<')
|
||||
[ %brts
|
||||
p=[p=[%axil p=[%atom p=~.]] q=[%axil p=[%atom p=~.ud]]]
|
||||
q=[%cnts p=~[[%.y p=6]] q=~]
|
||||
]
|
||||
|
||||
This is simply a parser -- no compilation happens here. The details of the
|
||||
parser can be found in the syntax section of this doc.
|
||||
|
||||
Next, we get into the real meat of the compiler. In the compiler proper, which
|
||||
is `++mint` in `++ut`, we take a twig and a subject type and turn it into a
|
||||
product type and a blob of nock.
|
||||
|
||||
[subject-type twig] => [product-type nock-formula]
|
||||
|
||||
As long as subject-type is a correct description of some subject, you can
|
||||
take any twig and compile it against subject-type, producing a formula such
|
||||
that
|
||||
For example, we can call `++mint` on the twig we produced earlier with a
|
||||
subject type of any noun (pardon the necessity for splitting the input line).
|
||||
|
||||
*[subject formula]
|
||||
~hoclur-bicrel/try=> (~(mint ut %noun) %noun
|
||||
[%brts [[%axil [%atom ~.]] [%axil [%atom ~.ud]]] [%cnts ~[[%.y 6]] ~]])
|
||||
|
||||
is a product correctly described by product-type.
|
||||
[ p
|
||||
[ %core
|
||||
p=[%cell p=[%cell p=[%atom p=%$] q=[%atom p=%ud]] q=%noun]
|
||||
q
|
||||
[ p=%gold
|
||||
q=[%cell p=[%cell p=[%atom p=%$] q=[%atom p=%ud]] q=%noun]
|
||||
r=[p=[0 6] q={[p=%$ q=[%ash p=[%cnts p=~[[%.y p=6]] q=~]]]}]
|
||||
]
|
||||
]
|
||||
q=[%8 p=[%1 p=[0 0]] q=[p=[%1 p=[0 6]] q=[%0 p=1]]]
|
||||
]
|
||||
|
||||
Note that in the result we get first a type, which is a gold core with an ash
|
||||
gate. The second part, though, is exactly the nock that was produced by
|
||||
`++make`. It looks a little different because it has some faces on it, but it
|
||||
is in fact the same.
|
||||
|
||||
The astute reader will notice that we casually starting referring to types
|
||||
without defining them. We must rectify this heinous atrocity.
|
||||
|
||||
A "type" is simply a set of possible values, combined with a set of semantics
|
||||
for operating on these values. These are defined in Hoon as one of the ten
|
||||
kinds of type found in `++type`. At its most general, the `noun` type contains
|
||||
all Hoon nouns. Everything in Hoon is a noun. The `%void` type contains no
|
||||
values at all. The `%atom` type contains only atoms. The `%cube` type is
|
||||
parameterized by a value, and it contains only that single value. The other
|
||||
types are defined in the lexicon.
|
||||
|
||||
Our type inference algorithms work well enough that in Hoon there is no direct
|
||||
syntax for defining or declaring a type. There is only a syntax for
|
||||
constructing twigs. Types are always produced by inference. Of course, since
|
||||
we often need to refer to types, there are a number of runes that are specfically
|
||||
used for referring to types. These runes act on `tile`s.
|
||||
|
||||
What are the usual things we want to do with types? We sometimes want to test
|
||||
if some value is of a particular type. We sometimes want to declare that a
|
||||
gate takes a particular type of argument, or that it produces a particular type
|
||||
of result. We sometimes want to coerce an untyped value into a type so that we
|
||||
always know what type we are operating on. We could write twigs to do each of
|
||||
these things for every type we wish to use, and our type inference algorithms
|
||||
will figure out what type we're talking about each time.
|
||||
|
||||
It would work, but it would be miserable. Thankfully, there's an easier way.
|
||||
It is possible to generate twigs for each of the above use cases from two
|
||||
pieces of information: (1) a tile describing a type and (2) a reference to
|
||||
which case we want to generate. This eliminates much of the tedious
|
||||
boilerplate.
|
||||
|
||||
The first and most important thing to remember about tiles is that tiles are
|
||||
not types. The second and nearly as important thing to remember about tiles
|
||||
is that every tile has a unique "icon", which is an associated type. A tile
|
||||
is not a type, but every tile corresponds to some type. But what is a tile?
|
||||
A tile is a nice little representation of a type that may be acted upon and
|
||||
reduced in several ways. Remember, types never show up in our code -- they are
|
||||
always produced by inference. But sometimes it's convenient to have little
|
||||
macros to do all the little things we usually want to do with types without
|
||||
having to rewrite them for every type.
|
||||
|
||||
Formalizing the operations on a tile, there are exactly four. We will briefy
|
||||
describe them here, but they are documented more thoroughly elsewhere.
|
||||
|
||||
Bunting a tile is simply creating a blank default example of the tile's icon.
|
||||
This may seem to have limited usefulness, but this is actually the most common
|
||||
use of tiles. This is due to the the way in which we create, for example,
|
||||
gates. The sample of a gate is usually defined in terms of a tile, as in `|=
|
||||
[p=@ q=@] (add p q)`. The `|=` rune takes a tile (in this case `[p=@ q=@]`
|
||||
and bunts it to create a default sample of `[0 0]` with the type of a cell of
|
||||
two atoms. Note that this is different from `|= [p=%0 q=%0] (add p q)`,
|
||||
which still bunts to create a default sample of `[0 0]`, but whose type is a
|
||||
cell of two constant `0`s. In the first case, when we call our gate, the type
|
||||
checker will make sure that we are only ever replacing the sample with a cell
|
||||
of two atoms, while in the second case the type checker will enforce that we
|
||||
are only ever replacing the sample with a cell of two `0`s.
|
||||
|
||||
Clamming a tile is creating a gate whose sample is a noun and whose product is
|
||||
a member of the icon. This allows us to coerce any noun into any type. If the
|
||||
noun is not a member of that type, then we produce some member of the type that
|
||||
is either in some sense "similar" to the given noun or else we simply give a
|
||||
default value, usually the bunt of this tile.
|
||||
|
||||
Fishing a tile is testing whether a particular value is of that type. This
|
||||
returns a loobean. This obviously has many uses.
|
||||
|
||||
Whipping a tile is something you'll never have to do directly. Whipping is
|
||||
never directly used in any Hoon source code. Whipping is used internally by
|
||||
clamming.
|
||||
|
||||
In summary, a tile is simply a convenient syntax for creating well-typed nouns.
|
||||
Say that again, *a tile is simply a convenient syntax for creating well-typed
|
||||
nouns.* A tile is not a tiwg, but tiles always are reduced statically in one
|
||||
of four ways to a twig. `++tile` is a sort of intermediate representation
|
||||
between text and twigs that is often used when we're referring to types.
|
||||
|
||||
Returning from our digression about types and tiles to our discussion of the
|
||||
compilation process, recall that `++mint` takes a subject-type and a twig and
|
||||
compiles it into a product-type and a blob of nock.
|
||||
|
||||
[subject-type twig] => [product-type nock-formula]
|
||||
|
||||
As we compile each twig, we compile it against a subject type. During the
|
||||
compilation of a twig, we obviously don't have access to the value of the
|
||||
subject (else compilation would include running the code). We do, however,
|
||||
have some guarantees about its value. We have its type.
|
||||
|
||||
Most runes don't change the subject type, but some do. In the most simple
|
||||
example, consider `=> p q`, which means to evaluate `q` with a subject of
|
||||
`p`. Here, the subject type of `q` is simply the type of `p`.
|
||||
|
||||
In a slightly more complicated example, consider `?: p q r`, which means
|
||||
simply to evaluate `q` if `p` is true, else evaluate `r`. When compiling `q`,
|
||||
we get to assume in the subject type that `p` is true while when compiling `r`,
|
||||
we get to assume in the subject type that `p` is false. This is used to great
|
||||
practical purpose, for example, when handling lists. If `p` is a test whether
|
||||
the list is empty (which would be `?=(~ l)`), then in `q` we can assume that
|
||||
the list is empty while in `r` we know that the list has a head and a tail.
|
||||
Without that test, if you attempt to refer to the head of the list, the
|
||||
compiler will complain that it cannot verify there is even a head to which to
|
||||
refer. Thus, the compiler will generate `find-limb` and `find-fork` errors.
|
||||
The `find-limb` refers to its inability to find the head of the list while the
|
||||
`find-fork` refers to the fact that it's looking in a fork type -- that is, it
|
||||
believes that the values is of one of multiple types, and in at least one of
|
||||
the constituent types there is no head. The error messages are described in
|
||||
detail in the lexicon.
|
||||
|
||||
Recapping, when compiling Hoon code, we compile each individual twig against
|
||||
the subject with which it will eventually be called. The product is a nock
|
||||
formula and the type of value that it may produce. Thus, both the nock formula
|
||||
and the product type may depend on the both the subject type and the twig in
|
||||
question.
|
||||
|
||||
It's obvious that the product type will usually depend on the subject type, but
|
||||
it's less obvious when the nock formula will depend on the type. It does,
|
||||
however, happen at times. When resolving a face, for example, the axis that
|
||||
ends up in the nock formula depends on the where the face is in the subject.
|
||||
We only know this because faces are in the subject type. Thus, in `=> [p=42
|
||||
q=1.337] p`, the `p` twig compiles to nock `[0 2]` while in `=> [q=42 p=1.337]
|
||||
p`, the `p` twig compiles to nock `[0 3]`. This is true even though the actual
|
||||
nock produced by `[p=42 q=1.337]` is the same as that produced by `[q=42
|
||||
p=1.337]`. Thus, the nock formula may depend on the subject type. It is for
|
||||
this reason that we say that a type defines not only a set of values, but also
|
||||
the semantics for operating on those values.
|
||||
|
||||
As long as some value is in the subject type, you can run it against the
|
||||
produced nock formula as `*[subject formula]` and get a value in the product
|
||||
type.
|
||||
|
||||
We've ignored one question thus far: it's all well and good once we've started
|
||||
compiling, for we know the subject type. But what subject type do we start
|
||||
with? We could, of course, put some restrictions on the subject type of
|
||||
compiled Hoon code, but (1) there's no reason to do that, and (2) since this
|
||||
will be returned as nock, and nock is untyped, the compiler cannot actually
|
||||
make any guarantees about what subject it will be called with. Thus, we start
|
||||
the compilation with a subject type of all nouns.
|
||||
|
||||
This works well enough that in Hoon there is no direct syntax for defining or
|
||||
declaring a type. There is only a syntax for constructing twigs. Types are
|
||||
always produced by inference.
|
||||
|
||||
|
||||
Let's look at a simple example of the above proc
|
||||
|
||||
|
||||
Hoon has 120 [XX count] digraph runes. The choice of glyph is not random. The
|
||||
|
Loading…
Reference in New Issue
Block a user