2015-11-03 22:49:57 +03:00
|
|
|
---
|
|
|
|
title: Hoon 101.0: nouns, spans, and molds
|
|
|
|
sort: 0
|
2015-11-05 02:25:02 +03:00
|
|
|
spam: true
|
2015-11-04 00:17:30 +03:00
|
|
|
next: true
|
2015-11-03 22:49:57 +03:00
|
|
|
---
|
|
|
|
|
|
|
|
# Hoon 101.0: nouns, spans and molds
|
|
|
|
|
|
|
|
Hoon is a strict, higher-order typed pure-functional language.
|
|
|
|
|
|
|
|
Why Hoon? Typed functional languages are known for a pleasant
|
|
|
|
phenomenon: once your code compiles, it's quite likely to work.
|
|
|
|
But most typed functional languages are conceptually dependent on
|
|
|
|
abstract advanced math, and difficult to understand without it.
|
|
|
|
|
|
|
|
Hoon is a typed FP language for the common street programmer.
|
|
|
|
Well-written Hoon is as concrete and data-oriented as possible.
|
|
|
|
The less functional magic you use, the better. But the magic is
|
|
|
|
there, mostly, if you need it.
|
|
|
|
|
|
|
|
The main disadvantage of Hoon is that its syntax and semantics
|
|
|
|
are unfamiliar. The syntax will remind too many of Perl, but
|
|
|
|
like most human languages (and unlike Perl) it combines a regular
|
2015-11-04 00:49:33 +03:00
|
|
|
core structure with irregular variations. And Hoon's semantic
|
2015-11-04 21:33:16 +03:00
|
|
|
complexity is bounded by the size of the compiler: type inference plus code
|
|
|
|
generation are 2000 lines of Hoon. Most peoples' experience is that the
|
|
|
|
language is much easier to learn than it looks.
|
2015-11-03 22:49:57 +03:00
|
|
|
|
2015-11-04 00:17:30 +03:00
|
|
|
> The name "Hoon" is from the Wallace Stevens poem, _Tea at the
|
|
|
|
Palaz of Hoon_. It also means "hooligan" in Australian.
|
2015-11-03 22:49:57 +03:00
|
|
|
|
2015-11-10 22:55:17 +03:00
|
|
|
## How to use this tutorial
|
|
|
|
|
|
|
|
Ideally, you've installed an Urbit planet (if you have a ticket)
|
|
|
|
or comet (if you don't). See the [user doc](../../../user). We
|
|
|
|
recommend opening up the dojo and just typing the examples; you
|
|
|
|
don't know a language until you know it in your fingers.
|
|
|
|
|
2015-11-03 22:49:57 +03:00
|
|
|
## Nouns: data made boring
|
|
|
|
|
|
|
|
A noun is an atom or a cell. An atom is any unsigned integer. A
|
|
|
|
cell is an ordered pair of nouns.
|
|
|
|
|
2015-11-05 02:52:26 +03:00
|
|
|
Nouns are trees; they have no cycles. Noun comparison is always
|
|
|
|
by value (the programmer can't test pointer equality). Nouns are
|
|
|
|
strict; there is no such thing as an infinite noun. And nouns are
|
|
|
|
immutable. There's just no way to have any real fun with nouns.
|
2015-11-03 22:49:57 +03:00
|
|
|
|
2015-11-04 00:17:30 +03:00
|
|
|
> Nouns are Lisp's S-expressions, minus a lot of hacks, tricks,
|
2015-11-03 22:49:57 +03:00
|
|
|
and features that made sense 50 years ago. In particular,
|
|
|
|
because atoms are not tagged (an atom can encode a string, for
|
2015-11-04 00:49:33 +03:00
|
|
|
instance), nouns depend on a static type system at a higher
|
|
|
|
layer. How do you print an atom if you don't know whether it's a
|
|
|
|
string or a number?
|
2015-11-03 22:49:57 +03:00
|
|
|
|
|
|
|
## A type system for nouns
|
|
|
|
|
|
|
|
One obstacle to learning Hoon is that it has two quite distinct
|
|
|
|
concepts that might equally be called a "type." Worse, most
|
|
|
|
other typed functional languages are mathy and share a basically
|
|
|
|
mathematical concept of "type." Hoon does not have this concept
|
|
|
|
at all. We can't avoid using the T-word occasionally, but it has
|
|
|
|
no precise meaning in Hoon and can be extremely confusing.
|
|
|
|
|
|
|
|
Hoon's two kinds of "type" are `span` and `mold`. A span is both
|
|
|
|
a constructively defined set of nouns, and a semantic convention
|
|
|
|
for users in that set. A `mold` is a function whose range is
|
2015-11-04 00:49:33 +03:00
|
|
|
some useful span. A mold is always idempotent (for any noun `x`,
|
2015-11-03 22:49:57 +03:00
|
|
|
`f(x)` equals `f(f(x))`), and its domain is any noun.
|
|
|
|
|
|
|
|
One way to explain this is that while a span is what most
|
|
|
|
languages call a "type," Hoon has no syntax for the programmer to
|
|
|
|
define a span directly. Instead, we use inference to define it
|
|
|
|
as the range of a mold function. This mold can also be used to
|
|
|
|
validate or normalize untrusted, untyped data -- a common problem
|
|
|
|
in modern programming, because networks.
|
|
|
|
|
2015-11-04 21:33:16 +03:00
|
|
|
> Sending a noun over the network is a good example of why Hoon
|
|
|
|
is different. In a normal modern language, you serialize and
|
|
|
|
deserialize a data type by extending your type to implement a
|
|
|
|
serialization interface. A really good language can process your
|
|
|
|
type definition and automatically generate this code. In Hoon,
|
|
|
|
we have one function `jam` that converts any noun to an atom,
|
|
|
|
and another `cue` that inverts `jam`. To validate, the receiver
|
|
|
|
applies its own mold to the cued noun, and we've sent typed data
|
|
|
|
over the network without any attack surface (except `jam` and
|
|
|
|
`cue`, which fit on a page). No custom serialization code,
|
|
|
|
manual or generated, is required. The mold itself is not sent;
|
|
|
|
protocol agreement is out of band.
|
|
|
|
|
2015-11-03 22:49:57 +03:00
|
|
|
Hoon's inference algorithm is dumber than the unification
|
|
|
|
algorithms (Hindley-Milner) used in most typed functional
|
|
|
|
languages. Hoon thinks only forward, not backward. Eg, Haskell
|
|
|
|
can infer the result type of a function from its argument
|
|
|
|
(forward), or the argument type from the result (backward).
|
|
|
|
Hoon can do the first but not the second.
|
|
|
|
|
2015-11-04 21:33:16 +03:00
|
|
|
So Hoon needs more manual annotations, which you usually want
|
2015-11-03 22:49:57 +03:00
|
|
|
anyway for prosaic software-engineering reasons. Otherwise its
|
|
|
|
typesystem solves more or less the same job, including
|
|
|
|
pattern-matching, genericity / typeclasses, etc.
|
|
|
|
|
2015-11-04 21:33:16 +03:00
|
|
|
> A good test of a static higher-order typesystem is whether it can
|
|
|
|
infer the product type of a grammar defined as a combinator
|
|
|
|
parser. The Hoon parser passes this test; when it typechecks,
|
|
|
|
the parser's range nests within the span of the expression mold.
|
2015-11-03 22:49:57 +03:00
|
|
|
|
|
|
|
## Let's make some nouns
|
|
|
|
|
2015-11-04 21:33:16 +03:00
|
|
|
Let's make a noun:
|
2015-11-03 22:49:57 +03:00
|
|
|
```
|
|
|
|
~tasfyn-partyv:dojo> 42
|
|
|
|
```
|
2015-11-04 00:49:33 +03:00
|
|
|
You'll see the expression you entered, then the result:
|
2015-11-03 22:49:57 +03:00
|
|
|
```
|
|
|
|
> 42
|
|
|
|
42
|
|
|
|
```
|
2015-11-04 21:33:16 +03:00
|
|
|
Let's try a different noun. Or is it different?
|
2015-11-03 22:49:57 +03:00
|
|
|
```
|
|
|
|
~tasfyn-partyv:dojo> 0x2a
|
|
|
|
```
|
|
|
|
You'll see:
|
|
|
|
```
|
|
|
|
> 0x2a
|
|
|
|
0x2a
|
|
|
|
```
|
|
|
|
`42` and `0x2a` are actually *the same noun*, because they're the
|
|
|
|
same number. But we don't just have the noun to print - we have
|
|
|
|
a `[span noun]` cell (sometimes called a `vase`).
|
|
|
|
|
|
|
|
As you recall, a span defines a set of nouns and a semantic
|
|
|
|
interpretation. As sets, both spans here are "any number". But
|
|
|
|
semantically, `42` has a decimal span and `0x2a` hexadecimal, so
|
|
|
|
they print differently.
|
|
|
|
|
2015-11-04 00:49:33 +03:00
|
|
|
> It's important to remember that Hoon is a statically typed language.
|
2015-11-03 22:49:57 +03:00
|
|
|
We don't work with vases unless we're dynamically compiling code,
|
|
|
|
which is of course what we're doing here in the dojo. In Hoon,
|
2015-11-04 00:49:33 +03:00
|
|
|
dynamic type equals static type plus runtime compilation.
|
2015-11-03 22:49:57 +03:00
|
|
|
|
|
|
|
Let's make some cells. Try these on your own urbit:
|
|
|
|
```
|
|
|
|
~tasfyn-partyv:dojo> [42 0x2a]
|
|
|
|
~tasfyn-partyv:dojo> [42 [0x2a 420]]
|
|
|
|
~tasfyn-partyv:dojo> [42 0x2a 420]
|
|
|
|
```
|
|
|
|
We observe that cells associate right: `[a b c]` is just another
|
|
|
|
way of writing `[a [b c]]`.
|
|
|
|
|
2015-11-04 21:33:16 +03:00
|
|
|
Lisp masters beware: Hoon `[a b]` is Lisp `(a . b)`, Lisp
|
|
|
|
`(a b)` is Hoon `[a b ~]`. `~` means nil, with value zero. Lisp
|
|
|
|
and Hoon are both pair-oriented languages down below, but Lisp
|
|
|
|
has a layer of sugar that makes it look list-oriented. Hoon
|
|
|
|
loves its "improper lists," ie, tuples.
|
2015-11-03 22:49:57 +03:00
|
|
|
|
|
|
|
## Looking at spans
|
|
|
|
|
|
|
|
What are these mysterious spans? We can see them with the `?`
|
|
|
|
prefix, which prints the span along with the result. Moving to
|
|
|
|
a more compact example format:
|
|
|
|
```
|
|
|
|
~tasfyn-partyv:dojo> ? 42
|
|
|
|
@ud
|
|
|
|
42
|
|
|
|
~tasfyn-partyv:dojo> ? 0x2a
|
|
|
|
@ux
|
|
|
|
0x2a
|
|
|
|
```
|
|
|
|
`@ud` and `@ux` stand for "unsigned decimal" and "unsigned hex,"
|
|
|
|
obviously.
|
|
|
|
|
2015-11-04 00:49:33 +03:00
|
|
|
> What is this span syntax? We only derive spans through
|
2015-11-03 22:49:57 +03:00
|
|
|
inference. So there's no parsing grammar for a span. We have to
|
|
|
|
be able to print spans, if only for debugging and diagnostics,
|
|
|
|
but the syntax is output-only. As in this case, it often looks
|
|
|
|
like the `mold` syntax, but the two are at opposite ends of the
|
2015-11-04 00:49:33 +03:00
|
|
|
type food chain.
|
2015-11-03 22:49:57 +03:00
|
|
|
|
|
|
|
## Looking at spans, part 2
|
|
|
|
|
2015-11-04 21:33:16 +03:00
|
|
|
Usually, good style in Hoon is concrete style. When a Hoon
|
|
|
|
programmer defines an abstract semantic value in terms of a noun,
|
|
|
|
we rarely put a conceptual layer of abstraction between value and
|
|
|
|
noun. We think of the value as an interpretation of the noun.
|
|
|
|
We don't think of the noun as an implementation of the noun.
|
2015-11-03 22:49:57 +03:00
|
|
|
|
2015-11-04 21:33:16 +03:00
|
|
|
But rules are made to be broken. With the `?` command, we *do*
|
|
|
|
use an abstract layer, by printing our span noun in that custom
|
|
|
|
syntax. But we can also look at the span noun directly, with the
|
|
|
|
`??` command. As we'll see, `??` is mainly for newbies, but
|
|
|
|
newbies love it.
|
2015-11-03 22:49:57 +03:00
|
|
|
|
|
|
|
```
|
|
|
|
~tasfyn-partyv:dojo> ?? 42
|
|
|
|
[%atom %ud]
|
|
|
|
42
|
|
|
|
~tasfyn-partyv:dojo> ?? [42 0x2a]
|
|
|
|
[%cell [%atom %ud] [%atom %ux]]
|
|
|
|
[42 0x2a]
|
|
|
|
```
|
|
|
|
What is this `%atom` syntax? Is it a real noun? Can anyone
|
|
|
|
make one?
|
|
|
|
```
|
|
|
|
~tasfyn-partyv:dojo> %atom
|
|
|
|
%atom
|
|
|
|
~tasfyn-partyv:dojo> %foo
|
|
|
|
%foo
|
|
|
|
~tasfyn-partyv:dojo> [%foo %bar]
|
|
|
|
[%foo %bar]
|
|
|
|
```
|
|
|
|
What's the span of one of these symbols?
|
|
|
|
```
|
|
|
|
~tasfyn-partyv:dojo> ? %foo
|
|
|
|
%foo
|
|
|
|
%foo
|
|
|
|
~tasfyn-partyv:dojo> ?? %foo
|
|
|
|
[%cube 7.303.014 [%atom %tas]]
|
|
|
|
%foo
|
|
|
|
```
|
|
|
|
This takes a little bit of explaining. `7.303.014` is just the
|
|
|
|
Urbit (and German) way of writing the English number `7,303,014`,
|
|
|
|
or the Urbit hex number `0x6f.6f66`, or the string "foo" as an
|
|
|
|
unsigned integer with least-significant byte first.
|
|
|
|
|
|
|
|
A `%cube` span is a constant -- a set of one noun, the atom
|
|
|
|
`7.303.014`. But we still need to know how to print that noun.
|
|
|
|
In this case, it's an `[%atom %tas]`, ie, a text symbol.
|
|
|
|
|
|
|
|
Cubes don't have to be symbols -- in fact, we can take the
|
|
|
|
numbers we've just been using, and make them constants:
|
|
|
|
```
|
|
|
|
~tasfyn-partyv:dojo> %42
|
|
|
|
%42
|
|
|
|
~tasfyn-partyv:dojo> ? %42
|
|
|
|
%42
|
|
|
|
%42
|
|
|
|
~tasfyn-partyv:dojo> ?? %42
|
|
|
|
[%cube 42 [%atom %ud]]
|
|
|
|
%42
|
|
|
|
```
|
|
|
|
|
2015-11-04 21:33:16 +03:00
|
|
|
> Why `??`? Spans are an exception to concrete style, because they
|
|
|
|
use "manual laziness" to define logically recursive structures.
|
|
|
|
A recursive span contains Hoon code which is evaluated to apply
|
|
|
|
it. In practice, this noun often contains the entire Urbit
|
|
|
|
kernel, so you wouldn't want to try to print it in the dojo. If
|
|
|
|
you find `??` taking a weirdly long time, this may have happened;
|
2015-11-04 00:49:33 +03:00
|
|
|
press ^C.
|
|
|
|
|
2015-11-03 22:49:57 +03:00
|
|
|
## Our first mold
|
|
|
|
|
|
|
|
After seeing a few span examples, are we ready to describe the
|
|
|
|
set of all spans with a Hoon mold? Well, no, but let's try it
|
|
|
|
anyway. Ignore the syntax (which we'll explain later; this is a
|
|
|
|
tutorial, not a reference manual), and you'll get the idea:
|
|
|
|
```
|
|
|
|
++ span
|
|
|
|
$% [%atom p=@tas]
|
|
|
|
[%cell p=span q=span]
|
|
|
|
[%cube p=* q=span]
|
|
|
|
==
|
|
|
|
```
|
|
|
|
This mold is not the entire definition of `span`, just the cases
|
|
|
|
we've seen so far. In English, a valid span is either:
|
|
|
|
|
|
|
|
- a cell with head `%atom`, and tail some symbol.
|
|
|
|
- a cell with head `%cell`, and tail some pair of spans.
|
|
|
|
- a cell with head `%cube`, and tail a noun-span pair.
|
|
|
|
|
|
|
|
The head of a span is essentially the tag in a variant record,
|
2015-11-04 00:49:33 +03:00
|
|
|
a pattern every programming language has. To use the span, we
|
2015-11-03 22:49:57 +03:00
|
|
|
look at the head and then decide what to do with the tail.
|
|
|
|
|
2015-11-04 00:49:33 +03:00
|
|
|
> A conventional naming strategy for simple, self-explaining
|
2015-11-03 22:49:57 +03:00
|
|
|
structures is to name the legs of a tuple `p`, `q`, `r`, `s` and
|
|
|
|
`t`. If you get all the way to `t`, your noun is probably not
|
2015-11-04 00:49:33 +03:00
|
|
|
simple or self-explaining; meaningful names are recommended.
|
|
|
|
|
2015-11-10 22:55:17 +03:00
|
|
|
## Progress
|
|
|
|
|
2015-11-04 00:49:33 +03:00
|
|
|
Believe it or not, at this point we understand nouns completely.
|
|
|
|
We don't understand spans and molds completely, but we get the
|
2015-11-10 22:55:17 +03:00
|
|
|
basics. In the [next chapter](1-twigs), we'll see how Hoon
|
|
|
|
expressions (twigs) turn one noun into another.
|