tutorial drafts

This commit is contained in:
Galen Wolfe-Pauly 2015-09-29 14:36:58 -07:00
parent 0cc523538a
commit e97bec0bb5
6 changed files with 1434 additions and 0 deletions

4
pub/doc/hoon/tutorial.md Normal file
View File

@ -0,0 +1,4 @@
Tutorials
=========
<list dataPreview="true" titlesOnly="true"></list>

View File

@ -0,0 +1,232 @@
# Hoon 0: introduction
Hoon is a strict, higher-order typed pure-functional language.
Why Hoon? On the one hand, typed functional languages are known
for a particularly pleasant phenomenon: once your code compiles,
it's quite likely to work. On the other hand, most typed
functional languages are influenced by advanced mathematics.
As Barbie once put it, math class is hard.
Hoon is a typed FP language for the common street programmer.
Well-written Hoon is as concrete and data-oriented as possible.
The less functional magic you use, the better. One Haskell
hacker described Hoon as "imperative programming in a functional
language." He didn't mean this as a compliment, but we choose to
take it as one.
Moreover, one task of a type system in network computing is
marshalling typed data on the sender, and validating untrusted
data on the receiver. Hoon is very good at this task, which in
most typed languages is an afterthought at best.
The main disadvantage of Hoon is that its syntax and semantics
are unfamiliar. The syntax will remind too many of Perl, but
like most human languages (and unlike Perl) it combines a regular
core structure with irregular variations. Its semantic
complexity is bounded by the fact that the compiler is only 2000
lines of Hoon (admittedly an expressive language). Most peoples'
experience is that Hoon is much easier to learn than it looks.
## Nouns: data made boring
A noun is an atom or a cell. An atom is any unsigned integer. A
cell is an ordered pair of nouns.
The noun is an intentionally boring data model. Nouns (at least,
nouns in Urbit) don't have cycles (although a noun implementation
should take advantage of acyclic graph structure). Noun
comparison is always by value (there is no way for the programmer
to test pointer equality). Nouns are strict; there is no such
thing as an infinite noun. And, of course, nouns are immutable.
So there's basically no way to have any real fun with nouns.
For language historians, nouns are Lisp's S-expressions, minus a
lot of hacks, tricks, and features that made sense 50 years ago.
In particular, because atoms are not tagged (an atom can encode a
string, for instance), nouns work best with a static type system.
How do you print an atom if you don't know whether it's a string
or a number? You can guess, but...
## A type system for nouns
So learning nouns in practice involves learning them with a type
system that makes them usable. Fortunately, we have that.
One obstacle to learning Hoon is that it has two quite distinct
concepts that might equally be called a "type." Worse, most
other typed functional languages are mathy and share a basically
mathematical concept of "type." We can't avoid using the T-word
occasionally, but it has no precise meaning in Hoon and can be
extremely confusing.
Hoon's two kinds of "type" are `span` and `mold`. A span is both
a constructively defined set of nouns, and a semantic convention
for users in that set. A `mold` is a function whose range is
some useful span. A mold is always idempotent (for any noun x,
`f(x)` equals `f(f(x))`), and its domain is any noun.
(One way to explain this is that while a span is what most
languages call a "type," Hoon has no way for the programmer to
express a span directly. Instead, we use inference to define it
as the range of a function. This same function, the mold, can
also be used to validate or normalize untrusted, untyped data --
a common problem in modern programming.)
(Hoon's inference algorithm is somewhat dumber than the
unification algorithms (Hindley-Milner) used in most typed
functional languages. Hoon reasons only forward, not backward.
It needs more manual annotations, which you usually want anyway.
Otherwise, it gets more or less the same job done.)
## Let's make some nouns
This stuff isn't even slightly hard. Let's make a noun:
```
~tasfyn-partyv:dojo> 42
```
You'll see the expression you entered, then the resulting value:
```
> 42
42
```
Let's try a different value:
```
~tasfyn-partyv:dojo> 0x2a
```
You'll see:
```
> 0x2a
0x2a
```
`42` and `0x2a` are actually *the same noun*, because they're the
same number. But we don't just have the noun to print - we have
a `[span noun]` cell (sometimes called a `vase`).
As you recall, a span defines a set of nouns and a semantic
interpretation. As sets, both spans here are "any number". But
semantically, `42` has a decimal span and `0x2a` hexadecimal, so
they print differently.
(It's important to note that Hoon is a statically typed language.
We don't work with vases unless we're dynamically compiling code,
which is of course what we're doing here in the shell. Dynamic
type is static type compiled at runtime.)
Finally, let's make some cells. Try these on your own ship:
```
~tasfyn-partyv:dojo> [42 0x2a]
~tasfyn-partyv:dojo> [42 [0x2a 420]]
~tasfyn-partyv:dojo> [42 0x2a 420]
```
We observe that cells associate right: `[a b c]` is just another
way of writing `[a [b c]]`.
Also, Lisp veterans beware: Hoon `[a b]` is Lisp `(a . b)`, Lisp
`(a b)` is Hoon `[a b ~]`(`~` represents nil, with a value of atom `0`). Lisp and Hoon are both pair-oriented
languages down below, but Lisp has a layer of sugar that makes it
look list-oriented. Hoon loves its "improper lists," ie, tuples.
## Looking at spans
What are these mysterious spans? We can see them with the `?`
prefix, which prints the span along with the result. Moving to
a more compact example format:
```
~tasfyn-partyv:dojo> ? 42
@ud
42
~tasfyn-partyv:dojo> ? 0x2a
@ux
0x2a
```
`@ud` and `@ux` stand for "unsigned decimal" and "unsigned hex,"
obviously. But what is this syntax?
We only derive spans through inference. So there's no language
syntax for a span. We have to be able to print spans, though, if
only for debugging and diagnostics. `@ud` is an print-only
syntax. (In this case it happens to be the same as the `mold`
syntax, but that's just a coincidence.)
## Looking at spans, part 2
A good way to teach yourself to think in nouns is to look not at
the prettyprinted span, but at the actual noun it's made of.
Since everything in Hoon is a noun, a span is a noun too. When
we use `??` rather than `?` as a prefix, we see the noun:
```
~tasfyn-partyv:dojo> ?? 42
[%atom %ud]
42
~tasfyn-partyv:dojo> ?? [42 0x2a]
[%cell [%atom %ud] [%atom %ux]]
[42 0x2a]
```
What is this `%atom` notation? Is it a real noun? Can anyone
make one?
```
~tasfyn-partyv:dojo> %atom
%atom
~tasfyn-partyv:dojo> %foo
%foo
~tasfyn-partyv:dojo> [%foo %bar]
[%foo %bar]
```
What if we look at the span?
```
~tasfyn-partyv:dojo> ? %foo
%foo
%foo
~tasfyn-partyv:dojo> ?? %foo
[%cube 7.303.014 %atom %tas]
%foo
```
This takes a little bit of explaining. First of all, `7.303.014`
is just the German (and Urbit) way of writing `7,303,014`, or the
hexadecimal number `0x6f.6f66`, or the string "foo" as an
unsigned integer. (It's much easier to work with large integers
when the digits are grouped.) Second, remembering that cells
nest right, `[%cube 7.303.014 %atom %tas]` is really `[%cube
7.303.014 [%atom %tas]]`.
A `%cube` span is a constant -- a set of one noun, the atom
`7.303.014`. But we still need to know how to print that noun.
In this case, it's an `[%atom %tas]`, ie, a text symbol.
Cubes don't have to be symbols -- in fact, we can take the
numbers we've just been using, and make them constants:
```
~tasfyn-partyv:dojo> %42
%42
~tasfyn-partyv:dojo> ? %42
%42
%42
~tasfyn-partyv:dojo> ?? %42
[%cube 42 %atom %ud]
%42
```
## Our first mold
After seeing a few span examples, are we ready to describe the
set of all spans with a Hoon mold? Well, no, but let's try it
anyway. Ignore the syntax (which we'll explain later; this is a
tutorial, not a reference manual), and you'll get the idea:
```
++ span
$% [%atom @tas]
[%cell span span]
[%cube * span]
==
```
This mold is not the entire definition of `span`, just the cases
we've seen so far. In English, a valid span is either:
- a cell with head `%atom`, and tail some symbol.
- a cell with head `%cell`, and tail some pair of spans.
- a cell with head `%cube`, and tail a noun-span pair.
The head of a span is essentially the tag in a variant record,
a pattern every programming language has. To use the noun, we
look at the head and then decide what to do with the tail.

View File

@ -0,0 +1,229 @@
# Hoon 1: twigs and legs
In the last chapter we learned how to make nouns. In this
chapter we'll start programming a little.
## Nock for Hoon programmers
Hoon compiles itself to a pico-interpreter called Nock. This
isn't the place to explain Nock (which is to Hoon much as
assembly language is to C), but Nock is just a way to express a
function as a noun.
Specifically, you can think of Nock as a (Turing-complete)
interpreter shaped like (pseudocode):
```
Nock(subject formula) => product
```
Your function is the noun `formula`. The input to the function
is the noun `subject`. The output is `product`. If something
about this seems complicated or even interesting, you may be
misunderstanding it.
## From Hoon to Nock
The Hoon parser turns an source expression (even one as simple as
`42` from the last chapter) into a noun called a `twig`. If you
know what an AST is, a twig is an AST. (If you don't know what
an AST is, it's not worth the student loans.)
To simplify slightly, the Hoon compiler is shaped like:
```
Hoon(subject-span function-twig) => [product-span formula-nock]
```
Hoon, like Nock, is a *subject-oriented* language - your twig is
always executed against one input noun, the subject. For any
subject noun in `subject-span`, the compiler produces a Nock
formula that computes `function-twig` on that subject, and a
`product-span` that is the span of the product.
(Pretty much no other language works this way. In a normal
language, your code is executed against a scope, stack, or other
variable context, which may not even be a regular user-level
value. This change is one of the hardest things to understand
about Hoon, mostly because it's hard to stay convinced that
subject-oriented programming is as straightforward as it is.)
## From constants to twigs
In the last chapter we were entering degenerate twigs like `42`.
Obviously this doesn't use the subject at all.
Let's use the dojo variable facility (this is *not* Hoon syntax,
just a dojo command) to make a test subject:
```
~tasfyn-partyv:dojo> =test [[[8 9] 5] [6 7]]
```
We can evaluate twigs against this subject with the Hoon `:`
syntax (`a:b` uses the product of `b` as the subject of `a`).
```
~tasfyn-partyv:dojo> 42:test
42
```
## Tree addressing
The simplest twigs produce a subtree, or "leg", of the subject.
A cell, of course, is a binary tree. The very simplest twig is
`.`, which produces the root of the tree - the whole subject:
```
~tasfyn-partyv:dojo> .:test
[[[8 9] 5] 6 7]
```
(If you're wondering why `[6 7]` got printed as `6 7`, remember
that `[]` associates to the right.)
Hoon has a simple tree addressing scheme (inherited from Nock):
the root is `1`, the head of `n` is `2n`, the tail is `2n+1`.
The twig syntax is `+n`. Hence:
```
~tasfyn-partyv:dojo> +1:test
[[[8 9] 5] 6 7]
```
Our example is a sort of Hoon joke, not very funny:
```
~tasfyn-partyv:dojo> +2:test
[[8 9] 5]
~tasfyn-partyv:dojo> +3:test
[6 7]
~tasfyn-partyv:dojo> +4:test
[8 9]
~tasfyn-partyv:dojo> +5:test
5
~tasfyn-partyv:dojo> +6:test
6
~tasfyn-partyv:dojo> +7:test
7
```
And so on. An instinct for binary tree geometry develops over
time as you use the system, rather the way most programmers
learn to do binary math.
## Femur syntax
A "femur" is an alternative syntax for a tree address. The femur
syntax creates a recognizable geometric shape by alternating
between two head/tail pairs, read left to right: `-` and `+`,
`<` and `>`.
Thus `-` is `+2`, `+` is `+3`, `+<` is `+6`, `->` is `+5`, `-<+`
is `+9`, etc. The decimal numbers are distracting, whereas the
glyph string binds directly to the tree geometry as you learn it.
We actually almost never use the decimal tree geometry syntax.
## Simple faces
But it would be pretty tough to program in Hoon if explicit
geometry was the only way of getting data out of a subject.
Let's introduce some new syntax:
```
~tasfyn-partyv:dojo> foo=42
foo=42
~tasfyn-partyv:dojo> ? foo=42
foo=@ud
foo=42
~tasfyn-partyv:dojo> ?? foo=42
[%face %foo %atom %ud]
foo=42
```
To extend our `++span` mold:
```
++ span
$% [%atom @tas]
[%cell span span]
[%cube * span]
[%face @tas span]
==
```
The `%face` span wraps a label around a noun. Then we can
get a leg by name. Let's make a new dojo variable:
```
~tasfyn-partyv:dojo> =test [[[8 9] 5] foo=[6 7]]
```
The syntax is what you might expect:
```
~tasfyn-partyv:dojo> foo:test
[6 7]
```
Does this do what you expect it to do?
```
~tasfyn-partyv:dojo> +3:test
foo=[6 7]
~tasfyn-partyv:dojo> ? +3:test
foo=[@ud @ud]
foo=[6 7]
~tasfyn-partyv:dojo> ?? +3:test
[%face %foo %cell [%atom %ud] %atom %ud]
foo=[6 7]
```
## Interesting faces; wings
Again, you're probably used to name resolution in variable scopes
and flat records, but not in trees. (Partly this is because the
tradition in language design is to eschew semantics that make it
hard to build simple symbol tables, because linear search of a
big tree is a bad idea on '80s hardware.)
Let's look at a few more interesting face cases. First, suppose
we have two cases of `foo`?
```
~tasfyn-partyv:dojo> =test [[foo=[8 9] 5] foo=[6 7]]
~tasfyn-partyv:dojo> foo:test
[8 9]
```
In the tree search, the head wins. We can overcome this with a
`^` prefix, which tells the search to skip its first hit:
```
~tasfyn-partyv:dojo> =test [[foo=[8 9] 5] foo=[6 7]]
~tasfyn-partyv:dojo> ^foo:test
[6 7]
```
`^^foo` will skip two foos, `^^^foo` three, up to `n`.
But what about nested labels?
```
~tasfyn-partyv:dojo> =test [[[8 9] 5] foo=[6 bar=7]]
~tasfyn-partyv:dojo> bar:test
/~tasfyn-partyv/home/~2015.9.16..21.40.21..1aec:<[1 1].[1 9]>
-find-limb.bar
find-none
```
It didn't seem to like that. We'll need a nested search:
```
~tasfyn-partyv:dojo> bar.foo:test
7
```
`bar.foo` here is a `wing`, a search path in a noun. Note that
the wing runs from left to right, ie, the opposite of most
languages: `bar.foo` means "bar inside foo."
Each step in a wing is a `limb`. A limb can be a tree address,
like `+3` or `.`, or a label like `foo`. We can combine them in
one wing:
```
~tasfyn-partyv:dojo> bar.foo.+3:test
7
```
## Mutation
Well, not really. We can't modify nouns; the concept doesn't
even make sense in Hoon. Rather, we build new nouns which are
(logical -- the pointers are actually shared) copies of old ones,
with changes.
Let's build a "mutated" copy of our test noun:
```
~tasfyn-partyv:dojo> test
[[[8 9] 5] foo=[6 bar=7]]
~tasfyn-partyv:dojo> test(foo 42)
[[[8 9] 5] foo=42]
~tasfyn-partyv:dojo> test(+8 %eight, bar.foo [%hello %world])
[[[%eight 9] 5] foo=[6 [%hello %world]]]
```
As we see, there's no obvious need for the mutant noun to be
shaped anything like the old noun. They're different nouns.
At this point, you have a simplified but basically sound idea of
how Hoon builds and manages nouns. Next, it's time to do some
programming.

View File

@ -0,0 +1,392 @@
# Hoon 2: serious syntax
We've done a bunch of fun stuff on the command line. We know our
nouns. It's time to actually write some serious code -- in a
real source file.
## Building a simple generator
In Urbit there's a variety of source file roles, distinguished by
the magic paths they're loaded from: `/gen` for generators,
`/ape` for appliances, `/fab` for renderers, etc.
We'll start with a generator, the simplest kind of Urbit program.
### Create a sandbox desk
A desk is the Urbit equivalent of a `git` branch. We're just
playing around here and don't intend to soil our `%home` desk with
test files, so let's make a sandbox:
```
|merge %sandbox our %home
```
### Mount the sandbox
Your Urbit pier is in `~/tasfyn-partyv`, or at least mine is.
So we can get our code into Urbit, run the command
```
~tasfyn-partyv:dojo> |mount /=sandbox=/gen %gen
```
mounts the `/gen` folder from the `%sandbox` desk in your Unix
directory `~/tasfyn-partyv/gen`. The mount is a two-way sync,
like your Dropbox. When you edit a Unix file and save, your edit
is automatically committed as a change to `%sandbox`.
### Execute from the sandbox
The `%sandbox` desk obviously is merged from `%home`, so it
contains find all the default facilities you'd expect there.
Bear in mind, we didn't set it to auto-update when `%home`
is updated (that would be `|sync` instead of `|merge`).
So we're not roughing it when we set the dojo to load from
`%sandbox`:
```
[switch to %home]
```
### Write your builder
Let's build the simplest possible kind of generator, a builder.
With your favorite Unix text editor (there are Hoon modes for vim
and emacs), create the file `~/tasfyn-partyv/gen/test.hoon`.
Edit it into this:
```
:- %say |= * :- %noun
[%hello %world]
```
Get the spaces exactly right, please. Hoon is not in general a
whitespace-sensitive language, but the difference between one
space and two-or-more matters. And for the moment, think of
```
:- %say |= * :- %noun
```
as gibberish boilerplate at the start of a file, like `#include
"stdio.h"` at the start of a C program. Any of our old Hoon
constants would work in place of `[%hello %world`].
Now, run your builder:
```
~tasfyn-partyv:dojo/sandbox> +test
[%hello %world]
```
Obviously this is your first Hoon *program* per se.
## Hoon syntax 101
But what's up with this syntax?
### A syntactic apology
The relationship between ASCII and human programming languages
is like the relationship between the electric guitar and
rock-and-roll. If it doesn't have a guitar, it's not rock.
Some great rockers play three chords, like Johnny Ramone; some
shred it up, like Jimmy Page.
The two major families of ASCII-shredding languages are Perl and
the even more spectacular APL. (Using non-ASCII characters is
just a fail, but APL successors like J fixed this.) No one
has any right to rag on Larry Wall or Ken Iverson, but Hoon,
though it shreds, shreds very differently.
The philosophical case for a "metalhead" language is threefold.
One, human beings are much better at associating meaning with
symbols than they think they are. Two, a programming language is
a professional tool and not a plastic shovel for three-year-olds.
And three, the alternative to heavy metal is keywords. When you
use a keyword language, not only are you forcing the programmer
to tiptoe around a ridiculous maze of restricted words used and
reserved, you're expressing your program through two translation
steps: symbol->English and English->computation. When you shred,
you are going direct: symbol->computation. Especially in a pure
language, this creates a sense of "seeing the function" which no
keyword language can quite duplicate.
But any metalhead language you don't yet know is line noise.
Let's get you up to speed as fast as possible.
### A glyphic bestiary
A programming language needs to be not just read but said. But
no one wants to say "ampersand." Therefore, we've taken the
liberty of assigning three-letter names to all ASCII glyphs.
Some of these bindings are obvious and some aren't. You'll be
genuinely surprised at how easy they are to remember:
```
ace [1 space] dot . pan ]
bar | fas / pel )
bis \ gap [>1 space, nl] pid }
buc $ hax # ran >
cab _ ket ^ rep '
cen % lep ( sac ;
col : lit < tar *
com , lus + tec `
das - mat @ tis =
den " med & wut ?
dip { nap [ zap !
```
It's fun to confuse people by using these outside Urbit. A few
digraphs also have irregular sounds:
```
== stet
-- shed
++ slus
-> dart
-< dusk
+> lark
+< lush
```
### The shape of a twig
A twig, of course, is a noun. As usual, the easiest way to
explain both the syntax that compiles into that noun, and the
semantic meaning of the noun, is the noun's physical structure.
#### Autocons
A twig is always a cell, and any cell of twigs is a twig
producing a cell. As an homage to Lisp, we call this
"autocons." Where you'd write `(cons a b)` in Lisp, you write
`[a b]` in Hoon, and the shape of the twig follows.
The `???` prefix prints a twig as a noun instead of running it.
Let's see autocons in action:
```
~tasfyn-partyv:dojo/sandbox> ??? 42
[%dtzy %ud 42]
~tasfyn-partyv:dojo/sandbox> ??? 0x2a
[%dtzy %ux 42]
~tasfyn-partyv:dojo/sandbox> ??? [42 0xa]
[[%dtzy %ud 42] %dtzy %ux 42]
```
(As always, it may confuse *you* that this is the same noun as
`[[%dtzy %ud 42] [%dtzy %ux 42]]`, but it doesn't confuse Hoon.)
#### The stem-bulb pattern
If the head of your twig is a cell, it's an autocons. If the
head is an atom, it's an unpronounceable four-letter symbol like
the `%dtzy` above.
This is the same pattern as we see in the `span` mold -- a
variant record, essentially, in nouns. The head of one of these
cells is called the "stem." The tail is the "bulb." The shape
of the bulb is totally dependent on the value of the stem.
#### Runes and stems
A "rune" (a word intentionally chosen to annoy Go programmers) is
a digraph - a sequence of two ASCII glyphs. If you know C, you
know digraphs like `->` and `?:` and are used to reading them as
single characters.
In Hoon you can *say* them as words: "dasran" and "wattis"
respectively. In a metalhead language, if we had to say
"minus greater-than" and "question-colon", we'd just die.
Most twig stems are made from runes, by concatenating the glyph
names and removing the vowels. For example, the rune `=+`,
pronounced "tislus," becomes the stem `%tsls`. (Note that in
many noun implementations, this is a 31-bit direct value.)
(Some stems (like `%dtzy`) are not runes, simply because they
don't have regular-form syntax and don't need to use precious
ASCII real estate. They are otherwise no different.)
An important point to note about runes: they're organized. The
first glyph in the rune defines a category. For instance, runes
starting with `.` compute intrinsics; runes starting with `|`
produce cores; etc.
Another important point about runes: they come in two flavors,
"natural" (stems interpreted directly by the compiler) and
"synthetic" (macros, essentially).
(Language food fight warning: one advantage of Hoon over Lisp is
that all Hoon macros are inherently hygienic. Another advantage
is that Hoon has no (user-level) macros. In Hoon terms, nobody
gets to invent their own runes. A DSL is always and everywhere
a write-only language. Hoon shreds its ASCII pretty hard, but
the same squiggles mean the same things in everyone's code.)
#### Wide and tall regular forms
A good rune example is the simple rune `=+`, pronounced "tislus",
which becomes the stem `%tsls`. A `%tsls` twig has the shape
`[%tsls twig twig]`.
The very elegance of functional languages creates a visual
problem that imperative languages lack. An imperative language
has distinct statements (with side effects) and (usually pure)
expressions; it's natural that in most well-formatted code,
statements flow vertically down the screen, and expressions grow
horizontally across this. This interplay creates a natural and
relaxing shape on your screen.
In a functional language, there's no difference. The trivial
functional syntax is Lisp's, which has two major problems. One:
piles of expression terminators build up at the bottom of complex
functions. Two: the natural shape of code is diagonal. The more
complex a function, the more it wants to besiege the right
margin. The children of a node have to start to the right of its
parent, so the right margin bounds the tree depth.
Hoon does not completely solve these problems, but alleviates
them. In Hoon, there are actually two regular syntax forms for
most twig cases: "tall" and "wide" form. Tall twigs can contain
wide twigs, but not vice versa, so the visual shape of a program
is very like that of a statements-and-expressions language.
Also, in tall mode, most runes don't need terminators. Take
`=+`, for example. Since the parser knows to expect exactly
two twigs after the `=+` rune, it doesn't need any extra syntax
to tell it that it's done.
Let's try a wide `=+` in the dojo:
```
~tasfyn-partyv:dojo/sandbox> =+(planet=%world [%hello planet])
[%hello %world]
```
(`=+` seems to be some sort of variable declaration? Let's not
worry about it right now. We're on syntax.)
The wide syntax for a `=+` twig, or any binary rune: `(`, the
first subtwig, one space, the second subtwig, and `)`). To read
this twig out loud, you'd say:
```
tislus lap planet is cen world ace nep cen hello ace planet pen
pal
```
("tis" not in a rune gets contracted to "is".)
Let's try a tall `=+` in `test.hoon`:
```
:- %say |= * :- %noun
=+ planet=%world
[%hello planet]
```
The tall syntax for a `=+` twig, or any binary rune: the rune, at
least two spaces or one newline, the first subtwig, at least two
spaces or one newline, the second subtwig. Again, tall subtwigs
can be tall or wide; wide subtwigs have to be wide.
(Note that our boilerplate line is a bunch of tall runes on one
line, with two-space gaps. This is unusual but quite legal, and
not to be confused with the actual wide form.)
To read this twig out loud, you'd say:
```
tislus gap planet is cen world gap nep cen hello ace planet pen
```
#### Layout conventions
Should you use wide twigs or tall twigs? When? How? What
should your code look like? You're the artist. Except for the
difference between one space (`ace`) and more space (`gap`), the
parser doesn't care how you format your code. Hoon is not Go --
there are no fixed rules for doing it right.
However, the universal convention is to keep lines under 80
characters. Also, hard tab characters are illegal. And when in
doubt, make your code look like the kernel code.
##### Backstep indentation
Note that the "variable declaration" concept of `=+` (which is no
more a variable declaration than a Tasmanian tiger is a tiger)
works perfectly here. Because `[%hello planet]` -- despite being
a subtree of the the `=+` twig -- is at the same indent level.
So our code flows down the screen, not down and to the right, and
of course there are no superfluous terminators. It looks good,
and creates fewer hard-to-find syntax errors than you'd think.
This is called "backstep" indentation. Another example, using a
ternary rune that has a strange resemblance to C:
```
:- %say |= * :- %noun
=+ planet=%world
?: =(%world planet)
[%hello planet]
[%goodbye planet]
```
It's not always the case when backstepping that the largest
subtwig is at the bottom and loses no margin, but it often is.
And not all runes have tuple structure; some are n-ary, and use
the `==` terminator (again, pronounced "stet"):
```
:- %say |= * :- %noun
=+ planet=%world
?+ planet
[%unknown planet]
%world [%hello planet]
%ocean [%goodbye planet]
==
```
So we occasionally lose right-margin as we descend a deep twig.
But we can keep this lossage low with good layout design. The
goal is to keep the heavy twigs on the right, and Hoon tries as
hard as possible to help you with this.
For instance, `=+` ("tislus") is a binary rune: `=+(a b)`. In
most cases of `=+` the heavy twig is `b`, but sometimes it's `a`.
So we can use its friend the `=-` rune ("tisdas") to get the same
semantics with the right shape: `=-(b a)`.
#### Irregular forms
There are more regular forms than we've shown above, but not a
lot more. Hoon would be quite easy to learn if it was only its
regular forms. It wouldn't be as easy to read or use, though.
The learning curve is important, but not all-important.
Some stems (like the `%dtzy` constants above) obviously don't and
can't have any kind of regular form (which is why `%dtzy` is not
a real digraph rune). Many of the true runes have only regular
forms. But some have irregular forms. Irregular forms are
always wide, but there is no other constraint on their syntax.
We've already encountered one of the irregular forms: `foo=42`
from the last chapter, and `planet=%world` here. Let's unpack
this twig:
```
~tasfyn-partyv:dojo/sandbox> ?? %world
[%cube 431.316.168.567 %atom %tas]
%world
~tasfyn-partyv:dojo/sandbox> ??? %world
[%dtzz %tas 431.316.168.567]
```
Clearly, `%dtzz` is one of our non-regulars. But we can wrap it
with our irregular form:
```
~tasfyn-partyv:dojo/sandbox> ?? planet=%world
[%face %planet [%cube 431.316.168.567 %atom %tas]]
planet=%world
~tasfyn-partyv:dojo/sandbox> ??? planet=%world
[%ktts %planet %dtzz %tas 431.316.168.567]
```
Since `%ktts` is "kettis", ie, `^=`, this has to be the irregular
form of
```
~tasfyn-partyv:dojo/sandbox> ^=(planet %world)
planet=world
```
So if we wrote our example without this irregular form, it'd be
```
:- %say |= * :- %noun
=+ ^=(planet %world)
[%hello planet]
```
Or with a gratuitous use of tall form:
```
:- %say |= * :- %noun
=+ ^= planet %world
[%hello planet]
```
Now you know how to read Hoon! For fun, try to pronounce more of
the code on this page. Please don't laugh too hard at yourself.

View File

@ -0,0 +1,316 @@
# Hoon 3: our first program
It's time for us to do some actual programming. In this section,
we'll work through that classic Urbit pons asinorum, decrement.
If you learned Nock before Hoon, you've already done decrement.
If not, all you need to know is that the only arithmetic
intrinsic in Nock is increment -- in Hoon, the unary `.+` rune.
So an actual decrement function is required.
In chapter 3, we write a decrement builder: more or less the
simplest nontrivial Urbit program. We should be able to run this
example:
```
~tasfyn-partyv:dojo/sandbox> +test 42
41
```
## What's in that subject?
As we've seen, Hoon works by running a twig against a subject.
We've been cheerfully running twigs through three chapters while
avoiding the question: what's in the subject? To avoid the issue
we've built a lot of constants, etc.
Of course your twig's subject comes from whoever runs it. There
is no one true subject. Our twigs on the command line are not
run against the same subject as our generator code, even though
they are both run by the same `:dojo` appliance.
But the short answer is that both command-line and builder get
*basically* the same subject: some ginormous noun containing all
kinds of bells and whistles and slicers and dicers, including a
kernel library which can needless to say decrement in its sleep.
As yet you have only faced human-sized nouns. We need not yet
acquaint you with this mighty Yggdrasil, Mother of Trees. First
we need to figure out what she could even be made of.
## Clearing the subject
We'll start by clearing the subject:
```
:- %say |= * :- %noun
=> ~
[%hello %world]
```
The `=>` rune ("tisran"), for `=>(p q)` executes `p` against
the subject, then uses that product as the subject of `q`.
(We've already used an irregular form of `=>`, or to be more
precise its mirror `=<` ("tislit"). In chapter 1, when we wrote
`+3:test`, we meant `=>(test +3)` or `=<(+3 test)`.)
What is this `~`? It's Hoon `nil`, a zero atom with this span:
```
~tasfyn-partyv:dojo/sandbox> ?? ~
[%cube 0 %atom %n]
~
```
We use it for list terminators and the like. Obviously, since
our old test code is just a constant, a null subject works fine:
```
~tasfyn-partyv:dojo/sandbox> +test
[%hello %world]
```
## Getting an argument
Obviously, if we want to write a decrement builder, we'll have to
get an argument from the command line. This involves changing
the `test.hoon` boilerplate a little:
```
:- %say |= [* [[arg=@ud ~] ~]] :- %noun
=> arg=arg
[%hello arg]
~tasfyn-partyv:dojo/sandbox> +test 42
[%hello 42]
```
`=> arg=arg` looks a little odd. We wouldn't ordinarily do
this. We're just replacing a very interesting subject that
contains `arg` with a very boring one that contains only `arg`,
for the same reason we cleared the subject with `~`.
In case there's any doubt about the subject (`.` is limb syntax
for `+1`, ie, the whole noun):
```
:- %say |= [* [[arg=@ud ~] ~]] :- %noun
=> arg=arg
.
~tasfyn-partyv:dojo/sandbox> +test 42
arg=42
```
We can even write a trivial increment function using `.+`:
```
:- %say |= [* [[arg=@ud ~] ~]] :- %noun
=> arg=arg
+(arg)
~tasfyn-partyv:dojo/sandbox> +test 42
43
```
Below we'll skip both boilerplate lines in our examples.
## A core is a code-data cell
But how do we actually, like, code? The algorithm for decrement
is clear. We need to count up to 41. (How do we run useful
programs on a computer with O(n) decrement? That's an
implementation detail.)
We'll need another kind of noun: the *core*. Briefly, the core
is always a cell `[battery payload]`. The payload is data, the
battery is code -- one or more Nock formulas, to be exact.
Consider a simple core with a one-formula battery. Remember, we
create Nock formulas by compiling a twig against a subject. The
subject is dynamic data, but its span is static. What span do we
give the compiler, and what noun do we give the formula?
A core formula always has the core as its subject. The formula
is essentially a computed attribute on the payload. But if the
subject was just the payload, the formula couldn't recurse.
Of course, there is no need to restrict ourselves to one computed
attribute. We can just stick a bunch of formulas together and
call them a battery. The source twigs in this core are called
"arms," which have labels just like the faces we saw earlier.
Hoon overloads computed attributes (arms) and literal attributes
(legs) in the same namespace. A label in a wing may refer to
either. To extend the name-resolution tree search described in
chapter 1, when searching a core, we look for a matching arm.
If we find it we're done. If we don't, or if a `^` mark makes us
skip, we search into the payload.
If a name resolves to a core arm, but it's not the last limb in the
wing, the arm produces the core itself. Similarly, when the
wing is not an access but a mutation, the arm refers to the core.
This demands an example: if `foo` produces some core `c`, and
`bar` is an arm in that `c` (which may be `foo` itself, or some
leg within `foo`), `bar.foo` runs the arm formula with `c` as the
subject. You might think that `moo.bar.foo` would compute
`bar.foo`, then search for `moo` within that result. Instead, it
searches for `moo` within `c`. (You can get the other result
with `moo:bar.foo`.)
Does this sound too tricky? It should - it's about the most
complicated feature of Hoon. It's all downhill once you
understand cores.
Let's again extend our `++span` mold:
```
++ span
$% [%atom @tas]
[%cell span span]
[%core span (map ,@tas twig)]
[%cube * span]
[%face @tas span]
==
```
This definition of `%core` is somewhat simplified from the
reality, but basically conveys it. (Moreover, this version of
`span` describes every kind of noun we build.) In our `%core` we
see a payload span and a name-to-twig arm table, as expected.
Is a core an object? Not quite, because an arm is not a method.
Methods in an OO language have arguments. Arms are functions
only of the payload. (A method in Hoon is an arm that produces a
gate, which is another core -- but we're getting too far ahead.)
However, the battery does look a lot like a classic "vtable."
## Increment with a core
Let's increment with a core:
```
=< inc
|%
++ inc
+(arg)
--
~tasfyn-partyv:dojo/sandbox> +test 42
43
```
What's going on? We used the `|%` rune ("barcen") to produce a
core. (There are a lot of runes which create cores; they all
start with `|`, and are basically macros that turn into `|%`.)
The payload of a core produced with `|%` is the subject with
which `|%` is compiled. We might say that `|%` wraps a core
around its subject. In this case, the subject of the `|%`,
and thus payload, is our `arg=@ud` argument.
Then we used this core as the subject of the simple wing `inc`.
(Remember that `=<(a b)` is just `=>(b a)`.)
We can actually print out a core. Take out the `=< inc`:
```
|%
++ inc
+(arg)
--
~tasfyn-partyv:dojo/sandbox> +test 42
!!!
~tasfyn-partyv:dojo/sandbox> ? +test 42
!!!
```
Cores can be large and complex, and we obviously can't render all
the data in them, either when printing a type or a value. At
some point, you'll probably make the mistake of printing a big
core, maybe even the whole kernel, as an untyped noun. Just
press ^C.
## Adding a counter
To decrement, we need to count up to the argument. So we need a
counter in our subject, because where else would it go? Let's
change the subject to add a counter, `pre`:
```
=> [pre=0 .]
=< inc
|%
++ inc
+(arg)
--
~tasfyn-partyv:dojo/sandbox> +test 42
43
```
Once again, `.` is the whole subject, so we're wrapping it in a
cell whose head is `pre=0`. Through the magic of labels, this
doesn't change the way we use `arg`, even though it's one level
deeper in the subject tree. Let's look at the subject again:
```
=> [pre=0 .]
.
~tasfyn-partyv:dojo/sandbox> +test 42
[pre=0 arg=42]
~tasfyn-partyv:dojo/sandbox> ? +test 42
[pre=@ud arg=@ud]
[pre=0 arg=42]
```
There's actually a simpler way to write this. We've seen it
already. It's not exactly a variable declaration:
```
=+ pre=0
.
~tasfyn-partyv:dojo/sandbox> +test 42
[pre=0 arg=42]
```
## We actually decrement
Now we can write our actual decrement program:
```
=+ pre=0
=< dec
|%
++ dec
?: =(arg +(pre))
pre
dec(pre +(pre))
--
~tasfyn-partyv:dojo/sandbox> +test 42
41
```
`=(a b)` is an irregular form of `.=(a b)`, ie, "dottis" or the
noun `[%dtts a b]`. Likewise, `+(a)` is `.+(a)`, ie, "dotlus"
or `[%dtls a]`.
`?:` is a regular rune which does exactly what you think it does.
Bear in mind, though, that in Hoon 0 (`&`, "rob") is true and 1
(`|`, "bar") is false.
The real action is in `dec(pre +(pre))`. This is obviously an
irregular form -- it's the same mutation form we saw before.
Writing it out in full regular form:
```
=+ pre=0
=< dec
|%
++ dec
?: =(arg +(pre))
pre
%= dec
pre +(pre)
==
--
~tasfyn-partyv:dojo/sandbox> +test 42
41
```
`%=`, "centis", is the rune which almost every use of a wing
resolves to. It might be called "evaluate with changes."
When we evaluate with changes, we take a wing (`dec`) here and
evaluate it as described above. Searching in the subject, which
is of course our core, we find an arm called `dec` and run it.
The changes (replacing `pre` with `+(pre)`) are always applied
relative to the core we landed on (or the leg we landed on).
The change wing is relative to this target; the subject of the
replacement (`+(pre)`) is the original subject.
So, in English, we compute the `dec` arm again, against a new
core with a new payload that contains an incremented `pre`.
And thus, we decrement. Doesn't seem so hard, does it?

View File

@ -0,0 +1,261 @@
# Hoon 4: toward actual functions
Okay, we've programmed. We've achieved decrement. We've written
what is in some sense a loop. What next?
Well... we're still feeling vaguely disappointed. Because we're
supposed to be doing *functional programming*. And we haven't
yet written any *functions*.
After all, in Hoon we don't really write a command-line utility
to decrement `42`. We write `(dec 42)`. You probably realize
that on the inside, this is not the same thing as a function in a
normal functional language. The Tasmanian tiger is not a tiger.
On the other hand, it certainly *looks* like a function call.
So how do we write the function?
In this chapter, we'll modify `+test` to extend the subject so
that we can write our result as `(dec arg)`. Or rather, `(duck
arg)`, because we want to get out of training wheels and stop
clearing the subject soon.
## Form of the solution
```
=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun
(duck arg)
!! :: some interesting core
```
`!!`, or "zapzap" or `[%zpzp ~]`, can go anywhere a twig can and
always crashes. Because its span is the empty set (`%void`), it
doesn't cause type inference problems.
In place of the `!!`, we'll put a core, effectively a library,
that provides our new, improved decrement function `duck`. We'll
then call it with the irregular form, `(duck arg)`, which looks
like a function call but is in fact some mysterious macro.
## Some interesting core
Translated into imperative programming, what we did in chapter 3
was more like computing a function of a global variable. Now,
we have to actually pass an argument to a function.
Here's our first try:
```
=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun
=+ gat=duck
=<(run gat(sam arg))
=> ~
|%
++ duck
=+ sam=0
=+ pre=0
|%
++ run
?: =(sam +(pre))
pre
run(pre +(pre))
--
--
~tasfyn-partyv:dojo/sandbox> +test 42
41
```
We step back and contemplate our handiwork. Is it good? Well...
it works. Reading programs written without syntactic sugar is
about as fun as eating raw chocolate nibs.
What did we do? In the `duck` arm (we often write `++duck`, for
obvious reasons) we produce a core whose payload is `[pre=0 num=0
~]`, and whose battery contains `++run`.
In the result twig, we first use `++duck` to extend our subject
with a core named `gat`. We then use `run` on that gate. Why do
we need this `gat`? Why can't we just write `=<(run duck(sam
arg))`?
Because the arm is computed *after* the mutation. But here we
need the mutated *result* of `++duck`. Instead, what this code
is doing is trying to mutate `sam` within the core that contains
`++duck`. Where it doesn't exist, so your code won't compile.
And note that with `=<`, we've placed our library structurally
between the original subject and the program we're writing,
but lexically at the bottom with zero left margin. We also
clear the subject to keep things simple.
## A more regular structure
It actually gets worse. To make this code look simpler, we need
to make it more complex. While "function calls" actually fit
quite well into the Hoon architecture, they're also a nontrivial
synthetic construction. We'll build the desugared form the hard
way, then show you where we put the sugar in.
The desugared canonical decrement:
```
=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun
=+ gat=duck
=<(run gat(sam arg))
=> ~
|%
++ duck
=+ sam=0
|%
++ run
=+ pre=0
=< loop
|%
++ loop
?: =(sam +(pre))
pre
loop(pre +(pre))
--
--
--
~tasfyn-partyv:dojo/sandbox> +test 42
41
```
Yuck. Okay, let's fix this.
## Art of the loop
First, look at our little `++loop`. It works just like our old
`++run` loop. We notice that there's actually something nice
about it: we don't use the symbol `loop` anywhere outside these 7
lines of code. It's not exported at all.
Actually, the symbol `loop` name is useless and redundant.
Making up names is one of the hard problems in computer science,
so why solve it? For just this reason, Hoon has an *empty name*,
which as a constant is a zero-length symbol (`%$` instead of
`%foo`), and as a limb is the `buc` symbol (`$`). With `$`,
our loop becomes:
```
=< $
|%
++ $
?: =(sam +(pre))
pre
$(sam +(run))
--
```
This may not seem like a huge improvement. It's not. But it's
exactly equivalent to the synthetic rune `|-`, "bardas":
```
|- ?: =(sam +(pre))
pre
$(pre +(pre))
```
This is obviously the canonical Hoon loop. It leaves us with
```
=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun
=+ gat=duck
=<(run gat(sam arg))
=> ~
|%
++ duck
=+ sam=0
|%
++ run
=+ pre=0
|- ?: =(sam +(pre))
pre
$(pre +(pre))
--
--
~tasfyn-partyv:dojo/sandbox> +test 42
41
```
## Is this a lambda?
Could we use `$` for `++run`? It certainly sounds like the same
kind of thing as `++loop` -- just a word we invented to mean "do
it." Should the programmer have to invent these kinds of words?
```
=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun
=+ gat=duck
=<($ gat(sam arg))
=> ~
|%
++ duck
=| sam=@ud
|%
=+ pre=0
++ $
|- ?: =(sam +(pre))
pre
$(pre +(pre))
--
--
~tasfyn-partyv:dojo/sandbox> +test 42
41
```
(Besides `run` to `$`, we changed `=+ sam=0` to `=| sam=@ud`.
Let's just remember that there's some magic here. We'll come
back and explain it later.)
This is still kind of ugly -- but it's exactly equivalent to
```
=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun
=+ gat=duck
=<($ gat(sam arg))
=> ~
|%
++ duck
|= sam=@ud
=+ pre=0
|- ?: =(sam +(pre))
pre
$(pre +(pre))
--
~tasfyn-partyv:dojo/sandbox> +test 42
41
```
Doesn't that look like a function? Indeed, we're done with
`++duck` -- that's what a Hoon decrement should look like.
If you squint a little, `|=` ("bartis") might even be a strange,
deformed lambda rune.
Since it's doing something simple, we might well even compress
the whole body of the function into one wide-form line:
```
=+(pre=0 |-(?:(=(sam +(pre)) pre $(pre +(pre)))))
```
(According, of course, to taste -- this is a bit tight for some.)
## Gates and how to call them
Our call site remains a disaster, though. We'll need moar sugar.
But first, let's look at this lambda-thing we've made. What is
the noun produced by `++duck`? Our term for it is a "gate," but
nobody will hate you for saying "function." And while we "slam"
our gates, you can feel free to just "call" them.
A gate is a core, of course, but a special kind of core. All
cores are shaped like `[battery payload]`. A gate is shaped like
`[formula [sample context]]`. A gate has one arm, `$`, so its
battery is just a formula. To slam a gate, you replace its
sample (`+6` or `+<`, "luslit" or "lust") with your own noun,
and apply the formula to the mutated gate.
As we explained earlier, `duck(sam arg)` is not the right way to
mutate the gate we make with `duck`, because it's actually
trying to mutate the core we used to make `duck`. But there has
to be some sugar to do this, and there is: `%*`, "centar". We
can replace our call site with `%*($ duck sam arg)`.
This is also not quite orthodox, because the whole point of a
gate is the canonical shape that defines a calling convention.
We can and should say: `%*($ duck +< arg)`.
Unsurprisingly, this in turn is `%-(duck arg)` in regular form,
or `(duck arg)`