From e97bec0bb58a35299bd6ab6d52f2ae65177b9cbb Mon Sep 17 00:00:00 2001 From: Galen Wolfe-Pauly Date: Tue, 29 Sep 2015 14:36:58 -0700 Subject: [PATCH] tutorial drafts --- pub/doc/hoon/tutorial.md | 4 + pub/doc/hoon/tutorial/0-nouns.md | 232 ++++++++++++++++ pub/doc/hoon/tutorial/1-twigs.md | 229 ++++++++++++++++ pub/doc/hoon/tutorial/2-syntax.md | 392 +++++++++++++++++++++++++++ pub/doc/hoon/tutorial/3-program.md | 316 +++++++++++++++++++++ pub/doc/hoon/tutorial/4-functions.md | 261 ++++++++++++++++++ 6 files changed, 1434 insertions(+) create mode 100644 pub/doc/hoon/tutorial.md create mode 100644 pub/doc/hoon/tutorial/0-nouns.md create mode 100644 pub/doc/hoon/tutorial/1-twigs.md create mode 100644 pub/doc/hoon/tutorial/2-syntax.md create mode 100644 pub/doc/hoon/tutorial/3-program.md create mode 100644 pub/doc/hoon/tutorial/4-functions.md diff --git a/pub/doc/hoon/tutorial.md b/pub/doc/hoon/tutorial.md new file mode 100644 index 000000000..1048ccb96 --- /dev/null +++ b/pub/doc/hoon/tutorial.md @@ -0,0 +1,4 @@ +Tutorials +========= + + \ No newline at end of file diff --git a/pub/doc/hoon/tutorial/0-nouns.md b/pub/doc/hoon/tutorial/0-nouns.md new file mode 100644 index 000000000..3dec5abd4 --- /dev/null +++ b/pub/doc/hoon/tutorial/0-nouns.md @@ -0,0 +1,232 @@ +# Hoon 0: introduction + +Hoon is a strict, higher-order typed pure-functional language. + +Why Hoon? On the one hand, typed functional languages are known +for a particularly pleasant phenomenon: once your code compiles, +it's quite likely to work. On the other hand, most typed +functional languages are influenced by advanced mathematics. +As Barbie once put it, math class is hard. + +Hoon is a typed FP language for the common street programmer. +Well-written Hoon is as concrete and data-oriented as possible. +The less functional magic you use, the better. One Haskell +hacker described Hoon as "imperative programming in a functional +language." He didn't mean this as a compliment, but we choose to +take it as one. + +Moreover, one task of a type system in network computing is +marshalling typed data on the sender, and validating untrusted +data on the receiver. Hoon is very good at this task, which in +most typed languages is an afterthought at best. + +The main disadvantage of Hoon is that its syntax and semantics +are unfamiliar. The syntax will remind too many of Perl, but +like most human languages (and unlike Perl) it combines a regular +core structure with irregular variations. Its semantic +complexity is bounded by the fact that the compiler is only 2000 +lines of Hoon (admittedly an expressive language). Most peoples' +experience is that Hoon is much easier to learn than it looks. + +## Nouns: data made boring + +A noun is an atom or a cell. An atom is any unsigned integer. A +cell is an ordered pair of nouns. + +The noun is an intentionally boring data model. Nouns (at least, +nouns in Urbit) don't have cycles (although a noun implementation +should take advantage of acyclic graph structure). Noun +comparison is always by value (there is no way for the programmer +to test pointer equality). Nouns are strict; there is no such +thing as an infinite noun. And, of course, nouns are immutable. +So there's basically no way to have any real fun with nouns. + +For language historians, nouns are Lisp's S-expressions, minus a +lot of hacks, tricks, and features that made sense 50 years ago. +In particular, because atoms are not tagged (an atom can encode a +string, for instance), nouns work best with a static type system. +How do you print an atom if you don't know whether it's a string +or a number? You can guess, but... + +## A type system for nouns + +So learning nouns in practice involves learning them with a type +system that makes them usable. Fortunately, we have that. + +One obstacle to learning Hoon is that it has two quite distinct +concepts that might equally be called a "type." Worse, most +other typed functional languages are mathy and share a basically +mathematical concept of "type." We can't avoid using the T-word +occasionally, but it has no precise meaning in Hoon and can be +extremely confusing. + +Hoon's two kinds of "type" are `span` and `mold`. A span is both +a constructively defined set of nouns, and a semantic convention +for users in that set. A `mold` is a function whose range is +some useful span. A mold is always idempotent (for any noun x, +`f(x)` equals `f(f(x))`), and its domain is any noun. + +(One way to explain this is that while a span is what most +languages call a "type," Hoon has no way for the programmer to +express a span directly. Instead, we use inference to define it +as the range of a function. This same function, the mold, can +also be used to validate or normalize untrusted, untyped data -- +a common problem in modern programming.) + +(Hoon's inference algorithm is somewhat dumber than the +unification algorithms (Hindley-Milner) used in most typed +functional languages. Hoon reasons only forward, not backward. +It needs more manual annotations, which you usually want anyway. +Otherwise, it gets more or less the same job done.) + +## Let's make some nouns + +This stuff isn't even slightly hard. Let's make a noun: +``` +~tasfyn-partyv:dojo> 42 +``` +You'll see the expression you entered, then the resulting value: +``` +> 42 +42 +``` +Let's try a different value: +``` +~tasfyn-partyv:dojo> 0x2a +``` +You'll see: +``` +> 0x2a +0x2a +``` +`42` and `0x2a` are actually *the same noun*, because they're the +same number. But we don't just have the noun to print - we have +a `[span noun]` cell (sometimes called a `vase`). + +As you recall, a span defines a set of nouns and a semantic +interpretation. As sets, both spans here are "any number". But +semantically, `42` has a decimal span and `0x2a` hexadecimal, so +they print differently. + +(It's important to note that Hoon is a statically typed language. +We don't work with vases unless we're dynamically compiling code, +which is of course what we're doing here in the shell. Dynamic +type is static type compiled at runtime.) + +Finally, let's make some cells. Try these on your own ship: +``` +~tasfyn-partyv:dojo> [42 0x2a] +~tasfyn-partyv:dojo> [42 [0x2a 420]] +~tasfyn-partyv:dojo> [42 0x2a 420] +``` +We observe that cells associate right: `[a b c]` is just another +way of writing `[a [b c]]`. + +Also, Lisp veterans beware: Hoon `[a b]` is Lisp `(a . b)`, Lisp +`(a b)` is Hoon `[a b ~]`(`~` represents nil, with a value of atom `0`). Lisp and Hoon are both pair-oriented +languages down below, but Lisp has a layer of sugar that makes it +look list-oriented. Hoon loves its "improper lists," ie, tuples. + +## Looking at spans + +What are these mysterious spans? We can see them with the `?` +prefix, which prints the span along with the result. Moving to +a more compact example format: +``` +~tasfyn-partyv:dojo> ? 42 + @ud +42 +~tasfyn-partyv:dojo> ? 0x2a + @ux +0x2a +``` +`@ud` and `@ux` stand for "unsigned decimal" and "unsigned hex," +obviously. But what is this syntax? + +We only derive spans through inference. So there's no language +syntax for a span. We have to be able to print spans, though, if +only for debugging and diagnostics. `@ud` is an print-only +syntax. (In this case it happens to be the same as the `mold` +syntax, but that's just a coincidence.) + +## Looking at spans, part 2 + +A good way to teach yourself to think in nouns is to look not at +the prettyprinted span, but at the actual noun it's made of. +Since everything in Hoon is a noun, a span is a noun too. When +we use `??` rather than `?` as a prefix, we see the noun: +``` +~tasfyn-partyv:dojo> ?? 42 + [%atom %ud] +42 +~tasfyn-partyv:dojo> ?? [42 0x2a] + [%cell [%atom %ud] [%atom %ux]] +[42 0x2a] +``` +What is this `%atom` notation? Is it a real noun? Can anyone +make one? +``` +~tasfyn-partyv:dojo> %atom +%atom +~tasfyn-partyv:dojo> %foo +%foo +~tasfyn-partyv:dojo> [%foo %bar] +[%foo %bar] +``` +What if we look at the span? +``` +~tasfyn-partyv:dojo> ? %foo + %foo +%foo +~tasfyn-partyv:dojo> ?? %foo + [%cube 7.303.014 %atom %tas] +%foo +``` +This takes a little bit of explaining. First of all, `7.303.014` +is just the German (and Urbit) way of writing `7,303,014`, or the +hexadecimal number `0x6f.6f66`, or the string "foo" as an +unsigned integer. (It's much easier to work with large integers +when the digits are grouped.) Second, remembering that cells +nest right, `[%cube 7.303.014 %atom %tas]` is really `[%cube +7.303.014 [%atom %tas]]`. + +A `%cube` span is a constant -- a set of one noun, the atom +`7.303.014`. But we still need to know how to print that noun. +In this case, it's an `[%atom %tas]`, ie, a text symbol. + +Cubes don't have to be symbols -- in fact, we can take the +numbers we've just been using, and make them constants: +``` +~tasfyn-partyv:dojo> %42 +%42 +~tasfyn-partyv:dojo> ? %42 + %42 +%42 +~tasfyn-partyv:dojo> ?? %42 + [%cube 42 %atom %ud] +%42 +``` + +## Our first mold + +After seeing a few span examples, are we ready to describe the +set of all spans with a Hoon mold? Well, no, but let's try it +anyway. Ignore the syntax (which we'll explain later; this is a +tutorial, not a reference manual), and you'll get the idea: +``` +++ span + $% [%atom @tas] + [%cell span span] + [%cube * span] + == +``` +This mold is not the entire definition of `span`, just the cases +we've seen so far. In English, a valid span is either: + +- a cell with head `%atom`, and tail some symbol. +- a cell with head `%cell`, and tail some pair of spans. +- a cell with head `%cube`, and tail a noun-span pair. + +The head of a span is essentially the tag in a variant record, +a pattern every programming language has. To use the noun, we +look at the head and then decide what to do with the tail. diff --git a/pub/doc/hoon/tutorial/1-twigs.md b/pub/doc/hoon/tutorial/1-twigs.md new file mode 100644 index 000000000..9ae966536 --- /dev/null +++ b/pub/doc/hoon/tutorial/1-twigs.md @@ -0,0 +1,229 @@ +# Hoon 1: twigs and legs + +In the last chapter we learned how to make nouns. In this +chapter we'll start programming a little. + +## Nock for Hoon programmers + +Hoon compiles itself to a pico-interpreter called Nock. This +isn't the place to explain Nock (which is to Hoon much as +assembly language is to C), but Nock is just a way to express a +function as a noun. + +Specifically, you can think of Nock as a (Turing-complete) +interpreter shaped like (pseudocode): +``` +Nock(subject formula) => product +``` +Your function is the noun `formula`. The input to the function +is the noun `subject`. The output is `product`. If something +about this seems complicated or even interesting, you may be +misunderstanding it. + +## From Hoon to Nock + +The Hoon parser turns an source expression (even one as simple as +`42` from the last chapter) into a noun called a `twig`. If you +know what an AST is, a twig is an AST. (If you don't know what +an AST is, it's not worth the student loans.) + +To simplify slightly, the Hoon compiler is shaped like: +``` +Hoon(subject-span function-twig) => [product-span formula-nock] +``` +Hoon, like Nock, is a *subject-oriented* language - your twig is +always executed against one input noun, the subject. For any +subject noun in `subject-span`, the compiler produces a Nock +formula that computes `function-twig` on that subject, and a +`product-span` that is the span of the product. + +(Pretty much no other language works this way. In a normal +language, your code is executed against a scope, stack, or other +variable context, which may not even be a regular user-level +value. This change is one of the hardest things to understand +about Hoon, mostly because it's hard to stay convinced that +subject-oriented programming is as straightforward as it is.) + +## From constants to twigs + +In the last chapter we were entering degenerate twigs like `42`. +Obviously this doesn't use the subject at all. + +Let's use the dojo variable facility (this is *not* Hoon syntax, +just a dojo command) to make a test subject: +``` +~tasfyn-partyv:dojo> =test [[[8 9] 5] [6 7]] +``` +We can evaluate twigs against this subject with the Hoon `:` +syntax (`a:b` uses the product of `b` as the subject of `a`). +``` +~tasfyn-partyv:dojo> 42:test +42 +``` + +## Tree addressing + +The simplest twigs produce a subtree, or "leg", of the subject. +A cell, of course, is a binary tree. The very simplest twig is +`.`, which produces the root of the tree - the whole subject: +``` +~tasfyn-partyv:dojo> .:test +[[[8 9] 5] 6 7] +``` +(If you're wondering why `[6 7]` got printed as `6 7`, remember +that `[]` associates to the right.) + +Hoon has a simple tree addressing scheme (inherited from Nock): +the root is `1`, the head of `n` is `2n`, the tail is `2n+1`. +The twig syntax is `+n`. Hence: +``` +~tasfyn-partyv:dojo> +1:test +[[[8 9] 5] 6 7] +``` +Our example is a sort of Hoon joke, not very funny: +``` +~tasfyn-partyv:dojo> +2:test +[[8 9] 5] +~tasfyn-partyv:dojo> +3:test +[6 7] +~tasfyn-partyv:dojo> +4:test +[8 9] +~tasfyn-partyv:dojo> +5:test +5 +~tasfyn-partyv:dojo> +6:test +6 +~tasfyn-partyv:dojo> +7:test +7 +``` +And so on. An instinct for binary tree geometry develops over +time as you use the system, rather the way most programmers +learn to do binary math. + +## Femur syntax + +A "femur" is an alternative syntax for a tree address. The femur +syntax creates a recognizable geometric shape by alternating +between two head/tail pairs, read left to right: `-` and `+`, +`<` and `>`. + +Thus `-` is `+2`, `+` is `+3`, `+<` is `+6`, `->` is `+5`, `-<+` +is `+9`, etc. The decimal numbers are distracting, whereas the +glyph string binds directly to the tree geometry as you learn it. +We actually almost never use the decimal tree geometry syntax. + +## Simple faces + +But it would be pretty tough to program in Hoon if explicit +geometry was the only way of getting data out of a subject. +Let's introduce some new syntax: +``` +~tasfyn-partyv:dojo> foo=42 +foo=42 +~tasfyn-partyv:dojo> ? foo=42 + foo=@ud +foo=42 +~tasfyn-partyv:dojo> ?? foo=42 + [%face %foo %atom %ud] +foo=42 +``` +To extend our `++span` mold: +``` +++ span + $% [%atom @tas] + [%cell span span] + [%cube * span] + [%face @tas span] + == +``` +The `%face` span wraps a label around a noun. Then we can +get a leg by name. Let's make a new dojo variable: +``` +~tasfyn-partyv:dojo> =test [[[8 9] 5] foo=[6 7]] +``` +The syntax is what you might expect: +``` +~tasfyn-partyv:dojo> foo:test +[6 7] +``` +Does this do what you expect it to do? +``` +~tasfyn-partyv:dojo> +3:test +foo=[6 7] +~tasfyn-partyv:dojo> ? +3:test + foo=[@ud @ud] +foo=[6 7] +~tasfyn-partyv:dojo> ?? +3:test + [%face %foo %cell [%atom %ud] %atom %ud] +foo=[6 7] +``` + +## Interesting faces; wings + +Again, you're probably used to name resolution in variable scopes +and flat records, but not in trees. (Partly this is because the +tradition in language design is to eschew semantics that make it +hard to build simple symbol tables, because linear search of a +big tree is a bad idea on '80s hardware.) + +Let's look at a few more interesting face cases. First, suppose +we have two cases of `foo`? +``` +~tasfyn-partyv:dojo> =test [[foo=[8 9] 5] foo=[6 7]] +~tasfyn-partyv:dojo> foo:test +[8 9] +``` +In the tree search, the head wins. We can overcome this with a +`^` prefix, which tells the search to skip its first hit: +``` +~tasfyn-partyv:dojo> =test [[foo=[8 9] 5] foo=[6 7]] +~tasfyn-partyv:dojo> ^foo:test +[6 7] +``` +`^^foo` will skip two foos, `^^^foo` three, up to `n`. +But what about nested labels? +``` +~tasfyn-partyv:dojo> =test [[[8 9] 5] foo=[6 bar=7]] +~tasfyn-partyv:dojo> bar:test +/~tasfyn-partyv/home/~2015.9.16..21.40.21..1aec:<[1 1].[1 9]> +-find-limb.bar +find-none +``` +It didn't seem to like that. We'll need a nested search: +``` +~tasfyn-partyv:dojo> bar.foo:test +7 +``` +`bar.foo` here is a `wing`, a search path in a noun. Note that +the wing runs from left to right, ie, the opposite of most +languages: `bar.foo` means "bar inside foo." + +Each step in a wing is a `limb`. A limb can be a tree address, +like `+3` or `.`, or a label like `foo`. We can combine them in +one wing: +``` +~tasfyn-partyv:dojo> bar.foo.+3:test +7 +``` + +## Mutation + +Well, not really. We can't modify nouns; the concept doesn't +even make sense in Hoon. Rather, we build new nouns which are +(logical -- the pointers are actually shared) copies of old ones, +with changes. + +Let's build a "mutated" copy of our test noun: +``` +~tasfyn-partyv:dojo> test +[[[8 9] 5] foo=[6 bar=7]] +~tasfyn-partyv:dojo> test(foo 42) +[[[8 9] 5] foo=42] +~tasfyn-partyv:dojo> test(+8 %eight, bar.foo [%hello %world]) +[[[%eight 9] 5] foo=[6 [%hello %world]]] +``` +As we see, there's no obvious need for the mutant noun to be +shaped anything like the old noun. They're different nouns. + +At this point, you have a simplified but basically sound idea of +how Hoon builds and manages nouns. Next, it's time to do some +programming. diff --git a/pub/doc/hoon/tutorial/2-syntax.md b/pub/doc/hoon/tutorial/2-syntax.md new file mode 100644 index 000000000..3864ba48c --- /dev/null +++ b/pub/doc/hoon/tutorial/2-syntax.md @@ -0,0 +1,392 @@ +# Hoon 2: serious syntax + +We've done a bunch of fun stuff on the command line. We know our +nouns. It's time to actually write some serious code -- in a +real source file. + +## Building a simple generator + +In Urbit there's a variety of source file roles, distinguished by +the magic paths they're loaded from: `/gen` for generators, +`/ape` for appliances, `/fab` for renderers, etc. + +We'll start with a generator, the simplest kind of Urbit program. + +### Create a sandbox desk + +A desk is the Urbit equivalent of a `git` branch. We're just +playing around here and don't intend to soil our `%home` desk with +test files, so let's make a sandbox: +``` +|merge %sandbox our %home +``` +### Mount the sandbox + +Your Urbit pier is in `~/tasfyn-partyv`, or at least mine is. +So we can get our code into Urbit, run the command +``` +~tasfyn-partyv:dojo> |mount /=sandbox=/gen %gen +``` +mounts the `/gen` folder from the `%sandbox` desk in your Unix +directory `~/tasfyn-partyv/gen`. The mount is a two-way sync, +like your Dropbox. When you edit a Unix file and save, your edit +is automatically committed as a change to `%sandbox`. + +### Execute from the sandbox + +The `%sandbox` desk obviously is merged from `%home`, so it +contains find all the default facilities you'd expect there. +Bear in mind, we didn't set it to auto-update when `%home` +is updated (that would be `|sync` instead of `|merge`). + +So we're not roughing it when we set the dojo to load from +`%sandbox`: +``` +[switch to %home] +``` + +### Write your builder + +Let's build the simplest possible kind of generator, a builder. +With your favorite Unix text editor (there are Hoon modes for vim +and emacs), create the file `~/tasfyn-partyv/gen/test.hoon`. +Edit it into this: +``` +:- %say |= * :- %noun +[%hello %world] +``` +Get the spaces exactly right, please. Hoon is not in general a +whitespace-sensitive language, but the difference between one +space and two-or-more matters. And for the moment, think of +``` +:- %say |= * :- %noun +``` +as gibberish boilerplate at the start of a file, like `#include +"stdio.h"` at the start of a C program. Any of our old Hoon +constants would work in place of `[%hello %world`]. + +Now, run your builder: +``` +~tasfyn-partyv:dojo/sandbox> +test +[%hello %world] +``` +Obviously this is your first Hoon *program* per se. + +## Hoon syntax 101 + +But what's up with this syntax? + +### A syntactic apology + +The relationship between ASCII and human programming languages +is like the relationship between the electric guitar and +rock-and-roll. If it doesn't have a guitar, it's not rock. +Some great rockers play three chords, like Johnny Ramone; some +shred it up, like Jimmy Page. + +The two major families of ASCII-shredding languages are Perl and +the even more spectacular APL. (Using non-ASCII characters is +just a fail, but APL successors like J fixed this.) No one +has any right to rag on Larry Wall or Ken Iverson, but Hoon, +though it shreds, shreds very differently. + +The philosophical case for a "metalhead" language is threefold. +One, human beings are much better at associating meaning with +symbols than they think they are. Two, a programming language is +a professional tool and not a plastic shovel for three-year-olds. + +And three, the alternative to heavy metal is keywords. When you +use a keyword language, not only are you forcing the programmer +to tiptoe around a ridiculous maze of restricted words used and +reserved, you're expressing your program through two translation +steps: symbol->English and English->computation. When you shred, +you are going direct: symbol->computation. Especially in a pure +language, this creates a sense of "seeing the function" which no +keyword language can quite duplicate. + +But any metalhead language you don't yet know is line noise. +Let's get you up to speed as fast as possible. + +### A glyphic bestiary + +A programming language needs to be not just read but said. But +no one wants to say "ampersand." Therefore, we've taken the +liberty of assigning three-letter names to all ASCII glyphs. + +Some of these bindings are obvious and some aren't. You'll be +genuinely surprised at how easy they are to remember: +``` + ace [1 space] dot . pan ] + bar | fas / pel ) + bis \ gap [>1 space, nl] pid } + buc $ hax # ran > + cab _ ket ^ rep ' + cen % lep ( sac ; + col : lit < tar * + com , lus + tec ` + das - mat @ tis = + den " med & wut ? + dip { nap [ zap ! +``` +It's fun to confuse people by using these outside Urbit. A few +digraphs also have irregular sounds: +``` +== stet +-- shed +++ slus +-> dart +-< dusk ++> lark ++< lush +``` + +### The shape of a twig + +A twig, of course, is a noun. As usual, the easiest way to +explain both the syntax that compiles into that noun, and the +semantic meaning of the noun, is the noun's physical structure. + +#### Autocons + +A twig is always a cell, and any cell of twigs is a twig +producing a cell. As an homage to Lisp, we call this +"autocons." Where you'd write `(cons a b)` in Lisp, you write +`[a b]` in Hoon, and the shape of the twig follows. + +The `???` prefix prints a twig as a noun instead of running it. +Let's see autocons in action: +``` +~tasfyn-partyv:dojo/sandbox> ??? 42 +[%dtzy %ud 42] +~tasfyn-partyv:dojo/sandbox> ??? 0x2a +[%dtzy %ux 42] +~tasfyn-partyv:dojo/sandbox> ??? [42 0xa] +[[%dtzy %ud 42] %dtzy %ux 42] +``` +(As always, it may confuse *you* that this is the same noun as +`[[%dtzy %ud 42] [%dtzy %ux 42]]`, but it doesn't confuse Hoon.) + +#### The stem-bulb pattern + +If the head of your twig is a cell, it's an autocons. If the +head is an atom, it's an unpronounceable four-letter symbol like +the `%dtzy` above. + +This is the same pattern as we see in the `span` mold -- a +variant record, essentially, in nouns. The head of one of these +cells is called the "stem." The tail is the "bulb." The shape +of the bulb is totally dependent on the value of the stem. + +#### Runes and stems + +A "rune" (a word intentionally chosen to annoy Go programmers) is +a digraph - a sequence of two ASCII glyphs. If you know C, you +know digraphs like `->` and `?:` and are used to reading them as +single characters. + +In Hoon you can *say* them as words: "dasran" and "wattis" +respectively. In a metalhead language, if we had to say +"minus greater-than" and "question-colon", we'd just die. + +Most twig stems are made from runes, by concatenating the glyph +names and removing the vowels. For example, the rune `=+`, +pronounced "tislus," becomes the stem `%tsls`. (Note that in +many noun implementations, this is a 31-bit direct value.) + +(Some stems (like `%dtzy`) are not runes, simply because they +don't have regular-form syntax and don't need to use precious +ASCII real estate. They are otherwise no different.) + +An important point to note about runes: they're organized. The +first glyph in the rune defines a category. For instance, runes +starting with `.` compute intrinsics; runes starting with `|` +produce cores; etc. + +Another important point about runes: they come in two flavors, +"natural" (stems interpreted directly by the compiler) and +"synthetic" (macros, essentially). + +(Language food fight warning: one advantage of Hoon over Lisp is +that all Hoon macros are inherently hygienic. Another advantage +is that Hoon has no (user-level) macros. In Hoon terms, nobody +gets to invent their own runes. A DSL is always and everywhere +a write-only language. Hoon shreds its ASCII pretty hard, but +the same squiggles mean the same things in everyone's code.) + +#### Wide and tall regular forms + +A good rune example is the simple rune `=+`, pronounced "tislus", +which becomes the stem `%tsls`. A `%tsls` twig has the shape +`[%tsls twig twig]`. + +The very elegance of functional languages creates a visual +problem that imperative languages lack. An imperative language +has distinct statements (with side effects) and (usually pure) +expressions; it's natural that in most well-formatted code, +statements flow vertically down the screen, and expressions grow +horizontally across this. This interplay creates a natural and +relaxing shape on your screen. + +In a functional language, there's no difference. The trivial +functional syntax is Lisp's, which has two major problems. One: +piles of expression terminators build up at the bottom of complex +functions. Two: the natural shape of code is diagonal. The more +complex a function, the more it wants to besiege the right +margin. The children of a node have to start to the right of its +parent, so the right margin bounds the tree depth. + +Hoon does not completely solve these problems, but alleviates +them. In Hoon, there are actually two regular syntax forms for +most twig cases: "tall" and "wide" form. Tall twigs can contain +wide twigs, but not vice versa, so the visual shape of a program +is very like that of a statements-and-expressions language. + +Also, in tall mode, most runes don't need terminators. Take +`=+`, for example. Since the parser knows to expect exactly +two twigs after the `=+` rune, it doesn't need any extra syntax +to tell it that it's done. + +Let's try a wide `=+` in the dojo: +``` +~tasfyn-partyv:dojo/sandbox> =+(planet=%world [%hello planet]) +[%hello %world] +``` +(`=+` seems to be some sort of variable declaration? Let's not +worry about it right now. We're on syntax.) + +The wide syntax for a `=+` twig, or any binary rune: `(`, the +first subtwig, one space, the second subtwig, and `)`). To read +this twig out loud, you'd say: +``` +tislus lap planet is cen world ace nep cen hello ace planet pen +pal +``` +("tis" not in a rune gets contracted to "is".) + +Let's try a tall `=+` in `test.hoon`: +``` +:- %say |= * :- %noun +=+ planet=%world +[%hello planet] +``` +The tall syntax for a `=+` twig, or any binary rune: the rune, at +least two spaces or one newline, the first subtwig, at least two +spaces or one newline, the second subtwig. Again, tall subtwigs +can be tall or wide; wide subtwigs have to be wide. + +(Note that our boilerplate line is a bunch of tall runes on one +line, with two-space gaps. This is unusual but quite legal, and +not to be confused with the actual wide form.) + +To read this twig out loud, you'd say: +``` +tislus gap planet is cen world gap nep cen hello ace planet pen +``` +#### Layout conventions + +Should you use wide twigs or tall twigs? When? How? What +should your code look like? You're the artist. Except for the +difference between one space (`ace`) and more space (`gap`), the +parser doesn't care how you format your code. Hoon is not Go -- +there are no fixed rules for doing it right. + +However, the universal convention is to keep lines under 80 +characters. Also, hard tab characters are illegal. And when in +doubt, make your code look like the kernel code. + +##### Backstep indentation + +Note that the "variable declaration" concept of `=+` (which is no +more a variable declaration than a Tasmanian tiger is a tiger) +works perfectly here. Because `[%hello planet]` -- despite being +a subtree of the the `=+` twig -- is at the same indent level. +So our code flows down the screen, not down and to the right, and +of course there are no superfluous terminators. It looks good, +and creates fewer hard-to-find syntax errors than you'd think. + +This is called "backstep" indentation. Another example, using a +ternary rune that has a strange resemblance to C: +``` +:- %say |= * :- %noun +=+ planet=%world +?: =(%world planet) + [%hello planet] +[%goodbye planet] +``` +It's not always the case when backstepping that the largest +subtwig is at the bottom and loses no margin, but it often is. +And not all runes have tuple structure; some are n-ary, and use +the `==` terminator (again, pronounced "stet"): +``` +:- %say |= * :- %noun +=+ planet=%world +?+ planet + [%unknown planet] + %world [%hello planet] + %ocean [%goodbye planet] +== +``` +So we occasionally lose right-margin as we descend a deep twig. +But we can keep this lossage low with good layout design. The +goal is to keep the heavy twigs on the right, and Hoon tries as +hard as possible to help you with this. + +For instance, `=+` ("tislus") is a binary rune: `=+(a b)`. In +most cases of `=+` the heavy twig is `b`, but sometimes it's `a`. +So we can use its friend the `=-` rune ("tisdas") to get the same +semantics with the right shape: `=-(b a)`. + +#### Irregular forms + +There are more regular forms than we've shown above, but not a +lot more. Hoon would be quite easy to learn if it was only its +regular forms. It wouldn't be as easy to read or use, though. +The learning curve is important, but not all-important. + +Some stems (like the `%dtzy` constants above) obviously don't and +can't have any kind of regular form (which is why `%dtzy` is not +a real digraph rune). Many of the true runes have only regular +forms. But some have irregular forms. Irregular forms are +always wide, but there is no other constraint on their syntax. + +We've already encountered one of the irregular forms: `foo=42` +from the last chapter, and `planet=%world` here. Let's unpack +this twig: +``` +~tasfyn-partyv:dojo/sandbox> ?? %world + [%cube 431.316.168.567 %atom %tas] +%world + +~tasfyn-partyv:dojo/sandbox> ??? %world +[%dtzz %tas 431.316.168.567] +``` +Clearly, `%dtzz` is one of our non-regulars. But we can wrap it +with our irregular form: +``` +~tasfyn-partyv:dojo/sandbox> ?? planet=%world + [%face %planet [%cube 431.316.168.567 %atom %tas]] +planet=%world + +~tasfyn-partyv:dojo/sandbox> ??? planet=%world +[%ktts %planet %dtzz %tas 431.316.168.567] +``` +Since `%ktts` is "kettis", ie, `^=`, this has to be the irregular +form of +``` +~tasfyn-partyv:dojo/sandbox> ^=(planet %world) +planet=world +``` +So if we wrote our example without this irregular form, it'd be +``` +:- %say |= * :- %noun +=+ ^=(planet %world) +[%hello planet] +``` +Or with a gratuitous use of tall form: +``` +:- %say |= * :- %noun +=+ ^= planet %world +[%hello planet] +``` +Now you know how to read Hoon! For fun, try to pronounce more of +the code on this page. Please don't laugh too hard at yourself. diff --git a/pub/doc/hoon/tutorial/3-program.md b/pub/doc/hoon/tutorial/3-program.md new file mode 100644 index 000000000..964e62f16 --- /dev/null +++ b/pub/doc/hoon/tutorial/3-program.md @@ -0,0 +1,316 @@ +# Hoon 3: our first program + +It's time for us to do some actual programming. In this section, +we'll work through that classic Urbit pons asinorum, decrement. + +If you learned Nock before Hoon, you've already done decrement. +If not, all you need to know is that the only arithmetic +intrinsic in Nock is increment -- in Hoon, the unary `.+` rune. +So an actual decrement function is required. + +In chapter 3, we write a decrement builder: more or less the +simplest nontrivial Urbit program. We should be able to run this +example: +``` +~tasfyn-partyv:dojo/sandbox> +test 42 +41 +``` + +## What's in that subject? + +As we've seen, Hoon works by running a twig against a subject. +We've been cheerfully running twigs through three chapters while +avoiding the question: what's in the subject? To avoid the issue +we've built a lot of constants, etc. + +Of course your twig's subject comes from whoever runs it. There +is no one true subject. Our twigs on the command line are not +run against the same subject as our generator code, even though +they are both run by the same `:dojo` appliance. + +But the short answer is that both command-line and builder get +*basically* the same subject: some ginormous noun containing all +kinds of bells and whistles and slicers and dicers, including a +kernel library which can needless to say decrement in its sleep. + +As yet you have only faced human-sized nouns. We need not yet +acquaint you with this mighty Yggdrasil, Mother of Trees. First +we need to figure out what she could even be made of. + +## Clearing the subject + +We'll start by clearing the subject: +``` +:- %say |= * :- %noun +=> ~ +[%hello %world] +``` +The `=>` rune ("tisran"), for `=>(p q)` executes `p` against +the subject, then uses that product as the subject of `q`. + +(We've already used an irregular form of `=>`, or to be more +precise its mirror `=<` ("tislit"). In chapter 1, when we wrote +`+3:test`, we meant `=>(test +3)` or `=<(+3 test)`.) + +What is this `~`? It's Hoon `nil`, a zero atom with this span: +``` +~tasfyn-partyv:dojo/sandbox> ?? ~ + [%cube 0 %atom %n] +~ +``` +We use it for list terminators and the like. Obviously, since +our old test code is just a constant, a null subject works fine: +``` +~tasfyn-partyv:dojo/sandbox> +test +[%hello %world] +``` + +## Getting an argument + +Obviously, if we want to write a decrement builder, we'll have to +get an argument from the command line. This involves changing +the `test.hoon` boilerplate a little: +``` +:- %say |= [* [[arg=@ud ~] ~]] :- %noun +=> arg=arg +[%hello arg] + +~tasfyn-partyv:dojo/sandbox> +test 42 +[%hello 42] +``` +`=> arg=arg` looks a little odd. We wouldn't ordinarily do +this. We're just replacing a very interesting subject that +contains `arg` with a very boring one that contains only `arg`, +for the same reason we cleared the subject with `~`. + +In case there's any doubt about the subject (`.` is limb syntax +for `+1`, ie, the whole noun): +``` +:- %say |= [* [[arg=@ud ~] ~]] :- %noun +=> arg=arg +. + +~tasfyn-partyv:dojo/sandbox> +test 42 +arg=42 +``` + +We can even write a trivial increment function using `.+`: +``` +:- %say |= [* [[arg=@ud ~] ~]] :- %noun +=> arg=arg ++(arg) + +~tasfyn-partyv:dojo/sandbox> +test 42 +43 +``` +Below we'll skip both boilerplate lines in our examples. + +## A core is a code-data cell + +But how do we actually, like, code? The algorithm for decrement +is clear. We need to count up to 41. (How do we run useful +programs on a computer with O(n) decrement? That's an +implementation detail.) + +We'll need another kind of noun: the *core*. Briefly, the core +is always a cell `[battery payload]`. The payload is data, the +battery is code -- one or more Nock formulas, to be exact. + +Consider a simple core with a one-formula battery. Remember, we +create Nock formulas by compiling a twig against a subject. The +subject is dynamic data, but its span is static. What span do we +give the compiler, and what noun do we give the formula? + +A core formula always has the core as its subject. The formula +is essentially a computed attribute on the payload. But if the +subject was just the payload, the formula couldn't recurse. + +Of course, there is no need to restrict ourselves to one computed +attribute. We can just stick a bunch of formulas together and +call them a battery. The source twigs in this core are called +"arms," which have labels just like the faces we saw earlier. + +Hoon overloads computed attributes (arms) and literal attributes +(legs) in the same namespace. A label in a wing may refer to +either. To extend the name-resolution tree search described in +chapter 1, when searching a core, we look for a matching arm. +If we find it we're done. If we don't, or if a `^` mark makes us +skip, we search into the payload. + +If a name resolves to a core arm, but it's not the last limb in the +wing, the arm produces the core itself. Similarly, when the +wing is not an access but a mutation, the arm refers to the core. + +This demands an example: if `foo` produces some core `c`, and +`bar` is an arm in that `c` (which may be `foo` itself, or some +leg within `foo`), `bar.foo` runs the arm formula with `c` as the +subject. You might think that `moo.bar.foo` would compute +`bar.foo`, then search for `moo` within that result. Instead, it +searches for `moo` within `c`. (You can get the other result +with `moo:bar.foo`.) + +Does this sound too tricky? It should - it's about the most +complicated feature of Hoon. It's all downhill once you +understand cores. + +Let's again extend our `++span` mold: +``` +++ span + $% [%atom @tas] + [%cell span span] + [%core span (map ,@tas twig)] + [%cube * span] + [%face @tas span] + == +``` +This definition of `%core` is somewhat simplified from the +reality, but basically conveys it. (Moreover, this version of +`span` describes every kind of noun we build.) In our `%core` we +see a payload span and a name-to-twig arm table, as expected. + +Is a core an object? Not quite, because an arm is not a method. +Methods in an OO language have arguments. Arms are functions +only of the payload. (A method in Hoon is an arm that produces a +gate, which is another core -- but we're getting too far ahead.) +However, the battery does look a lot like a classic "vtable." + +## Increment with a core + +Let's increment with a core: +``` +=< inc +|% +++ inc + +(arg) +-- + +~tasfyn-partyv:dojo/sandbox> +test 42 +43 +``` +What's going on? We used the `|%` rune ("barcen") to produce a +core. (There are a lot of runes which create cores; they all +start with `|`, and are basically macros that turn into `|%`.) + +The payload of a core produced with `|%` is the subject with +which `|%` is compiled. We might say that `|%` wraps a core +around its subject. In this case, the subject of the `|%`, +and thus payload, is our `arg=@ud` argument. + +Then we used this core as the subject of the simple wing `inc`. +(Remember that `=<(a b)` is just `=>(b a)`.) + +We can actually print out a core. Take out the `=< inc`: +``` +|% +++ inc + +(arg) +-- + +~tasfyn-partyv:dojo/sandbox> +test 42 +!!! + +~tasfyn-partyv:dojo/sandbox> ? +test 42 +!!! +``` +Cores can be large and complex, and we obviously can't render all +the data in them, either when printing a type or a value. At +some point, you'll probably make the mistake of printing a big +core, maybe even the whole kernel, as an untyped noun. Just +press ^C. + +## Adding a counter + +To decrement, we need to count up to the argument. So we need a +counter in our subject, because where else would it go? Let's +change the subject to add a counter, `pre`: +``` +=> [pre=0 .] +=< inc +|% +++ inc + +(arg) +-- + +~tasfyn-partyv:dojo/sandbox> +test 42 +43 +``` +Once again, `.` is the whole subject, so we're wrapping it in a +cell whose head is `pre=0`. Through the magic of labels, this +doesn't change the way we use `arg`, even though it's one level +deeper in the subject tree. Let's look at the subject again: +``` +=> [pre=0 .] +. + +~tasfyn-partyv:dojo/sandbox> +test 42 +[pre=0 arg=42] +~tasfyn-partyv:dojo/sandbox> ? +test 42 + [pre=@ud arg=@ud] +[pre=0 arg=42] +``` +There's actually a simpler way to write this. We've seen it +already. It's not exactly a variable declaration: +``` +=+ pre=0 +. + +~tasfyn-partyv:dojo/sandbox> +test 42 +[pre=0 arg=42] +``` + +## We actually decrement + +Now we can write our actual decrement program: +``` +=+ pre=0 +=< dec +|% +++ dec + ?: =(arg +(pre)) + pre + dec(pre +(pre)) +-- + +~tasfyn-partyv:dojo/sandbox> +test 42 +41 +``` +`=(a b)` is an irregular form of `.=(a b)`, ie, "dottis" or the +noun `[%dtts a b]`. Likewise, `+(a)` is `.+(a)`, ie, "dotlus" +or `[%dtls a]`. + +`?:` is a regular rune which does exactly what you think it does. +Bear in mind, though, that in Hoon 0 (`&`, "rob") is true and 1 +(`|`, "bar") is false. + +The real action is in `dec(pre +(pre))`. This is obviously an +irregular form -- it's the same mutation form we saw before. +Writing it out in full regular form: +``` +=+ pre=0 +=< dec +|% +++ dec + ?: =(arg +(pre)) + pre + %= dec + pre +(pre) + == +-- +~tasfyn-partyv:dojo/sandbox> +test 42 +41 +``` +`%=`, "centis", is the rune which almost every use of a wing +resolves to. It might be called "evaluate with changes." + +When we evaluate with changes, we take a wing (`dec`) here and +evaluate it as described above. Searching in the subject, which +is of course our core, we find an arm called `dec` and run it. + +The changes (replacing `pre` with `+(pre)`) are always applied +relative to the core we landed on (or the leg we landed on). +The change wing is relative to this target; the subject of the +replacement (`+(pre)`) is the original subject. + +So, in English, we compute the `dec` arm again, against a new +core with a new payload that contains an incremented `pre`. +And thus, we decrement. Doesn't seem so hard, does it? diff --git a/pub/doc/hoon/tutorial/4-functions.md b/pub/doc/hoon/tutorial/4-functions.md new file mode 100644 index 000000000..c8429e3d6 --- /dev/null +++ b/pub/doc/hoon/tutorial/4-functions.md @@ -0,0 +1,261 @@ +# Hoon 4: toward actual functions + +Okay, we've programmed. We've achieved decrement. We've written +what is in some sense a loop. What next? + +Well... we're still feeling vaguely disappointed. Because we're +supposed to be doing *functional programming*. And we haven't +yet written any *functions*. + +After all, in Hoon we don't really write a command-line utility +to decrement `42`. We write `(dec 42)`. You probably realize +that on the inside, this is not the same thing as a function in a +normal functional language. The Tasmanian tiger is not a tiger. +On the other hand, it certainly *looks* like a function call. + +So how do we write the function? + +In this chapter, we'll modify `+test` to extend the subject so +that we can write our result as `(dec arg)`. Or rather, `(duck +arg)`, because we want to get out of training wheels and stop +clearing the subject soon. + +## Form of the solution + +``` +=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun + (duck arg) +!! :: some interesting core +``` +`!!`, or "zapzap" or `[%zpzp ~]`, can go anywhere a twig can and +always crashes. Because its span is the empty set (`%void`), it +doesn't cause type inference problems. + +In place of the `!!`, we'll put a core, effectively a library, +that provides our new, improved decrement function `duck`. We'll +then call it with the irregular form, `(duck arg)`, which looks +like a function call but is in fact some mysterious macro. + +## Some interesting core + +Translated into imperative programming, what we did in chapter 3 +was more like computing a function of a global variable. Now, +we have to actually pass an argument to a function. + +Here's our first try: +``` +=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun + =+ gat=duck + =<(run gat(sam arg)) +=> ~ +|% +++ duck + =+ sam=0 + =+ pre=0 + |% + ++ run + ?: =(sam +(pre)) + pre + run(pre +(pre)) + -- +-- + +~tasfyn-partyv:dojo/sandbox> +test 42 +41 +``` +We step back and contemplate our handiwork. Is it good? Well... +it works. Reading programs written without syntactic sugar is +about as fun as eating raw chocolate nibs. + +What did we do? In the `duck` arm (we often write `++duck`, for +obvious reasons) we produce a core whose payload is `[pre=0 num=0 +~]`, and whose battery contains `++run`. + +In the result twig, we first use `++duck` to extend our subject +with a core named `gat`. We then use `run` on that gate. Why do +we need this `gat`? Why can't we just write `=<(run duck(sam +arg))`? + +Because the arm is computed *after* the mutation. But here we +need the mutated *result* of `++duck`. Instead, what this code +is doing is trying to mutate `sam` within the core that contains +`++duck`. Where it doesn't exist, so your code won't compile. + +And note that with `=<`, we've placed our library structurally +between the original subject and the program we're writing, +but lexically at the bottom with zero left margin. We also +clear the subject to keep things simple. + +## A more regular structure + +It actually gets worse. To make this code look simpler, we need +to make it more complex. While "function calls" actually fit +quite well into the Hoon architecture, they're also a nontrivial +synthetic construction. We'll build the desugared form the hard +way, then show you where we put the sugar in. + +The desugared canonical decrement: +``` +=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun + =+ gat=duck + =<(run gat(sam arg)) +=> ~ +|% +++ duck + =+ sam=0 + |% + ++ run + =+ pre=0 + =< loop + |% + ++ loop + ?: =(sam +(pre)) + pre + loop(pre +(pre)) + -- + -- +-- + +~tasfyn-partyv:dojo/sandbox> +test 42 +41 +``` +Yuck. Okay, let's fix this. + +## Art of the loop + +First, look at our little `++loop`. It works just like our old +`++run` loop. We notice that there's actually something nice +about it: we don't use the symbol `loop` anywhere outside these 7 +lines of code. It's not exported at all. + +Actually, the symbol `loop` name is useless and redundant. +Making up names is one of the hard problems in computer science, +so why solve it? For just this reason, Hoon has an *empty name*, +which as a constant is a zero-length symbol (`%$` instead of +`%foo`), and as a limb is the `buc` symbol (`$`). With `$`, +our loop becomes: +``` +=< $ +|% +++ $ + ?: =(sam +(pre)) + pre + $(sam +(run)) +-- +``` +This may not seem like a huge improvement. It's not. But it's +exactly equivalent to the synthetic rune `|-`, "bardas": +``` +|- ?: =(sam +(pre)) + pre + $(pre +(pre)) +``` +This is obviously the canonical Hoon loop. It leaves us with +``` +=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun + =+ gat=duck + =<(run gat(sam arg)) +=> ~ +|% +++ duck + =+ sam=0 + |% + ++ run + =+ pre=0 + |- ?: =(sam +(pre)) + pre + $(pre +(pre)) + -- +-- + +~tasfyn-partyv:dojo/sandbox> +test 42 +41 +``` + +## Is this a lambda? + +Could we use `$` for `++run`? It certainly sounds like the same +kind of thing as `++loop` -- just a word we invented to mean "do +it." Should the programmer have to invent these kinds of words? +``` +=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun + =+ gat=duck + =<($ gat(sam arg)) +=> ~ +|% +++ duck + =| sam=@ud + |% + =+ pre=0 + ++ $ + |- ?: =(sam +(pre)) + pre + $(pre +(pre)) + -- +-- + +~tasfyn-partyv:dojo/sandbox> +test 42 +41 +``` +(Besides `run` to `$`, we changed `=+ sam=0` to `=| sam=@ud`. +Let's just remember that there's some magic here. We'll come +back and explain it later.) + +This is still kind of ugly -- but it's exactly equivalent to +``` +=< :- %say |= [* [[arg=@ud ~] ~]] :- %noun + =+ gat=duck + =<($ gat(sam arg)) +=> ~ +|% +++ duck + |= sam=@ud + =+ pre=0 + |- ?: =(sam +(pre)) + pre + $(pre +(pre)) +-- + +~tasfyn-partyv:dojo/sandbox> +test 42 +41 +``` +Doesn't that look like a function? Indeed, we're done with +`++duck` -- that's what a Hoon decrement should look like. +If you squint a little, `|=` ("bartis") might even be a strange, +deformed lambda rune. + +Since it's doing something simple, we might well even compress +the whole body of the function into one wide-form line: +``` +=+(pre=0 |-(?:(=(sam +(pre)) pre $(pre +(pre))))) +``` +(According, of course, to taste -- this is a bit tight for some.) + +## Gates and how to call them + +Our call site remains a disaster, though. We'll need moar sugar. + +But first, let's look at this lambda-thing we've made. What is +the noun produced by `++duck`? Our term for it is a "gate," but +nobody will hate you for saying "function." And while we "slam" +our gates, you can feel free to just "call" them. + +A gate is a core, of course, but a special kind of core. All +cores are shaped like `[battery payload]`. A gate is shaped like +`[formula [sample context]]`. A gate has one arm, `$`, so its +battery is just a formula. To slam a gate, you replace its +sample (`+6` or `+<`, "luslit" or "lust") with your own noun, +and apply the formula to the mutated gate. + +As we explained earlier, `duck(sam arg)` is not the right way to +mutate the gate we make with `duck`, because it's actually +trying to mutate the core we used to make `duck`. But there has +to be some sugar to do this, and there is: `%*`, "centar". We +can replace our call site with `%*($ duck sam arg)`. + +This is also not quite orthodox, because the whole point of a +gate is the canonical shape that defines a calling convention. +We can and should say: `%*($ duck +< arg)`. + +Unsurprisingly, this in turn is `%-(duck arg)` in regular form, +or `(duck arg)`