mirror of
https://github.com/urbit/shrub.git
synced 2024-12-14 20:02:51 +03:00
added some extremely rough hoon reference
This commit is contained in:
parent
87f8bc901f
commit
5222da66ad
2546
main/pub/src/doc/ref/hoon/lexicon.md
Normal file
2546
main/pub/src/doc/ref/hoon/lexicon.md
Normal file
File diff suppressed because it is too large
Load Diff
54
main/pub/src/doc/ref/hoon/morphology.md
Normal file
54
main/pub/src/doc/ref/hoon/morphology.md
Normal file
@ -0,0 +1,54 @@
|
|||||||
|
Morphology
|
||||||
|
==========
|
||||||
|
|
||||||
|
Hoon is a statically typed language that compiles to Nock.
|
||||||
|
|
||||||
|
A type is a function whose domain is the set of all nouns and whose range is the set of all nouns that are members of that type.
|
||||||
|
|
||||||
|
The compilation process is as follows:
|
||||||
|
|
||||||
|
First, a runic expression is parsed into an abstact syntax-tree, called a `twig`
|
||||||
|
|
||||||
|
expression => twig
|
||||||
|
|
||||||
|
A subject type is generated from the twig. This type describes the subject
|
||||||
|
of the Nock formula that the twig compiles to.
|
||||||
|
|
||||||
|
twig => [subject-type twig]
|
||||||
|
|
||||||
|
The twig is then compiled into nock formula, and the type of the product of
|
||||||
|
the formula is inferred.
|
||||||
|
|
||||||
|
[subject-type twig] => [product-type nock-formula]
|
||||||
|
|
||||||
|
As long as subject-type is a correct description of some subject, you can
|
||||||
|
take any twig and compile it against subject-type, producing a formula such
|
||||||
|
that
|
||||||
|
|
||||||
|
*[subject formula]
|
||||||
|
|
||||||
|
is a product correctly described by product-type.
|
||||||
|
|
||||||
|
This works well enough that in Hoon there is no direct syntax for defining or
|
||||||
|
declaring a type. There is only a syntax for constructing twigs. Types are
|
||||||
|
always produced by inference.
|
||||||
|
|
||||||
|
|
||||||
|
Let's look at a simple example of the above proc
|
||||||
|
|
||||||
|
|
||||||
|
Hoon has 120 [XX count] digraph runes. The choice of glyph is not random. The
|
||||||
|
first defines a semantic category (with some exceptions). These categories are:
|
||||||
|
|
||||||
|
| bar core construction
|
||||||
|
$ buc tiles and tiling
|
||||||
|
% cen invocations
|
||||||
|
: col tuples
|
||||||
|
. dot nock operators
|
||||||
|
^ ket type conversions
|
||||||
|
; sem miscellaneous macros
|
||||||
|
~ sig hints
|
||||||
|
= tis compositions
|
||||||
|
? wut conditionals, booleans, tests
|
||||||
|
! zap special operations
|
||||||
|
|
100
main/pub/src/doc/ref/hoon/orthography.md
Normal file
100
main/pub/src/doc/ref/hoon/orthography.md
Normal file
@ -0,0 +1,100 @@
|
|||||||
|
Orthography: Consensus Aesthetic
|
||||||
|
==========
|
||||||
|
|
||||||
|
The Hoon compiler enforces the syntactical correctness of the language, it does
|
||||||
|
not, with some exceptions, enforce aesthetic standards. Many different styles
|
||||||
|
of Hoon are possible. However, given Hoon's runic syntax, it is remarkably easy
|
||||||
|
for the novice programmer to generate idiosyncratic illegible code. Many other
|
||||||
|
languages that make heavy use of ASCII have a similar problem. Furthermore,
|
||||||
|
collaborative programming is made vastly easier by using a standard style
|
||||||
|
convention.
|
||||||
|
|
||||||
|
The Urbit source is written in a style of Hoon called the Consensus Aesthetic.
|
||||||
|
No patches to the Urbit source will be accepted unless they follow the ConsensusAesthetic.
|
||||||
|
|
||||||
|
The general rules of the Consensus Aesthetic are the following:
|
||||||
|
|
||||||
|
Character Restriction
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
The horizontal tab character, \ht, ASCII 0x9, must never occur. This is
|
||||||
|
enforced by the compiler.
|
||||||
|
|
||||||
|
Line and Comments
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
Lines must not exceed 80 columns in width and should not exceed 55 columns.
|
||||||
|
|
||||||
|
Blank lines (lines consisting entirely of whitespace) should not occur. For
|
||||||
|
visual separation of code, use empty comments.
|
||||||
|
|
||||||
|
Comments may appear on column 0, column 57 or inline at the same level of
|
||||||
|
indentation as the code.
|
||||||
|
|
||||||
|
Indentation
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Aesthetically, the act of programming is the act of formatting a big wall of
|
||||||
|
text. This canvas has a curious but essential property - it is indefinitely
|
||||||
|
tall, but finitely wide. The programmer's task as a visual designer is to
|
||||||
|
persuade code to flow down, not across.
|
||||||
|
|
||||||
|
The first law of Hoon indentation style is that all tall indentation is in
|
||||||
|
two-space increments. Single spaces are for wide only.
|
||||||
|
|
||||||
|
The second law of Hoon indentation is that everything in the kernel is good
|
||||||
|
indentation style. Or at least if it's not, it needs changed.
|
||||||
|
|
||||||
|
The third and most important law of Hoon indentation is that large twigs should
|
||||||
|
flow down and not across. Longer twigs should occur below shorter ones. Hoon
|
||||||
|
has several runes designed specifically to aid this task
|
||||||
|
|
||||||
|
The right margin is a precious resource not to be wasted. It's this law, when
|
||||||
|
properly applied, that makes casual readers wonder if Hoon is a functional
|
||||||
|
language at all. It doesn't have a program counter, but it looks like it does -
|
||||||
|
at least when written right.
|
||||||
|
|
||||||
|
Naming Convention
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
Names must follow one of the following naming conventions: Austere, Lapidary,
|
||||||
|
or Freehand.
|
||||||
|
|
||||||
|
In Austere Hoon, variables and arguments are named alphabetically with one
|
||||||
|
letter, a, b, c etc, in strict order of appearance in the text. This scheme is
|
||||||
|
only useful in the case of extremely regular and straightforward namespaces:
|
||||||
|
very short functions, for instance.
|
||||||
|
|
||||||
|
Austere arms must be gates or trays. Gate arms are three letters and try to
|
||||||
|
carry some mnemonic significance - for instance, ++dec. Tray arms are two
|
||||||
|
letters and try to resemble pronouns - for instance, ++by.
|
||||||
|
|
||||||
|
Austere structures must be short tuples, no wider than 5. The legs are named p,
|
||||||
|
q, r, s and/or t.
|
||||||
|
|
||||||
|
Conventional recursive structures use other standard names. The head of a list
|
||||||
|
is always i, the tail is always t. In a binary tree of nodes, the node is n,
|
||||||
|
the children l and r.
|
||||||
|
|
||||||
|
When in doubt, do not use Austere Hoon. In an ordinary context - not least
|
||||||
|
because Austere gates are easily mistaken for Lapidary variables - there should
|
||||||
|
be as few Austere arms as possible. And always remind yourself that Austere
|
||||||
|
Hoon makes it as hard as possible to refactor your code.
|
||||||
|
|
||||||
|
Lapidary Hoon is the ordinary style of most of Hoon and Arvo. In lapidary mode,
|
||||||
|
variables, arguments, attributes, etc, are three-letter strings, usually
|
||||||
|
consonant-vowel-consonant, generally meaningless. If the same string is used
|
||||||
|
more than once in the same file, it should be used for the same concept in some
|
||||||
|
sense, as often happens spontaneously in cutting and pasting. It would be nice
|
||||||
|
to have an editor with a macro that generated random unique TLV strings
|
||||||
|
automatically.
|
||||||
|
|
||||||
|
Lapidary arms are always four letters. They may or may not be English words,
|
||||||
|
which may or may not mean anything relevant.
|
||||||
|
|
||||||
|
In Freehand Hoon, do whatever you want. Note that while uppercase is not
|
||||||
|
permitted in a symbol, - is, suggesting a generally Lisp-like state of gross
|
||||||
|
hyphenated disorder. F-mode is best used for top-layer software which nothing
|
||||||
|
else is based on; prototyping and casual coding; etc. Freehand Hoon is not an acceptable style for any code in the Urbit source proper, and is discouraged for production applications.
|
||||||
|
|
||||||
|
|
85
main/pub/src/doc/ref/hoon/philosophy.md
Normal file
85
main/pub/src/doc/ref/hoon/philosophy.md
Normal file
@ -0,0 +1,85 @@
|
|||||||
|
Philosophy
|
||||||
|
==========
|
||||||
|
|
||||||
|
Hoon is a higher-order typed functional language that it compiles itself to
|
||||||
|
Nock in 3400 lines of Hoon. If this number is accurate (it is), Hoon is very
|
||||||
|
expressive, or very simple, or both. (It's both.) The bad news is that it
|
||||||
|
really has nothing at all in common, either syntactically or semantically, with
|
||||||
|
anything you've used before.
|
||||||
|
|
||||||
|
By understanding Nock tutorial, you've actually come closer than you realize to
|
||||||
|
knowing Hoon. Hoon is actually not much more than a fancy wrapper around Nock.
|
||||||
|
People who know C can think of Hoon as the C to Urbit's Nock - just a
|
||||||
|
sprinkling of syntax, wrapped around machine code and memory.
|
||||||
|
|
||||||
|
For instance, it's easy to imagine how instead of calculating tree axes by
|
||||||
|
hand, we could actually assign names to different parts of the tree - and those
|
||||||
|
names would stay the same as we pushed more data on the subject.
|
||||||
|
|
||||||
|
The way we're going to do this is by associating something called a type with
|
||||||
|
the subject. You may have heard of types before. Technically, Hoon is a
|
||||||
|
statically typed language, which just means that the type isn't a part of your
|
||||||
|
program: it's just a piece of data the compiler keeps around as it turns your
|
||||||
|
Hoon into Nock.
|
||||||
|
|
||||||
|
A lot of other languages use dynamic types, in which the type of a value is
|
||||||
|
carried along with the data as you use it. Even languages like Lisp, which are
|
||||||
|
nominally typeless, look rather typed from the Hoon perspective. For example, a
|
||||||
|
Lisp atom knows dynamically whether it's a symbol or an integer. A Hoon atom is
|
||||||
|
just a Nock atom, which is just a number. So without a static type, Hoon
|
||||||
|
doesn't even know how to print an atom properly.
|
||||||
|
|
||||||
|
Most higher-order typed languages, Haskell and ML being prominent examples, use
|
||||||
|
something called the Hindley-Milner unification algorithm. Hoon uses its own
|
||||||
|
special sauce instead.
|
||||||
|
|
||||||
|
Why? There are two obvious problems with Hindley-Milner as a functional type
|
||||||
|
system, the main one being the wall of heavy mathematics that greets you
|
||||||
|
instantly when you google it. We have heard some claims that Hindley-Milner is
|
||||||
|
actually quite simple. We urge all such claimants to hie themselves to its
|
||||||
|
Wikipedia page, which they'll surely be able to relieve of its present alarming
|
||||||
|
resemblance to some string-theory paper in Physics Review D.
|
||||||
|
|
||||||
|
Nor is this in any way an an anti-academic stance. Quite the contrary.
|
||||||
|
Frankly, OS guys really quite seldom find themselves in the math-department
|
||||||
|
lounge, cadging stray grants by shamelessly misrepresenting the CAP theorem as
|
||||||
|
a result in mathematics. It doesn't seem too much to expect the mathematicians
|
||||||
|
to reciprocate this basic academic courtesy.
|
||||||
|
|
||||||
|
Furthermore, besides the drawback that Hindley-Milner reeks of math and
|
||||||
|
programmers who love math are about as common as cats who love a bath - a
|
||||||
|
problem, but really only a marketing problem - Hindley-Milner has a genuine
|
||||||
|
product problem as well. It's too powerful.
|
||||||
|
|
||||||
|
Specifically, Hindley-Milner reasons both forward with evaluation, and backward
|
||||||
|
from constraints. Pretty unavoidable in any sort of unification algorithm,
|
||||||
|
obviously. But since the compiler has to think both forward and backward, and
|
||||||
|
the programmer has to predict what the compiler will do, the programmer has to
|
||||||
|
think backward as well.
|
||||||
|
|
||||||
|
Hoon's philosophy is that a language is a UI for programmers, and the basic
|
||||||
|
test of a UI is to be easy to use. It is impossible (for most programmers) to
|
||||||
|
learn a language properly unless they know what the compiler is doing, which in
|
||||||
|
practice means mentally stepping through the algorithms it uses (with the
|
||||||
|
exception of semantically neutral optimizations). Haskell is a hard language to
|
||||||
|
learn (for most programmers) because it's hard (for most programmers) to follow
|
||||||
|
what the Haskell compiler is thinking.
|
||||||
|
|
||||||
|
It's true that some programmers have an effective mathematical intuition that
|
||||||
|
let them "see" algorithms without working through them step by step. But this
|
||||||
|
is a rare talent, we feel. And even those who have a talent don't always enjoy
|
||||||
|
exercising it.
|
||||||
|
|
||||||
|
If a thorough understanding of any language demands high-grade mathematical
|
||||||
|
intuition in its programmers, the language as a UI is like a doorway that makes
|
||||||
|
you duck if you're over 6 feet tall. The only reason to build such a doorway in
|
||||||
|
your castle is if you and all your friends are short, and only your enemies are
|
||||||
|
tall. Is this really the case here?
|
||||||
|
|
||||||
|
Although an inference algorithm that reasons only forward must and does require
|
||||||
|
a few more annotations from the programmer, the small extra burden on her
|
||||||
|
fingers is more than offset by the lighter load on her hippocampus.
|
||||||
|
Furthermore, programs also exist to be read. The modern code monkey is above
|
||||||
|
all things a replaceable part, and some of these annotations (which a smarter
|
||||||
|
algorithm might infer by steam) may annoy the actual author of the code but be
|
||||||
|
a lifesaver for her replacement.
|
226
main/pub/src/doc/ref/hoon/phonology.md
Normal file
226
main/pub/src/doc/ref/hoon/phonology.md
Normal file
@ -0,0 +1,226 @@
|
|||||||
|
Phonology
|
||||||
|
=========
|
||||||
|
|
||||||
|
Glyphs
|
||||||
|
------
|
||||||
|
|
||||||
|
Hoon is a keyword-free language - any alphanumeric text in the program is part
|
||||||
|
of the program. Where other languages have reserved words, Hoon syntax uses
|
||||||
|
ASCII symbols, or glyphs. In normal English, many of these glyphs have
|
||||||
|
cumbersome multisyllabic names. As Hoon uses these glyphs heavily, it has its
|
||||||
|
own, more concise, naming scheme for them:
|
||||||
|
|
||||||
|
ace space gal < per )
|
||||||
|
bar | gar > sel [
|
||||||
|
bas \ hax # sem ;
|
||||||
|
buc $ hep - ser ]
|
||||||
|
cab _ kel { sig ~
|
||||||
|
cen % ker } soq '
|
||||||
|
col : ket ^ tar *
|
||||||
|
com , lus + tec `
|
||||||
|
doq " pam & tis =
|
||||||
|
dot . pat @ wut ?
|
||||||
|
fas / pel ( zap !
|
||||||
|
|
||||||
|
|
||||||
|
A language is meant to be spoken. Even a programming language. Studies have
|
||||||
|
shown that even when we read silently, we activate the motor cortex that
|
||||||
|
controls our vocal cords. Even if we never speak these symbols, they're easier
|
||||||
|
to think if bound to simple sounds.
|
||||||
|
|
||||||
|
Mnemonic aids for memorizing the above glyphs can be found in the comments of section 2eF of the Urbit Source, which is reprinted here:
|
||||||
|
|
||||||
|
```
|
||||||
|
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
|
||||||
|
:: section 2eF, parsing (ascii) ::
|
||||||
|
::
|
||||||
|
++ ace (just ' ') :: spACE
|
||||||
|
++ bar (just '|') :: vertical BAR
|
||||||
|
++ bas (just '\\') :: Back Slash (escaped)
|
||||||
|
++ buc (just '$') :: dollars BUCks
|
||||||
|
++ cab (just '_') :: CABoose
|
||||||
|
++ cen (just '%') :: perCENt
|
||||||
|
++ col (just ':') :: COLon
|
||||||
|
++ com (just ',') :: COMma
|
||||||
|
++ doq (just '"') :: Double Quote
|
||||||
|
++ dot (just '.') :: dot dot dot ...
|
||||||
|
++ fas (just '/') :: Forward Slash
|
||||||
|
++ gal (just '<') :: Greater Left
|
||||||
|
++ gar (just '>') :: Greater Right
|
||||||
|
++ hax (just '#') :: Hash
|
||||||
|
++ kel (just '{') :: Curly Left
|
||||||
|
++ ker (just '}') :: Curly Right
|
||||||
|
++ ket (just '^') :: CareT
|
||||||
|
++ lus (just '+') :: pLUS
|
||||||
|
++ hep (just '-') :: HyPhen
|
||||||
|
++ pel (just '(') :: Paren Left
|
||||||
|
++ pam (just '&') :: AMPersand pampersand
|
||||||
|
++ per (just ')') :: Paren Right
|
||||||
|
++ pat (just '@') :: AT pat
|
||||||
|
++ sel (just '[') :: Square Left
|
||||||
|
++ sem (just ';') :: SEMicolon
|
||||||
|
++ ser (just ']') :: Square Right
|
||||||
|
++ sig (just '~') :: SIGnature squiggle
|
||||||
|
++ soq (just '\'') :: Single Quote
|
||||||
|
++ tar (just '*') :: sTAR
|
||||||
|
++ tec (just '`') :: backTiCk
|
||||||
|
++ tis (just '=') :: 'tis tis, it is
|
||||||
|
++ wut (just '?') :: wut, what?
|
||||||
|
++ zap (just '!') :: zap! bang! crash!!
|
||||||
|
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
|
||||||
|
```
|
||||||
|
|
||||||
|
Digraph Glyphs: Runes
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
The fundamental building block of Hoon is the digraph glyph or rune. TThe choice of glyph is not random. The first defines a semantic category. That is, all runes whose first glyph is `|` or `bar` are conceptually related. See Morphology for details.
|
||||||
|
|
||||||
|
To pronounce a rune, concatenate the glyph names, stressing the first syllable
|
||||||
|
and softening the second vowel into a "schwa." Hence, to say `~&`, say
|
||||||
|
"sigpam." To say `|=`, say "bartis."
|
||||||
|
|
||||||
|
Punctuation Runes
|
||||||
|
----------------
|
||||||
|
|
||||||
|
The following runes are used as punctuation in Tall Form Hoon (See Syntax for details) and have mandatory special pronunciation:
|
||||||
|
|
||||||
|
-- hephep phep
|
||||||
|
+- lushep slep
|
||||||
|
++ luslus slus
|
||||||
|
== tistis stet
|
||||||
|
|
||||||
|
Wing Runes
|
||||||
|
---------
|
||||||
|
|
||||||
|
The following runes are used to access specific axes or wings in a noun. See Morphology. They have optional alternate phonology.
|
||||||
|
|
||||||
|
+< lusgal glus
|
||||||
|
+> lusgar gras
|
||||||
|
-< hepgal gelp
|
||||||
|
-> hepgar garp
|
||||||
|
|
||||||
|
Tile Runes
|
||||||
|
---------
|
||||||
|
|
||||||
|
The following runes comprise the set of "Tile Runes" and are used to generate
|
||||||
|
complex types (See Morphology for details). They have an optional alternate
|
||||||
|
phonology, which describes the tile they generate:
|
||||||
|
|
||||||
|
$% buccen kelp
|
||||||
|
$^ bucket herb
|
||||||
|
$: buccol tile
|
||||||
|
$= buctis bark
|
||||||
|
$& bucpam bush
|
||||||
|
$? bucwut fern
|
||||||
|
$| bucbar reed
|
||||||
|
|
||||||
|
The following glyphs are not runes, but are commonly used with tile runes to specify basic types. (See Morphology for details). In context, they have an optional alternate phonology:
|
||||||
|
|
||||||
|
@ "atom"
|
||||||
|
^ "cell"
|
||||||
|
* "noun"
|
||||||
|
? "bean"
|
||||||
|
~ "null"
|
||||||
|
|
||||||
|
Irregular Runes
|
||||||
|
--------------
|
||||||
|
|
||||||
|
The following glyphs have optional special pronunciation when they appear as
|
||||||
|
the irregular form as certain digraph runes. It is perfectly acceptable to
|
||||||
|
pronounce the characters, but some may find the alternate phonology useful,
|
||||||
|
especially in cases where multiple irregular forms occur in sequence.
|
||||||
|
|
||||||
|
Irregular Regular Pronunciation
|
||||||
|
|
||||||
|
,p $,(p) "clam p"
|
||||||
|
_p $_(p) "bunt p"
|
||||||
|
p@q $@(p q) "whip p into q"
|
||||||
|
!p ?!(p) "NOT p"
|
||||||
|
&(p q) ?&(p q) "AND p q
|
||||||
|
|(p q) ?|(p q) "OR p q"
|
||||||
|
?(p q) $?(p q) "fern p q"
|
||||||
|
`p`q ^-(p q) "cast p q"
|
||||||
|
p=q ^=(p q) "p is q"
|
||||||
|
~[p q] :~(a b) "list p q"
|
||||||
|
`[p] [~ p] "unit p"
|
||||||
|
p^q [p q] "cell p q"
|
||||||
|
[p q] :*(p) "cell p q"
|
||||||
|
+(p) .+(p) "bump p"
|
||||||
|
=(p q) .=(p q) "equals p q"
|
||||||
|
p:q =<(p q) "p of q"
|
||||||
|
p(q r) %=(p q r) "toss p q r"
|
||||||
|
(p q) %-(p q) "slam p q"
|
||||||
|
~(p q r) %~(p q r) "slug p q r"
|
||||||
|
|
||||||
|
Nouns
|
||||||
|
-----
|
||||||
|
|
||||||
|
Some nouns also have an alternate phonology:
|
||||||
|
|
||||||
|
|
||||||
|
& "yes"
|
||||||
|
%& "yes"
|
||||||
|
%.y "yes"
|
||||||
|
|
||||||
|
| "no"
|
||||||
|
%| "no"
|
||||||
|
%.n "no"
|
||||||
|
|
||||||
|
42 "forty-two"
|
||||||
|
0i42 "dec four two"
|
||||||
|
0x2e "hex two e"
|
||||||
|
0b10 "bin one zero"
|
||||||
|
0v3t "base thirty two three t"
|
||||||
|
0wA4 "base sixty-four big a four"
|
||||||
|
|
||||||
|
'foo' "cord foo"
|
||||||
|
"foo" "tape foo"
|
||||||
|
|
||||||
|
|
||||||
|
Example
|
||||||
|
-------
|
||||||
|
|
||||||
|
Take the following snippet of Hoon:
|
||||||
|
|
||||||
|
++ dec :: decrement
|
||||||
|
~/ %dec
|
||||||
|
|= a=@
|
||||||
|
~| %decrement-underflow
|
||||||
|
?< =(0 a)
|
||||||
|
=+ b=0
|
||||||
|
|- ^- @
|
||||||
|
?: =(a +(b))
|
||||||
|
b
|
||||||
|
$(b +(b))
|
||||||
|
|
||||||
|
Omitting the spaces and comments (which only a real purist would include), the
|
||||||
|
above is pronounced:
|
||||||
|
|
||||||
|
slus dec
|
||||||
|
sigfas cen-dec
|
||||||
|
bartis A tis pat
|
||||||
|
sigbar cen-decrement-underflow
|
||||||
|
wutgal tis zero A
|
||||||
|
tislus B tis zero
|
||||||
|
barhep kethep pat
|
||||||
|
wutcol tis A lus B
|
||||||
|
B
|
||||||
|
buc B lus B
|
||||||
|
|
||||||
|
Or using the alternate phonology:
|
||||||
|
|
||||||
|
slus dec
|
||||||
|
sigfas cen-dec
|
||||||
|
bartis A is atom
|
||||||
|
sigbar cen-decrement-underflow
|
||||||
|
wutgal equals zero A
|
||||||
|
tislus B is zero
|
||||||
|
barhep kethep atom
|
||||||
|
wutcol equals A lus B
|
||||||
|
B
|
||||||
|
buc B lus B
|
||||||
|
|
||||||
|
Which is very similar. The alternate phonology exists as a result of common
|
||||||
|
speech patterns observed amongst Hoon programmers in the wild. In any language
|
||||||
|
actually spoken by actual humans, laziness soon rounds off any rough edges.
|
||||||
|
|
130
main/pub/src/doc/ref/hoon/syntax.md
Normal file
130
main/pub/src/doc/ref/hoon/syntax.md
Normal file
@ -0,0 +1,130 @@
|
|||||||
|
Syntax
|
||||||
|
======
|
||||||
|
|
||||||
|
|
||||||
|
Syntax: Twigs
|
||||||
|
------------
|
||||||
|
|
||||||
|
A twig is an abstract syntax tree (or AST). Everything the Hoon programmer types is parsed into a twig.
|
||||||
|
|
||||||
|
##Noun
|
||||||
|
|
||||||
|
All constant data in Hoon is
|
||||||
|
%dtzy
|
||||||
|
%dtzz
|
||||||
|
|
||||||
|
|
||||||
|
##Runes
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Tall forms
|
||||||
|
|
||||||
|
When all is said and done, the programmer is formatting a big wall of text. This canvas has a curious but essential property - it is indefinitely tall, but finitely wide. We strongly encourage an 80-column standard.
|
||||||
|
|
||||||
|
So the programmer's task as a visual designer is to persuade her code to flow down, not across. The usual way to lay out a tree which does not fit on one line is to indent the subtrees and enclose them in parens, brackets or braces. Which might look like this (not Hoon syntax):
|
||||||
|
|
||||||
|
?: {
|
||||||
|
&
|
||||||
|
47
|
||||||
|
52
|
||||||
|
}
|
||||||
|
Hoon, like other functional languages, has very deep expression trees. In this simple, classic syntax model, a functional language develops huge piles of closing parens at the end of large blocks, which is manageable but ugly. Less manageably, as each subtree is indented to the right, the width of the text window bounds the depth of the expression tree.
|
||||||
|
|
||||||
|
Other languages skip the braces and parse whitespace, using indentation to express tree depth. This actually is valid (but ugly) Hoon:
|
||||||
|
|
||||||
|
?:
|
||||||
|
&
|
||||||
|
47
|
||||||
|
52
|
||||||
|
This gets rid of the terminator problem, but keeps the width problem. And parsing whitespace is horrible. Whitespace in Hoon is not significant, though its presence or absence is. (Note also that hard TAB characters are zutiefst verboten).
|
||||||
|
|
||||||
|
Hoon notices a couple of things about this problem. First, most Hoon twigs have small constant fanout. A parser shouldn't need either significant whitespace or a terminator to figure out how many twigs follow ?: - the answer is always 3.
|
||||||
|
|
||||||
|
Second, our goal is to descend into a deep tree without losing right margin. With the backstep pattern
|
||||||
|
|
||||||
|
?: &
|
||||||
|
47
|
||||||
|
52
|
||||||
|
or
|
||||||
|
|
||||||
|
=+ a=3
|
||||||
|
b
|
||||||
|
we step two spaces backward at each subtwig, till the last one is at the same indentation as its parent.
|
||||||
|
|
||||||
|
This preserves your right margin in one and only one case - where the bottom twig is the heaviest. For example, if we write
|
||||||
|
|
||||||
|
?: &
|
||||||
|
47
|
||||||
|
?: |
|
||||||
|
52
|
||||||
|
?: &
|
||||||
|
97
|
||||||
|
=+ 35
|
||||||
|
b
|
||||||
|
we see a tree that flows neatly down the screen. It's obviously much nicer than, say (not Hoon syntax):
|
||||||
|
|
||||||
|
?: {
|
||||||
|
&
|
||||||
|
47
|
||||||
|
?: {
|
||||||
|
|
|
||||||
|
52
|
||||||
|
?: {
|
||||||
|
&
|
||||||
|
97
|
||||||
|
=+ {
|
||||||
|
35
|
||||||
|
b
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
or any similar abortion. But its downward flow depends on the coincidence of the bottom twig being the heavy one:
|
||||||
|
|
||||||
|
?: &
|
||||||
|
?: |
|
||||||
|
52
|
||||||
|
?: &
|
||||||
|
97
|
||||||
|
=+ 35
|
||||||
|
b
|
||||||
|
47
|
||||||
|
To handle this, Hoon has a reasonable selection of reverse hoons, which have the same semantics but inverse order. For instance, if ?: is "if," ?. (wutdot) is "unless":
|
||||||
|
|
||||||
|
?. &
|
||||||
|
47
|
||||||
|
?: |
|
||||||
|
52
|
||||||
|
?: &
|
||||||
|
97
|
||||||
|
=+ 35
|
||||||
|
b
|
||||||
|
Wide forms
|
||||||
|
|
||||||
|
Observe that in the tall syntax, there are always at least two spaces (or one newline) between tokens. Other than this, nothing requires anything to be tall. For instance, it is normal and only slightly aggressive to write:
|
||||||
|
|
||||||
|
?. & 47
|
||||||
|
?: | 52
|
||||||
|
?: & 97
|
||||||
|
=+ 35
|
||||||
|
b
|
||||||
|
But we could even go so far as:
|
||||||
|
|
||||||
|
?. & 47 ?: | 52 ?: & 97 =+ 35 b
|
||||||
|
Few would find this readable, which is why Hoon also has a wide syntax:
|
||||||
|
|
||||||
|
?.(& 47 ?:(| 52 ?:(& 97 =+(35 b))))
|
||||||
|
On a single line, the parentheses - while a parser could get away with skipping them - are needed to actually read the expression. The hoon attaches directly to the left paren (pel), and a double space or a newline is a syntax error.
|
||||||
|
|
||||||
|
The semantics of tall and wide syntax are identical, of course. The choice is entirely up to the programmer. Some languages can be formatted automatically - turning an abstract syntax tree into a tall, handsome Hoon file is an art form. We won't say a program could never do it - but it'd be work.
|
||||||
|
|
||||||
|
Wide forms are also nice because our immature and incomplete command-line shell can't process multi-line input.
|
||||||
|
|
||||||
|
Irregular forms
|
||||||
|
|
||||||
|
For a very large set of primitives, neither tall nor wide form is tight enough. If you go to ++scat in hoon.hoon, you can see them all, organized by initial character.
|
||||||
|
|
||||||
|
This isn't the place to go over the irregular forms directly - we'll introduce them when we talk about individual runes, or when we run into them and we can't go around.
|
Loading…
Reference in New Issue
Block a user