Post content thus far.

This commit is contained in:
Sigilante 2022-06-23 16:14:32 -05:00
parent 4ef41dbd4c
commit 8c9934692c
128 changed files with 33265 additions and 0 deletions

View File

@ -0,0 +1,21 @@
+++
title = "Guides"
weight = 20
sort_by = "weight"
template = "sections/docs/chapters.html"
insert_anchor_links = "right"
+++
## [Writing Aqua Tests](/docs/hoon/guides/aqua)
## [CLI Apps](/docs/hoon/guides/cli-tutorial)
## [Parsing](/docs/hoon/guides/parsing)
## [Strings](/docs/hoon/guides/strings)
## [JSON](/docs/hoon/guides/json-guide)
## [Sail](/docs/hoon/guides/sail)
## [Unit tests](/docs/hoon/guides/unit-tests)

View File

@ -0,0 +1,80 @@
+++
title = "Writing Aqua Tests"
weight = 37
template = "doc.html"
+++
# Concepts
Aqua (short for "aquarium", alluding to the idea that you're running
multiple ships in a safe, artificial environment and watching them
carefully) is an app that lets you run one or more virtual ships from
within a single host.
pH is a library of functions designed to make it easy to write
integration tests using Aqua.
# First test
To run your first pH test, run the following commands:
```
|start %aqua
:aqua +solid
-ph-add
```
This will start Aqua, compile a new kernel for it, and then compile and
run /ted/ph/add.hoon. Here are the contents of that file:
```
/- spider
/+ *ph-io
=, strand=strand:spider
^- thread:spider
|= args=vase
=/ m (strand ,vase)
;< ~ bind:m start-simple
;< ~ bind:m (raw-ship ~bud ~)
;< ~ bind:m (dojo ~bud "[%test-result (add 2 3)]")
;< ~ bind:m (wait-for-output ~bud "[%test-result 5]")
;< ~ bind:m end-simple
(pure:m *vase)
```
There's a few lines of boilerplate, with three important lines defining
the test.
```
;< ~ bind:m (raw-ship ~bud ~)
;< ~ bind:m (dojo ~bud "[%test-result (add 2 3)]")
;< ~ bind:m (wait-for-output ~bud "[%test-result 5]")
```
We boot a ship with `+raw-ship`. In this case the ship we are booting will be `~bud`. These ships exist in a virtual environment so you could use any valid `@p`.
Next we enter some commands with `+dojo`, and then we wait until we get a line that includes some expected output. Each of these commands we need to specify the ship we want to run on.
Many tests can be created with nothing more than these simple tools.
Try starting two ships and having one send a `|hi` to the other, and
check that it arrives.
Many more complex tests can be created, including file changes, personal
breaches, mock http clients or servers, or anything you can imagine.
Check out `/lib/ph/io.hoon` for other available functions, and look at
other tests in `/ted/ph/` for inspiration.
# Reference
Aqua has the following commands:
`:aqua +solid` Compiles a "pill" (kernel) for the guest ships and loads it into Aqua.
`:aqua [%swap-files ~]` modifies the pill to use the files you have in
your filesystem without rebuilding the whole pill. For example, if you
change an app and you want to test the new version, you must install it
in the pill. This command will do that.
`:aqua [%swap-vanes ~[%a]]` Modifies the pill to load a new version of a
vane (`%a` == Ames in this example, but it can be any list of vanes).
This is faster than running `:aqua +solid`.

View File

@ -0,0 +1,440 @@
+++
title = "CLI apps"
weight = 2
template = "doc.html"
+++
## Introduction
In this walkthrough we will go in-depth on how to build command line interface (CLI)
applications in Urbit using the `shoe` library.
There are three CLI apps that currently ship with urbit - `%dojo`, `%chat-cli`,
and `%shoe`. You should be familiar with the former two, the latter is an
example app that shows off how the `shoe` library works that we will be looking
at closely. These are all Gall apps, and their source can be found in the `app/`
folder of your `%base` desk.
In [the `shoe` library](#the-shoe-library) we take a closer look at the `shoe` library and its
cores and how they are utilized in CLI apps. Then in [the `sole`
library](#the-sole-library) we look at what `shoe` effects ultimately break down
into. Finally in [`%shoe` app walkthrough](#shoe-app-walkthrough) we explore
the functionality of the `%shoe` app and then go through the code line-by-line.
This tutorial can be
considered to be an application equivalent of the [Hoon school
lesson](/docs/hoon/hoon-school/generators#ask) on `sole` and `%ask`
generators, which only covers the bare minimum necessary to write generators
that take user input.
## The `shoe` library {% #the-shoe-library %}
Here we describe how sessions are identified, the specialized `card`s that Gall agents
with the `shoe` library are able to utilize, and the different cores of `/lib/shoe.hoon` and their purpose.
### Session identifiers
An app using the `shoe` library will automatically track sessions by their `sole-id=@ta`.
These are opaque identifiers generated by the connecting client. An app using the `shoe`
library may be connected to by a local or remote ship in order to send commands,
and each of these connections is assigned a unique `@ta` that identifies the
ship and which session on that ship if there are multiple.
### `%shoe` `card`s
Gall agents with the `shoe` library are able to utilize `%shoe` `card`s. These
additions to the standard set of `cards` have the following shape:
```hoon
[%shoe sole-ids=(list @ta) effect=shoe-effect]`
```
`sole-ids` is the `list` of session ids that the following `effect` is
emitted to. An empty `sole-ids` sends the effect to all connected sessions.
`shoe-effect`s, for now, are always of the shape `[%sole effect=sole-effect]`, where
`sole-effect`s are basic console events such as displaying text, changing the
prompt, beeping, etc. These are described in the section on the [`sole` library](#the-sole-library).
For example, a `%shoe` `card` that causes all connected sessions to beep would
be `[%shoe ~ %sole %bel ~]`.
### `shoe` core
An iron (contravariant) door that defines an interface for Gall agents utilizing
the `shoe` library. Use this core whenever you want to receive input from the user and run a command. The input will get
put through the parser (`+command-parser`) and results in a noun of
`command-type` that the underlying application specifies, which shoe then feeds back into the underlying app as an `+on-command` callback.
In addition to the ten arms that all Gall core apps possess, `+shoe` defines and expects a few
more, tailored to common CLI logic. Thus you will need to wrap the `shoe:shoe` core using the `agent:shoe` function to obtain a standard
10-arm Gall agent core. See the [shoe example app
walkthrough](#shoe-example-app-walkthrough) for how to do this.
The additional arms are described below. The Hoon code shows their expected type signature. As we'll see [later](#shoe-app-walkthrough), the `command-type` can differ per application. Note also that most of these take a session identifier as an argument. This lets applications provide different users (at potentially different "places" within the application) with different affordances.
#### `+command-parser`
```hoon
++ command-parser
|~ sole-id=@ta
|~(nail *(like [? command-type]))
```
Input parser for a specific command-line session. Will be run on whatever the user tries to input into the command prompt, and won't let them type anything that doesn't parse. If the head of the result is true,
instantly run the command. If it's false, require the user to press return.
#### `+tab-list`
```hoon
++ tab-list
|~ sole-id=@ta
*(list (option:auto tank))
```
Autocomplete options for the command-line session (to match `+command-parser`).
#### `+on-command`
```hoon
++ on-command
|~ [sole-id=@ta command=command-type]
*(quip card _this)
```
Called when a valid command is run.
#### `+can-connect`
```hoon
++ can-connect
|~ sole-id=@ta
*?
```
Called to determine whether a session may be opened or connected to. For example, you
may only want the local ship to be able to connect.
#### `+on-connect`
```hoon
++ on-connect
|~ sole-id=@ta
*(quip card _^|(..on-init))
```
Called when a session is opened or connected to.
#### `+on-disconnect`
```hoon
++ on-disconnect
|~ sole-id=@ta
*(quip card _^|(..on-init))
```
Called when a previously made session gets disconnected from.
### `default` core
This core contains the bare minimum implementation of the additional `shoe` arms
beyond the 10 standard Gall app ams. It is used
analogously to how the `default-agent` core is used for regular Gall apps.
### `agent` core
This is a function for wrapping a `shoe` core, which has too many
arms to be a valid Gall agent core. This turns it into a standard Gall agent core by
integrating the additional arms into the standard ones.
## The `sole` library {% #the-sole-library %}
`shoe` apps may create specialized `card`s of the `[%shoe (list @ta) shoe-effect]` shape, where `shoe-effect` currently just wrap `sole-effect`s, i.e. instructions for displaying text and producing other effects in the console.
The list of possible `sole-effects` can be found in `/sur/sole.hoon`. A few
commonly used ones are as follows.
- `[%txt tape]` is used to display a line of text.
- `[%bel ~]` is used to emit a beep.
- `[%pro sole-prompt]` is used to set the prompt.
- `[%mor (list sole-effect)]` is used to emit multiple effects.
For example, a `sole-effect` that beeps and displays `This is some text.` would
be structured as
```hoon
[%mor [%txt "This is some text."] [%bel ~] ~]
```
## `%shoe` app walkthrough {% #shoe-app-walkthrough %}
Here we explore the capabilities of the `%shoe` example app and then go through
the code, explaining what each line does.
### Playing with `%shoe`
First let's test the functionality of `%shoe` so we know what we're getting
into.
Start two fake ships, one named `~zod` and the other can have any name - we will
go with `~nus`. Fake ships run locally are able to see each other, and our
intention is to connect their `%shoe` apps.
On each fake ship start `%shoe` by entering `|start %shoe` into dojo. This will
automatically
change the prompt to `~zod:shoe>` and `~nus:shoe>`. Type `demo` and watch the following appear:
```
~zod ran the command
~zod:shoe>
```
`~zod ran the command` should be displayed in bold green text, signifying that
the command originated locally.
Now we will connect the sessions. Switch `~zod` back to dojo with `Ctrl-X` and enter `|link ~nus %shoe`. If this succeeds you will see the following.
```
>=
; ~nus is your neighbor
[linked to [p=~nus q=%shoe]]
```
Now `~zod` will have two `%shoe` sessions running - one local one on `~zod` and
one remote one on `~nus`, which you can access by pressing `Ctrl-X` until you see
`~nus:shoe>` from `~zod`'s console. On the other hand, you should not see
`~zod:shoe>` on `~nus`'s side, since you have not connected `~nus` to `~zod`'s
`%shoe` app. When you enter `demo` from `~nus:shoe>` on
`~zod`'s console you will again see `~zod ran the command`, but this time it
should be in the ordinary font used by the console, signifying that the command
is originating from a remote session. Contrast this with entering `demo` from
`~nus:shoe>` in `~nus`'s console, which will display `~nus ran the command` in
bold green text.
Now try to link to `~zod`'s `%shoe` session from `~nus` by switching to the dojo
on `~nus` and entering `|link ~zod %shoe`. You should see
```
>=
[unlinked from [p=~zod q=%shoe]]
```
and if you press `Ctrl-X` you will not get a `~zod:shoe>` prompt. This is
because the example app is set up to always allow `~zod` to connect (as well as
subject moons if the ship happens to be a planet) but not `~nus`, so this
message means that `~nus` failed to connect to `~zod`'s `%shoe` session.
### `%shoe`'s code
```hoon
:: shoe: example usage of /lib/shoe
::
:: the app supports one command: "demo".
:: running this command renders some text on all sole clients.
::
/+ shoe, verb, dbug, default-agent
```
`/+` is the Ford rune which imports libraries from the `/lib` directory into
the subject.
- `shoe` is the `shoe` library.
- `verb` is a library used to print what a Gall agent is doing.
- `dbug` is a library of debugging tools.
- `default-agent` contains a Gall agent core with minimal implementations of
required Gall arms.
```hoon
|%
+$ state-0 [%0 ~]
+$ command ~
::
+$ card card:shoe
--
```
The types used by the app.
`state-0` stores the state of the app, which is null as there is no state to
keep track of. It is good practice to include a version number anyways,
in case the app is made stateful at a later time.
`command` is typically a set of tagged union types that represent the possible
commands that can be entered by the user. Since this app only supports one
command, it is unnecessary for it to have any associated data, thus the command
is represented by `~`.
In a non-trivial context, a `command` is commonly given by `[%name data]`, where `%name` is the identifier for the type of command and `data` is
a type or list of types that contain data needed to execute the command. See
`app/chat-cli.hoon` for examples of commands, such as `[%say letter:store]` and
`[%delete path]`. This is not required though, and you could use something like
`[chat-room=@t =action]`.
`card` is either an ordinary Gall agent `card` or a `%shoe` `card`, which takes
the shape `[%shoe sole-ids=(list @ta) effect=shoe-effect]`. A `%shoe` `card` is
sent to all `sole`s listed in `sole-ids`, imaking them run the `sole-effect`
specified by `effect` (i.e. printing some text). Here we can
reference `card:shoe` because of `/+ shoe` at the beginning of the app.
```hoon
=| state-0
=* state -
::
```
Add the bunt value of `state-0` to the head of the subject, then give it the
macro `state`. The `-` here is a lark expression referring to the head of the
subject. This allows us to use `state` to refer to the state elsewhere in the
code no matter what version we're using, while also getting direct access to the
contents of `state` (if it had any).
```hoon
%+ verb |
%- agent:dbug
^- agent:gall
%- (agent:shoe command)
^- (shoe:shoe command)
```
The casts here are just reminders of what is being produced. So let's focus on
what the `%` runes are doing, from bottom to top. We call `(agent:shoe command)`
on what follows (i.e. the rest of the app),
producing a standard Gall agent core. Then we call wrap the Gall agent core with
`agent:dbug`, endowing it with additional arms useful for debugging, and then
wrap again with `verb`.
```hoon
|_ =bowl:gall
+* this .
def ~(. (default-agent this %|) bowl)
des ~(. (default:shoe this command) bowl)
```
This is boilerplate Gall agent core code. We set `this` to be a macro for the
subject, which is the Gall agent core itself. We set `def` and `des` to be
macros for initialized `default-agent` and `default:shoe` doors respectively.
Next we implement all of the arms required for a `shoe` agent. Starting with
the standard Gall arms:
```hoon
++ on-init on-init:def
++ on-save !>(state)
++ on-load
|= old=vase
^- (quip card _this)
[~ this]
::
++ on-poke on-poke:def
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent on-agent:def
++ on-arvo on-arvo:def
++ on-fail on-fail:def
```
These are minimalist Gall app arm implementations using the default behavior found in `def`.
Here begins the implementation of the additional arms required by the
`(shoe:shoe command)` interface.
```hoon
++ command-parser
|= sole-id=@ta
^+ |~(nail *(like [? command]))
(cold [& ~] (jest 'demo'))
```
`+command-parser` is of central importance - it is what is used to parse user
input and transform it into `command`s for the app to execute. Writing a proper
command parser requires understanding of the Hoon parsing functions found in the
standard library. How to do so may be found in the [parsing tutorial](/docs/hoon/guides/parsing). For now, it is sufficient to know that this arm matches the text "demo" and
produces a `[? command]`-shaped noun in response. Note how the `&` signifies that the command will be run as soon as it has been entered, without waiting for the user to press return.
```hoon
++ tab-list
|= sole-id=@ta
^- (list [@t tank])
:~ ['demo' leaf+"run example command"]
==
```
`+tab-list` is pretty much plug-n-play. For each command you want to be tab
completed, add an entry to the `list` begun by `:~` of the form `[%command leaf+"description"]`. Now whenever the user types a partial command and presses
tab, the console will display the list of commmands that match the partial
command as well as the descriptions given here.
Thus here we have that starting to type `demo` and pressing tab will result in
the following output in the console:
```
demo run example command
~zod:shoe> demo
```
with the remainder of `demo` now added to the input line.
Next we have `+on-command`, which is called whenever `+command-parser`
recognizes that `demo` has been entered by a user.
```hoon
++ on-command
|= [sole-id=@ta =command]
^- (quip card _this)
```
This is a gate that takes in the `sole-id` corresponding to the session and the
`command` noun parsed by `+command-parser` and returns a `list` of `card`s and
`_this`, which is our shoe agent core including its state.
```hoon
=- [[%shoe ~ %sole -]~ this]
```
This creates a cell of a `%shoe` card that triggers a `sole-effect` given by the head of
the subject `-`, then the Gall agent core `this` - i.e. the return result of
this gate. The use of the `=-` rune means that what follows this
expression is actually run first, which puts the desired `sole-effect` into the
head of the subject.
```hoon
=/ =tape "{(scow %p src.bowl)} ran the command"
```
We define the `tape` that we want to be printed.
```hoon
?. =(src our):bowl
[%txt tape]
[%klr [[`%br ~ `%g] [(crip tape)]~]~]
```
We cannot just produce the `tape` we want printed, - it needs to fit the
`sole-effect` type. This tells us that if the
origin of the command is not our ship to just print it normally with the `%txt`
`sole-effect`. Otherwise we use `%klr`, which prints it stylistically (here it
makes the text green and bold).
The following allows either `~zod`, or the host ship and its moons, to connect to
this app's command line interface using `|link`.
```hoon
++ can-connect
|= sole-id=@ta
^- ?
?| =(~zod src.bowl)
(team:title [our src]:bowl)
==
```
We use the minimal implementations for the final two `shoe` arms, since we don't
want to do anything special when users connect or disconnect.
```hoon
++ on-connect on-connect:des
++ on-disconnect on-disconnect:des
--
```
This concludes our review of the code of the `%shoe` app. To continue learning
how to build your own CLI app, we recommend checking out `/app/chat-cli.hoon`.

View File

@ -0,0 +1,513 @@
+++
title = "JSON"
weight = 4
template = "doc.html"
+++
If you are working on a Gall agent with any kind of web interface, it's likely you will encounter the problem of converting Hoon data structures to JSON and vice versa. This is what we'll look at in this document.
Urbit represents JSON data with the `$json` structure (defined in `lull.hoon`). You can refer to the [json type](#the-json-type) section below for details.
JSON data on the web is encoded in text, so Urbit has two functions in `zuse.hoon` for dealing with this:
- [`+en-json:html`](/docs/hoon/reference/zuse/2e_2-3#en-jsonhtml) - For printing `$json` to a text-encoded form.
- [`+de-json:html`](/docs/hoon/reference/zuse/2e_2-3#de-jsonhtml) - For parsing text-encoded JSON to a `$json` structure.
You typically want `$json` data converted to some other `noun` structure or vice versa, so Urbit has three collections of functions for this purpose, also in `zuse.hoon`:
- [`+enjs:format`](/docs/hoon/reference/zuse/2d_1-5#enjsformat) - Functions for converting various atoms and structures to `$json`.
- [`+dejs:format`](/docs/hoon/reference/zuse/2d_6#dejsformat) - Many "reparsers" for converting `$json` data to atoms and other structures.
- [`+dejs-soft:format`](/docs/hoon/reference/zuse/2d_7#dejs-softformat) - Largely the same as `+dejs:format` except its reparsers produce `unit`s which are null upon failure rather than simply crashing.
The relationship between these types and functions look like this:
![json diagram](https://media.urbit.org/docs/json-diagram.svg)
Note this diagram is a simplification - the `+dejs:format` and `+enjs:format` collections in particular are tools to be used in writing conversion functions rather than simply being used by themselves, but it demonstrates the basic relationships. Additionally, it is less common to do printing/parsing manually - this would typically be handled automatically by Eyre, though it may be necessary if you're retrieving JSON data via the web client vane Iris.
### In practice
A typical Gall agent will have a number of structures defined in a file in the `/sur` directory. These will define the type of data it expects to be `%poke`ed with, the type of data it will `%give` to subscribers, and the type of data its scry endpoints produce.
If the agent only interacts with other agents inside Urbit, it may just use a `%noun` `mark`. If, however, it needs to talk to a web interface of some kind, it usually must handle `$json` data with a `%json` mark.
Sometimes an agent's interactions with a web interface are totally distinct from its interactions with other agents. If so, the agent could just have separate scry endpoints, poke handlers, etc, that directly deal with `$json` data with a `%json` mark. In such a case, you can include `$json` encoding/decoding functions directly in the agent or associated libraries, using the general techniques demonstrated in the [$json encoding and decoding example](#json-encoding-and-decoding-example) section below.
If, on the other hand, you want a unified interface (whether interacting with a web client or within Urbit), a different approach is necessary. Rather than taking or producing either `%noun` or `%json` marked data, custom `mark` files can be created which specify conversion methods for both `%noun` and `%json` marked data.
With this approach, an agent would take and/or produce data with some `mark` like `%my-custom-mark`. Then, when the agent must interact with a web client, the webserver vane Eyre can automatically convert `%my-custom-mark` to `%json` or vice versa. This way the agent only ever has to handle the `%my-custom-mark` data. This approach is used by `%graph-store` with its `%graph-update-2` mark, for example, and a number of other agents.
For details of creating a `mark` file for this purpose, the [mark file example](#mark-file-example) section below walks through a practical example.
## The `$json` type
Urbit represents JSON data with the `$json` structure (defined in `/sys/lull.hoon`):
```hoon
+$ json :: normal json value
$@ ~ :: null
$% [%a p=(list json)] :: array
[%b p=?] :: boolean
[%o p=(map @t json)] :: object
[%n p=@ta] :: number
[%s p=@t] :: string
== ::
```
The correspondence of `$json` to JSON types is fairly self-evident, but here's a table comparing the two for additional clarity:
| JSON Type | `$json` Type | JSON Example | `$json` Example |
| :-------- | :--------------------- | ------------------------- | :----------------------------------------------------------- |
| Null | `~` | `null` | `~` |
| Boolean | `[%b p=?]` | `true` | `[%b p=%.y]` |
| Number | `[%n p=@ta]` | `123` | `[%n p=~.123]` |
| String | `[%s p=@t]` | `"foo"` | `[%s p='foo']` |
| Array | `[%a p=(list json)]` | `["foo",123]` | `[%a p=~[[%s p='foo'] [%n p=~.123]]]` |
| Object | `[%o p=(map @t json)]` | `{"foo":"xyz","bar":123}` | `[%o p={[p='bar' q=[%n p=~.123]] [p='foo' q=[%s p='xyz']]}]` |
Since the `$json` `%o` object and `%a` array types may themselves contain any `$json`, you can see how JSON structures of arbitrary complexity can be represented. Note the `%n` number type is a `@ta` rather than something like a `@ud` that you might expect. This is because JSON's number type may be either an integer or floating point, so it's left as a `knot` which can then be parsed to a `@ud` or `@rd` with the appropriate [`+dejs:format`](/docs/hoon/reference/zuse/2d_6) function.
## `$json` encoding and decoding example
Let's have a look at a practical example. Here's a core with three arms. It has the structure arm `$user`, and then two more: `+to-js` converts a `$user` structure to `$json`, and `+from-js` does the opposite. Usually we'd define the structure in a separate `/sur` file, but for simplicity it's all in the one core.
#### `json-test.hoon`
```hoon
|%
+$ user
$: username=@t
name=[first=@t mid=@t last=@t]
joined=@da
email=@t
==
++ to-js
|= usr=user
|^ ^- json
%- pairs:enjs:format
:~
['username' s+username.usr]
['name' name]
['joined' (sect:enjs:format joined.usr)]
['email' s+email.usr]
==
++ name
:- %a
:~
[%s first.name.usr]
[%s mid.name.usr]
[%s last.name.usr]
==
--
++ from-js
=, dejs:format
^- $-(json user)
%- ot
:~
[%username so]
[%name (at ~[so so so])]
[%joined du]
[%email so]
==
--
```
**Note**: This example (and a couple of others in this guide) sometimes use a syntax of `foo+bar`. This is just syntactic sugar to tag the head of `bar` with the `term` constant `%foo`, and is equivalent to `[%foo bar]`. Since `json` data is a union with head tags of `%b`, `%n`, `%s`, `%a`, or `%o`, it's sometimes convenient to do `s+'some string'`, `b+&`, etc.
### Try it out
First we'll try using our `$json` encoding/decoding library, and afterwards we'll take a closer look at its construction. To begin, save the code above in `/lib/json-test.hoon` of the `%base` desk on a fake ship and `|commit` it:
```
> |commit %base
>=
+ /~zod/base/5/lib/json-test/hoon
```
Then we need to build it so we can use it. We'll give it a face of `user-lib`:
```
> =user-lib -build-file %/lib/json-test/hoon
```
Let's now create an example of a `$user` structure:
```
> =usr `user:user-lib`['john456' ['John' 'William' 'Smith'] now 'john.smith@example.com']
> usr
[ username='john456'
name=[first='John' mid='William' last='Smith']
joined=~2021.9.12..09.47.58..1b65
email='john.smith@example.com'
]
```
Now we can try calling the `+to-js` function with our data to convert it to `$json`:
```
> =usr-json (to-js:user-lib usr)
> usr-json
[ %o
p
{ [p='email' q=[%s p='john.smith@example.com']]
[p='name' q=[%a p=~[[%s p='John'] [%s p='William'] [%s p='Smith']]]]
[p='username' q=[%s p='john456']]
[p='joined' q=[%n p=~.1631440078]]
}
]
```
Let's also see how that `$json` would look as real JSON encoded in text. We can do that with `+en-json:html`:
```
> (crip (en-json:html (to-js:user-lib usr)))
'{"joined":1631440078,"username":"john456","name":["John","William","Smith"],"email":"john.smith@example.com"}'
```
Finally, let's try converting the `$json` back to a `$user` again with our `+from-js` arm:
```
> (from-js:user-lib usr-json)
[ username='john456'
name=[first='John' mid='William' last='Smith']
joined=~2021.9.12..09.47.58
email='john.smith@example.com'
]
```
### Analysis
#### Converting to `$json`
Here's our arm that converts a `$user` structure to `$json`:
```hoon
++ to-js
|= usr=user
|^ ^- json
%- pairs:enjs:format
:~
['username' s+username.usr]
['name' name]
['joined' (sect:enjs:format joined.usr)]
['email' s+email.usr]
==
++ name
:- %a
:~
[%s first.name.usr]
[%s mid.name.usr]
[%s last.name.usr]
==
--
```
There are different ways we could represent our `$user` structure as JSON, but in this case we've opted to encapsulate it in an object and have the `name` as an array (since JSON arrays preserve order).
[`+enjs:format`](/docs/hoon/reference/zuse/2d_1-5#enjsformat)includes the convenient [`+pairs`](/docs/hoon/reference/zuse/2d_1-5#pairsenjsformat) function, which converts a list of `[@t json]` to an object containing those key-value pairs. We've used this to assemble the final object. Note that if you happen to have only a single key-value pair rather than a list, you can use [`+frond`](/docs/hoon/reference/zuse/2d_1-5#frondenjsformat) instead of `+pairs`.
For the `joined` field, we've used the [`+sect`](/docs/hoon/reference/zuse/2d_1-5#sectenjsformat) function from `+enjs` to convert the `@da` to a Unix seconds timestamp in a `$json` number. The `+sect` function, like others in `+enjs`, takes in a noun (in this case a `@da`) and produces `$json` (in this case a `[%n @ta]` number). `+enjs` contains a handful of useful functions like this, but for the rest we've just hand-made the `$json` structure. This is fairly typical when encoding `$json`, it's usually [decoding](#converting-from-json) that makes more extensive use of the `$json` utility functions in `+format`.
For the `name` field we've just formed a cell of `%a` and a list of `$json` strings, since a `$json` array is `[%a p=(list json)]`. Note we've separated this part into its own arm and wrapped the whole thing in a `|^` - a core with a `$` arm that's computed immediately. This is simply for readability - our structure here is quite simple but when dealing with deeply-nested `$json` structures or complex logic, having a single giant function can quickly become unwieldy.
#### Converting from `$json`
Here's our arm that converts `$json` to our `$user` structure:
```hoon
++ from-js
=, dejs:format
^- $-(json user)
%- ot
:~
[%username so]
[%name (at ~[so so so])]
[%joined du]
[%email so]
==
```
This is the inverse of the [encoding](#converting-to-json) function described in the previous section.
We make extensive use of [`+dejs:format`](/docs/hoon/reference/zuse/2d_6) functions here, so we've used `=,` to expose the namespace and allow succinct `+dejs` function calls.
We use the [`+ot`](/docs/hoon/reference/zuse/2d_6#otdejsformat) function from `+dejs:format` to decode the `$json` object to a n-tuple. It's a wet gate that takes a list of pairs of keys and other `+dejs` functions and produces a new gate that takes the `$json` to be decoded (which we've given it in `jon`).
The [`+so`](/docs/hoon/reference/zuse/2d_6#sodejsformat) functions just decode `$json` strings to `cord`s. The [`+at`](/docs/hoon/reference/zuse/2d_6#atdejsformat) function converts a `$json` array to a tuple, decoding each element with the respective function given in its argument list. Like `+ot`, `+at` is also a wet gate that produces a gate that takes `$json`. In our case we've used `+so` for each element, since they're all strings.
For `joined`, we've used the [`+du`](/docs/hoon/reference/zuse/2d_6#dudejsformat) function, which converts a Unix seconds timestamp in a `$json` number to a `@da` (it's basically the inverse of the `+sect:enjs:format` we used earlier).
Notice how `+ot` takes in other `+dejs` functions in its argument. One of its arguments includes the `+at` function which itself takes in other `+dejs` functions. There are several `+dejs` functions like this that allow complex nested JSON structures to be decoded. For other examples of common `+dejs` functions like this, see the [More `+dejs`](#more-dejs) section below.
There are dozens of different functions in [`+dejs:format`](/docs/hoon/reference/zuse/2d_6) that will cover a great many use cases. If there isn't a `+dejs` function for a particular case, you can also just write a custom function - it just has to take `$json`. Note there's also the [`+dejs-soft:format`](/docs/hoon/reference/zuse/2d_7) functions - these are similar to `+dejs` functions except they produce `unit`s rather than simply crashing if decoding fails.
## More `+dejs`
We looked at the commonly used `+ot` function in the [first example](#converting-from-json), now let's look at a couple more common `+dejs` functions.
### `+of`
The [`+of`](/docs/hoon/reference/zuse/2d_6#ofdejsformat) function takes an object containing a single key-value pair, decodes the value with the corresponding `+dejs` function in a key-function list, and produces a key-value tuple. This is useful when there are multiple possible objects you might receive, and tagged unions are a common data structure in hoon.
Let's look at an example. Here's a gate that takes in some `$json`, decodes it with an `+of` function that can handle three possible objects, casts the result to a tagged union, switches against its head with `?-`, performs some transformation and finally returns the result. You can save it as `gen/of-test.hoon` in the `%base` desk of a fake ship and `|commit %base`.
#### `of-test.hoon`
```hoon
|= jon=json
|^ ^- @t
=/ =fbb
(to-fbb jon)
?- -.fbb
%foo (cat 3 +.fbb '!!!')
%bar ?:(+.fbb 'Yes' 'No')
%baz :((cury cat 3) p.fbb q.fbb r.fbb)
==
+$ fbb
$% [%foo @t]
[%bar ?]
[%baz p=@t q=@t r=@t]
==
++ to-fbb
=, dejs:format
%- of
:~ foo+so
bar+bo
baz+(at ~[so so so])
==
--
```
Let's try it:
```
> +of-test (need (de-json:html '{"foo":"Hello"}'))
'Hello!!!'
> +of-test (need (de-json:html '{"bar":true}'))
'Yes'
> +of-test (need (de-json:html '{"baz":["a","b","c"]}'))
'abc'
```
### `+ou`
The [`+ou`](/docs/hoon/reference/zuse/2d_6#oudejsformat) function decodes a `$json` object to an n-tuple using the matching functions in a key-function list. Additionally, it lets you set some key-value pairs in an object as optional and others as mandatory. The mandatory ones crash if they're missing and the optional ones are replaced with a given noun.
`+ou` is different to other `+dejs` functions - the functions it takes are `$-((unit json) grub)` rather than the usual `$-(json grub)` of most `+dejs` functions. There are only two `+dejs` functions that fit this - [`+un`](/docs/hoon/reference/zuse/2d_6#undejsformat) and [`+uf`](/docs/hoon/reference/zuse/2d_6#ufdejsformat). These are intended to be used with `+ou` - you would wrap each function in the key-function list of `+ou` with either `+un` or `+uf`.
`+un` crashes if its argument is `~`. `+ou` gives functions a `~` if the matching key-value pair is missing in the `$json` object, so `+un` crashes if the key-value pair is missing. Therefore, `+un` lets you set key-value pairs as mandatory.
`+uf` takes two arguments - a noun and a `+dejs` function. If the `(unit json)` it's given by `+ou` is `~`, it produces the noun it was given rather than the product of the `+dejs` function. This lets you specify key-value pairs as optional, replacing missing ones with whatever you want.
Let's look at a practical example. Here's a generator you can save in the `%base` desk of a fake ship in `gen/ou-test.hoon` and `|commit %base`. It takes in a `$json` object and produces a triple. The `+ou` in `+decode` has three key-function pairs - the first two are mandatory and the last is optional, producing the bunt of a set if the `%baz` key is missing.
#### `ou-test.hoon`
```hoon
|= jon=json
|^ ^- [@t ? (set @ud)]
(decode jon)
++ decode
=, dejs:format
%- ou
:~ foo+(un so)
bar+(un bo)
baz+(uf *(set @ud) (as ni))
==
--
```
Let's try it:
```
> +ou-test (need (de-json:html '{"foo":"hello","bar":true,"baz":[1,2,3,4]}'))
['hello' %.y {1 2 3 4}]
> +ou-test (need (de-json:html '{"foo":"hello","bar":true}'))
['hello' %.y {}]
> +ou-test (need (de-json:html '{"foo":"hello"}'))
[%key 'bar']
dojo: hoon expression failed
```
### `+su`
The [`+su`](/docs/hoon/reference/zuse/2d_6#sudejsformat) function parses a string with the given parsing `rule`. Hoon's functional parsing library is very powerful and lets you create arbitrarily complex parsers. JSON will often have data types encoded in strings, so this function can be very useful. The writing of parsers is outside the scope of this guide, but you can see the [Parsing Guide](/docs/hoon/guides/parsing) and sections 4e to 4j of the standard library documentation for details.
Here are some simple examples of using `+su` to parse strings:
```
> `@ux`((su:dejs:format hex) s+'deadbeef1337f00D')
0xdead.beef.1337.f00d
> `(list @)`((su:dejs:format (most lus dem)) s+'1+2+3+4')
~[1 2 3 4]
> `@ub`((su:dejs:format ven) s+'+>-<->+<+')
0b11.1000.1101
```
Here's a more complex parser that will parse a GUID like `824e7749-4eac-9c00-db16-4cb816cd6f19` to a `@ux`:
#### `su-test.hoon`
```hoon
|= jon=json
^- @ux
%. jon
%- su:dejs:format
%+ cook
|= parts=(list [step @])
^- @ux
(can 3 (flop parts))
;~ plug
(stag 4 ;~(sfix (bass 16 (stun 8^8 six:ab)) hep))
(stag 2 ;~(sfix qix:ab hep))
(stag 2 ;~(sfix qix:ab hep))
(stag 2 ;~(sfix qix:ab hep))
(stag 6 (bass 16 (stun 12^12 six:ab)))
(easy ~)
==
```
Save it in the `/gen` directory of the `%base` desk and `|commit` it. We can then try it with:
```
> +su-test s+'5323a61d-0c26-d8fa-2b73-18cdca805fd8'
0x5323.a61d.0c26.d8fa.2b73.18cd.ca80.5fd8
```
If we delete the last character it'll no longer be a valid GUID and the parsing will fail:
```
> +su-test s+'5323a61d-0c26-d8fa-2b73-18cdca805fd'
/gen/su-test/hoon:<[2 1].[16 3]>
/gen/su-test/hoon:<[3 1].[16 3]>
{1 36}
syntax error
dojo: naked generator failure
```
## `mark` file example
Here's a simple `mark` file for the `$user` structure we created in the [first example](#json-encoding-and-decoding-example). It imports the [json-test.hoon](#json-testhoon) library we created and saved in our `%base` desk's `/lib` directory.
#### `user.hoon`
```hoon
/+ *json-test
|_ usr=user
++ grab
|%
++ noun user
++ json from-js
--
++ grow
|%
++ noun usr
++ json (to-js usr)
--
++ grad %noun
--
```
The [Marks section](/docs/arvo/clay/marks/marks) of the Clay documentation covers `mark` files comprehensively and is worth reading through if you want to write a mark file.
In brief, a mark file contains a `door` with three arms. The door's sample type is the type of the data in question - in our case the `$user` structure. The `+grab` arm contains methods for converting _to_ our mark, and the `+grow` arm contains methods for converting _from_ our mark. The `+noun` arms are mandatory, and then we've added `+json` arms which respectively call the `+from-js` and `+to-js` functions from our `json-test.hoon` library. The final `+grad` arm defines various revision control functions, in our case we've delegated these to the `%noun` mark.
From this mark file, Clay can build mark conversion gates between the `%json` mark and our `%user` mark, allowing the conversion of `$json` data to a `$user` structure and vice versa.
### Try it out
First, we'll save the code above as `user.hoon` in the `/mar` directory our of `%base` desk:
```
> |commit %base
>=
+ /~zod/base/9/mar/user/hoon
```
Let's quickly create a `$json` object to work with:
```
> =jon (need (de-json:html '{"joined":1631440078,"username":"john456","name":["John","William","Smith"],"email":"john.smith@example.com"}'))
> jon
[ %o
p
{ [p='email' q=[%s p='john.smith@example.com']]
[p='name' q=[%a p=~[[%s p='John'] [%s p='William'] [%s p='Smith']]]]
[p='username' q=[%s p='john456']]
[p='joined' q=[%n p=~.1631440078]]
}
]
```
We'll also build our library so we can use its types from the dojo:
```
> =user-lib -build-file %/lib/json-test/hoon
```
Now we can ask Clay to build a mark conversion gate from a `%json` mark to our `%user` mark. We'll use a scry with a `%f` `care` which produces a static mark conversion gate:
```
> =json-to-user .^($-(json user:user-lib) %cf /===/json/user)
```
Let's try converting our `$json` to a `$user` structure with our new mark conversion gate:
```
> =usr (json-to-user jon)
> usr
[ username='john456'
name=[first='John' mid='William' last='Smith']
joined=~2021.9.12..09.47.58
email='john.smith@example.com'
]
```
Now let's try the other direction. We'll again scry Clay to build a static mark conversion gate, this time _from_ `%user` _to_ `%json` rather than the reverse:
```
> =user-to-json .^($-(user:user-lib json) %cf /===/user/json)
```
Let's test it out by giving it our `$user` data:
```
> (user-to-json usr)
[ %o
p
{ [p='email' q=[%s p='john.smith@example.com']]
[p='name' q=[%a p=~[[%s p='John'] [%s p='William'] [%s p='Smith']]]]
[p='username' q=[%s p='john456']]
[p='joined' q=[%n p=~.1631440078]]
}
]
```
Finally, let's see how that looks as JSON encoded in text:
```
> (crip (en-json:html (user-to-json usr)))
'{"joined":1631440078,"username":"john456","name":["John","William","Smith"],"email":"john.smith@example.com"}'
```
Usually (though not in all cases) these mark conversions will be performed implicitly by Gall or Eyre and you'd not deal with the mark conversion gates directly, but it's still informative to see them work explicitly.
## Further reading
[The Zuse library reference](/docs/hoon/reference/zuse/table-of-contents) - This includes documentation of the JSON parsing, printing, encoding and decoding functions.
[The Marks section of the Clay documentation](/docs/arvo/clay/marks/marks) - Comprehensive documentation of `mark`s.
[The External API Reference section of the Eyre documentation](/docs/arvo/eyre/external-api-ref) - Details of the webserver vane Eyre's external API.
[The Iris documentation](/docs/arvo/iris/iris) - Details of the web client vane Iris, which may be used to fetch external JSON data among other things.
[Strings Guide](/docs/hoon/guides/strings) - Atom printing functions like `+scot` will often be useful for JSON encoding - see the [Encoding in Text](/docs/hoon/guides/strings#encoding-in-text) section for usage.
[Parsing Guide](/docs/hoon/guides/parsing) - Learn how to write functional parsers in hoon which can be used with `+su`.

View File

@ -0,0 +1,709 @@
+++
title = "Parsing"
weight = 3
template = "doc.html"
+++
This document serves as an introduction to parsing text with Hoon. No prior
knowledge of parsing is required, and we will explain the basic structure of how
parsing works in a purely functional language such as Hoon before moving on to
how it is implemented in Hoon.
**Note:** For JSON printing/parsing and encoding/decoding, see the [JSON Guide](/docs/hoon/guides/json-guide).
## What is parsing? {% #what-is-parsing %}
A program which takes a raw sequence of characters as an input and produces a data
structure as an output is known as a _parser_. The data structure produced
depends on the use case, but often it may be represented as a tree and the
output is thought of as a structural representation of the input. Parsers are ubiquitous in
computing, commonly used for to perform tasks such as reading files, compiling
source code, or understanding commands input in a command line interface.
Parsing a string is rarely done all at once. Instead, it is usually done
character-by-character, and the return contains the data structure representing
what has been parsed thus far as well as the remainder of the string to be
parsed. They also need to be able to fail in case the input is improperly
formed. We will see each of these standard practices implemented in [Hoon below](#parsing-in-hoon).
## Functional parsers
How parsers are built varies substantially depending on what sort of programming
language it is written in. As Hoon is a functional programming language, we will
be focused on understanding _functional parsers_, also known as _combinator
parsers_. In this section we will make light use of pseudocode, as introducing
the Hoon to describe functional parsers here creates a chicken-and-egg problem.
Complex functional parsers are built piece by piece from simpler parsers
that are plugged into one another in various ways to perform the desired task.
The basic building blocks, or primitives, are parsers that read only a
single character. There are frequently a few types of possible input characters,
such as letters, numbers, and symbols. For example, `(parse "1" integer)` calls
the parsing routine on the string `"1"` and looks for an integer, and so it
returns the integer `1`. However, taking into account what was said above about
parsers returning the unparsed portion of the string as well, we should
represent this return as a tuple. So we should expect something like this:
```
> (parse "1" integer)
[1 ""]
> (parse "123" integer)
[1 "23"]
```
What if we wish to parse the rest of the string? We would need to apply the
`parse` function again:
```
> (parse (parse "123" integer) integer)
[12 "3"]
> (parse (parse (parse "123" integer) integer) integer)
[123 ""]
```
So we see that we can parse strings larger than one character by stringing
together parsing functions for single characters. Thus in addition to parsing
functions for single input characters, we want _parser combinators_ that
allow you to combine two or more parsers to form a more complex one.
Combinators come in a few shapes and sizes, and typical operations they may
perform would be to repeat the same parsing operation until the string is
consumed, try a few different parsing operations until one of them works,
or perform a sequence of parsing operations. We will see how all of this is done
with Hoon in the next section.
# Parsing in Hoon
In this section we will cover the basic types, parser functions, and parser
combinators that are utilized for parsing in Hoon. This is not a complete guide
to every parsing-related functionality in the standard library (of which there
are quite a few), but ought to be
sufficient to get started with parsing in Hoon and be equipped to discover the
remainder yourself.
## Basic types
In this section we discuss the types most commonly used for Hoon parsers. In short:
- A `hair` is the position in the text the parser is at,
- A `nail` is parser input,
- An `edge` is parser output,
- A `rule` is a parser.
### `hair`
```hoon
++ hair [p=@ud q=@ud]
```
A `hair` is a pair of `@ud` used to keep track of what has already been parsed
for stack tracing purposes. This allows the parser to reveal where the problem
is in case it hits something unexpected during parsing.
`p` represents the column and `q` represents the line.
### `nail`
```hoon
++ nail [p=hair q=tape]
```
We recall from our [discussion above](#what-is-parsing) that parsing functions must keep
track of both what has been parsed and what has yet to be parsed. Thus a `nail`
consists of both a `hair`, giving the line and column up to which the input
sequence has already been parsed, and a `tape`, consisting of what remains of
the original input string (i.e. everything after the location indicated by the
`nail`, including the character at that `nail`).
For example, if you wish to feed the entire `tape` `"abc"` into a parser, you
would pass it as the `nail` `[[1 1] "abc"]`. If the parser successfully parses the first
character, the `nail` it returns will be `[[1 2] "bc"]` (though we note that
parser outputs are actually `edge`s which contain a `nail`, see the following).
The `nail` only matters for book-keeping reasons - it could be any value here
since it doesn't refer to a specific portion of the string being input, but only
what has theoretically already been parsed up to that point.
### `edge`
```hoon
++ edge [p=hair q=(unit [p=* q=nail])]
```
An `edge` is the output of a parser. If parsing succeeded, `p` is the location
of the original input `tape `up to which the text has been parsed. If parsing
failed, `p` will be the first `hair` at which parsing failed.
`q` may be `~`, indicating that parsing has failed .
If parsing did not fail, `p.q` is the data structure that is the result of the
parse up to this point, while `q.q` is the `nail` which contains the remainder
of what is to be parsed. If `q` is not null, `p` and `p.q.q` are identical.
### `rule`
```hoon
++ rule _|:($:nail $:edge)
```
A `rule` is a gate which takes in a `nail` and returns an `edge` - in other
words, a parser.
## Parser builders
These functions are used to build `rule`s (i.e. parsers), and thus are often
called rule-builders. For a complete list of parser builders, see [4f: Parsing
(Rule-Builders)](/docs/hoon/reference/stdlib/4f), but also the more specific
functions in [4h: Parsing (ASCII
Glyphs)](/docs/hoon/reference/stdlib/4h), [4i: Parsing (Useful
Idioms)](/docs/hoon/reference/stdlib/4i), [4j: Parsing (Bases and Base
Digits)](/docs/hoon/reference/stdlib/4j), [4l: Atom Parsing](/docs/hoon/reference/stdlib/4l).
### [`+just`](/docs/hoon/reference/stdlib/4f/#just)
The most basic rule builder, `+just` takes in a single `char` and produces a
`rule` that attempts to match that `char` to the first character in the `tape`
of the input `nail`.
```
> =edg ((just 'a') [[1 1] "abc"])
> edg
[p=[p=1 q=2] q=[~ [p='a' q=[p=[p=1 q=2] q="bc"]]]]
```
We note that `p.edg` is `[p=1 q=2]`, indicating that the next character to be
parsed is in line 1, column 2. `q.edg` is not null, indicating that parsing
succeeded. `p.q.edg` is `'a'`, which is the result of the parse. `p.q.q.edg` is the same as `p.edg`, which is always the case for
`rule`s built using standard library functions when parsing succeeds. Lastly,
`q.q.edg` is `"bc"`, which is the part of the input `tape` that has yet to be parsed.
Now let's see what happens when parsing fails.
```
> =edg ((just 'b') [[1 1] "abc"])
> edg
[p=[p=1 q=1] q=~]
```
Now we have that `p.edg` is the same as the input `hair`, `[1 1]`, meaning the
parser has not advanced since parsing failed. `q.edg` is null, indicating that
parsing has failed.
Later we will use [+star](#star) to string together a sequence of `+just`s in
order to parse multiple characters at once.
### [`+jest`](/docs/hoon/reference/stdlib/4f/#jest)
`+jest` is a `rule` builder used to match a `cord`. It takes an input `cord` and
produces a `rule` that attempts to match that `cord` against the beginning of
the input.
Let's see what happens when we successfully parse the entire input `tape`.
```
> =edg ((jest 'abc') [[1 1] "abc"])
> edg
[p=[p=1 q=4] q=[~ [p='abc' q=[p=[p=1 q=4] q=""]]]]
```
`p.edg` is `[p=1 q=4]`, indicating that the next character to be parsed is at
line 1, column 4. Of course, this does not exist since the input `tape` was only
3 characters long, so this actually indicates that the entire `tape` has been
successfully parsed (since the `hair` does not advance in the case of failure).
`p.q.edg` is `'abc'`, as expected. `q.q.edg` is `""`, indicating that nothing
remains to be parsed.
What happens if we only match some of the input `tape`?
```
> =edg ((jest 'ab') [[1 1] "abc"])
> edg
[p=[p=1 q=3] q=[~ [p='ab' q=[p=[p=1 q=3] q="c"]]]]
```
Now we have that the result, `p.q.edg`, is `'ab'`, while the remainder `q.q.q.edg`
is `"c"`. So `+jest` has successfully parsed the first two characters, while the
last character remains. Furthermore, we still have the information that the
remaining character was in line 1 column 3 from `p.edg` and `p.q.q.edg`.
What happens when `+jest` fails?
```
> ((jest 'bc') [[1 1] "abc"])
[p=[p=1 q=1] q=~]
```
Despite the fact that `'bc'` appears in `"abc"`, because it was not at the
beginning the parse failed. We will see in [parser
combinators](#parser-combinators) how to modify this `rule` so that it
finds `bc` successfully.
### [`+shim`](/docs/hoon/reference/stdlib/4f/#shim)
`+shim` is used to parse characters within a given range. It takes in two atoms
and returns a `rule`.
```
> ((shim 'a' 'z') [[1 1] "abc"])
[p=[p=1 q=2] q=[~ [p='a' q=[p=[p=1 q=2] q="bc"]]]]
```
### [`+next`](/docs/hoon/reference/stdlib/4f/#next)
`+next` is a simple `rule` that takes in the next character and returns it as the
parsing result.
```
> (next [[1 1] "abc"])
[p=[p=1 q=2] q=[~ [p='a' q=[p=[p=1 q=2] q="bc"]]]]
```
### [`+cold`](/docs/hoon/reference/stdlib/4f/#cold)
`+cold` is a `rule` builder that takes in a constant noun we'll call `cus` and a
`rule` we'll call `sef`. It returns a `rule` identical to the `sef` except it
replaces the parsing result with `cus`.
Here we see that `p.q` of the `edge` returned by the `rule` created with `+cold`
is `%foo`.
```
> ((cold %foo (just 'a')) [[1 1] "abc"])
[p=[p=1 q=2] q=[~ u=[p=%foo q=[p=[p=1 q=2] q="bc"]]]]
```
One common scenario where `+cold` sees play is when writing [command line
interface (CLI) apps](/docs/hoon/guides/cli-tutorial). We usher the
reader there to find an example where `+cold` is used.
### [`+knee`](/docs/hoon/reference/stdlib/4f/#knee)
Another important function in the parser builder library is `+knee`, used for building
recursive parsers. We delay discussion of `+knee` to the
[section below](#recursive-parsers) as more context is needed to explain it
properly.
## Outside callers
Since `hair`s, `nail`s, etc. are only utilized within the context of writing
parsers, we'd like to hide them from the rest of the code of a program that
utilizes parsers. That is to say, you'd like the programmer to only worry about
passing `tape`s to the parser, and not have to dress up the `tape` as a `nail`
themselves. Thus we have several functions for exactly this purpose.
These functions take in either a `tape` or a `cord`,
alongside a `rule`, and attempt to parse the input with the `rule`. If the
parse succeeds, it returns the result. There are crashing and unitized versions
of each caller, corresponding to what happens when a parse fails.
For additional information including examples see [4g: Parsing (Outside Caller)](/docs/hoon/reference/stdlib/4g).
### Parsing `tape`s
[`+scan`](/docs/hoon/reference/stdlib/4g/#scan) takes in a `tape` and a `rule` and attempts to parse the `tape` with the
`rule`.
```
> (scan "hello" (jest 'hello'))
'hello'
> (scan "hello zod" (jest 'hello'))
{1 6}
'syntax-error'
```
[`+rust`](/docs/hoon/reference/stdlib/4g/#rust) is the unitized version of `+scan`.
```
> (rust "a" (just 'a'))
[~ 'a']
> (rust "a" (just 'b'))
~
```
For the remainder of this tutorial we will make use of `+scan` so that we do not
need to deal directly with `nail`s except where it is illustrative to do so.
### Parsing atoms
[Recall](/docs/hoon/hoon-school/lists) that `cord`s are atoms with the aura
`@t` and are typically used to represent strings internally as data, as atoms
are faster for the computer to work with than `tape`s, which are `list`s of
`@tD` atoms. [`+rash`](/docs/hoon/reference/stdlib/4g/#rash) and [`+rush`](/docs/hoon/reference/stdlib/4g/#rush) are for parsing atoms, with `+rash` being
analogous to `+scan` and `+rush` being analogous to `+rust`. Under the hood, `+rash`
calls `+scan` after converting the input atom to a `tape`, and `+rush` does
similary for `+rust`.
## Parser modifiers
The standard library provides a number of gates that take a `rule` and produce a
new modified `rule` according to some process. We call these _parser modifiers_.
These are documented among the [parser
builders](/docs/hoon/reference/stdlib/4f).
### [`+ifix`](/docs/hoon/reference/stdlib/4f/#ifix)
`+ifix` modifies a `rule` so that it matches that `rule` only when it is
surrounded on both sides by text that matches a pair of `rule`s, which is discarded.
```
> (scan "(42)" (ifix [pal par] (jest '42')))
'42'
```
`+pal` and `+par` are shorthand for `(just '(')` and `(just ')')`, respectively. All
ASCII glyphs have counterparts of this sort, documented
[here](/docs/hoon/reference/stdlib/4h).
### [`+star`](/docs/hoon/reference/stdlib/4f/#star) {% #star %}
`+star` is used to apply a `rule` repeatedly. Recall that `+just` only parses
the first character in the input `tape.`
```
> ((just 'a') [[1 1] "aaaa"])
[p=[p=1 q=2] q=[~ [p='a' q=[p=[p=1 q=2] q="aaa"]]]]
```
We can use `+star` to get the rest of the `tape`:
```
> ((star (just 'a')) [[1 1] "aaa"])
[p=[p=1 q=4] q=[~ [p=[i='a' t=<|a a|>] q=[p=[p=1 q=4] q=""]]]]
```
and we note that the parsing ceases when it fails.
```
> ((star (just 'a')) [[1 1] "aaab"])
[p=[p=1 q=4] q=[~ [p=[i='a' t=<|a a|>] q=[p=[p=1 q=4] q="b"]]]]
```
We can combine `+star` with `+next` to just return the whole input:
```
> ((star next) [[1 1] "aaabc"])
[p=[p=1 q=6] q=[~ [p=[i='a' t=<|a a b c|>] q=[p=[p=1 q=6] q=""]]]]
```
### [`+cook`](/docs/hoon/reference/stdlib/4f/#cook)
`+cook` takes a `rule` and a gate and produces a modified version of the `rule`
that passes the result of a successful parse through the given gate.
Let's modify the rule `(just 'a')` so that it when it successfully parses `a`,
it returns the following letter `b` as the result.
```
((cook |=(a=@ `@t`+(a)) (just 'a')) [[1 1] "abc"])
[p=[p=1 q=2] q=[~ u=[p='b' q=[p=[p=1 q=2] q="bc"]]]]
```
## Parser combinators
Building complex parsers from simpler parsers is accomplished in Hoon with the
use of two tools: the monadic applicator rune
[`;~`](/docs/hoon/reference/rune/mic/#-micsig) and [parsing
combinators](/docs/hoon/reference/stdlib/4e). First we introduce a few
combinators, then we examine more closely how `;~` is used to chain them together.
The syntax to combine `rule`s is
```hoon
;~(combinator rule1 rule2 ... ruleN)
```
The `rule`s are composed together using the combinator as an
intermediate function, which takes the product of a `rule` (an `edge`) and a `rule` and turns
it into a sample (a `nail`) for the next `rule` to handle. We elaborate on this
behavior [below](#micsig).
### [`+plug`](/docs/hoon/reference/stdlib/4e/#plug)
`+plug` simply takes the `nail` in the `edge` produced by one rule and passes it
to the next `rule`, forming a cell of the results as it proceeds.
```
> (scan "starship" ;~(plug (jest 'star') (jest 'ship')))
['star' 'ship']
```
### [`+pose`](/docs/hoon/reference/stdlib/4e/#pose)
`+pose` tries each `rule` you hand it successively until it finds one that
works.
```
> (scan "a" ;~(pose (just 'a') (just 'b')))
'a'
> (scan "b" ;~(pose (just 'a') (just 'b')))
'b'
```
### [`+glue`](/docs/hoon/reference/stdlib/4e/#glue)
`+glue` parses a delimiter in between each `rule` and forms a cell of the
results of each `rule`.
```
> (scan "a,b" ;~((glue com) (just 'a') (just 'b')))
['a' 'b']
> (scan "a,b,a" ;~((glue com) (just 'a') (just 'b')))
{1 4}
syntax error
> (scan "a,b,a" ;~((glue com) (just 'a') (just 'b') (just 'a')))
['a' 'b' 'a']
```
### [`;~`](/docs/hoon/reference/rune/mic/#-micsig) {% #micsig %}
Understanding the rune `;~` is essential to building parsers with Hoon. Let's
take this opportunity to think about it carefully.
The `rule` created by `;~(combinator (list rule))` may be understood
inductively. To do this, let's consider the base case where our `(list rule)` has only a
single entry.
```
> (scan "star" ;~(plug (jest 'star')))
'star'
```
Our output is identical to that given by `(scan "star" (jest 'star'))`. This is to
be expected. The combinator `+plug` is specifically used for chaining together
`rule`s in the `(list rule)`, but if there is only one `rule`, there is nothing
to chain. Thus, swapping out `+plug` for another combinator makes no difference here:
```
> (scan "star" ;~(pose (jest 'star')))
'star'
> (scan "star" ;~((glue com) (jest 'star')))
'star'
```
`;~` and the combinator only begin to play a role once the `(list rule)` has at
least two elements. So let's look at an example done with `+plug`, the simplest
combinator.
```
> (scan "star" ;~(plug (jest 'st') (jest 'ar')))
['st' 'ar']
```
Our return suggests that we first parsed `"star"` with the `rule` `(jest 'st')` and passed
the resulting `edge` to `(jest 'ar')` - in other words, we called `+plug` on `(jest 'st')` and the `edge` returned once it had been used to parse `"star"`. Thus
`+plug` was the glue that allowed us to join the two `rule`s, and `;~` performed
the gluing operation. And so, swapping `+plug` for `+pose` results in a crash,
which clues us into the fact that the combinator now has an effect since there
is more than one `rule`.
```
> (scan "star" ;~(pose (jest 'st') (jest 'ar')))
{1 3}
syntax error
```
## Parsing numbers
Functions for parsing numbers are documented in [4j: Parsing (Bases and Base
Digits)](/docs/hoon/reference/stdlib/4j). In particular,
[`dem`](/docs/hoon/reference/stdlib/4i/#dem) is a `rule` for parsing decimal
numbers.
```
> (scan "42" dem)
42
> (add 1 (scan "42" dem))
43
```
## Recursive parsers
Naively attempting to write a recursive `rule`, i.e. like
```
> |-(;~(plug prn ;~(pose $ (easy ~))))
```
results in an error:
```
-find.,.+6
-find.,.+6
rest-loop
```
Here, [`+prn`](/docs/hoon/reference/stdlib/4i/#prn) is a `rule` used
to parse any printable character, and
[`+easy`](/docs/hoon/reference/stdlib/4f/#easy) is a `rule` that always returns a
constant (`~` in this case) regardless of the input.
Thus some special sauce is required, the
[`+knee`](/docs/hoon/reference/stdlib/4f/#knee) function.
`+knee` takes in a noun that is the default value of the parser, typically given
as the bunt value of the type that the `rule` produces, as well as a gate that
accepts a `rule`. `+knee` produces a `rule` that implements any recursive calls
in the `rule` in a manner acceptable to the compiler. Thus the preferred manner
to write the above `rule` is as follows:
```hoon
|-(;~(plug prn ;~(pose (knee *tape |.(^$)) (easy ~))))
```
You may want to utilize the `~+` rune when writing recursive parsers to cache
the parser to improve performance. In the following section, we will be writing a recursive
parser making use of `+knee` and `~+` throughout.
# Parsing arithmetic expressions
In this section we will be applying what we have learned to write a parser for
arithmetic expressions in Hoon. That is, we will make a `rule` that takes in
`tape`s of the form `"(2+3)*4"` and returns `20` as a `@ud`.
We call a `tape` consisting of some consistent arithmetic string of numbers,
`+`, `*`, `(`, and `)` an _expression_. We wish to build a `rule` that takes in an
expression and returns the result of the arithmetic computation described by the
expression as a `@ud`.
To build a parser it is a helpful exercise to first describe its
[grammar](https://en.wikipedia.org/wiki/Parsing_expression_grammar). This has a
formal mathematical definition, but we will manage to get by here describing the grammar
for arithmetic expressions informally.
First let's look at the code we're going to use, and then dive into explaining
it. If you'd like to follow along, save the following as `expr-parse.hoon` in
your `gen/` folder.
```hoon
:: expr-parse: parse arithmetic expressions
::
|= math=tape
|^ (scan math expr)
++ factor
%+ knee *@ud
|. ~+
;~ pose
dem
(ifix [pal par] expr)
==
++ term
%+ knee *@ud
|. ~+
;~ pose
((slug mul) tar ;~(pose factor term))
factor
==
++ expr
%+ knee *@ud
|. ~+
;~ pose
((slug add) lus ;~(pose term expr))
term
==
--
```
Informally, the grammar here is:
- A factor is either an integer or an expression surrounded by parentheses.
- A term is either a factor or a factor times a term.
- An expression is either a term or a term plus an expression.
### Factors, terms, and expressions
Our grammar consists of three `rule`s: one for factors, one for terms, and one
for expressions.
#### Factors
```hoon
++ factor
%+ knee *@ud
|. ~+
;~ pose
dem
(ifix [pal par] expr)
==
```
A _factor_ is either a decimal number or an expression surrounded by parentheses. Put
into Hoon terms, a decimal number is parsed by the `rule` `+dem` and an
expression is parsed by removing the surrounding parentheses and then passing
the result to the expression parser arm `+expr`, given by the `rule` `(ifix [pal par] expr)`. Since we want to parse our expression with one or the other, we
chain these two `rule`s together using the monadic applicator rune `;~` along
with `+pose`, which says to try each rule in succession until one of them works.
Since expressions ultimately reduce to factors, we are actually building a
recursive rule. Thus we need to make use of `+knee`. The first argument for
`+knee` is `*@ud`, since our final answer should be a `@ud`.
Then follows the definition of the gate utilized by `+knee`:
```hoon
|. ~+
;~ pose
dem
(ifix [pal par] expr)
==
```
`~+` is used to cache the parser, so that it does not need to be computed over
and over again. Then it follows the `rule` we described above.
#### Parsing expressions
An _expression_ is either a term plus an expression or a term.
In the case of a term plus an expression, we actually must compute what that equals. Thus we will
make use of [`+slug`](/docs/hoon/reference/stdlib/4f#slug), which parses a
delimited list into `tape`s separated by a given delimiter and then composes
them by folding with a binary gate. In this case, our delimiter is `+` and our
binary gate is `+add`. That is to say, we will split the input string into terms
and expressions separated by luses, parse each term and expression until they
reduce to a `@ud`, and then add them together. This is accomplished with the
`rule` `((slug add) lus ;~(pose term expr))`.
If the above `rule` does not parse the expression, it must be a `term`, so the
`tape` is automatically passed to `+term` to be evaluated. Again we use `;~` and `pose` to
accomplish this:
```hoon
;~ pose
((slug add) lus ;~(pose term expr))
term
==
```
The rest of the `+expr` arm is structured just like how `+factor` is, and for
the same reasons.
#### Parsing terms
A _term_ is either a factor times a term or a factor. This is handled similarly
for expressions, we just need to swap `lus` for `tar`, `add` for `mul`, and
`;~(pose factor term)` instead of `;~(pose term expr)`.
```hoon
++ expr
%+ knee *@ud
|. ~+
;~ pose
((slug add) lus ;~(pose term expr))
term
==
```
### Try it out
Let's feed some expressions to `+expr-parse` and see how it does.
```
> +expr-parse "3"
3
> +expr-parse "3+3"
6
> +expr-parse "3+3+(2*3)+(4+2)*(4+1)"
42
> +expr-parse "3+3+2*3"
12
```
As an exercise, add exponentiation (e.g. `2^3 = 8`) to `+expr-parse`.

View File

@ -0,0 +1,566 @@
+++
title = "Sail (HTML)"
weight = 6
template = "doc.html"
+++
Sail is a domain-specific language for composing HTML (and XML) structures in
Hoon. Like everything else in Hoon, a Sail document is a noun, just one produced
by a specialized markup language within Hoon.
Front-ends for Urbit apps are often created and uploaded separately to the
rest of the code in the desk. Sail provides an alternative approach, where
front-ends can be composed directly inside agents.
This document will walk through the basics of Sail and its syntax.
## Basic example
Its easy to see how Sail can directly translate to HTML:
{% table %}
- Sail
- HTML
---
- ```
;html
;head
;title: My page
;meta(charset "utf-8");
==
;body
;h1: Welcome!
;p
; Hello, world!
; Welcome to my page.
; Here is an image:
;br;
;img@"/foo.png";
==
==
==
```
- ```
<html>
<head>
<title>My page</title>
<meta charset="utf-8" />
</head>
<body>
<h1>Welcome!</h1>
<p>Hello, world! Welcome to my
page. Here is an image:
<br />
<img src="/foo.png" />
</p>
</body>
</html>
```
{% /table %}
## Tags and Closing
In Sail, tag heads are written with the tag name prepended by `;`. Unlike in
HTML, there are different ways of closing tags, depending on the needs of the
tag. One of the nice things about Hoon is that you dont have to constantly
close expressions; Sail inherits this convenience.
### Empty
Empty tags are closed with a `;` following the tag. For example, `;div;` will be
rendered as `<div></div>`. Non-container tags `;br;` and `;img@"some-url";` in
particular will be rendered as a single tag like `<br />` and `<img src="some-url" />`.
### Filled
Filled tags are closed via line-break. To fill text inside, add `:` after the
tag name, then insert your plain text following a space. Example:
| Sail | HTML |
| ---------------- | -------------------- |
| `;h1: The title` | `<h1>The title</h1>` |
### Nested
To nest tags, simply create a new line. Nested tags need to be closed with `==`,
because they expect a list of sub-tags.
If we nest lines of plain text with no tag, the text will be wrapped in a
`<p>` tag. Additionally, any text with atom auras or `++arm:syntax` in such
plain text lines will be wrapped in `<code>` tags.
Example:
{% table %}
- Sail
- HTML
---
- ```
;body
;h1: Blog title
This is some good content.
==
```
- ```
<body>
<h1>Blog title</h1>
<p>This is some good content.</p>
</body>
```
{% /table %}
If we want to write a string with no tag at all, then we can prepend
those untagged lines with `;` and then a space:
{% table %}
- Sail
- HTML
---
- ```
;body
;h1: Welcome!
; Hello, world!
; Were on the web.
==
```
- ```
<body>
<h1>Welcome!</h1>
Hello, world!
Were on the web.
</body>
```
{% /table %}
## Attributes
Adding attributes is simple: just add the desired attribute between parentheses,
right after the tag name without a space. We separate different attributes of
the same node by using `,`.
Attributes can also be specified in tall form, with each key prefixed by `=`,
followed by two spaces, and then a tape with its value. These two styles are
shown below.
### Generic
{% table %}
- Form
- Example
---
- Wide
- ```
;div(title "a tooltip", style "color:red")
;h1: Foo
foo bar baz
==
```
---
- Tall
- ```
;div
=title "a tooltip"
=style "color:red"
;h1: Foo
foo bar baz
==
```
---
- HTML
- ```
<div title="a tooltip" style="color:red">
<h1>Foo</h1>
<p>foo bar baz </p>
</div>
```
{% /table %}
### IDs
Add `#` after tag name to add an ID:
| Sail | HTML |
| ------------------- | ----------------------------- |
| `;nav#header: Menu` | `<nav id="header">Menu</nav>` |
### Classes
Add `.` after tag name to add a class:
| Sail | HTML |
| ---------------------- | ---------------------------------- |
| `;h1.text-blue: Title` | `<h1 class="text-blue">Title</h1>` |
For class values containing spaces, you can add additional `.`s like so:
| Sail | HTML |
| ------------------- | --------------------------------- |
| `;div.foo.bar.baz;` | `<div class="foo bar baz"></div>` |
Otherwise, if your class value does not conform to the allowed `@tas`
characters, you must use the generic attribute syntax:
| Sail | HTML |
| ------------------------ | ----------------------------- |
| `;div(class "!!! !!!");` | `<div class="!!! !!!"></div>` |
### Images
Add `@` after the tag name to link your source:
| Sail | HTML |
| --------------------- | -------------------------- |
| `;img@"example.png";` | `<img src="example.png"/>` |
To add attributes to the image, like size specifications, add the desired
attribute after the `"` of the image name and before the final `;` of the `img`
tag like `;img@"example.png"(width "100%");`.
### Links
Add `/` after tag name to start an `href`.
{% table %}
- Sail
- HTML
---
- ```
;a/"urbit.org": A link to Urbit.org
```
- ```
<a href="urbit.org">A link to Urbit.org</a>
```
{% /table %}
## Interpolation
The textual content of tags, despite not being enclosed in double-quotes, are
actually tapes. This means they support interpolated Hoon expressions in the
usual manner. For example:
{% table %}
- Sail
- HTML
---
- ```
=| =time
;p: foo {<time>} bar
```
- ```
<p>foo ~2000.1.1 baz</p>
```
{% /table %}
Likewise:
{% table .w-full %}
- Sail
---
- ```
=/ txt=tape " bananas"
;article
;b: {(a-co:co (mul 42 789))}
; {txt}
{<our>} {<now>} {<`@ux`(end 6 eny)>}
==
```
{% /table %}
{% table .w-full %}
- HTML
---
- ```
<article>
<b>33138</b> bananas
<p>~zod ~2022.2.21..09.54.21..5b63 0x9827.99c7.06f4.8ef9</p>
</article>
```
{% /table %}
## A note on CSS
The CSS for a page is usually quite large. The typical approach is to include a
separate arm in your agent (`++style` or the like) and write out the CSS in a
fenced cord block. You can then call `++trip` on it and include it in a style
tag. For example:
```hoon
++ style
^~
%- trip
'''
main {
width: 100%;
color: red;
}
header {
color: blue;
font-family: monospace;
}
'''
```
And then your style tag might look like:
```hoon
;style: {style}
```
A cord is used rather than a tape so you don't need to escape braces. The
[ketsig](/docs/hoon/reference/rune/ket#-ketsig) (`^~`) rune means `++trip` will
be run at compile time rather than call time.
## Types and marks
So far we've shown rendered HTML for demonstrative purposes, but Sail syntax
doesn't directly produce HTML text. Instead, it produces a
[$manx](/docs/hoon/reference/stdlib/5e#manx). This is a Hoon type used to
represent an XML hierarchical structure with a single root node. There are six
XML-related types defined in the standard library:
```hoon
+$ mane $@(@tas [@tas @tas]) :: XML name+space
+$ manx $~([[%$ ~] ~] [g=marx c=marl]) :: dynamic XML node
+$ marl (list manx) :: XML node list
+$ mars [t=[n=%$ a=[i=[n=%$ v=tape] t=~]] c=~] :: XML cdata
+$ mart (list [n=mane v=tape]) :: XML attributes
+$ marx $~([%$ ~] [n=mane a=mart]) :: dynamic XML tag
```
More information about these can be found in [section 5e of the standard library
reference](/docs/hoon/reference/stdlib/5e).
You don't need to understand these types in order to write Sail. The main thing
to note is that a `$manx` is a node (a single tag) and its contents is a
[$marl](/docs/hoon/reference/stdlib/5e#marl), which is just a `(list manx)`.
### Rendering
A `$manx` can be rendered as HTML in a tape with the `++en-xml:html` function in
`zuse.hoon`. For example:
```
> ;p: foobar
[[%p ~] [[%$ [%$ "foobar"] ~] ~] ~]
> =x ;p: foobar
> (en-xml:html x)
"<p>foobar</p>"
> (crip (en-xml:html x))
'<p>foobar</p>'
```
### Sanitization
The `++en-xml:html` function will sanitize the contents of both attributes and
elements, converting characters such as `>` to HTML entities. For example:
```
> =z ;p(class "\"><script src=\"example.com/xxx.js"): <h1>FOO</h1>
> (crip (en-xml:html z))
'<p class="&quot;&gt;<script src=&quot;example.com/xxx.js"><h1&gt;FOO</h1&gt;</p>'
```
### Marks
There are a few different HTML and XML related marks, so it can be a bit
confusing. We'll look at the ones you're most likely to use.
#### `%html`
- Type: `@t`
This mark is used for HTML that has been printed as text in a cord. You may wish
to return this mark when serving pages to the web. To do so, you must run the
`$manx` produced by your Sail expressions through `++en-xml:html`, and then run
the resulting `tape` through `++crip`.
#### `%hymn`
- Type: `$manx`
The `%hymn` mark is intended to be used for complete HTML documents - having an
`<html>` root element, `<head>`, `<body>`, etc. This isn't enforced on
the type level but it is assumed in certain mark conversion pathways. Eyre can
automatically convert a `%hymn` to printed `%html` if it was requested through
Eyre's scry interface.
#### `%elem`
- Type: `$manx`
The type of the `%elem` mark is a `$manx`, just like a `%hymn`. While `%hymn`s
are intended for complete HTML documents, `%elem`s are intended for more general
XML structures. You may wish to use an `%elem` mark if you're producing smaller
fragments of XML or HTML rather than whole documents. Like a `%hymn`, Eyre can
automatically convert it to `%html` if requested through its scry interface.
#### Summary
In general, if you're going to be composing web pages and serving them to web
clients, running the result of your Sail through `++en-xml:html`, `++crip`ping
it and producing `%html` is the most straight-forward approach. If you might
want to pass around a `$manx` to other agents or ships which may wish to
manipulate it futher, a `%hymn` or `%elem` is better.
## Sail Runes
In addition to the syntax so far described, there are also a few Sail-specific
runes:
### `;+` Miclus
The [miclus rune](/docs/hoon/reference/rune/mic#-miclus) makes a `$marl` from a
complex hoon expression that produces a single `$manx`. Its main use is nesting
tall-form hoon logic in another Sail element. For example:
```hoon
;p
;b: {(a-co:co number)}
; is an
;+ ?: =(0 (mod number 2))
;b: even
;b: odd
; number.
==
```
Produces one of these depending on the value of `number`:
```
<p><b>2 </b>is an <b>even </b>number.</p>
```
```
<p><b>12345 </b>is an <b>odd </b>number.</p>
```
### `;*` Mictar
The [mictar rune](/docs/hoon/reference/rune/mic#-mictar) makes a `$marl` (a list
of XML nodes) from a complex hoon expression. This rune lets you add many
elements inside another Sail element. For example:
{% table %}
- Sail
- HTML
---
- ```
=/ nums=(list @ud) (gulf 1 9)
;p
;* %+ turn nums
|= n=@ud
?: =(0 (mod n 2))
;sup: {(a-co:co n)}
;sub: {(a-co:co n)}
==
```
- ```
<p>
<sub>1</sub><sup>2</sup>
<sub>3</sub><sup>4</sup>
<sub>5</sub><sup>6</sup>
<sub>7</sub><sup>8</sup>
<sub>9</sub>
</p>
```
{% /table %}
### `;=` Mictis
The [mictis rune](/docs/hoon/reference/rune/mic#-mictis) makes a `$marl` (a list
of XML nodes) from a series of `$manx`es. This is mostly useful if you want to
make the list outside of an element and then be able to insert it afterwards.
For example:
{% table %}
- Sail
- HTML
---
- ```
=/ paras=marl
;= ;p: First node.
;p: Second node.
;p: Third node.
==
;main
;* paras
==
```
- ```
<main>
<p>First node.</p>
<p>Second node.</p>
<p>Third node.</p>
</main>
```
{% /table %}
### `;/` Micfas
The [micfas rune](/docs/hoon/reference/rune/mic#-micfas) turns an ordinary tape
into a `$manx`. For example:
```
> %- en-xml:html ;/ "foobar"
"foobar"
```
In order to nest it inside another Sail element, it must be preceeded with a
`;+` rune or similar, it cannot be used directly. For example:
```hoon
;p
;+ ;/ ?: =(0 (mod eny 2))
"even"
"odd"
==
```
## Good examples
Here's a couple of agents that make use of Sail, which you can use as a
reference:
- [Pals by ~palfun-foslup][pals]
- [Gora by ~rabsef-bicrym][gora]
[pals]: https://github.com/Fang-/suite/blob/master/app/pals/webui/index.hoon
[gora]: https://github.com/dalten-collective/gora/blob/master/sail/app/gora/goraui/index.hoon

View File

@ -0,0 +1,531 @@
+++
title = "Strings"
weight = 4
template = "doc.html"
+++
This document discusses hoon's two main string types: `cord`s (as well as its
subsets `knot` and `term`) and `tape`s. The focus of this
document is on their basic properties, syntax and the most common text-related
functions you'll regularly encounter. In particular, it discusses conversions
and the encoding/decoding of atom auras in strings.
Hoon has a system for writing more elaborate functional parsers, but that is not
touched on here. Instead, see the [Parsing](/docs/hoon/guides/parsing) guide.
Hoon also has a type for UTF-32 strings, but those are rarely used and not
discussed in this document.
There are a good deal more text manipulation functions than are discussed here.
See the [Further Reading](#further-reading) section for details.
## `tape`s vs. text atoms
As mentioned, urbit mainly deals with two kinds of strings: `tape`s and
`cord`/`knot`/`term`s. The former is a list of individual UTF-8 characters.
The latter three encode UTF-8 strings in a single atom.
Cords may contain any UTF-8 characters, while `knot`s and `term`s only allow a
smaller subset. Each of these are discussed below in the [Text
atoms](#text-atoms) section.
Text atoms like `cord`s are more efficient to store and move around. They are
also more efficient to manipulate with simple bitwise operations. Their downside
is that UTF-8 characters vary in their byte-length. ASCII characters are all
8-bit, but others can occupy up to four bytes. Accounting for this variation in
character size can complicate otherwise simple functions. Tapes, on the other
hand, don't have this problem because each character is a separate item in the
list, regardless of it byte-length. This fact makes it much easier to process
tapes in non-trivial ways with simple list functions.
In light of this, a general rule of thumb is to use cords for simple things like
storing chat messages or exchanging them over the network. If text requires
complex processing on the other hand, it is generally easier with tapes. Note
there _are_ cord manipulation functions in the standard library, so you needn't
always convert cords to tapes for processing, it just depends on the case.
Next we'll look at these different types of strings in more detail.
## Text atoms
### `cord`
A [`cord`](/docs/hoon/reference/stdlib/2q#cord) has an aura of `@t`. It denotes
UTF-8 text encoded in an atom, little-endian. That is, the first character in
the text is the least-significant byte. A cord may contain any UTF-8 characters,
there are no restrictions.
The `hoon` syntax for a cord is some text wrapped in single-quotes like:
```hoon
'This is a cord!'
```
single-quotes and backslashes must be escaped with a backslash like:
```hoon
'\'quotes\' \\backslashes\\'
```
Characters can also be entered as hex, they just have to be escaped by a
backslash. For example, `'\21\21\21'` will render as `'!!!'`. This is useful for
entering special characters such as line breaks like `'foo\0abar'`.
Cords divided over multiple lines are allowed. There are two ways to do this.
The first is to start and end with three single-quotes like:
```hoon
'''
foo
bar
baz
'''
```
The line endings will be encoded Unix-style as line feed characters like:
```hoon
'foo\0abar\0abaz'
```
The second is to begin with a single-quote like usual, then break the line by
ending it with a backslash and start the next line with a forward-slash like:
```hoon
'foo\
/bar\
/baz'
```
This will be parsed to:
```hoon
'foobarbaz'
```
### `knot`
A [`knot`](/docs/hoon/reference/stdlib/2q#knot) has an aura of `@ta`, and is a
subset of a [`cord`](#cord). It allows lower-case letters, numbers, and four
special characters: Hyphen, tilde, underscore and period. Its restricted set of
characters is intended to be URL-safe.
The `hoon` syntax for a knot is a string containing any of the aforementioned
characters prepended with `~.` like:
```hoon
~.abc-123.def_456~ghi
```
### `term`
A [`term`](/docs/hoon/reference/stdlib/2q#term) has an aura of `@tas`, and is a
subset of a [`knot`](#knot). It only allows lower-case letters, numbers, and
hyphens. Additionally, the first character cannot be a hyphen or number. This is
a very restricted text atom, and is intended for naming data structures and the
like.
The `hoon` syntax for a term is a string conforming to the prior description,
prepended with a `%` like:
```hoon
%foo-123
```
#### A note about `term` type inference
There is actually an even more restricted text atom form with the same `%foo`
syntax as a term, where the type of the text is the text itself. For example, in
the dojo:
```
> `%foo`%foo
%foo
```
The hoon parser will, by default, infer the type of `%foo`-style syntax this
way. If we try with the dojo type printer:
```
> ? %foo
%foo
%foo
```
This type-as-itself is used for many things, such as unions like:
```hoon
?(%foo %bar %bas)
```
In order to give `%foo` the more generic `@tas` aura, it must be explicitly
upcast like:
```
> ? `@tas`%foo
@tas
%foo
```
This is something to be wary of. For example, if you wanted to form a `(set @tas)` you might think to do:
```hoon
(silt (limo ~[%foo %bar %baz]))
```
However, this will actually form a set of the union `?(%foo %bar %baz)` due to
the specificity of type inference:
{% customFence %}
\> ? (silt (limo ~[%foo %bar %baz]))
?(%~ [?(n=%bar n=%baz n=%foo) l=nlr(?(%bar %baz %foo)) r=nlr(?(%bar %baz %foo))])
[n=%baz l&#x7B;&#x25;bar} r=&#x7B;&#x25;foo}]
{% /customFence %}
One further note about the type-as-itself form: Ocassionally you may wish to
form a union of strings which contain characters disallowed in `term`s. To get
around this, you can enclose the text after the `%` with single-quotes like
`%'HELLO!'`.
### Aura type validity
The hoon parser will balk at `cord`s, `knot`s and `term`s containing invalid
characters. However, because they're merely auras, any atom can be cast to them.
When cast (or clammed), they will **not** be validated in terms of whether the
characters are allowed in the specified aura.
For example, you can do this:
```
> `@tas`'!%* $@&'
%!%* $@&
```
This means you cannot rely on mere aura-casting if you need the text to conform
to the specified aura's restrictions. Instead, there are a couple of function in
the standard library to check text aura validity:
[`+sane`](/docs/hoon/reference/stdlib/4b#sane) and
[`+sand`](/docs/hoon/reference/stdlib/4b#sane).
The `+sane` function takes an argument of either `%ta` or `%tas` to validate
`@ta` and `@tas` respectively (you can technically give it `%t` for `@t` too but
there's no real point). It will return `%.y` if the given atom is valid for the
given aura, and `%.n` if it isn't. For example:
```
> ((sane %tas) 'foo')
%.y
> ((sane %tas) 'foo!')
%.n
```
The `+sand` function does the same thing, but rather than returning a `?` it
returns a `unit` of the given atom, or `~` if validation failed. For example:
```
> `(unit @tas)`((sand %tas) 'foo')
[~ %foo]
> `(unit @tas)`((sand %tas) 'foo!')
~
```
## `tape`
A [`tape`](/docs/hoon/reference/stdlib/2q#tape) is the other
main string type in hoon. Rather than a single atom, it's instead a list of
individual `@tD` characters (the `D` specifies a bit-length of 8, see the
[Auras](/docs/hoon/reference/auras#bitwidth) documentation for
details). The head of the list is the first character in the string.
The `hoon` syntax for a tape is some text wrapped in double-quotes like:
```hoon
"This is a tape!"
```
Double-quotes, backslashes and left-braces must be escaped by a backslash
character:
```hoon
"\"double-quotes\" \\backslash\\ left-brace:\{"
```
Like with `cord`s, characters can also be entered as hex escaped by a backslash
so `"\21\21\21"` renders as `"!!!"`.
Tapes divided over multiple lines are allowed. Unlike [`cord`](#cord)s, there is
only one way to do this, which is by starting and ending with three
double-quotes like:
```hoon
"""
foo
bar
baz
"""
```
The line endings will be encoded Unix-style as line feed characters like:
```hoon
"foo\0abar\0abaz"
```
As mentioned earlier, tapes are lists of single characters:
```
> `tape`~['f' 'o' 'o']
"foo"
```
This means they can be manipulated with ordinary list functions:
```
> `tape`(turn "foobar" succ)
"gppcbs"
```
### Interpolation
Tapes, unlike cords, allow string interpolation. Arbitrary `hoon` may be
embedded in the tape syntax and its product will be included in the resulting
tape. There are two ways to do it:
#### Manual
In the first case, the code to be evaluated is enclosed in braces. The type of
the product of the code must itself be a tape. For example, if the `@p` of our
ship is stored in `our`, simply doing `"{our}"` will fail because its type will
be `@p` rather than `tape`. Instead, we must explicitly use the
[`+scow`](/docs/hoon/reference/stdlib/4m#scow) function to
render `our` as a tape:
```
> "{(scow %p our)}"
"~zod"
```
Another example:
```
> "[{(scow %p our)} {(scow %da now)}]"
"[~zod ~2021.10.3..08.59.10..2335]"
```
#### Automatic
Rather than having to manually render data as a `tape`, angle brackets _inside_
the braces tell the interpreter to automatically pretty-print the product of the
expression as a tape. This way we needn't use functions like `+scow` and can
just reference things like `our` directly:
```
> "{<our>}"
~zod
```
Another example:
```
> "{<(add 1 2)>}"
"3"
```
And another:
```
> "{<our now>}"
"[~zod ~2021.10.3..09.01.14..1654]"
```
## Conversions
Tapes can easily be converted to cords and vice versa. There are two stdlib
functions for this purpose: [`+crip`](/docs/hoon/reference/stdlib/4b#crip) and
[`+trip`](/docs/hoon/reference/stdlib/4b#trip). The former converts a `tape` to
a `cord` and the latter does the opposite. For example:
```
> (crip "foobar")
'foobar'
> (trip 'foobar')
"foobar"
```
Knots and terms can also be converted to tapes with `+trip`:
```
> (trip %foobar)
"foobar"
> (trip ~.foobar)
"foobar"
```
Likewise, the output of `+crip` can be cast to a knot or term:
```
> `@tas`(crip "foobar")
%foobar
> `@ta`(crip "foobar")
~.foobar
> `@tas`(need ((sand %tas) (crip "foobar")))
%foobar
```
## Encoding in text
It's common to encode atoms in cords or knots, particularly when constructing a
[scry](/docs/arvo/concepts/scry) [`path`](/docs/hoon/reference/stdlib/2q#path)
or just a `path` in general. There are two main functions for this purpose:
[`+scot`](/docs/hoon/reference/stdlib/4m#scot) and
[`+scow`](/docs/hoon/reference/stdlib/4m#scow). The former produces a `knot`,
and the latter produces a `tape`. Additionally, there are two more functions for
encoding `path`s in cords and tapes respectively:
[`+spat`](/docs/hoon/reference/stdlib/4m#spat) and
[`+spud`](/docs/hoon/reference/stdlib/4m#spud).
### `+scot` and `+spat`
`+scot` encodes atoms of various auras in a `knot` (or `cord`/`term` with
casting). It takes two arguments: the aura in a `@tas` and the atom to be
encoded. For example:
```
> (scot %p ~zod)
~.~zod
> (scot %da now)
~.~2021.10.4..07.35.54..6d41
> (scot %ux 0xaa.bbbb)
~.0xaa.bbbb
```
Note the aura of the atom needn't actually match the specified aura:
```
> (scot %ud ~zod)
~.0
```
Hoon can of course be evaluated in its arguments as well:
```
> (scot %ud (add 1 1))
~.2
```
You'll most commonly see this used in constructing a `path` like:
```
> /(scot %p our)/garden/(scot %da now)/foo/(scot %ud 123.456)
[~.~zod %garden ~.~2021.10.4..07.43.14..a556 %foo ~.123.456 ~]
> `path`/(scot %p our)/garden/(scot %da now)/foo/(scot %ud 123.456)
/~zod/garden/~2021.10.4..07.43.23..9a0f/foo/123.456
```
`+spat` simply encodes a `path` in a cord like:
```
> (spat /foo/bar/baz)
'/foo/bar/baz'
```
### `+scow` and `+spud`
`+scow` is the same as [`+scot`](#scot-and-spat) except it produces a tape
rather than a knot. For example:
```
> (scow %p ~zod)
"~zod"
> (scow %da now)
"~2021.10.4..07.45.25..b720"
> (scow %ux 0xaa.bbbb)
"0xaa.bbbb"
```
`+spud` simply encodes a `path` in a tape:
```
> (spud /foo/bar/baz)
"/foo/bar/baz"
```
## Decoding from text
For decoding atoms of particular auras encoded in cords, there are three
functions: [`+slat`](/docs/hoon/reference/stdlib/4m#slat),
[`+slav`](/docs/hoon/reference/stdlib/4m#slav), and
[`+slaw`](/docs/hoon/reference/stdlib/4m#slaw). Additionally, there is
[`+stab`](/docs/hoon/reference/stdlib/4m#stab) for decoding a cord to a path.
`+slav` parses the given cord with the aura specified as a `@tas`, crashing if
the parsing failed. For example:
```
> `@da`(slav %da '~2021.10.4..11.26.54')
~2021.10.4..11.26.54
> `@p`(slav %p '~zod')
~zod
> (slav %p 'foo')
dojo: hoon expression failed
```
`+slaw` is like `+slav` except it produces a `unit` which is null if parsing
failed, rather than crashing. For example:
```
> `(unit @da)`(slaw %da '~2021.10.4..11.26.54')
[~ ~2021.10.4..11.26.54]
> `(unit @p)`(slaw %p '~zod')
[~ ~zod]
> (slaw %p 'foo')
~
```
`+slat` is a curried version of `+slaw`, meaning it's given the aura and
produces a new gate which takes the actual cord. For example:
```
> `(unit @da)`((slat %da) '~2021.10.4..11.26.54')
[~ ~2021.10.4..11.26.54]
> `(unit @p)`((slat %p) '~zod')
[~ ~zod]
> ((slat %p) 'foo')
~
```
Finally, `+stab` parses a cord containing a path to a `path`. For example:
```
> (stab '/foo/bar/baz')
/foo/bar/baz
```
## Futher reading
- [Parsing](/docs/hoon/guides/parsing) - A guide to writing fully-fledged
functional parsers in hoon.
- [Auras](/docs/hoon/reference/auras) - Details of auras in hoon.
- [stdlib 2b: List logic](/docs/hoon/reference/stdlib/2b) - Standard library
functions for manipulating lists, which are useful for dealing with tapes.
- [stdlib 2q: Molds and Mold-builders](/docs/hoon/reference/stdlib/2q) - Several
text types are defined in this section of the standard library.
- [stdlib 4b: Text processing](/docs/hoon/reference/stdlib/4b) - Standard
library functions for manipulating and converting tapes and strings.
- [stdlib 4m: Formatting functions](/docs/hoon/reference/stdlib/4m) - Standard
library functions for encoding and decoding atom auras in strings.

View File

@ -0,0 +1,115 @@
+++
title = "Unit tests"
weight = 10
template = "doc.html"
+++
## Structure
The `%base` desk includes a `-test` thread which can run unit tests you've
written. A test is a Hoon file which produces a `core`. The `-test` thread will
look for any arms in the core whose name begin with `test-`, e.g:
```hoon
|%
++ test-foo
...
++ test-bar
...
++ test-foo-bar
...
--
```
Any arms that don't begin with `test-` will be ignored. Each `test-` arm must
produce a `tang` (a `(list tank)`). If the `tang` is empty (`~`), it indicates
success. If the `tang` is non-empty, it indicates failure, and the contents of
the `tang` is the error message.
To make test-writing easier, the `%base` desk includes the `/lib/test.hoon`
library which you can import into your test file. The library contains four
functions which all produce `tang`s:
- `expect-eq` - test whether an expression produces the expected value. This
function takes `[expected=vase actual=vase]`, comparing `.expected` to
`.actual`.
- `expect` - test whether an expression produces `%.y`. This function takes a
`vase` containing the result to check.
- `expect-fail` - tests whether the given `trap` crashes, failing if it succeeds.
- `category` - this is a utility that prepends an error message to a failed test
(non-null `tang`), passing through an empty `tang` (successful test)
unchanged.
The most commonly used function is `expect-eq`, which is used like:
```hoon
++ test-foo
%+ expect-eq
!> 'the result I expect'
!> (function-i-want-to-test 'some argument')
```
Of course, you'll want to test something else you've written rather than just
expressions in the test file itself. To do that, you'd just import the file with
`/=` or a similar Ford rune, and then call its functions in the test arms.
You're free to do any compositions, import types, etc, as long as the file
ultimately produces a `core` with `test-*` arms.
## Running
The `-test` thread takes a `(list path)` in the Dojo, where each path is a path
to a test file. The `path` _must_ include the full path prefix
(`/[ship]/[desk]/[case]`). The `path` _may_ omit the mark, since a `.hoon` file
is assumed. The `path` _may_ include the name of a test arm after the filename.
In that case, only the specified test arm will be run.
The conventional location for tests is a `/tests`
directory in the root of a desk.
The output of the `-test` thread will note which arms were tested and whether
they succeeded. It will also include:
- The number of micro-seconds it took to execute each test arm.
- A `?` specifying whether all tests succeeded.
- A message confirming the file was built successfully.
Here's an example of running the tests for the `naive.hoon` library:
```
> -test %/tests/lib/naive ~
built /tests/lib/naive/hoon
> test-zod-spawn-to-zero: took 81359µs
OK /lib/naive/test-zod-spawn-to-zero
> test-zod-spawn-proxy: took 128125µs
OK /lib/naive/test-zod-spawn-proxy
.............................
....truncated for brevity....
.............................
> test-approval-for-all: took 647403µs
OK /lib/naive/test-approval-for-all
> test-address-padding: took 75104µs
OK /lib/naive/test-address-padding
ok=%.y
```
Here's an example of running just a single test for `naive.hoon`, the
`++test-deposit` arm:
```
> -test %/tests/lib/naive/test-deposit ~
built /tests/lib/naive/hoon
> test-deposit: took ms/45.542
OK /lib/naive/test-deposit
ok=%.y
```
## More info
A good reference example is the test file for the `/lib/number-to-words.hoon`
library, located in `/tests/lib/number-to-words.hoon`. Note that the `/tests`
directory is not typically included in standard pills. If you want to have a
look at existing tests as a reference, you may need to clone the `urbit/urbit`
repo on [Github](https://github.com/urbit/urbit).
If you write tests for some of your code, you may wish to exclude the `/tests`
directory from the production version of your desk.

View File

@ -0,0 +1,27 @@
+++
title = "Threads"
weight = 50
sort_by = "weight"
template = "sections/docs/chapters.html"
insert_anchor_links = "right"
+++
## [Overview](/docs/userspace/threads/overview)
Overview of threads.
## [Basics](/docs/userspace/threads/basics/)
Tutorial and explanation of the basics of writing threads.
## [Gall](/docs/userspace/threads/gall/)
Tutorial and explanation of interacting with threads from agents.
## [Examples](/docs/userspace/threads/examples/)
A collection of example threads and how-tos.
## [Reference](/docs/userspace/threads/reference)
Basic reference for interacting with threads via the Gall agent Spider.

View File

@ -0,0 +1,29 @@
+++
title = "Basics"
weight = 10
sort_by = "weight"
template = "sections/docs/chapters.html"
insert_anchor_links = "right"
+++
Tutorial and explanation of the basics of writing threads.
## [Fundamentals](/docs/userspace/threads/basics/fundamentals)
Basic explanation of threads and usage of the strand arms `form` and `pure`.
## [Bind](/docs/userspace/threads/basics/bind)
Using micgal (`;<`) and the strand arm `bind` to chain strands together.
## [Input](/docs/userspace/threads/basics/input)
Information on what a strand takes.
## [Output](/docs/userspace/threads/basics/output)
Information on what a strand produces.
## [Summary](/docs/userspace/threads/basics/summary)
Summary of this information.

View File

@ -0,0 +1,97 @@
+++
title = "Bind"
weight = 2
template = "doc.html"
+++
Having looked at `form` and `pure`, we'll now look at the last `strand` arm `bind`. Bind is typically used in combination with micgal (`;<`).
## Micgal
Micgal takes four arguments like `spec hoon hoon hoon`. Given `;< a b c d`, it composes them like `((b ,a) c |=(a d))`. So, for example, these two expressions are equivalent:
```hoon
;< ~ bind:m (sleep:strandio ~s2)
(pure:m !>(~))
```
and
```hoon
((bind:m ,~) (sleep:strandio ~s2) |=(~ (pure:m !>(~))))
```
Micgal exists simply for readability. The above isn't too bad, but consider this:
```hoon
;< a b c
;< d e f
;< g h i
j
```
...as opposed to this monstrosity: `((b ,a) c |=(a ((e ,d) f |=(d ((h ,g) i |=(g j))))))`
## bind
Bind by itself must be specialised like `(bind:m ,<type>)` and it takes two arguments:
- The first argument is a function that returns the `form` of a strand which produces `<type>`.
- The second argument is a gate whose sample is `<type>` and which returns a `form`.
Since you'll invariably use it in conjunction with micgal, the `<type>` in `;< <type> bind:m ...` will both specialise `bind` and specify the gate's sample.
Bind calls the first function then, if it succeeded, calls the second gate with the result of the first as its sample. If the first function failed, it will instead just return an error message and not bother calling the next gate. So it's essentially "strand A then strand B".
Since the second gate may itself contain another `;< <type> bind:m ...`, you can see how this allows you to glue together an arbitrarily large pipeline, where subsequent gates depend on the previous ones.
## strandio
`/lib/strandio/hoon` contains a large collection of useful, ready-made functions for use in threads. For example:
- `sleep` waits for the specified time.
- `get-time` gets the current time.
- `poke` pokes an agent.
- `watch` subscribes to an agent.
- `fetch-json` produces the JSON at a particular URL.
- `retry` tries a strand repeatedly with exponential backoff until it succeeds.
- `start-thread` starts another thread.
- `send-raw-card` sends any card.
...and many more.
## Putting it together
Here's a simple thread with a couple of `strandio` functions:
```hoon
/- spider
/+ strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< t=@da bind:m get-time:strandio
;< s=ship bind:m get-our:strandio
(pure:m !>([s t]))
```
Save it as `/ted/mythread.hoon` of `%base`, `|commit` it and run it with `-mythread`. You should see something like:
```
> -mythread
[~zod ~2021.3.8..14.52.15..bdfe]
```
## Analysis
To use `strandio` functions we've imported the library with `/+ strandio`.
`get-time` and `get-our` get the current time & ship from the bowl in `strand-input`. We'll discuss `strand-input` in more detail later.
Note how we've specified the face and return type of each strand like `t=@da`, etc.
You can see how `pure` has access to the results of previous strands in the pipeline. Note how we've wrapped `pure`'s argument in a `!>` because the thread must produce a `vase`.
Next we'll look at `strand-input` in more detail.

View File

@ -0,0 +1,168 @@
+++
title = "Fundamentals"
weight = 1
template = "doc.html"
+++
## Introduction
A thread is like a transient gall agent. Unlike an agent, it can end and it can fail. The primary uses for threads are:
1. Complex IO, like making a bunch of external API calls where each call depends on the last. Doing this in an agent significantly increases its complexity and the risk of a mishandled intermediary state corrupting permanent state. If you spin the IO out into a thread, your agent only has to make one call to the thread and receive one response.
2. Testing - threads are very useful for writing complex tests for your agents.
Threads are managed by the gall agent called `spider`.
## Thread location
Threads live in the `ted` directory of each desk. For example, in a desk named `%sandbox`:
```
%sandbox
├──app
├──gen
├──lib
├──mar
├──sur
└──ted <-
├──foo
│ └──bar.hoon
└──baz.hoon
```
From the dojo, `ted/baz.hoon` can be run with `-sandbox!baz`, and `ted/foo/bar.hoon` with `-sandbox!foo-bar`. Threads in the `%base` desk can just be run like `-foo`, but all others must have the format `-desk!thread`.
**NOTE:** When the dojo sees the `-` prefix it automatically handles creating a thread ID, composing the argument, poking the `spider` gall agent and subscribing for the result. Running a thread from another context (eg. a gall agent) requires doing these things explicitly and is outside the scope of this particular tutorial.
## Libraries and Structures
There are three files that matter:
- `/sur/spider/hoon` - this contains a few simple structures used by spider. It's not terribly useful except it imports libstrand, so you'll typically get `strand` from `spider`.
- `/lib/strand/hoon` - this contains all the main functions and structures for strands (a thread is a running strand), and you'll refer to this fairly frequently.
- `/lib/strandio/hoon` - this contains a large collection of ready-made functions for use in threads. You'll likely use many of these when you write threads, so it's very useful.
## Thread definition
`/sur/spider/hoon` defines a thread as:
```hoon
+$ thread $-(vase _*form:(strand ,vase))
```
That is, a gate which takes a `vase` and returns the `form` of a `strand` that produces a `vase`. This is a little confusing and we'll look at each part in detail later. For now, note that the thread doesn't just produce a result, it actually produces a strand that takes input and produces output from which a result can be extracted. It works something like this:
![thread diagram](https://storage.googleapis.com/media.urbit.org/site/thread-diagram.png "diagram of a thread")
This is because threads typically do a bunch of I/O so it can't just immediately produce a result and end. Instead the strand will get some input, produce output, get some new input, produce new output, and so forth, until they eventually produce a `%done` with the actual final result.
## Strands
Strands are the building blocks of threads. A thread will typically compose multiple strands.
A strand is a function of `strand-input:strand -> output:strand` and is defined in `/lib/strand/hoon`. You can see the details of `strand-input` [here](https://github.com/urbit/urbit/blob/master/pkg/arvo/lib/strand.hoon#L2-L21) and `output:strand` [here](https://github.com/urbit/urbit/blob/master/pkg/arvo/lib/strand.hoon#L23-L48). At this stage you don't need to know the nitty-gritty but it's helpful to have a quick look through. We'll discuss these things in more detail later.
A strand is a core that has three important arms:
- `form` - the mold of the strand
- `pure` - produces a strand that does nothing except return a value
- `bind` - monadic bind, like `then` in javascript promises
We'll discuss each of these arms later.
A strand must be specialised to produce a particular type like `(strand ,<type>)`. As previously mentioned, a `thread` produces a `vase` so is specialised like `(strand ,vase)`. Within your thread you'll likely compose multiple strands which produce different types like `(strand ,@ud)`, `(strand ,[path cage])`, etc, but the thread itself will always come back to a `(strand ,vase)`.
Strands are conventionally given the face `m` like:
```hoon
=/ m (strand ,vase)
...
```
**NOTE:** a comma prefix as in `,vase` is the irregular form of `^:` which is a gate that returns the sample value if it's of the correct type, but crashes otherwise.
## Form and Pure
### `form`
The `form` arm is the mold of the strand, suitable for casting. The two other arms produce `form`s so you'll cast everything to this like:
```hoon
=/ m (strand ,@ud)
^- form:m
...
```
### `pure`
Pure produces a strand that does nothing except return a value. So, `(pure:(strand ,@tas) %foo)` is a strand that produces `%foo` without doing any IO.
We'll cover `bind` later.
## A trivial thread
```hoon
/- spider
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
(pure:m arg)
```
The above code is a simple thread that just returns its argument, and it's a good boilerplate to start from.
Save the above code as a file in `ted/mythread.hoon` and `|commit` it. Run it with `-mythread 'foo'`, you should see the following:
```
> -mythread 'foo'
[~ 'foo']
```
**NOTE:** The dojo wraps arguments in a unit so that's why it's `[~ 'foo']` rather than just `foo`.
## Analysis
We'll go through it line-by line.
```hoon
/- spider
=, strand=strand:spider
```
First we import `/sur/spider/hoon` which includes `/lib/strand/hoon` and give the latter the face `strand` for convenience.
```hoon
^- thread:spider
```
We make it a thread by casting it to `thread:spider`
```hoon
|= arg=vase
```
We create a gate that takes a vase, the first part of the previously mentioned thread definition.
```hoon
=/ m (strand ,vase)
```
Inside the gate we create our `strand` specialised to produce a `vase` and give it the canonical face `m`.
```hoon
^- form:m
```
We cast the output to `form` - the mold of the strand we created.
```hoon
(pure:m arg)
```
Finally we call `pure` with the gate input `arg` as its argument. Since `arg` is a `vase` it will return the `form` of a `strand` which produces a `vase`. Thus we've created a thread in accordance with its type definition.
Next we'll look at the third arm of a strand: `bind`.

View File

@ -0,0 +1,145 @@
+++
title = "Input"
weight = 3
template = "doc.html"
+++
The input to a `strand` is defined in `/lib/strand/hoon` as:
```hoon
+$ strand-input [=bowl in=(unit input)]
```
When a thread is first started, spider will populate the `bowl` and provide it along with an `input` of `~`. If/when new input comes in (such as a poke, sign or watch) it will provide a new updated bowl along with the new input.
For example, here's a thread that gets the time from the bowl, runs an IO-less function that takes one or two seconds to compute, and then gets the time again:
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
|%
++ ackermann
|= [m=@ n=@]
?: =(m 0) +(n)
?: =(n 0) $(m (dec m), n 1)
$(m (dec m), n $(n (dec n)))
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< t1=@da bind:m get-time
=/ ack (ackermann 3 8)
;< t2=@da bind:m get-time
(pure:m !>([t1 t2]))
```
Since it never does any IO, `t1` and `t2` are the same: `[~2021.3.17..07.47.39..e186 ~2021.3.17..07.47.39..e186]`. However, if we replace the ackermann function with a 2 second `sleep` from strandio:
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< t1=@da bind:m get-time
;< ~ bind:m (sleep ~s2)
;< t2=@da bind:m get-time
(pure:m !>([t1 t2]))
```
...and run it again we get different values for `t1` and `t2`: `[~2021.3.17..07.50.28..8a5d ~2021.3.17..07.50.30..8a66]`. This is because `sleep` gets a `%wake` sign back from `behn`, so spider updates the time in the bowl along with it.
Now let's look at the contents of `bowl` and `input` in detail:
## bowl
`bowl` is the following:
```hoon
+$ bowl
$: our=ship
src=ship
tid=tid
mom=(unit tid)
wex=boat:gall
sup=bitt:gall
eny=@uvJ
now=@da
byk=beak
==
```
- `our` - our ship
- `src` - ship where input is coming from
- `tid` - ID of this thread
- `mom` - parent thread if this is a child thread
- `wex` - outgoing subscriptions
- `sup` - incoming subscriptions
- `eny` - entropy
- `now` - current datetime
- `byk` - `[p=ship q=desk r=case]` path prefix
There are a number of functions in `strandio` to access the `bowl` contents like `get-bowl`, `get-beak`, `get-time`, `get-our` and `get-entropy`.
You can also write a function with a gate whose sample is `strand-input:strand` and access the bowl that way like:
```hoon
/- spider
/+ strandio
=, strand=strand:spider
=>
|%
++ bowl-stuff
=/ m (strand ,[boat:gall bitt:gall])
^- form:m
|= tin=strand-input:strand
`[%done [wex.bowl.tin sup.bowl.tin]]
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< res=[boat:gall bitt:gall] bind:m bowl-stuff
(pure:m !>(res))
```
## input
`input` is defined in libstrand as:
```hoon
+$ input
$% [%poke =cage]
[%sign =wire =sign-arvo]
[%agent =wire =sign:agent:gall]
[%watch =path]
==
```
- `%poke` incoming poke
- `%sign` incoming sign from arvo
- `%agent` incoming sign from a gall agent
- `%watch` incoming subscription
Various functions in `strandio` will check `input` and conditionally do things based on its contents. For example, `sleep` sets a `behn` timer and then calls `take-wake` to wait for a `%wake` sign from behn:
```hoon
++ take-wake
|= until=(unit @da)
=/ m (strand ,~)
^- form:m
|= tin=strand-input:strand
?+ in.tin `[%skip ~]
~ `[%wait ~]
[~ %sign [%wait @ ~] %behn %wake *]
?. |(?=(~ until) =(`u.until (slaw %da i.t.wire.u.in.tin)))
`[%skip ~]
?~ error.sign-arvo.u.in.tin
`[%done ~]
`[%fail %timer-error u.error.sign-arvo.u.in.tin]
==
```

View File

@ -0,0 +1,119 @@
+++
title = "Output"
weight = 4
template = "doc.html"
+++
A strand produces a `[(list card) <response>]`. The first part is a list of cards to be sent off immediately, and `<response>` is one of:
- `[%wait ~]`
- `[%skip ~]`
- `[%cont self=(strand-form-raw a)]`
- `[%fail err=(pair term tang)]`
- `[%done value=a]`
So, for example, if you feed `2 2` into the following function:
```hoon
|= [a=@ud b=@ud]
=/ m (strand ,vase)
^- form:m
=/ res !>(`@ud`(add a b))
(pure:m res)
```
The resulting strand won't just produce `[#t/@ud q=4]`, but rather `[~ %done [#t/@ud q=4]]`.
**Note:** that spider doesn't actually return the codes themselves to thread subscribers, they're only used internally to manage the flow of the thread.
Since a strand is a function from the previously discussed `strand-input` to the output discussed here, you can compose a valid strand like:
```hoon
|= strand-input:strand
[~ %done 'foo']
```
So this is a valid thread:
```hoon
/- spider
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
|= strand-input:strand
[~ %done arg]
```
As is this:
```hoon
/- spider
=, strand=strand:spider
|%
++ my-function
=/ m (strand ,@t)
^- form:m
|= strand-input:strand
[~ %done 'foo']
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< msg=@t bind:m my-function
(pure:m !>(msg))
```
As is this:
```hoon
/- spider
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
|= strand-input:strand
=/ umsg !< (unit @tas) arg
?~ umsg
[~ %fail %no-arg ~]
=/ msg=@tas u.umsg
?. =(msg %foo)
[~ %fail %not-foo ~]
[~ %done arg]
```
Which works like:
```
> -mythread
thread failed: %no-arg
> -mythread %bar
thread failed: %not-foo
> -mythread %foo
[~ %foo]
```
Now let's look at the meaning of each of the response codes.
### wait
Wait tells spider not to move on from the current strand, and to wait for some new input. For example, `sleep:strandio` will return a `[%wait ~]` along with a card to start a behn timer. Spider passes the card to behn, and when behn sends a wake back to spider, the new input will be given back to `sleep` as a `%sign`. Sleep will then issue `[~ %done ~]` and (assuming it's in a `bind`) `bind` will proceed to the next strand.
### skip
Spider will normally treat a `%skip` the same as a `%wait` and just wait for some new input. When used inside a `main-loop:strandio`, however, it will instead tell `main-loop` to skip this function and try the next one with the same input. This is very useful when you want to call different functions depending on the mark of a poke or some other condition.
### cont
Cont means continue computation. When a `%cont` is issued, the issuing gate will be called again with the new value provided. Therefore `%cont` essentially creates a loop.
### fail
Fail says to end the thread here and don't call any subsequent strands. It includes an error message and optional traceback. When spider gets a `%fail` it will send a fact with mark `%thread-fail` containing the error and traceback to its subscribers, and then end the thread.
### done
Done means the computation was completed successfully and includes the result. When `spider` recieves a `%done` it will send the result it contains in a fact with a mark of `%thread-done` to subscribers and end the thread. When `bind` receives a `%done` it will extract the result and call the next gate with it.

View File

@ -0,0 +1,81 @@
+++
title = "Summary"
weight = 5
template = "doc.html"
+++
That's basically all you need to know to write threads. The best way to get a good handle on them is just to experiment with some `strandio` functions. For information on running threads from gall agents, see [here](/docs/userspace/threads/gall) and for some examples see [here](/docs/userspace/threads/examples).
Now here's a quick recap of the main points covered:
## Spider
- is the gall agent that manages threads.
- Details of interacting with threads via spider can be seen [here](/docs/userspace/threads/reference).
## Threads
- are like transient gall agents
- are used mostly to chain a series of IO operations
- can be used by gall agents to spin out IO operations
- live in the `ted` directory
- are managed by the gall agent `spider`
- take a `vase` and produce a `strand` which produces a `vase`
#### Example
```hoon
/- spider
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
(pure:m arg)
```
## Strands
- are the building blocks of threads
- take [this](https://github.com/urbit/urbit/blob/master/pkg/arvo/lib/strand.hoon#L2-L21) input and produce [this](https://github.com/urbit/urbit/blob/master/pkg/arvo/lib/strand.hoon#L23-L48) output.
- must be specialised to produce a particular type like `(strand ,@ud)`.
- are conventionally given the face `m`.
- are a core that has three main arms - `form`, `pure` and `bind`:
### form
- is the mold of the strand suitable for casting
- is the type returned by the other arms
### pure
- simply returns the `form` of a `strand` that produces pure's argument without doing any IO
### bind
- is used to chain strands together like javascript promises
- is used in conjunction with micgal (`;<`)
- must be specialised to a type like `;< <type> bind:m ...`
- takes two arguments. The first is a function that returns the `form` of a `strand` that produces `<type>`. The second is a gate whose sample is `<type>` and which returns a `form`.
- calls the first and then, if it succeeded, calls the second with the result of the first as its sample.
## Strand input
- looks like `[=bowl in=(unit input)]`
- `bowl` has things like `our`, `now`, `eny` and so forth
- `bowl` is populated once when the thread is first called and then every time it receives new input
- `input` contains any incoming pokes, signs and watches.
## Strand output
- contains `[cards=(list card:agent:gall) <response>]`
- `cards` are any cards to be sent immediately
- `<response>` is something like `[%done value]`, `[%fail err]`, etc.
- `%done` will contain the result
- responses are only used internally to manage the flow of the thread and are not returned to subscribers.
## Strandio
- is located in `/lib/strandio/hoon`
- contains a collection of ready-made functions for use in threads
- eg. `sleep`, `get-bowl`, `take-watch`, `poke`, `fetch-json`, etc.

View File

@ -0,0 +1,33 @@
+++
title = "Examples"
weight = 30
sort_by = "weight"
template = "sections/docs/chapters.html"
insert_anchor_links = "right"
+++
Collection of how-tos and examples for threads.
## [Fetch JSON](/docs/userspace/threads/examples/get-json)
Get some JSON from an external website.
## [Child Thread](/docs/userspace/threads/examples/child-thread)
Spawn and manage threads from within threads.
## [Main-loop](/docs/userspace/threads/examples/main-loop)
Create a loop - useful for long-running threads and for trying the same input against multiple functions.
## [Poke Agent](/docs/userspace/threads/examples/poke-agent)
Poke an agent from a thread.
## [Scry](/docs/userspace/threads/examples/scry)
Scry example.
## [Take Fact](/docs/userspace/threads/examples/take-fact)
Take a fact from arvo or an agent.

View File

@ -0,0 +1,241 @@
+++
title = "Child Thread"
weight = 2
template = "doc.html"
+++
Here's a simple example of a thread that starts another thread:
#### `parent.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< tid=tid:spider bind:m (start-thread %child)
(pure:m !>(~))
```
#### `child.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
%- (slog leaf+"foo" ~)
(pure:m !>(~))
```
Save `parent.hoon` and `child.hoon` in `/ted` of the `%base` desk, `|commit %base` and run `-parent`. You should see something like:
```
foo
> -parent
~
```
`parent.hoon` just uses the `strandio` function `start-thread` to start `child.hoon`, and `child.hoon` just prints `foo` to the dojo. Since we got `foo` we can tell the second thread did, in fact, run.
```hoon
;< tid=tid:spider bind:m (start-thread %child)
```
See here how we gave `start-thread` the name of the thread to run. It returns the `tid` of the thread, which we could then use to poke it or whatever.
`start-thread` handles creating the `tid` for the thread so is quite convenient.
Note that threads we start this way will be a child of the thread that started them, and so will be killed when the parent thread ends.
## Start thread and get its result
If we want to actually get the result of the thread we started, it's slightly more complicated.
We note that this is mostly the same as `await-thread:strandio`.
#### `parent.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< =bowl:spider bind:m get-bowl
=/ tid `@ta`(cat 3 'strand_' (scot %uv (sham %child eny.bowl)))
;< ~ bind:m (watch-our /awaiting/[tid] %spider /thread-result/[tid])
;< ~ bind:m %- poke-our
:* %spider
%spider-start
!>([`tid.bowl `tid byk.bowl(r da+now.bowl) %child !>(~)])
==
;< =cage bind:m (take-fact /awaiting/[tid])
;< ~ bind:m (take-kick /awaiting/[tid])
?+ p.cage ~|([%strange-thread-result p.cage %child tid] !!)
%thread-done (pure:m q.cage)
%thread-fail (strand-fail !<([term tang] q.cage))
==
```
#### `child.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
=>
|%
++ url "https://www.whatsthelatestbasehash.com/"
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< =cord bind:m (fetch-cord url)
=/ hash-as-cord `@t`(end [3 (sub (met 3 cord) 1)] cord)
=/ hash `@uv`(slav %uv hash-as-cord)
(pure:m !>(hash))
```
`child.hoon` simply grabs the latest base hash from https://www.whatsthelatestbasehash.com/ and returns it.
`parent.hoon` is a bit more complicated so we'll look at it line-by-line
```hoon
;< =bowl:spider bind:m get-bowl
```
First we grab the bowl
```hoon
=/ tid `@ta`(cat 3 'strand_' (scot %uv (sham %child eny.bowl)))
```
Then we generate a `tid` (thread ID) for the thread we're gonna start
```hoon
;< ~ bind:m (watch-our /awaiting/[tid] %spider /thread-result/[tid])
```
We pre-emptively subscribe for the result. Spider sends the result at `/thread-result/<tid>` so that's where we subscribe.
```hoon
;< ~ bind:m %- poke-our
:* %spider
%spider-start
!>([`tid.bowl `tid byk.bowl(r da+now.bowl) %child !>(~)])
==
```
Spider takes a poke with a mark %spider-start and a vase containing `[parent=(unit tid) use=(unit tid) =beak file=term =vase]` to start a thread, where:
- `parent` is an optional parent thread. In this case we say the parent is our tid. Specifying a parent means the child will be killed if the parent ends.
- `use` is the thread ID for the thread we're creating
- `beak` is a `[p=ship q=desk r=case]` triple which specifies the desk and
revision containing the thread we want to run. In this case we just use
`byk.bowl`, but with the date of revision `q` changed to `now.bowl`.
- `file` is the filename of the thread we want to start
- `vase` is the vase it will be given as an argument when it's started
```hoon
;< =cage bind:m (take-fact /awaiting/[tid])
```
We wait for a fact which will be the result of the thread.
```hoon
;< ~ bind:m (take-kick /awaiting/[tid])
```
Spider will kick us from the subscription when it ends the thread so we also take that kick.
```hoon
?+ p.cage ~|([%strange-thread-result p.cage %child tid] !!)
%thread-done (pure:m q.cage)
%thread-fail (strand-fail !<([term tang] q.cage))
==
```
Finally we test whether the thread produced a `%thread-done` or a `%thread-fail`. These are the two possible marks produced by spider when it returns the results of a thread. A `%thread-done` will contain a vase with the result, and a `%thread-fail` will contain an error message and traceback, so we see which it is and then either produce the result with `pure` or trigger a `%thread-fail` with the error we got from the child.
## Stop a thread
#### `parent.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< =bowl:spider bind:m get-bowl
=/ tid `@ta`(cat 3 'strand_' (scot %uv (sham %child eny.bowl)))
%- (slog leaf+"Starting child thread..." ~)
;< ~ bind:m %- poke-our
:* %spider
%spider-start
!>([`tid.bowl byk.bowl(r da+now.bowl) `tid %child !>(~)])
==
;< ~ bind:m (sleep ~s5)
%- (slog leaf+"Stopping child thread..." ~)
;< ~ bind:m %- poke-our
:* %spider
%spider-stop
!>([tid %.y])
==
;< ~ bind:m (sleep ~s2)
(pure:m !>("Done"))
```
#### `child.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
=>
|%
++ looper
=/ m (strand ,~)
^- form:m
%- (main-loop ,~)
:~ |= ~
^- form:m
;< ~ bind:m (sleep `@dr`(div ~s1 2))
%- (slog leaf+"child thread" ~)
(pure:m ~)
==
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< ~ bind:m looper
(pure:m !>(~))
```
`child.hoon` just prints to the dojo in a loop.
`parent.hoon` starts `child.hoon`, and then pokes spider like:
```hoon
;< ~ bind:m %- poke-our
:* %spider
%spider-stop
!>([tid %.y])
==
```
- `%spider-stop` is the mark that tells spider to kill a thread.
- `tid` is the tid of the thread to kill
- `%.y` tells spider to suppress traceback in the result of the killed thread. If you give it `%.n` it will include the traceback.

View File

@ -0,0 +1,114 @@
+++
title = "Fetch JSON"
weight = 1
template = "doc.html"
+++
Grabbing JSON from some url is very easy.
`strandio` includes the `fetch-json` function which will handle the HTTP request, response, and parsing, producing `json`.
The following thread fetches the current Bitcoin price from the [CoinGecko
API](https://www.coingecko.com/en/api) in the specified currency and prints it
to the terminal.
#### `btc-price.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
=, dejs-soft:format
=, strand-fail=strand-fail:libstrand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
=/ url
"https://api.coingecko.com/api/v3/simple/price?ids=bitcoin&vs_currencies="
=/ cur !<((unit @tas) arg)
?~ cur (strand-fail %no-arg ~)
=. u.cur (crip (cass (trip u.cur)))
?. ((sane %tas) u.cur) (strand-fail %bad-currency-format ~)
;< =json bind:m (fetch-json (weld url (trip u.cur)))
=/ price=(unit @ta) ((ot ~[bitcoin+(ot [u.cur no]~)]) json)
?~ price ((slog 'Currency not found.' ~) (pure:m !>(~)))
%- (slog leaf+"{(trip u.price)} {(cuss (trip u.cur))}" ~)
(pure:m !>(~))
```
Save it as `/ted/btc-price.hoon` in the `%base` desk of a fake ship, `|commit %base` and run it with `-btc-price %usd`. You should see something like:
```
> -btc-price %usd
49168 USD
```
You can try with other currencies as well:
```
> -btc-price %nzd
72455 NZD
> -btc-price %aud
68866 AUD
> -btc-price %gbp
37319 GBP
```
### Analysis
The thread takes an `@tas` as its argument, which the dojo wraps in a `unit`. We extract the `vase` and check it's not empty:
```hoon
=/ cur !<((unit @tas) arg)
?~ cur (strand-fail %no-arg ~)
```
We then convert it to lowercase and check it's a valid `@tas`:
```hoon
=. u.cur (crip (cass (trip u.cur)))
?. ((sane %tas) u.cur) (strand-fail %bad-currency-format ~)
```
Next, we use the `fetch-json` function in `strandio` like so:
```hoon
;< =json bind:m (fetch-json (weld url (trip u.cur)))
```
We convert the currency to a `tape` and `weld` it to the end of the `url`, which
we give as an argument to `fetch-json`. The `fetch-json` function will make the
request to the URL, receive the result, parse the JSON and produce the result as
a `json` structure.
The JSON the API produces looks like:
```json
{
"bitcoin": {
"usd": 49477
}
}
```
Since it's an object in an object, we decode them using nested
[`ot:dejs-soft:format`](/docs/hoon/reference/zuse/2d_7#otdejs-softformat)
functions, and the price itself using
[`no:dejs-soft:format`](/docs/hoon/reference/zuse/2d_7#nodejs-softformat) to
produce a `(unit @ta)`:
```hoon
=/ price=(unit @ta) ((ot ~[bitcoin+(ot [u.cur no]~)]) json)
```
Finally, we check if the `unit` is null and either print an error or print the price to the terminal with `slog` functions (the thread itself produces `~`):
```hoon
?~ price ((slog 'Currency not found.' ~) (pure:m !>(~)))
%- (slog leaf+"{(trip u.price)} {(cuss (trip u.cur))}" ~)
(pure:m !>(~))
```
For more information about working with `json`, see the [JSON
Guide](/docs/hoon/guides/json-guide).

View File

@ -0,0 +1,178 @@
+++
title = "Main-loop"
weight = 3
template = "doc.html"
+++
`main-loop` is a useful function included in `strandio` that:
1. lets you create a loop
2. lets you try the same input against multiple functions
3. queues input on `%skip` and then dequeues from the beginning on `%done`
`main-loop` takes a list of functions as its argument but only moves to the next item in the list on a `[%fail %ignore ~]` (whose usage we'll describe in the second example). In other cases it restarts from the top, so providing multiple functions is only useful for trying the same input against multiple functions.
## Create a loop
This is useful if you want to (for example) take an arbitrary number of facts.
Here's an example of a thread that subscribes to `graph-store` for updates and nicely prints the messages (an extremely basic chat reader):
#### `chat-watch.hoon`
```hoon
/- spider
/+ *strandio, *graph-store
=, strand=strand:spider
=>
|%
++ watcher
=/ m (strand ,~)
^- form:m
%- (main-loop ,~)
:~ |= ~
^- form:m
;< =cage bind:m (take-fact /graph-store)
=/ up=update !< update q.cage
?. ?=(%add-nodes -.q.up)
(pure:m ~)
=/ res=tape "{(scow %p entity.resource.q.up)}/{(scow %tas name.resource.q.up)}"
=/ node-list `(list (pair index node))`~(tap by nodes.q.up)
?~ node-list
(pure:m ~)
?: (gth (lent node-list) 1)
%- (slog leaf+"{res}: <multi-node update skipped>" ~)
(pure:m ~)
=/ from=tape (scow %p author.post.q.i.node-list)
=/ conts `(list content)`contents.post.q.i.node-list
?~ conts
(pure:m ~)
?: (gth (lent conts) 1)
%- (slog leaf+"{res}: [{from}] <mixed-type message skipped>" ~)
(pure:m ~)
?. ?=(%text -.i.conts)
%- (slog leaf+"{res}: [{from}] <non-text message skipped>" ~)
(pure:m ~)
=/ msg=tape (trip text.i.conts)
%- (slog leaf+"{res}: [{from}] {msg}" ~)
(pure:m ~)
==
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< now=@da bind:m get-time
;< ~ bind:m (watch-our /graph-store %graph-store /updates)
;< ~ bind:m watcher
(pure:m !>(~))
```
Save it in `/ted`, `|commit %base`, and run it with `-chat-watch`. Now try typing some messages in the chat and you should see them printed like:
```
~zod/test-8488: [~zod] x
~zod/test-8488: [~zod] blah blah blah
~zod/test-8488: [~zod] foo
~zod/test-8488: [~zod] some text
~zod/test-8488: [~zod] .
```
To stop this hit backspace and then run `:spider|kill`.
First we subscribe to graph-store for updates with `watch-our`, then we call the watcher arm of the core we have added. Watcher just calls `main-loop`:
```hoon
=/ m (strand ,~)
^- form:m
%- (main-loop ,~)
...
```
...with a list of functions. In this case we've just given it one. Our function first calls `take-fact`:
```hoon
;< =cage bind:m (take-fact /graph-store)
```
...to receive the fact and then the rest is just processing & printing logic which isn't too important.
Once this is done, main-loop will just call the same function again which will again wait for a fact and so on. So you see how it creates a loop. The only way to exit the loop is with a `%fail` or else by poking spider with a `%spider-stop` and the thread's `tid`.
## Try input against multiple functions
To try the same input against multiple function you must use another `strandio` function `handle`. Handle converts a `%skip` into a `[fail %ignore ~]`. When `main-loop` sees a `[fail %ignore ~]` it tries the next function in its list with the same input.
Here are two files: `tester.hoon` and `tested.hoon`. Save them both to `/ted` in the `%base` desk, `|commit %base` and run `-tester`. You should see:
```
> -tester
baz
~
```
#### `tester.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< tid=tid:spider bind:m (start-thread %tested)
;< our=ship bind:m get-our
;< ~ bind:m %- poke
:- [our %spider]
[%spider-input !>([tid `cage`[%baz !>("baz")]])]
(pure:m !>(~))
```
#### `tested.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
=>
|%
++ looper
=/ m (strand ,~)
^- form:m
%- (main-loop ,~)
:~ |= ~
^- form:m
;< =vase bind:m ((handle ,vase) (take-poke %foo))
=/ msg=tape !<(tape vase)
%- (slog leaf+"{msg}" ~)
(pure:m ~)
::
|= ~
^- form:m
;< =vase bind:m ((handle ,vase) (take-poke %bar))
=/ msg=tape !<(tape vase)
%- (slog leaf+"{msg}" ~)
(pure:m ~)
::
|= ~
^- form:m
;< =vase bind:m ((handle ,vase) (take-poke %baz))
=/ msg=tape !<(tape vase)
%- (slog leaf+"{msg}" ~)
(pure:m ~)
==
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< ~ bind:m looper
(pure:m !>(~))
```
The first thread (tester.hoon) just starts the second thread and pokes it with a vase containing `"baz"` and the mark `%baz`.
The second thread (tested.hoon) has a `main-loop` with a list of three `take-poke` strands. As you can see it's the third one expecting a mark of `%baz` but yet it still successfully prints the message. This is because it tried the previous two which each saw the wrong mark and said `%skip`.
Notice we've wrapped the `take-poke`s in `handle` to convert the `%skip`s into `[%fail %ignore ~]`s, which `main-loop` takes to mean it should try the next function with the same input.

View File

@ -0,0 +1,60 @@
+++
title = "Poke Agent"
weight = 4
template = "doc.html"
+++
Here's a thread that lets you post a message to a chat in graph-store:
#### `post-msg.hoon`
```hoon
/- spider
/+ *strandio, *graph-store, *resource
=, strand=strand:spider
=>
|%
++ make-post
|= [our=ship now=@da res=resource msg=@t]
^- cage
::
=/ =post *post
=: author.post our
index.post ~[now]
time-sent.post now
contents.post ~[[%text msg]]
==
::
:- %graph-update
!> ^- update
:+ %0 now
:+ %add-nodes res
%- ~(gas by *(map index node))
~[[~[now] [post ~[%empty]]]]
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
=/ uarg !< (unit (pair resource @t)) arg
?~ uarg
(strand-fail %no-arg ~)
=/ res p.u.uarg
=/ msg q.u.uarg
^- form:m
;< our=@p bind:m get-our
;< now=@da bind:m get-time
;< ~ bind:m (poke [our %graph-push-hook] (make-post our now res msg))
(pure:m !>(~))
```
Save it in `/ted` of the `%base` desk, `|commit %base`, and run it like:
```
-post-msg [~zod %foo-9955] 'some message'
```
(obviously change the channel name to whatever you have)
### Analysis
Pretty simple, just use `on-poke` with an argument of `[ship term] cage` where `term` is the agent and `cage` is whatever the particular agent expects.

View File

@ -0,0 +1,70 @@
+++
title = "Scry"
weight = 5
template = "doc.html"
+++
Here's an example of a thread that scries ames for the IP address & port of a ship and nicely prints it:
#### `get-ip.hoon`
```hoon
/- spider
/+ strandio
=, strand=strand:spider
=, strand-fail=strand-fail:libstrand:spider
|%
++ process-lanes
|= [target=@p lanes=(list lane:ames)]
=/ m (strand ,~)
^- form:m
?~ `(list lane:ames)`lanes
%- (slog leaf+"No route for {(scow %p target)}." ~)
(pure:m ~)
=/ lroute (skip lanes |=(a=lane:ames -.a))
?~ lroute
%- (slog leaf+"No direct route for {(scow %p target)}." ~)
(pure:m ~)
=/ ip +:(scow %if p.i.lroute)
=/ port (skip (scow %ud (cut 5 [1 1] p.i.lroute)) |=(a=@tD =(a '.')))
%- (slog leaf+"{ip}:{port}" ~)
(pure:m ~)
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
=/ utarget !< (unit @p) arg
?~ utarget
(strand-fail %no-arg ~)
=/ target u.utarget
;< lanes=(list lane:ames) bind:m (scry:strandio (list lane:ames) /ax//peers/(scot %p target)/forward-lane)
;< ~ bind:m (process-lanes target lanes)
(pure:m !>(~))
```
**Note:** Pretty useless on a fake ship.
Save as `ted/get-ip.hoon` in the `%base` desk, `|commit %base`, and run it with `-get-ip ~bitbet-bolbel`. You should see something like:
```
34.83.113.220:60659
```
### Analysis
Here we use the `strandio` function `scry` which takes an argument of `[mold path]` where:
- `mold` is the return type of the scry
- `path` is the scry path formatted like:
1. vane letter and care
2. desk if scrying arvo or agent if scrying a gall agent
3. rest of path
In our case the mold is `(list lane:ames)` and the path is `/ax//peers/(scot %p target)/forward-lane` like:
```hoon
;< lanes=(list lane:ames) bind:m (scry:strandio (list lane:ames) /ax//peers/(scot %p target)/forward-lane)
```
After that we just process the result in `++ process-lanes` and print it.

View File

@ -0,0 +1,65 @@
+++
title = "Take Fact"
weight = 6
template = "doc.html"
+++
Taking a fact from an agent, arvo or whatever is easy. First you subscribe using `watch:strandio` or `watch-our:strandio`, then you use `take-fact:strandio` to receive the fact. Here's an example that takes an update from `graph-store` and prints the message to the dojo:
#### `print-msg.hoon`
```hoon
/- spider
/+ *strandio, *graph-store
=, strand=strand:spider
=>
|%
++ take-update
=/ m (strand ,~)
^- form:m
;< =cage bind:m (take-fact /graph-store)
=/ =update !< update q.cage
?. ?=(%add-nodes -.q.update)
(pure:m ~)
=/ nodes=(list [=index =node]) ~(tap by nodes.q.update)
?~ nodes
(pure:m ~)
=/ contents=(list content) contents.post.node.i.nodes
?~ contents
(pure:m ~)
?. ?=(%text -.i.contents)
(pure:m ~)
=/ msg (trip text.i.contents)
%- (slog leaf+msg ~)
(pure:m ~)
--
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< ~ bind:m (watch-our /graph-store %graph-store /updates)
;< ~ bind:m take-update
(pure:m !>(~))
```
Create a chat on your fake zod if you don't have one already, then save the thread in `/ted` on the `%base` desk, `|commit %base`, and run `-print-msg`. Next, type some message in your chat and you'll see it printed in the dojo.
### Analysis
First we call `watch-our` to subscribe:
```hoon
;< ~ bind:m (watch-our /graph-store %graph-store /updates)
```
We've spun the next part out into its own core, but it's just a `take-fact` to receive the update:
```hoon
;< =cage bind:m (take-fact /graph-store)
```
The rest of the code is just to pull the message out of the complicate data structure returned by graph-store and isn't important.
Spider will automatically leave the subscription once the thread finishes.
Note that `take-fact` only takes a single fact, so you'd need one for each message you're expecting. Alternatively you can use `main-loop` to take an arbitrary number of facts.

View File

@ -0,0 +1,29 @@
+++
title = "Gall"
weight = 20
sort_by = "weight"
template = "sections/docs/chapters.html"
insert_anchor_links = "right"
+++
Tutorial and explanation of interacting with threads from a Gall agent.
## [Start Thread](/docs/userspace/threads/gall/start-thread)
Start a thread from an agent.
## [Take Result](/docs/userspace/threads/gall/take-result)
Subscribe for thread result from an agent.
## [Take Facts](/docs/userspace/threads/gall/take-facts)
Subscribe for facts sent from a running thread.
## [Stop Thread](/docs/userspace/threads/gall/stop-thread)
Stop a thread from an agent.
## [Poke Thread](/docs/userspace/threads/gall/poke-thread)
Poke a running thread from an agent.

View File

@ -0,0 +1,115 @@
+++
title = "Poke Thread"
weight = 5
template = "doc.html"
+++
Here's a modified agent that pokes our thread. I've replaced some off the previous stuff because it was getting a little unwieldly.
#### `thread-starter.hoon`
```hoon
/+ default-agent, dbug
=* card card:agent:gall
%- agent:dbug
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %|) bowl)
::
++ on-init on-init:def
++ on-save on-save:def
++ on-load on-load:def
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
?+ q.vase (on-poke:def mark vase)
(pair term term)
=/ tid `@ta`(cat 3 'thread_' (scot %uv (sham eny.bowl)))
=/ ta-now `@ta`(scot %da now.bowl)
=/ start-args [~ `tid byk.bowl(r da+now.bowl) p.q.vase !>(q.q.vase)]
:_ this
:~
[%pass /thread/[ta-now] %agent [our.bowl %spider] %watch /thread-result/[tid]]
[%pass /thread/[ta-now] %agent [our.bowl %spider] %poke %spider-start !>(start-args)]
[%pass /thread/[ta-now] %agent [our.bowl %spider] %poke %spider-input !>([tid %foo !>(q.q.vase)])]
==
==
==
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ -.wire (on-agent:def wire sign)
%thread
?+ -.sign (on-agent:def wire sign)
%fact
?+ p.cage.sign (on-agent:def wire sign)
%thread-fail
=/ err !< (pair term tang) q.cage.sign
%- (slog leaf+"Thread failed: {(trip p.err)}" q.err)
`this
%thread-done
?: =(q.cage.sign *vase)
%- (slog leaf+"Thread cancelled nicely" ~)
`this
=/ res (trip !<(term q.cage.sign))
%- (slog leaf+"Result: {res}" ~)
`this
==
==
==
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
And here we've modified the thread to take the poke and return it as the result:
#### `test-thread.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< vmsg=vase bind:m (take-poke %foo)
(pure:m vmsg)
```
Save them, `|commit` and run it like `:thread-starter [%test-thread %blah]`. You should see:
```
Result: blah
> :thread-starter [%test-thread %blah]
```
### Analysis
In our agent we've added this card:
```hoon
[%pass /thread/[ta-now] %agent [our.bowl %spider] %poke %spider-input !>([tid %foo !>(q.q.vase)])]
```
To poke a particular thread you poke `%spider` with a mark of `%spider-input` and a vase of `[tid cage]` where:
- `tid` is the thread you want to poke
- `cage` has whatever mark and vase of data you want to give the thread
In our case we've given it a mark of `%foo` and a vase of whatever `term` we poked our agent with.
In our thread we've added:
```hoon
;< vmsg=vase bind:m (take-poke %foo)
```
`take-poke` is a `strandio` function that just waits for a poke with the given mark and skips everything else. In this case we've specified a mark of `%foo`. Once our thread gets a poke with this mark it returns it as a result with `(pure:m vmsg)`. When our agent gets that it just prints it.

View File

@ -0,0 +1,145 @@
+++
title = "Start Thread"
weight = 1
template = "doc.html"
+++
Here's an example of a barebones gall agent that just starts a thread:
#### `thread-starter.hoon`
```hoon
/+ default-agent, dbug
=* card card:agent:gall
%- agent:dbug
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %|) bowl)
::
++ on-init on-init:def
++ on-save on-save:def
++ on-load on-load:def
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
?+ q.vase (on-poke:def mark vase)
(pair term term)
=/ tid `@ta`(cat 3 'thread_' (scot %uv (sham eny.bowl)))
=/ ta-now `@ta`(scot %da now.bowl)
=/ start-args [~ `tid byk.bowl(r da+now.bowl) p.q.vase !>(q.q.vase)]
:_ this
:~
[%pass /thread/[ta-now] %agent [our.bowl %spider] %poke %spider-start !>(start-args)]
==
==
==
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ -.wire (on-agent:def wire sign)
%thread
?+ -.sign (on-agent:def wire sign)
%poke-ack
?~ p.sign
%- (slog leaf+"Thread started successfully" ~)
`this
%- (slog leaf+"Thread failed to start" u.p.sign)
`this
==
==
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
And here's a minimal thread to test it with:
#### `test-thread.hoon`
```hoon
/- spider
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
|= strand-input:strand
?+ q.arg [~ %fail %not-foo ~]
%foo
[~ %done arg]
==
```
Save them as `/app/thread-starter.hoon` and `/ted/test-thread.hoon` respectively in the `%base` desk, `|commit %base`, and start the app with `|rein %base [& %thread-starter]`.
Now you can poke it with a pair of thread name and argument like:
```
:thread-starter [%test-thread %foo]
```
You should see `Thread started successfully`.
Now try poking it with `[%fake-thread %foo]`, you should see something like:
```
Thread failed to start
/app/spider/hoon:<[355 5].[355 60]>
[%no-file-for-thread %fake-thread]
/app/spider/hoon:<[354 5].[355 60]>
/app/spider/hoon:<[353 3].[359 19]>
/app/spider/hoon:<[350 3].[359 19]>
/app/spider/hoon:<[346 3].[359 19]>
/app/spider/hoon:<[343 3].[359 19]>
/app/spider/hoon:<[341 3].[359 19]>
/app/spider/hoon:<[340 3].[359 19]>
/app/spider/hoon:<[336 3].[359 19]>
/app/spider/hoon:<[335 3].[359 19]>
/app/spider/hoon:<[202 24].[202 68]>
/app/spider/hoon:<[200 7].[207 9]>
/app/spider/hoon:<[199 5].[208 17]>
/app/spider/hoon:<[197 5].[208 17]>
/app/spider/hoon:<[196 5].[208 17]>
/sys/vane/gall/hoon:<[1.370 9].[1.370 37]>
```
### Analysis
We can ignore the input logic, here's the important part:
```hoon
=/ tid `@ta`(cat 3 'thread_' (scot %uv (sham eny.bowl)))
=/ ta-now `@ta`(scot %da now.bowl)
=/ start-args [~ `tid byk.bowl(r da+now.bowl) p.q.vase !>(q.q.vase)]
:_ this
:~
[%pass /thread/[ta-now] %agent [our.bowl %spider] %poke %spider-start !>(start-args)]
==
```
You can generate a tid any way you like, just make sure it's unique. Here we just use the hash of some entropy prefixed with `thread_`.
Then it's just a poke to `%spider` with the mark `%spider-start` and a vase containing [start-args](/docs/userspace/threads/reference#start-thread). Spider will then respond with a `%poke-ack` with a `(unit tang)` which will be `~` if it started successfully or else contain an error and a traceback if it failed. Here we test for this and print the result:
```hoon
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ -.wire (on-agent:def wire sign)
%thread
?+ -.sign (on-agent:def wire sign)
%poke-ack
?~ p.sign
%- (slog leaf+"Thread started successfully" ~)
`this
%- (slog leaf+"Thread failed to start" u.p.sign)
`this
==
==
```

View File

@ -0,0 +1,152 @@
+++
title = "Stop Thread"
weight = 4
template = "doc.html"
+++
Here we've added one last card to `on-poke` to stop the thread and a little extra to `on-agent` to print things for demonstrative purposes.
#### `thread-starter.hoon`
```hoon
/+ default-agent, dbug
=* card card:agent:gall
%- agent:dbug
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %|) bowl)
::
++ on-init on-init:def
++ on-save on-save:def
++ on-load on-load:def
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
?+ q.vase (on-poke:def mark vase)
(pair term term)
=/ tid `@ta`(cat 3 'thread_' (scot %uv (sham eny.bowl)))
=/ ta-now `@ta`(scot %da now.bowl)
=/ start-args [~ `tid byk.bowl(r da+now.bowl) p.q.vase !>(q.q.vase)]
:_ this
:~
[%pass /thread/[ta-now] %agent [our.bowl %spider] %watch /thread-result/[tid]]
[%pass /thread/[ta-now] %agent [our.bowl %spider] %poke %spider-start !>(start-args)]
[%pass /thread/updates/[ta-now] %agent [our.bowl %spider] %watch /thread/[tid]/updates]
[%pass /thread-stop/[ta-now] %agent [our.bowl %spider] %poke %spider-stop !>([tid %.y])]
==
==
==
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ -.wire (on-agent:def wire sign)
%thread
?+ -.sign (on-agent:def wire sign)
%poke-ack
?~ p.sign
%- (slog leaf+"Thread started successfully" ~)
`this
%- (slog leaf+"Thread failed to start" u.p.sign)
`this
::
%fact
?+ p.cage.sign (on-agent:def wire sign)
%thread-fail
=/ err !< (pair term tang) q.cage.sign
%- (slog leaf+"Thread failed: {(trip p.err)}" q.err)
`this
%thread-done
?: =(q.cage.sign *vase)
%- (slog leaf+"Thread cancelled nicely" ~)
`this
=/ res (trip !<(term q.cage.sign))
%- (slog leaf+"Result: {res}" ~)
`this
%update
=/ msg !< tape q.cage.sign
%- (slog leaf+msg ~)
`this
==
==
%thread-stop
?+ -.sign (on-agent:def wire sign)
%poke-ack
?~ p.sign
%- (slog leaf+"Thread cancelled successfully" ~)
`this
%- (slog leaf+"Thread failed to stop" u.p.sign)
`this
==
==
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
We've also added a `sleep` to the thread to keep it running for demonstration.
#### `test-thread.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< =path bind:m take-watch
;< ~ bind:m (send-raw-card [%give %fact ~[path] %update !>("message 1")])
;< ~ bind:m %- send-raw-cards
:~ [%give %fact ~[path] %update !>("message 2")]
[%give %fact ~[path] %update !>("message 3")]
[%give %fact ~[path] %update !>("message 4")]
==
;< ~ bind:m (send-raw-card [%give %kick ~[path] ~])
;< ~ bind:m (sleep ~m1)
|= strand-input:strand
?+ q.arg [~ %fail %not-foo ~]
%foo
[~ %done arg]
==
```
Save these, `|commit`, and run with `:thread-starter [%test-thread %foo]`. You should see:
```
Thread started successfully
message 1
message 2
message 3
message 4
Thread cancelled successfully
Thread cancelled nicely
```
Now, try changing the vase in our new card from `!>([tid %.y])` to `!>([tid %.n])`. Save, `|commit`, and run again. You should see:
```
Thread started successfully
message 1
message 2
message 3
message 4
Thread cancelled successfully
Thread failed: cancelled
```
### Analysis
The card we've added to our agent:
```hoon
[%pass /thread-stop/[ta-now] %agent [our.bowl %spider] %poke %spider-stop !>([tid %.y])]
```
...pokes spider with mark `%spider-stop` and a vase containing the tid of the thread we want to stop and a `?`. The `?` specifies whether to end it nicely or not. If `%.y` it will end with `%thread-done` and a `*vase` bunted vase. If `%.n` it will end with `%thread-fail` and a vase containing `[term tang]` where `term` is `%cancelled` and `tang` is `~`. You can see the difference in our tests above.

View File

@ -0,0 +1,184 @@
+++
title = "Take Facts"
weight = 3
template = "doc.html"
+++
Most of the time you'll just want the final result like how we did previously. Sometimes, though, you might want to send out facts while the thread runs rather than just at the end.
Here we've added another card to subscribe for any facts sent by the thread and some small changes to `on-agent`:
#### `thread-starter.hoon`
```hoon
/+ default-agent, dbug
=* card card:agent:gall
%- agent:dbug
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %|) bowl)
::
++ on-init on-init:def
++ on-save on-save:def
++ on-load on-load:def
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
?+ q.vase (on-poke:def mark vase)
(pair term term)
=/ tid `@ta`(cat 3 'thread_' (scot %uv (sham eny.bowl)))
=/ ta-now `@ta`(scot %da now.bowl)
=/ start-args [~ `tid byk.bowl(r da+now.bowl) p.q.vase !>(q.q.vase)]
:_ this
:~
[%pass /thread/[ta-now] %agent [our.bowl %spider] %watch /thread-result/[tid]]
[%pass /thread/[ta-now] %agent [our.bowl %spider] %poke %spider-start !>(start-args)]
[%pass /thread/updates/[ta-now] %agent [our.bowl %spider] %watch /thread/[tid]/updates]
==
==
==
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ -.wire (on-agent:def wire sign)
%thread
?+ -.sign (on-agent:def wire sign)
%poke-ack
?~ p.sign
%- (slog leaf+"Thread started successfully" ~)
`this
%- (slog leaf+"Thread failed to start" u.p.sign)
`this
::
%fact
?+ p.cage.sign (on-agent:def wire sign)
%thread-fail
=/ err !< (pair term tang) q.cage.sign
%- (slog leaf+"Thread failed: {(trip p.err)}" q.err)
`this
%thread-done
=/ res (trip !<(term q.cage.sign))
%- (slog leaf+"Result: {res}" ~)
`this
%update
=/ msg !< tape q.cage.sign
%- (slog leaf+msg ~)
`this
==
==
==
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
We've also made some changes to the thread:
#### `test-thread.hoon`
```hoon
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
;< =path bind:m take-watch
;< ~ bind:m (send-raw-card [%give %fact ~[path] %update !>("message 1")])
;< ~ bind:m %- send-raw-cards
:~ [%give %fact ~[path] %update !>("message 2")]
[%give %fact ~[path] %update !>("message 3")]
[%give %fact ~[path] %update !>("message 4")]
==
;< ~ bind:m (send-raw-card [%give %kick ~[path] ~])
|= strand-input:strand
?+ q.arg [~ %fail %not-foo ~]
%foo
[~ %done arg]
==
```
Save & `|commit`, then run `:thread-starter [%test-thread %foo]`. You should see:
```
Thread started successfully
message 1
message 2
message 3
message 4
Result: foo
```
Now try `:thread-starter [%test-thread %bar]`. You should see:
```
Thread started successfully
message 1
message 2
message 3
message 4
Thread failed: not-foo
```
### Analysis
In our agent's `on-poke` arm we've added another card to subscribe to `/thread/[tid]/updates`:
```hoon
[%pass /thread/updates/[ta-now] %agent [our.bowl %spider] %watch /thread/[tid]/updates]
```
**Note:** In practice you'll want to include some kind of tag in the wire so you can differentiate particular threads and subscriptions and test for it in `on-agent`.
Threads always send facts on `/thread/[tid]/some-path`. The thread itself will see the incoming subscription on `/some-path` though, not the full thing.
In the thread we've first added:
```hoon
;< =path bind:m take-watch
```
...to take the subscription. Without something like this to handle the incoming `%watch`, spider will reject the subscription.
Then we've added:
```hoon
;< ~ bind:m (send-raw-card [%give %fact ~[path] %update !>("message 1")])
;< ~ bind:m %- send-raw-cards
:~ [%give %fact ~[path] %update !>("message 2")]
[%give %fact ~[path] %update !>("message 3")]
[%give %fact ~[path] %update !>("message 4")]
==
```
...to send some facts out to subscribers. Here we've used both `send-raw-card` and `send-raw-cards` to demonstrate both ways.
**Note:** in practice you'd probably want to send facts on a predetermined path and just test the path of the incoming subscription rather than just accepting anything.
Finally we've added:
```hoon
;< ~ bind:m (send-raw-card [%give %kick ~[path] ~])
```
...to kick subscribers. This is important because, unlike on `/thread-result`, spider will not automatically kick subscribers when the thread ends. You have to do it explicitly so your agent doesn't accumulate wires with repeated executions.
Back in our agent: In the `on-agent` arm we've added:
```hoon
%update
=/ msg !< tape q.cage.sign
%- (slog leaf+msg ~)
`this
```
... to take the facts and print them.
One last thing: Notice how when we gave the thread an argument of `%bar` and made it fail, we still got the facts we sent. This is because such cards are sent immediately as the thread runs, they don't depend on the final result.

View File

@ -0,0 +1,132 @@
+++
title = "Take Result"
weight = 2
template = "doc.html"
+++
Here we've added an extra card to subscribe for the result and a couple of lines in on-agent to test if it succeeded:
#### `thread-starter.hoon`
```hoon
/+ default-agent, dbug
=* card card:agent:gall
%- agent:dbug
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %|) bowl)
::
++ on-init on-init:def
++ on-save on-save:def
++ on-load on-load:def
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
?+ q.vase (on-poke:def mark vase)
(pair term term)
=/ tid `@ta`(cat 3 'thread_' (scot %uv (sham eny.bowl)))
=/ ta-now `@ta`(scot %da now.bowl)
=/ start-args [~ `tid byk.bowl(r da+now.bowl) p.q.vase !>(q.q.vase)]
:_ this
:~
[%pass /thread/[ta-now] %agent [our.bowl %spider] %watch /thread-result/[tid]]
[%pass /thread/[ta-now] %agent [our.bowl %spider] %poke %spider-start !>(start-args)]
==
==
==
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ -.wire (on-agent:def wire sign)
%thread
?+ -.sign (on-agent:def wire sign)
%poke-ack
?~ p.sign
%- (slog leaf+"Thread started successfully" ~)
`this
%- (slog leaf+"Thread failed to start" u.p.sign)
`this
::
%fact
?+ p.cage.sign (on-agent:def wire sign)
%thread-fail
=/ err !< (pair term tang) q.cage.sign
%- (slog leaf+"Thread failed: {(trip p.err)}" q.err)
`this
%thread-done
=/ res (trip !<(term q.cage.sign))
%- (slog leaf+"Result: {res}" ~)
`this
==
==
==
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
#### `test-thread.hoon`
```hoon
/- spider
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
|= strand-input:strand
?+ q.arg [~ %fail %not-foo ~]
%foo
[~ %done arg]
==
```
Save these, `|commit` and then poke the app with `:thread-starter [%test-thread %foo]`. You should see:
```
Thread started successfully
Result: foo
```
Now try `:thread-starter [%test-thread %bar]`. You should see:
```
Thread started successfully
Thread failed: not-foo
```
### Analysis
In `on-poke` we've added an extra card _before_ the `%spider-start` poke to subscribe for the result:
```hoon
[%pass /thread/[ta-now] %agent [our.bowl %spider] %watch /thread-result/[tid]]
```
If successful the thread will return a cage with a mark of `%thread-done` and a vase containing the result.
If the thread failed it will return a cage with a mark of `%thread-fail` and a vase containing `[term tang]` where `term` is an error message and `tang` is a traceback. In our case our thread fails with error `%not-foo` when its argument is not `%foo`.
Note that spider will automatically `%kick` us from the subscription after ending the thread and returning the result.
```hoon
%fact
?+ p.cage.sign (on-agent:def wire sign)
%thread-fail
=/ err !< (pair term tang) q.cage.sign
%- (slog leaf+"Thread failed: {(trip p.err)}" q.err)
`this
%thread-done
=/ res (trip !<(term q.cage.sign))
%- (slog leaf+"Result: {res}" ~)
`this
==
```
Here in on-agent we've added a test for `%thread-fail` or `%thread-done` and print the appropriate result.

View File

@ -0,0 +1,73 @@
+++
title = "HTTP API"
weight = 2
template = "doc.html"
+++
Spider has an Eyre binding which allows threads to be run externally via [authenticated](/docs/arvo/eyre/external-api-ref#authentication) HTTP POST requests.
Spider is bound to the `/spider` URL path, and expects the requested URL to look like:
```
http{s}://{host}/spider/{desk}/{inputMark}/{threadName}/{outputMark}
```
The `desk` is the desk in which the thread resides. The `inputMark` is the `mark` the thread takes. The `threadName` is the name of the thread, e.g. `foo` for `/ted/foo/hoon`. The `outputMark` is the `mark` the thread produces. You may also include a file extension though it doesn't have an effect.
When Spider receives an HTTP request, the following steps happen:
1. It converts the raw body of the message to `json` using `de-json:html`
2. It creates a `tube:clay` (`mark` conversion gate) from `json` to whatever input `mark` you've specified and does the conversion.
3. It runs the specified thread and provides a `vase` of `(unit inputMark)` as the argument.
4. The thread does its thing and finally produces its result as a `vase` of `outputMark`.
5. Spider creates another `tube:clay` from the output `mark` to `json` and converts it.
6. It converts the `json` back into raw data suitable for the HTTP response body using `en-json:html`.
7. Finally, it composes the HTTP response and passes it back to Eyre which passes it on to the client.
Thus, it's important to understand that the original HTTP request and final HTTP response must contain JSON data, and therefore the input & output `mark`s you specify must each have a `mark` file in `/mar` that includes a conversion method for `json -> inputMark` and `outputMark -> json` respectively.
## Example
Here we'll look at running Spider threads through Eyre.
Here's an extremely simple thread that takes a `vase` of `(unit json)` and just returns the `json` in a new `vase`. You can save it in `/ted` and `|commit %base`:
`eyre-thread.hoon`
```hoon
/- spider
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
=/ =json
(need !<((unit json) arg))
(pure:m !>(json))
```
First we must obtain a session cookie by [authenticating](/docs/arvo/eyre/guide#authenticating).
Now we can try and run our thread. Spider is bound to the `/spider` URL path, and expects the rest of the path to be `/{desk}/{inputMark}/{thread}/{outputMark}`. Our `{thread}` is called `eyre-thread` and is in the `%base` `{desk}`. Both its `{inputMark}` and `{outputMark}` are `json`. Therefore, our URL path will be `/spider/base/json/eyre-agent/json`. Our request will be an HTTP POST request and the body will be some `json`, in this case `[{"foo": "bar"}]`:
```
curl -i --header "Content-Type: application/json" \
--cookie "urbauth-~zod=0v6.h6t4q.2tkui.oeaqu.nihh9.i0qv6" \
--request POST \
--data '[{"foo": "bar"}]' \
http://localhost:8080/spider/base/json/eyre-thread/json
```
Spider will run the thread and the result will be returned through Eyre in the body of an HTTP response with a 200 status code:
```
HTTP/1.1 200 ok
Date: Sun, 06 Jun 2021 05:32:45 GMT
Connection: keep-alive
Server: urbit/vere-1.5
set-cookie: urbauth-~zod=0v6.h6t4q.2tkui.oeaqu.nihh9.i0qv6; Path=/; Max-Age=604800
content-type: application/json
transfer-encoding: chunked
[{"foo":"bar"}]
```

View File

@ -0,0 +1,82 @@
+++
title = "Overview"
weight = 1
template = "doc.html"
+++
Urbit code lives in the following basic categories:
- Runtime (Nock interpreter, persistence engine, IO drivers, jets)
- Kernel vanes (managed by Arvo)
- Userspace agents (managed by Gall, permanent state)
- Userspace threads (managed by Spider, transient state)
This describes the last category: Threads.
A thread is a monadic function that takes arguments and produces a
result. It may perform input and output while running, so it is not a
pure function. It may fail.
An agent's strength is that it's permanent and bulletproof. All state
transitions are defined, and each action it performs is a transaction.
Code upgrades preserve existing state.
An agent's weakness is complex input and output. Since each state
transition must be explicitly handled, the complexity of an agent
explodes with the amount of IO it handles. At best, this results in
long and complex code; at worst, unexpected states are mishandled,
corrupting permanent state.
A thread's strength is that it can easily perform complex IO operations.
It uses what's often called the IO monad (plus the exception monad) to
provide a natural framework for IO.
A thread's weakness is that it's impermanent and may fail unexpectedly.
In most of its intermediate states, it expects only a small number of
events (usually one), so if it receives anything it didn't expect, it
fails. When code is upgraded, it's impossible to upgrade a running
thread, so it fails.
Thus, for anything that needs to be permament, use an agent. When you
need to do a long or complex sequence of IO operations, reduce that to a
single logical IO operation by spinning it out into a thread. If you
only change your agent's state in response to success of the thread, an
IO failure will never result in partially applied state changes.
A thread may also be run from the dojo by prefixing its name with `-`
and giving it any arguments it requires. If alone, any result will be
printed to the screen; else the output may be piped into an agent or
other sinks.
## Thread basics
These docs walk through the fundamental things you need to know to write threads. They're focused on basic thread composition so don't touch on interacting with threads from gall agents and such. The included examples can all just be run from the dojo.
1. [Thread Fundamentals](/docs/userspace/threads/basics/fundamentals) - Basic information and overview of threads, strands, `form` & `pure`.
2. [Micgal and Bind](/docs/userspace/threads/basics/bind) - Covers using micgal and `bind` to chain strands.
3. [Strand Input](/docs/userspace/threads/basics/input) - What strands receive as input
4. [Strand Output](/docs/userspace/threads/basics/output) - What strands produce
5. [Summary](/docs/userspace/threads/basics/summary)
## Gall
These docs walk through the basics of interacting with threads from gall agents.
1. [Start a thread](/docs/userspace/threads/gall/start-thread)
2. [Subscribe for result](/docs/userspace/threads/gall/take-result)
3. [Subscribe for facts](/docs/userspace/threads/gall/take-facts)
4. [Stop a thread](/docs/userspace/threads/gall/stop-thread)
5. [Poke a thread](/docs/userspace/threads/gall/poke-thread)
## How-tos & Examples
- [Grab some JSON from a URL](/docs/userspace/threads/examples/get-json) - Here's an example of chaining a couple of external http requests for JSON.
- [Start a child thread](/docs/userspace/threads/examples/child-thread) - Starting and managing child threads.
- [Main Loop](/docs/userspace/threads/examples/main-loop) - Some notes and examples of the `strandio` function `main-loop`.
- [Poke an agent](/docs/userspace/threads/examples/poke-agent) - Example of poking an agent from a thread.
- [Scry](/docs/userspace/threads/examples/scry) - Scry arvo or an agent.
- [Take a fact](/docs/userspace/threads/examples/take-fact) - Subscribe to an agent and receive a fact.
## [Reference](/docs/userspace/threads/reference)
Basic reference information. For usage of particular `strandio` functions just refer directly to `/lib/strandio/hoon` since they're largely self-explanatory.

View File

@ -0,0 +1,87 @@
+++
title = "Reference"
weight = 50
template = "doc.html"
+++
## Start thread
Poke `spider` with mark `%spider-start` and a vase containing `start-args`:
```hoon
+$ start-args
[parent=(unit tid) use=(unit tid) =beak file=term =vase]
```
Where:
- `parent` - optional `tid` of parent thread if the thread is a child. If specified, the child thread will be killed with the parent thread ends.
- `use` - `tid` (thread ID) to give the new thread. Can be generated with something like `(scot %ta (cat 3 'my-agent_' (scot %uv (sham eny))))`. However you do it, make sure it's unique.
- `beak` - A `$beak` is a triple of `[p=ship q=desk r=case]`. `p` is always our ship, `q` is the desk which contains the thread we want to run. `r` is a `case`, which specifies a desk revision and is a tagged union of:
```hoon
+$ case
$% [%da p=@da] :: date
[%tas p=@tas] :: label
[%ud p=@ud] :: number
==
```
You'll almost always just want the current revision, so you can specify the `case` as `da+now.bowl`. If the thread is on the same desk as the agent you can also just use `byk.bowl(r da+now)` for the `beak`.
- `file` - name of the thread file in `/ted`. For example, if the thread you want to start is `/ted/foo/hoon` you'd specify `%foo`.
- `vase` - `vase` to be given to the thread when it's started. Can be whatever or just `!>(~)` if it doesn't need any args.
#### Example
```hoon
[%pass /some-path %agent [our.bowl %spider] %poke %spider-start !>([~ `tid byk.bowl %foo !>(~)])]
```
## Stop thread
Poke `spider` with mark `%spider-stop` and a vase containing `[tid ?]`, where:
- `tid` - the `tid` of the thread you want to stop
- `?` - whether thread should end nicely. If `%.y` it'll end with mark `%thread-done` and the bunt value of a vase. If `%.n` it'll end with mark `%thread-fail` and a `[term tang]` where `term` is `%cancelled` and `tang` is `~`.
#### Example
```hoon
[%pass /some-path %agent [our.bowl %spider] %poke %spider-stop !>([tid %.y)]
```
## Subscribe for result
Spider will send the result on `/thread-result/[tid]` so you can subscribe there for the result. You should subscribe before starting the thread.
The result will have a mark of either `%thread-fail` or `%thread-done`.
- `%thread-fail` - has a vase containing a `[term tang]` where `term` is an error message and `tang` is a traceback.
- `%thread-done` - has a vase of the result of the thread.
#### Example
```hoon
[%pass /some-path %agent [our.bowl %spider] %watch /thread-result/[tid]]
```
## Subscribe to thread
You can subscribe to a thread on `/thread/[tid]/path`. Note this is for facts sent off by the thread while it's running, not the final result. The path depends on the particular thread.
#### Example
```hoon
[%pass /some-path %agent [our.bowl %spider] %watch /thread/[tid]/thread-path]
```
## Poke thread
To poke a thread you poke spider with a mark of `%spider-input` and a vase of `[tid cage]`.
- `tid` is the tid of the thread you want to poke
- `cage` is whatever mark and vase you want to poke it with
#### Example
```hoon
[%pass /some-path %agent [our.bowl %spider] %poke %spider-input !>([tid %foo !>('foooo')])]
```

View File

@ -0,0 +1,89 @@
+++
title = "1. Introduction"
weight = 1
template = "doc.html"
+++
This series walks through the writing of a full Gall agent, and then the process
of integrating it with a React front-end. This series follows on from the
previous [Gall Guide](/docs/userspace/gall-guide/intro). If you haven't
completed that, or otherwise aren't familiar with the basics of writing Gall
agents, it's strongly recommended to work through that guide first.
The app we'll be looking at is a simple journal with an agent called `%journal`.
In the browser, users will be able to add plain text journal entries organized
by date. Entries may be scrolled through in ascending date order, with more
entries loaded each time the bottom of the list is reached. Old entries will be
able to be edited and deleted, and users will be able to search through entries
by specifying a date range.
The `Journal` app we'll be looking at can be installed on a live ship from
`~pocwet/journal`, and its source code is available [here](https://github.com/urbit/docs-examples/tree/main/journal-app).
![journal ui screenshot](https://media.urbit.org/docs/userspace/full-stack-guide/entries.png)
This walkthrough does not contain exercises, nor does it completely cover every
aspect of building the app in full depth. Rather, its purpose is to demonstrate
the process of creating a full-stack Urbit app, showing how everything fits
together, and how concepts you've previously learned are applied in practice.
The primary focus of the walkthrough is to show how a Javascript front-end is
integrated with a Gall agent and distributed as a complete app. Consequently,
the example app is fairly simple and runs on a local ship only, rather than one
with more complex inter-ship networking.
Each section of this walkthrough will list additional resources and learning
material at the bottom of the page, which will cover the concepts discussed in a
more comprehensive manner.
Here is the basic structure of the app we'll be building:
![journal app
diagram](https://media.urbit.org/docs/userspace/full-stack-guide/journal-app-diagram.svg)
## Sections
#### [1. Introduction](/docs/userspace/full-stack/1-intro)
An overview of the guide and table of contents.
#### [2. Types](/docs/userspace/full-stack/2-types)
Creating the `/sur` structure file for our `%journal` agent.
#### [3. Agent](/docs/userspace/full-stack/3-agent)
Creating the `%journal` agent itself.
#### [4. JSON](/docs/userspace/full-stack/5-json)
Writing a library to convert between our agent's marks and JSON. This lets our
React front-end poke our agent, and our agent send updates back to it.
#### [5. Marks](/docs/userspace/full-stack/4-marks)
Creating the mark files for the pokes our agent takes and updates it sends out.
#### [6. Eyre](/docs/userspace/full-stack/6-eyre)
A brief overview of how the webserver vane Eyre works.
#### [7. React App Setup](/docs/userspace/full-stack/7-react-setup)
Creating a new React app, installing the required packages, and setting up some
basic things for our front-end.
#### [8. React App Logic](/docs/userspace/full-stack/8-http-api)
Analyzing the core logic of our React app, with particular focus on using
methods of the `Urbit` class from `@urbit/http-api` to communicate with our
agent.
#### [9. Desk and Glob](/docs/userspace/full-stack/9-web-scries)
Building and "globbing" our front-end, and putting together a desk for
distribution.
#### [10. Summary](/docs/userspace/full-stack/10-final)
Some final comments and additional resources.

View File

@ -0,0 +1,118 @@
+++
title = "10. Summary"
weight = 10
template = "doc.html"
+++
That's it! We've built our agent and React front-end, put together a desk and
published it. We hope this walkthrough has helped you see how all the pieces
for together for building and distributing an app in Urbit.
The reference material for each section of this walkthrough is listed
[below](#reference-material), the source code for our app is available
[here](https://github.com/urbit/docs-examples/tree/main/journal-app), and it can
be installed from `~pocwet/journal`.
In this guide we've built a separate React app for the front-end, but Hoon also
has a native domain-specific language for composing HTML structures called Sail.
Sail allows you to compose a front-end inside a Gall agent and serve it
directly. See the [Sail guide](/docs/hoon/guides/sail) for details.
Along with `@urbit/http-api`, there's also the `@urbit/api` NPM package, which
contains a large number of helpful functions for dealing with Hoon data types
and interacting with a number of agents - particularly those used by the Groups
app. Its source code is [available
here](https://github.com/urbit/urbit/tree/master/pkg/npm/api).
## Reference material
Here is the reference material for each section of this walkthrough.
#### Types
- [Gall Guide /sur section](/docs/userspace/gall-guide/7-sur-and-marks#sur) -
This section of the Gall Guide covers writing a `/sur` structure library for
an agent.
- [Ordered map functions in
`zuse.hoon`](https://github.com/urbit/urbit/blob/master/pkg/arvo/sys/zuse.hoon#L5284-L5688) -
This section of `zuse.hoon` contains all the functions for working with
`mop`s, and is well commented.
#### Agent
- [The Gall Guide](/docs/userspace/gall-guide/intro) - The Gall Guide covers all
aspects of writing Gall agents in detail.
- [Ordered map functions in
`zuse.hoon`](https://github.com/urbit/urbit/blob/master/pkg/arvo/sys/zuse.hoon#L5284-L5688) -
This section of `zuse.hoon` contains all the functions for working with
`mop`s, and is well commented.
- [`/lib/agentio.hoon`](https://github.com/urbit/urbit/blob/master/pkg/base-dev/lib/agentio.hoon) -
The `agentio` library in the `%base` desk contains a large number of useful
functions which making writing Gall agents easier.
#### JSON
- [The JSON Guide](/docs/hoon/guides/json-guide) - The stand-alone JSON guide
covers JSON encoding/decoding in great detail.
- [The Zuse Reference](/docs/hoon/reference/zuse/table-of-contents) - The
`zuse.hoon` reference documents all JSON-related functions in detail.
- [`++enjs:format` reference](/docs/hoon/reference/zuse/2d_1-5#enjsformat) -
This section of the `zuse.hoon` documentation covers all JSON encoding
functions.
- [`++dejs:format` reference](/docs/hoon/reference/zuse/2d_6) - This section of
the `zuse.hoon` documentation covers all JSON _decoding_ functions.
- [Eyre Overview](/docs/arvo/eyre/eyre) - This section of the Eyre vane
documentation goes over the basic features of the Eyre vane.
#### Marks
- [The Marks section of the Clay documentation](/docs/arvo/clay/marks/marks) -
This section of the Clay vane documentation covers mark files comprehensively.
- [The mark file section of the Gall
Guide](/docs/userspace/gall-guide/7-sur-and-marks#mark-files) - This part of
the Gall Guide goes through the basics of mark files.
- [The JSON Guide](/docs/hoon/guides/json-guide) - This also covers writing mark
files to convert to/from JSON.
#### Eyre
- [The Eyre vane documentation](/docs/arvo/eyre/eyre) - This section of the vane
docs covers all aspects of Eyre.
- [Eyre External API Reference](/docs/arvo/eyre/external-api-ref) - This section
of the Eyre documentation contains reference material for Eyre's external API.
- [The Eyre Guide](/docs/arvo/eyre/guide) - This section of the Eyre
documentation walks through using Eyre's external API at a low level (using
`curl`).
#### React App Setup and Logic
- [HTTP API Guide](/docs/userspace/http-api-guide) - Reference documentation for
`@urbit/http-api`.
- [React app source
code](https://github.com/urbit/docs-examples/tree/main/journal-app/ui) - The
source code for the Journal app UI.
- [`@urbit/http-api` source
code](https://github.com/urbit/urbit/tree/master/pkg/npm/http-api) - The
source code for the `@urbit/http-api` NPM package.
#### Desk and Glob
- [App publishing/distribution docs](/docs/userspace/dist/dist) -
Documentation covering third party desk composition, publishing and
distribution.
- [Glob documentation](/docs/userspace/dist/glob) - Comprehensive documentation
of handling front-end files.
- [Desk publishing guide](/docs/userspace/dist/guide) - A step-by-step guide to
creating and publishing a desk.

View File

@ -0,0 +1,215 @@
+++
title = "2. Types"
weight = 2
template = "doc.html"
+++
The best place to start when building a new agent is its type definitions in its
`/sur` structure file. The main things to think through are:
1. What basic types of data does my agent deal with?
2. What actions/commands does my agent need to handle?
3. What updates/events will my agent need to send out to subscribers?
4. What does my agent need to store in its state?
Let's look at each of these questions in turn, and put together our agent's
`/sur` file, which we'll call `/sur/journal.hoon`.
### 1. Basic types
Our journal entries will just be plain text, so a simple `@t` will work fine to
store their contents. Entries will be organized by date, so we'll also need to
decide a format for that.
One option would be to use an `@da`, and then use the date functions included
in the `@urbit/api` NPM package on the front-end to convert them to ordinary
Javascript `Date` objects. In this case, to keep it simple, we'll just use the
number of milliseconds since the Unix Epoch as an `atom`, since it's natively
supported by the Javascript `Date` object.
The structure for a journal entry can therefore be:
```hoon
+$ id @
+$ txt @t
+$ entry [=id =txt]
```
### 2. Actions
Now that we know what a journal entry looks like, we can think about what kind
of actions/commands our agent will handle in its `++on-poke` arm. For our
journal app, there are three basic things we might do:
1. Add a new journal entry.
2. Edit an existing journal entry.
3. Delete an existing journal entry.
We can create a tagged union structure for these actions, like so:
```hoon
+$ action
$% [%add =id =txt]
[%edit =id =txt]
[%del =id]
==
```
### 3. Updates
Updates are a little more complicated than our actions. Firstly, our front-end
needs to be able to retrieve an initial list of journal entries to display. Once
it has that, it also needs to be notified of any changes. For example, if a new
entry is added, it needs to know so it can add it to the list it's displaying.
If an entry gets deleted, it needs to remove it from the list. Etc.
The simplest approach to the initial entries is just a `(list entry)`. Then, for
the subsequent updates, we could send out the `$action`. Since an `$action` is a
tagged union, it's simpler to have all updates be a tagged union, so when we get
to doing mark conversions we can just switch on the head tag. Therefore, we can
define an `$update` structure like so:
```hoon
+$ update
$% action
[%jrnl list=(list entry)]
==
```
There's one drawback to this structure. Suppose either an agent on a remote ship
or an instance of the front-end client is subscribed for updates, and the
network connection is disrupted. In the remote ship case, Gall will only allow
so many undelivered messages to accumulate in Ames before it automatically kicks
the unresponsive subscriber. In the front-end case, the subscription will also
be ended if enough unacknowledged messages accumulate, and additionally the
client may sometimes need to establish an entirely new connection with the ship,
discarding existing subscriptions. When this happens, the remote ship or web
client has no way to know how many (if any) updates they've missed.
The only way to resynchronize their state with ours is to discard their existing
state, refetch the entire initial state once again, and then resubscribe for
updates. This might be fine if the state of our agent is small, but it becomes a
problem if it's very large. For example, if our agent holds tens of thousands of
chat messages, having to resend them all every time anyone has connectivity
issues is quite inefficient.
One solution to this is to keep an _update log_. Each update can be tagged with
the time it occurred, and stored in our agent's state, separately to the
entries. If an agent or web client needs to resynchronize with our agent, it can
just request all updates since the last one it received. This approach is used
by the `%graph-store` agent, for example. Our agent is local-only and doesn't
have a huge state so it might not be strictly necessary, but we'll use it to
demonstrate the approach.
We can define a logged update like so, where the `@` is the update timestamp in
milliseconds since the Unix Epoch:
```hoon
+$ logged (pair @ action)
+$ update
%+ pair @
$% action
[%jrnl list=(list entry)]
[%logs list=(list logged)]
==
```
### 4. State
We need to store two things in our state: the journal entries and the update
log. We could just use a couple of `map`s like so:
```hoon
+$ journal (map id txt)
+$ log (map @ action)
```
Ordinary `map`s are fine if we just want to access one value at a time, but we
want to be able to:
1. Retrieve only some of the journal entries at a time, so we can have "lazy
loading" in the front-end, loading more entries each time the user scrolls to
the bottom of the list.
2. Retrieve only logged updates newer than a certain time, in the case where the
subscription is interrupted due to connectivity issues.
3. Retrieve journal entries between two dates.
Maps are ordered by the hash of their key, so if we convert them to a list
they'll come out in seemingly random order. That means we'd have to convert the
map to a list, sort the list, and then iterate over it again to pull out the
items we want. We could alternatively store things in a list directly, but
retrieving or modifying arbitrary items would be less efficient.
To solve this, rather than using a `map` or a `list`, we can use an _ordered
map_. The mold builder for an ordered map is a `mop`, and it's included in the
[`zuse.hoon`](https://github.com/urbit/urbit/blob/master/pkg/arvo/sys/zuse.hoon#L5284)
utility library rather than the standard library.
A `mop` is defined similarly to a `map`, but it takes an extra argument in the
following manner:
```hoon
((mop key-mold val-mold) comparator-gate)
```
The gate is a binary gate which takes two keys and produces a `?`. The
comparator is used to decide how to order the items in the mop. In our case,
we'll create a `$journal` and `$log` `mop` like so:
```hoon
+$ journal ((mop id txt) gth)
+$ log ((mop @ action) lth)
```
The entries in `$journal` are arranged in ascending time order using `++gth`, so
the right-most item is the newest. The `$log` `mop` contains the update log, and
is arranged in descending time order, so the right-most item is the oldest.
We'll look at how to use ordered maps later when we get to writing the agent
itself.
## Conclusion
When we put each of these parts together, we have our complete
`/sur/journal.hoon` file:
```hoon
|%
:: Basic types of the data we're dealing with
::
+$ id @
+$ txt @t
+$ entry [=id =txt]
:: Poke actions
::
+$ action
$% [%add =id =txt]
[%edit =id =txt]
[%del =id]
==
:: Types for updates to subscribers or returned via scries
::
+$ logged (pair @ action)
+$ update
%+ pair @
$% action
[%jrnl list=(list entry)]
[%logs list=(list logged)]
==
:: Types for our agent's state
::
+$ journal ((mop id txt) gth)
+$ log ((mop @ action) lth)
--
```
## Resources
- [Gall Guide /sur section](/docs/userspace/gall-guide/7-sur-and-marks#sur) -
This section of the Gall Guide covers writing a `/sur` structure library for
an agent.
- [Ordered map functions in
`zuse.hoon`](https://github.com/urbit/urbit/blob/master/pkg/arvo/sys/zuse.hoon#L5284-L5688) -
This section of `zuse.hoon` contains all the functions for working with
`mop`s, and is well commented.

View File

@ -0,0 +1,338 @@
+++
title = "3. Agent"
weight = 3
template = "doc.html"
+++
Now that we have our agent's types defined and have thought through its
behavior, we can write the `%journal` agent itself.
## Imports
```hoon
/- *journal
/+ default-agent, dbug, agentio
```
We first import the `/sur/journal.hoon` file we previously created and expose
its structures. We import the standard `default-agent` and `dbug`, and also an additional library called `agentio`.
Agentio contains a number of convenience functions to make common agent tasks
simpler. For example, rather than writing out the full `$card`s when sending
`%fact`s to subscribers, we can call `++fact` in `agentio` with the `cage` and
`path`s and it will compose them for us. There are many more functions in
`agentio` than we'll use here - you can have a look through the library in
[`/base/lib/agentio.hoon`](https://github.com/urbit/urbit/blob/master/pkg/base-dev/lib/agentio.hoon)
to see what else it can do.
## State and type core
```hoon
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 =journal =log]
+$ card card:agent:gall
++ j-orm ((on id txt) gth)
++ log-orm ((on @ action) lth)
++ unique-time
|= [=time =log]
^- @
=/ unix-ms=@
(unm:chrono:userlib time)
|-
?. (has:log-orm log unix-ms)
unix-ms
$(time (add unix-ms 1))
--
```
As we discussed in the previous section, our state will contain a `$journal`
structure containing all our journal entries, and a `$log` structure containing
the update log. These are both _ordered maps_, defined as `((mop id txt) gth)`
and `((mop @ action) lth)` respectively. We can therefore define our _versioned
state_ as `[%0 =journal =log]`, in the usual manner.
We've define `$card` for convenience as usual, and we've also added three more
arms. The first two relate to our two ordered maps. If you'll recall, an
ordinary `map` is called with the `++by` door in the standard library, like so:
```hoon
(~(get by foo) %bar)
```
An ordered map uses the `++on` gate in `zuse.hoon` rather than `++by`, and its
invocation is slightly different. It must first be setup in a similar manner to
the `mop` type, by providing it the key/value molds and comparator gates. Once
that's done, its individual functions can be called with the `mop` and
arguments, like:
```hoon
(get:((on @ud @ud) gth) foo %bar)
```
This is quite a cumbersome expression to use every time we want to interact with
our `mop`. To make it easier, we can store the `((on @ud @ud) gth)` part in an
arm, and then when we need to use it we can just do `(get:arm-name foo %bar)`.
In this case, we've done one each of our ordered maps like so:
```hoon
++ j-orm ((on id txt) gth)
++ log-orm ((on @ action) lth)
```
The last arm in our state definition core is `++unique-time`. Since we'll use
`now.bowl` to derive the timestamp for updates, we run into an issue if multiple
pokes arrive in a single Arvo event. In that case, `now.bowl` would be the same
for each poke, so they'd be given the same key and override each other in the
`mop`. To avoid this, `++unique-time` is just a simple recursive function that
will increment the timestamp by one millisecond if the key already exists in the
`$log` `mop`, ensuring all updates get unique timestamps and there are no
collisions.
## Agent core setup
```hoon
%- agent:dbug
=| state-0
=* state -
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %|) bowl)
io ~(. agentio bowl)
++ on-init on-init:def
++ on-save
^- vase
!>(state)
::
++ on-load
|= old-vase=vase
^- (quip card _this)
`this(state !<(versioned-state old-vase))
::
```
Here we setup our agent core and define the three lifecycle arms. Since we only
have a single state version at present, these are very simple functions. You'll
notice in our `+*` arm, along with the usual `this` and `def`, we've also setup
the `agentio` library we imported, giving it the bowl and an alias of `io`.
## Pokes
```hoon
++ on-poke
|= [=mark =vase]
^- (quip card _this)
|^
?> (team:title our.bowl src.bowl)
?. ?=(%journal-action mark) (on-poke:def mark vase)
=/ now=@ (unique-time now.bowl log)
=/ act !<(action vase)
=. state (poke-action act)
:_ this(log (put:log-orm log now act))
~[(fact:io journal-update+!>(`update`[now act]) ~[/updates])]
::
++ poke-action
|= act=action
^- _state
?- -.act
%add
?< (has:j-orm journal id.act)
state(journal (put:j-orm journal id.act txt.act))
::
%edit
?> (has:j-orm journal id.act)
state(journal (put:j-orm journal id.act txt.act))
::
%del
?> (has:j-orm journal id.act)
state(journal +:(del:j-orm journal id.act))
==
--
::
```
Here we have our `++on-poke` arm, where we handle `$action`s. Since our
`%journal` agent is intended for local use only, we make sure only our ship or
our moons may perform actions with:
```hoon
?> (team:title our.bowl src.bowl)
```
We haven't yet written our mark files, but our mark for `$action`s will be
`%journal-action`, so we make sure that's what we've received and if not, call
`++on-poke:def` to crash with an error message. We make sure the the timestamps
are unique with our `++unique-time` function described earlier, and then we
extract the poke's vase to an `$action` structure and call `++poke-action` to
handle it. We've made `++on-poke` a door with a separate `++poke-action` arm to
make the logic a little simpler, but in principle we could have had it all
directly inside the main `++poke-action` gate, or even separated it out into a
helper core below.
The logic in `++poke-action` is very simple, with three cases for each of the possible `$action`s:
- `%add` - Add a new journal entry. We check it doesn't already exist with
`++has:j-orm`, and then add it to our `$journal` with `++put:j-orm`.
- `%edit` - Edit an existing journal entry. We make sure it _does_ exist with
`++has:j-orm`, and then override the old entry with the new one using
`++put:j-orm` again.
- `%del` - Delete an existing journal entry. We make sure it exists again with
`++has:j-orm`, and then use `++del:j-orm` to delete it from our `$journal`
`mop`.
Back in the main part of `++on-poke`, `++poke-action` updates the state with the
new `$journal`, then we proceed to:
```hoon
:_ this(log (put:log-orm log now act))
~[(fact:io journal-update+!>(`update`[now act]) ~[/updates])]
```
We add the timestamp to the action, converting it to a logged update. We add it
to the `$log` update log using `++put:log-orm`, and also send the logged update
out to subscribers on the `/updates` subscription path. We haven't written our
mark files yet, but `%journal-update` is the mark we'll use for `$update`s, so
we pack the `$update` in a vase and add the mark to make it a `$cage`. Notice
we're using the `++fact` function in `agentio` (which we aliased as `io`) rather
than manually composing the `%fact`.
## Subscriptions
```hoon
++ on-watch
|= =path
^- (quip card _this)
?> (team:title our.bowl src.bowl)
?+ path (on-watch:def path)
[%updates ~] `this
==
::
```
Our subscription logic is extremely simple - we just have a single `/updates`
path, which the front-end or other local agents may subscribe to. All updates
get sent out on this path. We enforce local-only with the `team:title` check.
We could have had our `++on-watch` arm send out some initial state to new
subscribers, but for our front-end we'll instead fetch the initial state
separately with a scry. This just makes it slightly easier if our front-end
needs to resubscribe at some point - it'll already have some state in that case
so we don't want it to get sent again.
## Scry Endpoints
```hoon
++ on-peek
|= =path
^- (unit (unit cage))
?> (team:title our.bowl src.bowl)
=/ now=@ (unm:chrono:userlib now.bowl)
?+ path (on-peek:def path)
[%x %entries *]
?+ t.t.path (on-peek:def path)
[%all ~]
:^ ~ ~ %journal-update
!> ^- update
[now %jrnl (tap:j-orm journal)]
::
[%before @ @ ~]
=/ before=@ (rash i.t.t.t.path dem)
=/ max=@ (rash i.t.t.t.t.path dem)
:^ ~ ~ %journal-update
!> ^- update
[now %jrnl (tab:j-orm journal `before max)]
::
[%between @ @ ~]
=/ start=@
=+ (rash i.t.t.t.path dem)
?:(=(0 -) - (sub - 1))
=/ end=@ (add 1 (rash i.t.t.t.t.path dem))
:^ ~ ~ %journal-update
!> ^- update
[now %jrnl (tap:j-orm (lot:j-orm journal `end `start))]
==
::
[%x %updates *]
?+ t.t.path (on-peek:def path)
[%all ~]
:^ ~ ~ %journal-update
!> ^- update
[now %logs (tap:log-orm log)]
::
[%since @ ~]
=/ since=@ (rash i.t.t.t.path dem)
:^ ~ ~ %journal-update
!> ^- update
[now %logs (tap:log-orm (lot:log-orm log `since ~))]
==
==
::
```
Here we have our `++on-peek` arm. The scry endpoints we've defined are divided
into two parts: querying the update `$log` and retrieving entries from the
`$journal`. Each end-point is as follows:
- `/x/entries/all` - Retrieve all entries in the `$journal`. Our front-end will
use lazy-loading and only get a few at a time, so it won't use this. It's nice
to have it though, in case other agents want to get that data.
- `/x/entries/before/[before]/[max]` - Retrieve at most `[max]` entries older
than the entry on `[before]` date. This is so our lazy-loading front-end can
progressively load more as the user scrolls down the page. The Javascript
front-end will format numbers without dot separators, so the path will look
like `/x/entries/before/1648051573109/10`. We therefore have to use the
[`++dem`](docs/hoon/reference/stdlib/4i#dem) parsing `rule` in a
[`++rash`](/docs/hoon/reference/stdlib/4g#rash) parser to convert it to an
ordinary atom. We then use the `++tap:log-orm` `mop` function to retrieve the
requested range as a list and return it as an `$update` with a
`%journal-update` mark.
- `/x/entries/between/[start]/[end]` - Retrieve all journal entries between two
dates. This is so our front-end can have a search function, where the user can
enter a start and end date and get all the entries in between. The
`++lot:j-orm` `mop` function returns the subset of a `mop` between the two
given keys as a `mop`, and then we call `++tap:j-orm` to convert it to a list.
The `++lot:j-orm function` excludes the start and end values, so we subtract 1
from the start and add 1 to the end to make sure it includes the full range.
- `/x/updates/all` - Retrieve the entire update `$log`. Our front-end won't use
this but it might be useful for other agents, so we've included it here.
- `/x/updates/since/[since]` - Retrieve all `$update`s that have happened since
the specified timestamp, if any. This is so our front-end (or another agent)
can resynchronize its state in the event its subscription is interrupted,
without having to fetch everything from scratch again.
We don't use any of the other agent arms, so the remainder have all been passed
to `default-agent` for handling:
```hoon
++ on-leave on-leave:def
++ on-agent on-agent:def
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
The full agent source can be viewed
[here](https://github.com/urbit/docs-examples/blob/main/journal-app/desk/app/journal.hoon).
## Resources
- [The Gall Guide](/docs/userspace/gall-guide/intro) - The Gall Guide covers all
aspects of writing Gall agents in detail.
- [Ordered map functions in
`zuse.hoon`](https://github.com/urbit/urbit/blob/master/pkg/arvo/sys/zuse.hoon#L5284-L5688) -
This section of `zuse.hoon` contains all the functions for working with
`mop`s, and is well commented.
- [`/lib/agentio.hoon`](https://github.com/urbit/urbit/blob/master/pkg/base-dev/lib/agentio.hoon) -
The `agentio` library in the `%base` desk contains a large number of useful
functions which making writing Gall agents easier.

View File

@ -0,0 +1,357 @@
+++
title = "4. JSON"
weight = 4
template = "doc.html"
+++
Data sent between our agent and our front-end will all be encoded as JSON. In
this section, we'll briefly look at how JSON works in Urbit, and write a library
to convert our agent's structures to and from JSON for our front-end.
JSON data comes into Eyre as a string, and Eyre parses it with the
[`++de-json:html`](/docs/hoon/reference/zuse/2e_2-3#de-jsonhtml) function in
[`zuse.hoon`](/docs/hoon/reference/zuse/table-of-contents). The
hoon type it's parsed to is `$json`, which is defined as:
```hoon
+$ json :: normal json value
$@ ~ :: null
$% [%a p=(list json)] :: array
[%b p=?] :: boolean
[%o p=(map @t json)] :: object
[%n p=@ta] :: number
[%s p=@t] :: string
== ::
```
Once Eyre has converted the raw JSON string to a `$json` structure, it will be
converted to the mark the web client specified and then delivered to the target
agent (unless the mark specified is already `%json`, in which case it will be
delivered directly). Outbound facts will go through the same process in
reverse - converted from the agent's native mark to `$json`, then encoded in a
string by Eyre using
[`++en-json:html`](/docs/hoon/reference/zuse/2e_2-3#en-jsonhtml) and delivered
to the web client. The basic flow for both inbound messages (pokes) and outbound
messages (facts and scry results) looks like this:
![eyre mark flow diagram](https://media.urbit.org/docs/userspace/full-stack-guide/eyre-mark-flow-diagram.svg)
The mark conversion will be done by the corresponding mark file in `/mar` on the
agent's desk. In our case it would be `/mar/journal/action.hoon` and
`/mar/journal/update.hoon` in the `%journal` desk for our `%journal-action` and
`%journal-update` marks, which are for the `$action` and `$update` structures we
defined previously.
Mark conversion functions can be included directly in the mark file, or they can
be written in a separate library, then imported and called by the mark file. We
will do the latter in this case, so before we create the mark files themselves,
we'll write a library called `/lib/journal.hoon` with the conversion functions.
## `$json` utilities
[`zuse.hoon`](/docs/hoon/reference/zuse/table-of-contents) contains three main
cores for converting to and from `$json`:
- [`++enjs:format`](/docs/hoon/reference/zuse/2d_1-5#enjsformat) - Functions to
help encode data structures as `$json`.
- [`++dejs:format`](/docs/hoon/reference/zuse/2d_6#dejsformat) - Functions to
decode `$json` to other data structures.
- [`++dejs-soft:format`](/docs/hoon/reference/zuse/2d_7#dejs-softformat) -
Mostly the same as `++dejs:format` except the functions produce units which
are null if decoding fails, rather than just crashing.
### `++enjs:format`
This contains ten functions for encoding `$json`. Most of them are for specific
hoon data types, such as `++tape:enjs:format`, `++ship:enjs:format`,
`++path:enjs:format`, etc. We'll just have a look at the two most general and
useful ones: `++frond:enjs:format` and `++pairs:enjs:format`.
#### `++frond`
This function is for forming a JSON object from a single key-value pair. For
example:
```
> (frond:enjs:format 'foo' s+'bar')
[%o p={[p='foo' q=[%s p='bar']]}]
```
When stringified by Eyre, this will look like:
```json
{ "foo": "bar" }
```
#### `++pairs`
This is similar to `++frond` and also forms a JSON object, but it takes multiple
key-value pairs rather than just one:
```
> (pairs:enjs:format ~[['foo' n+~.123] ['bar' s+'abc'] ['baz' b+&]])
[%o p={[p='bar' q=[%s p='abc']] [p='baz' q=[%b p=%.y]] [p='foo' q=[%n p=~.123]]}]
```
When stringified by Eyre, this will look like:
```json
{
"foo": 123,
"baz": true,
"bar": "abc"
}
```
Notice that we used a knot for the value of `foo` (`n+~.123`). Numbers in JSON
can be signed or unsigned and integers or floating point values. The `$json`
structure uses a knot so that you can decide whether a particular number should
be treated as `@ud`, `@sd`, `@rs`, etc.
### `++dejs:format`
This core contains many functions for decoding `$json`. We'll touch on some
useful families of `++dejs` functions in brief, but because there's so many, in
practice you'll need to look through the [`++dejs`
reference](/docs/hoon/reference/zuse/2d_6) to find the correct functions for
your use case.
#### Number functions
- `++ne` - decode a number to a `@rd`.
- `++ni` - decode a number to a `@ud`.
- `++no` - decode a number to a `@ta`.
- `++nu` - decode a hexadecimal string to a `@ux`.
For example:
```
> (ni:dejs:format n+'123')
123
```
#### String functions
- `++sa` - decode a string to a `tape`.
- `++sd` - decode a string containing a `@da` aura date value to a `@da`.
- `++se` - decode a string containing the specified aura to that aura.
- `++so` - decode a string to a `@t`.
- `++su` - decode a string by parsing it with the given [parsing
rule](/docs/hoon/reference/stdlib/4f).
#### Array functions
`++ar`, `++as`, and `++at` decode a `$json` array to a `list`, `set`, and
n-tuple respectively. These gates take other `++dejs` functions as an argument,
producing a new gate that will then take the `$json` array. For example:
```
> ((ar so):dejs:format a+[s+'foo' s+'bar' s+'baz' ~])
<|foo bar baz|>
```
Notice that `++so` is given as the argument to `++ar`. `++so` is a `++dejs`
function that decodes a `$json` string to a `cord`. The gate resulting from `(ar so)` is then called with a `$json` array as its argument, and its product is a
`(list @t)` of the elements of the array.
Many `++dejs` functions take other `++dejs` functions as their arguments. A
complex nested `$json` decoding function can be built up in this manner.
#### Object functions
- `++of` - decode an object containing a single key-value pair to a head-tagged
cell.
- `++ot` - decode an object to a n-tuple.
- `++ou` - decode an object to an n-tuple, replacing optional missing values
with a given value.
- `++oj` - decode an object of arrays to a `jug`.
- `++om` - decode an object to a `map`.
- `++op` - decode an object to a `map`, and also parse the object keys with a
[parsing rule](/docs/hoon/reference/stdlib/4f).
For example:
```
> =js %- need %- de-json:html
'''
{
"foo": "hello",
"baz": true,
"bar": 123
}
'''
> %- (ot ~[foo+so bar+ni]):dejs:format js
['hello' 123]
```
## Our types as JSON
We need to decide how our `$action` and `$update` types will be represented as
JSON in order to write our conversion functions. There are many ways to do this,
but in this case we'll do it as follows:
### Actions
| JSON | Noun |
| ------------------------------------------------- | ---------------------------------------------- |
| `{"add":{"id":1648366311070,"txt":"some text"}}` | `[%add id=1.648.366.034.844 txt='some text']` |
| `{"edit":{"id":1648366311070,"txt":"some text"}}` | `[%edit id=1.648.366.034.844 txt='some text']` |
| `{"del":{"id":1648366311070}}` | `[%del id=1.648.366.034.844]` |
### Updates
| Noun | JSON |
| ----------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
| `[1.648.366.492.459 %add id=1.648.366.034.844 txt='some text']` | `{time:1648366481425,"add":{"id":1648366311070,"txt":"some text"}}` |
| `[1.648.366.492.459 %edit id=1.648.366.034.844 txt='some text']` | `{time:1648366481425,"edit":{"id":1648366311070,"txt":"some text"}}` |
| `[1.648.366.492.459 %del id=1.648.366.034.844]` | `{time:1648366481425,"del":{"id":1648366311070}}` |
| `[1.648.366.492.459 %jrnl ~[[id=1.648.366.034.844 txt='some text'] ...]` | `{time:1648366481425,"entries":[{"id":1648366311070,"txt":"some text"},...]}` |
| `[1.648.366.492.459 %logs ~[[1.648.366.492.459 %add id=1.648.366.034.844 txt='some text'] ...]` | `{time:1648366481425,"logs":[{time:1648366481425,"add":{id":1648366311070,"txt":"some text"}},...]}` |
Now let's write our library of encoding/decoding functions.
## `/lib/journal.hoon`
```hoon
/- *journal
|%
```
First, we'll import the `/sur/journal.hoon` structures we previously created.
Next, we'll create two arms in our core, `++dejs-action` and `++enjs-update`, to
handle incoming poke `$action`s and outgoing facts or scry result `$update`s.
### `$json` to `$action`
```hoon
++ dejs-action
=, dejs:format
|= jon=json
^- action
%. jon
%- of
:~ [%add (ot ~[id+ni txt+so])]
[%edit (ot ~[id+ni txt+so])]
[%del (ot ~[id+ni])]
==
```
The first thing we do is use the [`=,`
rune](/docs/hoon/reference/rune/tis#-tiscom) to expose the `++dejs:format`
namespace. This allows us to reference `ot`, `ni`, etc rather than having to
write `ot:dejs:format` every time. Note that you should be careful using `=,`
generally as the exposed wings can shadow previous wings if they have the same
name.
We then create a gate that takes `$json` and returns a `$action` structure.
Since we'll only take one action at a time, we can use the `++of` function,
which takes a single key-value pair. `++of` takes a list of all possible `$json`
objects it will receive, tagged by key.
For each key, we specify a function to handle its value. Ours will be objects,
so we use `++ot` and specify the pairs of the key and `+dejs` function to decode
it. We then cast the output to our `$action` structure.
You'll notice the nesting of these `++dejs` functions approximately reflects the
nested structure of the `$json` it's decoding.
### `$update` to `$json`
```hoon
++ enjs-update
=, enjs:format
|= upd=update
^- json
|^
?+ -.q.upd (logged upd)
%jrnl
%- pairs
:~ ['time' (numb p.upd)]
['entries' a+(turn list.q.upd entry)]
==
::
%logs
%- pairs
:~ ['time' (numb p.upd)]
['logs' a+(turn list.q.upd logged)]
==
==
++ entry
|= ent=^entry
^- json
%- pairs
:~ ['id' (numb id.ent)]
['txt' s+txt.ent]
==
++ logged
|= lgd=^logged
^- json
?- -.q.lgd
%add
%- pairs
:~ ['time' (numb p.lgd)]
:- 'add'
%- pairs
:~ ['id' (numb id.q.lgd)]
['txt' s+txt.q.lgd]
== ==
%edit
%- pairs
:~ ['time' (numb p.lgd)]
:- 'edit'
%- pairs
:~ ['id' (numb id.q.lgd)]
['txt' s+txt.q.lgd]
== ==
%del
%- pairs
:~ ['time' (numb p.lgd)]
:- 'del'
(frond 'id' (numb id.q.lgd))
==
==
--
--
```
Our `$update` encoding function's a little more complex than our `$action`
decoding function, since our `$update` structure is more complex.
Like the previous one, we use `=,` to expose the namespace of `++enjs:format`.
Our gate takes an `$update` and returns a `$json` structure. We use `|^` so we
can separate out the encoding functions for individual entries (`++entry`) and
individual logged actions (`++logged`).
We first test the head of the `$update`, and if it's `%jrnl` (a list of
entries), we `turn` over the entries and call `++entry` to encode each one. If
it's `%logs`, we do the same, but call `++logged` for each item in the list.
Otherwise, if it's just a single update, we encode it with `++logged`.
We primarily use `++pairs` to form the object, though sometimes `++frond` if it
only contains a single key-value pair. We also use `++numb` to encode numerical
values.
You'll notice more of our encoding function is done manually than our previous
decoding function. For example, we form arrays by tagging an ordinary `list`
with `%a`, and strings by tagging an ordinary `cord` with `%s`. This is typical
when you write `$json` encoding functions, and is the reason there are far fewer
`+enjs` functions than `+dejs` functions.
## Resources
- [The JSON Guide](/docs/hoon/guides/json-guide) - The stand-alone JSON guide
covers JSON encoding/decoding in great detail.
- [The Zuse reference](/docs/hoon/reference/zuse/table-of-contents) - The
`zuse.hoon` reference documents all JSON-related functions in detail.
- [`++enjs:format` reference](/docs/hoon/reference/zuse/2d_1-5#enjsformat) -
This section of the `zuse.hoon` documentation covers all JSON encoding
functions.
- [`++dejs:format` reference](/docs/hoon/reference/zuse/2d_6) - This section of
the `zuse.hoon` documentation covers all JSON _decoding_ functions.
- [Eyre overview](/docs/arvo/eyre/eyre) - This section of the Eyre vane
documentation goes over the basic features of the Eyre vane.

View File

@ -0,0 +1,97 @@
+++
title = "5. Marks"
weight = 5
template = "doc.html"
+++
In this section we'll write the mark files for our agent. We'll need two marks,
one for poke `$action`s and one for subscription updates and scry results, both
of which are `$update`s. Our `$action` mark will be called `%journal-action` and
our `$update` mark will be called `%journal-update`. These will be located at
`/mar/journal/action.hoon` and `/mar/journal/update.hoon`.
Note that a mark called `%foo-bar` will first be looked for in
`/mar/foo-bar.hoon`, and if it's not there it will be looked for in
`/mar/foo/bar.hoon`. That's why we can have a single name like `%journal-action`
but have it in `/mar/journal/action.hoon`
## `%journal-action`
```hoon
/- *journal
/+ *journal
|_ act=action
++ grow
|%
++ noun act
--
++ grab
|%
++ noun action
++ json dejs-action
--
++ grad %noun
--
```
First we import our `/sur/journal.hoon` structure file and also our
`/lib/journal.hoon` library (containing our `$json` conversion functions). The
sample of our mark door is just our `$action` structure. The `++grow` arm of a
mark core, if you recall, contains methods for converting _from_ our mark _to_
another mark. Actions only ever come inwards in pokes, so we don't need to worry
about converting an `$action` _to_ `$json`. The `++grow` arm can therefore just
handle the generic `%noun` case, simply returning our mark door's sample without
doing anything.
`++grab`, conversely, defines methods for converting _to_ our mark _from_
another mark. Since `$action`s will come in from the front-end as `$json`, we
need to be able to convert `$json` data to our `$action` structure. Our
`/lib/journal.hoon` library contains the `++dejs-action` function for performing
this conversion, so we can just specify that function for the `%json` case.
We'll also define a standard `%noun` method, which will just "clam" (or "mold")
the incoming noun with the `$action` mold. Clamming/molding coerces a noun to a
type and is done by calling a mold as a function.
Lastly, `++grad` defines revision control methods, but can be delegated to
another mark. Since this mark will never be used for actually storing files in
Clay, we can just delegate it to the generic `%noun` mark rather than writing a
proper set of `++grad` methods.
## `%journal-update`
```hoon
/- *journal
/+ *journal
|_ upd=update
++ grow
|%
++ noun upd
++ json (enjs-update upd)
--
++ grab
|%
++ noun update
--
++ grad %noun
--
```
Next we have our `%journal-update` mark file. The sample of our mark door is our
`$update` structure. Our `$update`s are always outbound, never inbound, so we
only need to define a method for converting our `$update` structure to `$json`
in the `++grow` arm, and not the opposite direction in `++grad`. Our
`/lib/journal.hoon` library contains the `++enjs-update` function for performing
this conversion, so we can call it with the sample `$update` as its argument. We
can add `%noun` conversion methods and delegate revision control to the `%noun`
mark in the same manner as our `%journal-action` mark above.
## Resources
- [The Marks section of the Clay documentation](/docs/arvo/clay/marks/marks) -
This section of the Clay vane documentation covers mark files comprehensively.
- [The mark file section of the Gall
Guide](/docs/userspace/gall-guide/7-sur-and-marks#mark-files) - This part of
the Gall Guide goes through the basics of mark files.
- [The JSON Guide](/docs/hoon/guides/json-guide) - This also covers writing mark
files to convert to/from JSON.

View File

@ -0,0 +1,79 @@
+++
title = "6. Eyre"
weight = 6
template = "doc.html"
+++
Now that we have our structure file, agent, `$json` conversion library and mark
file, our back-end is complete. Before we start writing our front-end, though,
we should give a brief overview of how Eyre works.
[Eyre](/docs/arvo/eyre/eyre) is the HTTP server [vane](/docs/glossary/vane) of
Arvo. Eyre has a handful of different subsystems, but the main two are the
channel system and the scry interface. These two are what we'll focus on here.
In order to use the channel system or perform scries, a web client must have
authenticated with the ship's web login code (e.g.
`lidlut-tabwed-pillex-ridrup`) and obtained a session cookie. Our front-end will
be served directly from the ship by the `%docket` agent, so we can assume a
session cookie was already obtained when the user logged into landscape, and
skip over authentication.
## Channels
Eyre's channel system is the main way to interact with agents from a web client.
It provides a JSON interface to the ordinary poke and subscription system for
Gall agents.
First, a unique channel ID is generated by the web client (`@urbit/http-api`
uses the current Unix time concatenated with a random hex string). The client
then sends a poke or subscription request for the channel, and Eyre
automatically opens a new channel with that ID. Once open, the client can then
connect to the channel and receive any events such as poke acks, watch acks,
facts from subscriptions, etc.
The new channel is an SSE ([Server Sent
Event](https://html.spec.whatwg.org/#server-sent-events)) stream, and can be
handled by an `EventSource` object in Javascript. The `@urbit/http-api` library
we'll use abstracts this for us, so we won't need to deal with an `EventSource`
object directly. The channel can handle multiple concurrent subscriptions to
different agent subscription paths, and different agents can be poked through
the one channel. This means a client only needs to open a single channel for all
of its interactions with the ship. Each subscription is given a different ID, so
they can be individually unsubscribed later.
A channel will timeout after 12 hours of inactivity, and the timeout is reset
every time Eyre receives a message of any kind from the client. Additionally,
each subscription on the channel may only accumulate 50 unacknowledged facts
before it's considered "clogged", in which case the individual clogged
subscription will be closed by Eyre after a short delay. All events of any kind
which Eyre sends out on the channel must be ack'd by the client. Ack'ing one
event will also ack all previous events too. The `@urbit/http-api` library we'll
use automatically acks events for us, so we don't need to worry about clogged
subscriptions or manually ack'ing events.
Eyre expects a particular JSON object structure for each of these different
requests, but the `@urbit/http-api` library we'll use includes functions to send
pokes, subscription requests, etc, so we won't need to manually construct the
JSON objects in our front-end.
## Scries
Eyre's scry interface is separate to the channel system. Scries are performed by
a simple GET request to a path with a format of `/~/scry/{agent}{path}.{mark}`.
If successful, the HTTP response will contain the result with the mark
specified. If unsuccessful, an HTTP error will be thrown in response.
The `@urbit/http-api` library we'll use includes a function for performing
scries, so we'll not need to manually send GET requests to the ship.
## Resources
- [The Eyre vane documentation](/docs/arvo/eyre/eyre) - This section of the vane
docs covers all aspects of Eyre.
- [Eyre External API Reference](/docs/arvo/eyre/external-api-ref) - This section
of the Eyre documentation contains reference material for Eyre's external API.
- [The Eyre Guide](/docs/arvo/eyre/guide) - This section of the Eyre
documentation walks through using Eyre's external API at a low level (using
`curl`).

View File

@ -0,0 +1,214 @@
+++
title = "7. React app setup"
weight = 7
template = "doc.html"
+++
Now that we have a basic idea of how Eyre works, we can begin working on our
React app front-end.
## Create React app
Node.js must be installed, and can be downloaded from their
[website](https://nodejs.org/en/download). With that installed, we'll have the
`npm` package manager available. The first thing we'll do is globally install
the `create-react-app` package with the following command:
```sh
npm install -g create-react-app
```
Once installed, we can use it to create a new `journal-ui` directory and setup a
new React app in it with the following command:
```sh
create-react-app journal-ui
```
We can then open our new directory:
```sh
cd journal-ui
```
Its contents should look something like this:
```
journal-ui
├── node_modules
├── package.json
├── package-lock.json
├── public
├── README.md
└── src
```
## Install `http-api`
Inside our React app directory, let's install the `@urbit/http-api` NPM package:
```sh
npm i @urbit/http-api
```
We also install a handful of other packages for the UI components (`bootstrap react-bootstrap react-textarea-autosize date-fns react-bottom-scroll-listener react-day-picker`), but that's not important to our purposes here.
## Additional tweaks
Our front-end will be served directly from the ship by the `%docket` app, where
a user will open it by clicking on its homescreen tile. Docket serves such
front-ends with a base URL path of `/apps/[desk]/`, so in our case it will be
`/apps/journal`. In order for our app to be built with correct resource paths,
we must add the following line to `package.json`:
```json
"homepage": "/apps/journal/",
```
Our app also needs to know the name of the ship it's being served from in order
to talk with it. The `%docket` agent serves a small file for this purpose at
`[host]/session.js`. This file is very simple and just contains:
```js
window.ship = "sampel-palnet";
```
`sampel-palnet` will of course be replaced by the actual name of the ship. We
include this script by adding the following line to the `<head>` section of
`public/index.html`:
```
<script src="/session.js"></script>
```
## Basic API setup
With everything now setup, we can begin work on the app itself. In this case
we'll just edit the existing `App.js` file in the `/src` directory. The first thing is to import the `Urbit` class from `@urbit/http-api`:
```js
import Urbit from "@urbit/http-api";
```
We also need to import a few other things, mostly relating to UI components (but
these aren't important for our purposes here):
```js
import React, { Component } from "react";
import "bootstrap/dist/css/bootstrap.min.css";
import "react-day-picker/lib/style.css";
import TextareaAutosize from "react-textarea-autosize";
import Button from "react-bootstrap/Button";
import Card from "react-bootstrap/Card";
import Stack from "react-bootstrap/Stack";
import Tab from "react-bootstrap/Tab";
import Tabs from "react-bootstrap/Tabs";
import ToastContainer from "react-bootstrap/ToastContainer";
import Toast from "react-bootstrap/Toast";
import Spinner from "react-bootstrap/Spinner";
import CloseButton from "react-bootstrap/CloseButton";
import Modal from "react-bootstrap/Modal";
import DayPickerInput from "react-day-picker/DayPickerInput";
import endOfDay from "date-fns/endOfDay";
import startOfDay from "date-fns/startOfDay";
import { BottomScrollListener } from "react-bottom-scroll-listener";
```
Inside the existing `App` class:
```js
class App extends Component {
```
...we'll clear out the existing demo code and start adding ours. The first thing
is to define our app's state. We'll look at most of the state entries in the
next section. For now, we'll just consider `status`.
```js
state = {
// .....
status: null,
// .....
};
```
Next, we'll setup the `Urbit` API object in `componentDidMount`. We could do
this outside the `App` class since we're adding it to `window`, but we'll do it
this way so it's all in one place:
```js
componentDidMount() {
window.urbit = new Urbit("");
window.urbit.ship = window.ship;
window.urbit.onOpen = () => this.setState({status: "con"});
window.urbit.onRetry = () => this.setState({status: "try"});
window.urbit.onError = (err) => this.setState({status: "err"});
this.init();
};
```
The first thing we do is create a new instance of the `Urbit` class we imported
from `@urbit/http-api`, and save it to `window.urbit`. The `Urbit` class
constructor takes three arguments: `url`, `desk` and `code`, of which only `url`
is mandatory.
- `url` is the URL of the ship we want to talk to. Since our React app will be
served by the ship, we can just leave it as an empty `""` string and let
`Urbit` use root-relative paths.
- `desk` is only necessary if we want to run threads through Eyre, and since
we're not going to do that, we can exclude it.
- `code` is the web login code for authentication, but since the user will
already have logged in, we can also exclude that.
Therefore, we call the class contructor with just the empty `url` string:
```js
window.urbit = new Urbit("");
```
Next, we need to set the ship name in our `Urbit` instance. Eyre requires the
ship name be specified in all requests, so if we don't set it, Eyre will reject
all the messages we send. We previously included `session.js` which sets
`window.ship` to the ship name, so we just set `window.urbit.ship` as that:
```js
window.urbit.ship = window.ship;
```
Next, we set three callbacks: `onOpen`, `onRetry`, and `onError`. These
callbacks are fired when the state of our channel connection changes:
- `onOpen` is called when a connection is established.
- `onRetry` is called when a channel connection has been interrupted (such as by
network issues) and the `Urbit` object is trying to reconnect. Reconnection
will be attempted up to three times: immediately, after 750ms, and after
3000ms.
- `onError` is called with an `Error` message once all retries have failed, or
otherwise when a fatal error occurs.
We'll look at how we handle these cases in the next section. For now, we'll just
set the `status` entry in the state to either `"con"`, `"try"`, or `"err"` as
the case may be. Note that it's not mandatory to set these callbacks, but
leaving connection problems unhandled is usually a bad idea.
The last thing we do is call:
```js
this.init();
```
This function will fetch initial entries and subscribe for updates. We'll look
at it in the next section.
## Resources
- [HTTP API Guide](/docs/userspace/http-api-guide) - Reference documentation for
`@urbit/http-api`.
- [React app source
code](https://github.com/urbit/docs-examples/tree/main/journal-app/ui) - The
source code for the Journal app UI.
- [`@urbit/http-api` source
code](https://github.com/urbit/urbit/tree/master/pkg/npm/http-api) - The
source code for the `@urbit/http-api` NPM package.

View File

@ -0,0 +1,406 @@
+++
title = "8. React app logic"
weight = 8
template = "doc.html"
+++
With the basic things setup, we can now go over the logic of our app. We'll just
focus on functions that are related to ship communications using the `Urbit`
object we previously setup, and ignore UI components and other helper functions.
## State
In the previous section we just mentioned the connection `status` field of our
state. Here's the full state of our App:
```js
state = {
entries: [], // list of journal entries for display
drafts: {}, // edits which haven't been submitted yet
newDraft: {}, // new entry which hasn't been submitted yet
results: [], // search results
searchStart: null, // search query start date
searchEnd: null, // search query end date
resultStart: null, // search results start date
resultEnd: null, // search results end date
searchTime: null, // time of last search
latestUpdate: null, // most recent update we've received
entryToDelete: null, // deletion target for confirmation modal
status: null, // connection status (con, try, err)
errorCount: 0, // number of errors so far
errors: new Map(), // list of error messages for display
};
```
We'll see how these are used subsequently.
## Initialize
The first thing our app does is call `init()`:
```js
init = () => {
this.getEntries().then(
(result) => {
this.handleUpdate(result);
this.setState({ latestUpdate: result.time });
this.subscribe();
},
(err) => {
this.setErrorMsg("Connection failed");
this.setState({ status: "err" });
}
);
};
```
This function just calls `getEntries()` to retrieve the initial list of journal
entries then, if that succeeded, it calls `subscribe()` to subscribe for new
updates. If the initial entry retrieval failed, we set the connection `status`
and save an error message in the `errors` map. We'll look at what we do with
errors later.
## Getting entries
![entries screenshot](https://media.urbit.org/docs/userspace/full-stack-guide/entries.png)
The `getEntries` function scries our `%journal` agent for up to 10 entries
before the oldest we currently have. We call this initially, and then each time
the user scrolls to the bottom of the list.
```js
getEntries = async () => {
const { entries: e } = this.state;
const before = e.length === 0 ? Date.now() : e[e.length - 1].id;
const max = 10;
const path = `/entries/before/${before}/${max}`;
return window.urbit.scry({
app: "journal",
path: path,
});
};
```
The scry is done with the `Urbit.scry` method. This function takes two arguments
in an object:
- `app` - the agent to scry.
- `path` - the scry path. Note the `care` is not included - all scries through
Eyre are `%x` scries.
The `Urbit.scry` method only allows JSON results, but note that scries done via
direct GET requests allow other marks too.
The `Urbit.scry` method returns a Promise which will contain an HTTP error
message if the scry failed. We handle it with a `.then` expression back in the
function that called it, either [`init()`](#initialize) or `moreEntries()`. If
the Promise is successfuly, the results are passed to the
[`handleUpdate`](#updates) function which appends the new entries to the
existing ones in state.
## Subscription
A subscription to the `/updates` path of our `%journal` agent is opened with our
`subscribe()` function:
```js
subscribe = () => {
try {
window.urbit.subscribe({
app: "journal",
path: "/updates",
event: this.handleUpdate,
err: () => this.setErrorMsg("Subscription rejected"),
quit: () => this.setErrorMsg("Kicked from subscription"),
});
} catch {
this.setErrorMsg("Subscription failed");
}
};
```
We use the `Urbit.subscribe` method for this, which takes five arguments in an
object:
- `app` - the target agent.
- `path` - the `%watch` path we're subscribing to.
- `event` - a function to handle each fact the agent sends out. We call our
`handleUpdate` function, which we'll describe below.
- `err` - a function to call if the subscription request is rejected (nacked).
We just display an error in this case.
- `quit` - a function to call if we get kicked from the subscription. We also
just display an error in this case.
Note that the `Urbit.subscribe` method returns a subscription ID number. Since
we only have one subscription in our app which we never close, we don't bother
to record it. If your app has multiple subscriptions to manage, you may wish to
keep track of these IDs in your app's state.
## Updates
This `handleUpdate` function handles all updates we receive. It's called
whenever an event comes in for our subscription, and it's also called with the
results of [`getEntries`](#getting-entries) and [`getUpdates`](#error-handling)
(described later).
It's a bit complex, but basically it just checks whether the JSON object is
`add`, `edit`, `delete`, or `entries`, and then updates the state appropriately.
The object it's receiving is just the `$update` structure converted to JSON by
the mark conversion functions we wrote previously.
```js
handleUpdate = (upd) => {
const { entries, drafts, results, latestUpdate } = this.state;
if (upd.time !== latestUpdate) {
if ("entries" in upd) {
this.setState({ entries: entries.concat(upd.entries) });
} else if ("add" in upd) {
const { time, add } = upd;
const eInd = this.spot(add.id, entries);
const rInd = this.spot(add.id, results);
const toE =
entries.length === 0 || add.id > entries[entries.length - 1].id;
const toR = this.inSearch(add.id, time);
toE && entries.splice(eInd, 0, add);
toR && results.splice(rInd, 0, add);
this.setState({
...(toE && { entries: entries }),
...(toR && { results: results }),
latestUpdate: time,
});
} else if ("edit" in upd) {
const { time, edit } = upd;
const eInd = entries.findIndex((e) => e.id === edit.id);
const rInd = results.findIndex((e) => e.id === edit.id);
const toE = eInd !== -1;
const toR = rInd !== -1 && this.inSearch(edit.id, time);
if (toE) entries[eInd] = edit;
if (toR) results[rInd] = edit;
(toE || toR) && delete drafts[edit.id];
this.setState({
...(toE && { entries: entries }),
...(toR && { results: results }),
...((toE || toR) && { drafts: drafts }),
latestUpdate: time,
});
} else if ("del" in upd) {
const { time, del } = upd;
const eInd = entries.findIndex((e) => e.id === del.id);
const rInd = results.findIndex((e) => e.id === del.id);
const toE = eInd !== -1;
const toR = this.inSearch(del.id, time) && rInd !== -1;
toE && entries.splice(eInd, 1);
toR && results.splice(rInd, 1);
(toE || toR) && delete drafts[del.id];
this.setState({
...(toE && { entries: entries }),
...(toR && { results: results }),
...((toE || toR) && { drafts: drafts }),
latestUpdate: time,
});
}
}
};
```
## Add, edit, delete
![add screenshot](https://media.urbit.org/docs/userspace/full-stack-guide/add.png)
When a user writes a new journal entry and hits submit, the `submitNew` function
is called. It uses the `Urbit.poke` method to poke our `%journal` agent.
```js
submitNew = (id, txt) => {
window.urbit.poke({
app: "journal",
mark: "journal-action",
json: { add: { id: id, txt: txt } },
onSuccess: () => this.setState({ newDraft: {} }),
onError: () => this.setErrorMsg("New entry rejected"),
});
};
```
The `Urbit.poke` method takes five arguments:
- `app` is the agent to poke.
- `mark` is the mark of the data we're sending. We specify `"journal-action"`,
so Eyre will use the `/mar/journal/action.hoon` mark we created to convert it
to a `$action` structure with a `%journal-action` mark before it's delivered
to our agent.
- `json` is the actual data we're poking our agent with. In this case it's the
JSON form of the `%add` `$action`.
- `onSuccess` is a callback that fires if we get a positive ack in response. In
this case we just clear the draft.
- `onError` is a callback that fires if we get a negative ack (nack) in
response, meaning the poke failed. In this case we just set an error message
to be displayed.
`onSuccess` and `onError` are optional, but it's usually desirable to handle
these cases.
The `delete` and `submitEdit` functions are similar to `submitNew`, but for the
`%del` and `%edit` actions rather than `%add`:
![edit screenshot](https://media.urbit.org/docs/userspace/full-stack-guide/edit.png)
```js
submitEdit = (id, txt) => {
if (txt !== null) {
window.urbit.poke({
app: "journal",
mark: "journal-action",
json: { edit: { id: id, txt: txt } },
onError: () => this.setErrorMsg("Edit rejected"),
});
} else this.cancelEdit(id);
};
```
![delete screenshot](https://media.urbit.org/docs/userspace/full-stack-guide/delete.png)
```js
delete = (id) => {
window.urbit.poke({
app: "journal",
mark: "journal-action",
json: {"del": {"id": id}},
onError: ()=>this.setErrorMsg("Deletion rejected")
})
this.setState({rmModalShow: false, entryToDelete: null})
};
```
Note that whether we're adding, editing or deleting entries, we update our state
when we receive the update back on the `/updates` subscription, not when we poke
our agent.
## Search
![search screenshot](https://media.urbit.org/docs/userspace/full-stack-guide/search.png)
When searching for entries between two dates, the `getSearch` function is
called, which uses the `Urbit.scry` method to scry for the results in a similar
fashion to [`getEntries`](#getting-entries), but using the
`/x/entries/between/[start]/[end]` endpoint.
```js
getSearch = async () => {
const { searchStart: ss, searchEnd: se, latestUpdate: lu } = this.state;
if (lu !== null && ss !== null && se !== null) {
let start = ss.getTime();
let end = se.getTime();
if (start < 0) start = 0;
if (end < 0) end = 0;
const path = `/entries/between/${start}/${end}`;
window.urbit
.scry({
app: "journal",
path: path,
})
.then(
(result) => {
this.setState({
searchTime: result.time,
searchStart: null,
searchEnd: null,
resultStart: ss,
resultEnd: se,
results: result.entries,
});
},
(err) => {
this.setErrorMsg("Search failed");
}
);
} else {
lu !== null && this.setErrorMsg("Searh failed");
}
};
```
## Error handling
When the channel connection is interrupted, the `Urbit` object will begin trying to reconnect. On each attempt, it sets the connection `status` to `"try"`, as we specified for the `onRetry` callback. When this is set, a "reconnecting" message is displayed at the bottom of the screen:
![reconnecting screenshot](https://media.urbit.org/docs/userspace/full-stack-guide/reconnecting.png)
If all three reconnection attempts fail, the `onError` callback is fired and we replace the "reconnecting" message with a "reconnect" button:
![reconnect screenshot](https://media.urbit.org/docs/userspace/full-stack-guide/reconnect.png)
When clicked, the following function is called:
```js
reconnect = () => {
window.urbit.reset();
const latest = this.state.latestUpdate;
if (latest === null) {
this.init();
} else {
this.getUpdates().then(
(result) => {
result.logs.map((e) => this.handleUpdate(e));
this.subscribe();
},
(err) => {
this.setErrorMsg("Connection failed");
this.setState({ status: "err" });
}
);
}
};
```
Our `reconnect()` function first calls the `Urbit.reset` method. This closes the
channel connection, wipes event counts and subscriptions, and generates a new
channel ID. We could have tried reconnecting without resetting the connection,
but we don't know whether the channel still exists. We could time how long the
connection has been down and estimate whether it still exists, but it's easier
to just start fresh in this case.
Since we've reset the channel, we don't know if we've missed any updates. Rather
than having to refresh our whole state, we can use the `getUpdates()` function
to get any missing update:
```js
getUpdates = async () => {
const { latestUpdate: latest } = this.state;
const since = latest === null ? Date.now() : latest;
const path = `/updates/since/${since}`;
return window.urbit.scry({
app: "journal",
path: path,
});
};
```
This function uses the `Urbit.scry` method to scry the
`/x/updates/since/[since]` path, querying the update `$log` for entries more
recent than `latestUpdate`, which is always set to the last logged action we
received. The `getUpdates` function returns a Promise to the `reconnect`
function above which called it. The `reconnect` function handles it in a `.then`
expression, where the success case passes each update retrieved to the
[`handleUpdate`](#updates) function, updating our state.
Lastly, as well as handling channel connection errors, we also handle errors
such as poke nacks or failed scries by printing error messages added to the
`error` map by the `setErrorMsg` function. You could of course handle nacks,
kicks, scry failures, etc differently than just printing an error, it depends on
the needs of your app.
![search failed screenshot](https://media.urbit.org/docs/userspace/full-stack-guide/search-failed.png)
## Resources
- [HTTP API Guide](/docs/userspace/http-api-guide) - Reference documentation for
`@urbit/http-api`.
- [React app source
code](https://github.com/urbit/docs-examples/tree/main/journal-app/ui) - The
source code for the Journal app UI.
- [`@urbit/http-api` source
code](https://github.com/urbit/urbit/tree/master/pkg/npm/http-api) - The
source code for the `@urbit/http-api` NPM package.

View File

@ -0,0 +1,226 @@
+++
title = "9. Desk and glob"
weight = 9
template = "doc.html"
+++
With our React app now complete, we can put together the final desk and publish
it.
## Config files
So far we've written the following files for the back-end:
```
ourfiles
├── app
│ └── journal.hoon
├── lib
│ └── journal.hoon
├── mar
│ └── journal
│ ├── action.hoon
│ └── update.hoon
└── sur
└── journal.hoon
```
There's a handful of extra files we need in the root of our desk:
- `desk.bill` - the list of agents that should be started when our app is
installed.
- `sys.kelvin` - the kernel version our app is compatible with.
- `desk.docket-0` - configuration of our app tile, front-end glob and other
metadata.
We only have one agent to start, so `desk.bill` is very simple:
```
:~ %journal
==
```
Likewise, `sys.kelvin` just contains:
```
[%zuse 418]
```
The `desk.docket-0` file is slightly more complicated:
```
:~
title+'Journal'
info+'Dear diary...'
color+0xd9.b06d
version+[0 1 0]
website+'https://urbit.org'
license+'MIT'
base+'journal'
glob-ames+[~zod 0v0]
==
```
The fields are as follows:
- `title` is the name of the app - this will be displayed on the tile and when
people search for the app to install it.
- `info` is a brief description of the app.
- `color` - the RGB hex color of the tile.
- `version` - the version number of the app. The fields represent major, minor
and patch version.
- `website` - a link to a website for the app. This would often be its Github repo.
- `license` - the license of for the app.
- `base` - the desk name of the app.
- `glob-ames` - the ship to retrieve the front-end files from, and the hash of
those files. We've put `~zod` here but this would be the actual ship
distributing the app when it's live on the network. The hash is `0v0`
initially, but once we upload the front-end files it will be updated to the
hash of those files automatically. Note that it's also possible to distribute
front-end files from a separate web server. In that case, you'd use
`glob-http` rather than `glob-ames`. The [Glob section of the distribution
guide](/docs/userspace/dist/glob) covers this alternative approach in more
detail.
Our files should now look like this:
```
ourfiles
├── app
│ └── journal.hoon
├── desk.bill
├── desk.docket-0
├── lib
│ └── journal.hoon
├── mar
│ └── journal
│ ├── action.hoon
│ └── update.hoon
├── sur
│ └── journal.hoon
└── sys.kelvin
```
## New desk
Next, we'll create a new `%journal` desk on our ship by forking an existing one.
Once created, we can mount it to the unix filesystem.
In the dojo of a fake ship:
```
> |merge %journal our %webterm
>=
> |mount %journal
>=
```
Now we can browse to it in the unix terminal:
```sh
cd ~/zod/journal
```
Currently it has the same files as the `%webterm` desk, so we need to delete
those:
```sh
rm -r .
```
Apart from the kernel and standard library, desks need to be totally
self-contained, including all mark files and libraries necessary to build them.
For example, since our app contains a number of `.hoon` files, we need the
`hoon.hoon` mark, and its dependencies. The easiest way to ensure our desk has
everything it needs is to copy in the "dev" versions of the `%base` and
`%garden` desks. To do this, we first clone the Urbit git repository:
```sh
git clone https://github.com/urbit/urbit.git urbit-git
```
If we navigate to the `pkg` directory in the cloned repo:
```sh
cd ~/urbit-git/pkg
```
...we can combine the `base-dev` and `garden-dev` desks with the included
`symbolic-merge.sh` script:
```sh
./symbolic-merge.sh base-dev journal
./symbolic-merge.sh garden-dev journal
```
Now, we copy the contents of the new `journal` folder into our empty desk:
```sh
cp -rL journal/* ~/zod/journal/
```
Note we've used the `L` flag to resolve symbolic links, because the dev-desks
contain symlinks to files in the actual `arvo` and `garden` folders.
We can copy across all of our own files too:
```sh
cp -r ~/ourfiles/* ~/zod/journal/
```
Finally, in the dojo, we can commit the whole lot:
```
|commit %journal
```
## Glob
The next step is to build our front-end and upload the files to our ship. In the
`journal-ui` folder containing our React app, we can run:
```sh
npm run build
```
This will create a `build` directory containing the compiled front-end files. To
upload it to our ship, we need to first install the `%journal` desk. In the
dojo:
```
|install our %journal
```
Next, in the browser, we navigate to the `%docket` globulator at
`http://localhost:8080/docket/upload` (replacing localhost with the actual host):
![globulator screenshot](https://m.tinnus-napbus.xyz/pub/globulator.png)
We select our `journal` desk, then we hit `Choose file`, and select the whole
`build` directory which was created when we build our React app. Finally, we hit
`glob!` to upload it.
If we now return to the homescreen of our ship, we'll see our tile displayed, and we can open our app by clicking on it:
![tiles screenshot](https://m.tinnus-napbus.xyz/pub/tiles.png)
## Publishing
The last thing we need to do is publish our app, so other users can install it
from our ship. To do that, we just run the following command in the dojo:
```
:treaty|publish %journal
```
## Resources
- [App publishing/distribution documentation](/docs/userspace/dist/dist) -
Documentation covering third party desk composition, publishing and
distribution.
- [Glob documentation](/docs/userspace/dist/glob) - Comprehensive documentation
of handling front-end files.
- [Desk publishing guide](/docs/userspace/dist/guide) - A step-by-step guide to
creating and publishing a desk.

View File

@ -0,0 +1,52 @@
+++
title = "Full-Stack Walkthrough"
weight = 10
sort_by = "weight"
template = "sections/docs/chapters.html"
insert_anchor_links = "right"
+++
#### [1. Introduction](/docs/userspace/gall-2/1-intro)
An overview of the guide and table of contents.
#### [2. Types](/docs/userspace/gall-2/2-types)
Creating the `/sur` structure file for our `%journal` agent.
#### [3. Agent](/docs/userspace/gall-2/3-agent)
Creating the `%journal` agent itself.
#### [4. JSON](/docs/userspace/gall-2/5-json)
Writing a library to convert between our agent's marks and JSON. This lets our
React front-end poke our agent, and our agent send updates back to it.
#### [5. Marks](/docs/userspace/gall-2/4-marks)
Creating the mark files for the pokes our agent takes and updates it sends out.
#### [6. Eyre](/docs/userspace/gall-2/6-eyre)
A brief overview of how the webserver vane Eyre works.
#### [7. React App Setup](/docs/userspace/gall-2/7-react-setup)
Creating a new React app, installing the required packages, and setting up some
basic things for our front-end.
#### [8. React App Logic](/docs/userspace/gall-2/8-http-api)
Analysing the core logic of our React app, with particular focus on using
methods of the `Urbit` class from `@urbit/http-api` to communicate with our
agent.
#### [9. Desk and Glob](/docs/userspace/gall-2/9-web-scries)
Building and "globbing" our front-end, and putting together a desk for
distribution.
#### [10. Summary](/docs/userspace/gall-2/10-final)
Some final comments and additional resources.

View File

@ -0,0 +1,305 @@
+++
title = "1. Arvo"
weight = 5
template = "doc.html"
+++
This document is a prologue to the Gall guide. If you've worked though [Hoon
School](/docs/hoon/hoon-school/intro) (or have otherwise learned the basics of
Hoon), you'll likely be familiar with generators, but not with all the other
parts of the Arvo operating system or the way it fits together. We'll go over
the basic details here so you're better oriented to learn Gall agent
development. We'll not go into the internal workings of the kernel much, but
just what is necessary to understand it from the perspective of userspace.
## Arvo and its Vanes
[Arvo](/docs/arvo/overview) is the Urbit OS and kernel which is written in
[Hoon](/docs/glossary/hoon), compiled to [Nock](/docs/gossary/nock), and
executed by the runtime environment and virtual machine
[Vere](/docs/glossary/vere). Arvo has eight kernel modules called vanes:
[Ames](/docs/arvo/ames/ames), [Behn](/docs/arvo/behn/behn),
[Clay](/docs/arvo/clay/clay), [Dill](/docs/arvo/dill/dill),
[Eyre](/docs/arvo/eyre/eyre), [Gall](/docs/arvo/gall/gall),
[Iris](/docs/arvo/iris/iris), and [Jael](/docs/arvo/jael/jael).
Arvo itself has its own small codebase in `/sys/arvo.hoon` which primarily
implements the [transition function](/docs/arvo/overview#operating-function) `(State, Event) -> (State, Effects)` for
events injected by the runtime. It also handles inter-vane messaging, the
[scry](/docs/arvo/concepts/scry) system, and a couple of other things. Most of
the heavy lifting is done by the vanes themselves - Arvo itself typically just
routes events to the relevant vanes.
Each vane has its own state. Gall's state contains the agents it's managing,
Clay's state contains all the desks and their files, Jael's state contains all
its PKI data, etc. All the vanes and their states live in Arvo's state, so
Arvo's state ultimately contains the entire OS and its data.
Here's a brief summary of each of the vanes:
- **Ames**: This is both the name of Urbit's networking protocol, as well as the
vane that handles communications over it. All inter-ship communications are
done with Ames, but you'd not typically deal with it directly in a Gall agent
because Gall itself handles it for you.
- **Behn**: A simple timer vane. Behn lets your Gall agent set timers which go off
at the time specified and notify your agent.
- **Clay**: Filesystem vane. Clay is a revision-controlled, typed filesystem with a
built-in build system. Your agent's source code lives in Clay. Your agent's
source code and relevant files are automatically built and loaded upon
installation, so your Gall agent itself would not need to interact with Clay
unless you specifically wanted to read and write files.
- **Dill**: Terminal driver vane. You would not typically interact with Dill
directly; printing debug messages to the terminal is usually done with hinting runes
and functions rather than tasks to Dill, and CLI apps are mediated by a
sub-module of the `%hood` system agent called `%drum`. CLI apps will not be touched
on in this guide, but there's a separate [CLI
Apps](/docs/hoon/guides/cli-tutorial) guide which covers them if you're
interested.
- **Eyre**: Webserver vane. App web front-ends are served via Eyre. It's possible to
handle HTTP requests directly in a Gall agent (see the [Eyre
Guide](/docs/arvo/eyre/guide) for details), but usually you'd just serve a
front-end [glob](/docs/userspace/dist/glob) via the `%docket` agent, so you'd
not typically have your agent deal with Eyre directly.
- **Gall**: App management vane; this is where your agent lives.
- **Iris**: Web client vane. If you want your agent to query external web APIs and
the like, it's done via Iris. Oftentimes web API interactions are
spun out into [threads](/docs/userspace/threads/overview) to avoid
complicating the Gall agent itself, so a Gall agent would not necessary deal
with Iris directly, even if it made use of external APIs.
- **Jael**: Key infrastructure vane. Jael keeps track of PKI data for your ship and
other ships on the network. Jael's data is most heavily used by Ames, and
since Gall handles Ames communications for you, you'd not typically deal with
Jael directly unless your were specifically writing something that made use of
its data.
## Userspace
Gall agents live in "userspace" as opposed to "kernelspace". Kernelspace is Arvo
and its eight vanes. Userspace is primarily Gall agents, generators, threads,
front-ends, and all of their related files in Clay. The distinction looks
something like this:
[![kernelspace/userspace diagram](https://media.urbit.org/docs/userspace/gall-guide/kernelspace-userspace-diagram.svg)](https://media.urbit.org/docs/userspace/gall-guide/kernelspace-userspace-diagram.svg)
By and large, Gall _is_ the userspace vane - the majority of userspace is either
Gall agents, or things used by Gall agents. Apart from the agents themselves,
there's also:
- **Generators**: These are basically scripts. You'll likely already be familiar
with these from Hoon School. Aside from learning exercises, their main use is
to make interacting with Gall agents easier from the dojo. Rather than having
to manually craft `%poke`s to agents, generators can take a simpler input,
reformat it into what the agent actually expects, and poke the agent with it.
When you do something like `:dojo|wipe` in the dojo, you're actually running
the `/gen/dojo/wipe.hoon` generator and poking the `%dojo` agent with its
output.
- **Threads**: While generators are for strictly synchronous operations, threads
make it easy to implement sequences of asynchronous operations. Threads are
managed by the `%spider` agent. They can be used as mere scripts like
generators, but their main purpose is for performing complex IO. For example,
suppose you need to query some external web API, then with the data in its
response you make another API call, and then another, before finally having
the data you need. If one of the API calls fails, your Gall agent is
potentially left in a strange intermediary state. Instead, you can put all the
IO logic in a separate thread which is completely atomic. That way the Gall
agent only has to deal with the two conditions of success or failure. Writing
threads is covered in a [separate
guide](/docs/userspace/threads/basics/fundamentals), which you might like to
work through after completing the Gall Guide.
- **Front-end**: Web UIs. It's possible for Gall agents to handle HTTP requests
directly and dynamically produce responses, but it's also possible to have a
static [glob](/docs/userspace/dist/glob) of HTML, CSS, Javascript, images,
etc, which are served to the client like an ordinary web app. Such front-end
files are typically managed by the `%docket` agent which serves them via Eyre.
The [software distribution guide](/docs/userspace/dist/guide) covers this in
detail, and you might like to work through it after completing the Gall Guide.
## The Filesystem
On an ordinary OS, you have persistent disk storage and volatile memory. An
application is launched by reading an executable file on disk, loading it into
memory and running it. The application will maybe read some more files from
disk, deserialize them into data structures in memory, perform some computations
and manipulate the data, then serialize the new data and write it back to disk.
This process is necessary because persistent storage is too slow to operate on
directly and the fast memory is wiped when it loses power. The result is that
all non-emphemeral data is ultimately stored as files in the filesystem on disk.
Arvo on the other hand is completely different.
Arvo has no concept of volatile memory - its whole state is assumed to be
persistent. This means it's unnecessary for a Gall agent to write its data to
the filesystem or read it in from the filesystem - an agent can just modify its
state in situ and leave it there. The urbit runtime writes events to disk and
backs up Arvo's state on the host OS to ensure data integrity but Arvo itself
isn't concerned with such details.
The result of this total persistence is that the filesystem—Clay—does not have
the same fundamental role as on an ordinary OS. In Arvo, very little of its data
is actually stored in Clay. The vast majority is just in the state of Gall
agents and vanes. For example, none of the chat messages, notebooks, etc, in the
Groups app exist in Clay - they're all in the state of the `%graph-store` agent.
For the most part, Clay just stores source code.
Clay has a few unique features - it's a typed filesystem, with all file types
defined in `mark` files. It's revision controlled, in a similar way to git. It
also has a built-in build system (formerly a separate vane called Ford, but was
merged with Clay in 2020 to make atomicity of upgrades easier). We'll look at
some of these features in more detail later in the guide.
## Desk Anatomy
The fundamental unit in Clay is a desk. Desks are kind of like git repositories.
By default, new urbits come with the following desks included: `%base`, `%garden`, `%landscape`, `%webterm`
and `%bitcoin`.
- `%base` - This desk contains the kernel as well as some core agents and utilities.
- `%garden` - This desk contains agents and utilities for managing apps, and the
home screen that displays other app tiles.
- `%landscape` - This desk contains everything for the Groups app.
- `%webterm` - This desk is for the web dojo app.
- `%bitcoin` - This desk is for the bitcoin wallet app and bitcoin provider.
You'll typically also have a `%kids` desk, which is just a copy of `%base` from
upstream that sponsored ships (moons in the case of a planet, planets in the
case of a star) sync their `%base` desk from. Any third-party apps you've
installed will also have their own desks.
Desks are typically assumed to store their files according to the following directory structure:
```
desk
├── app
├── gen
├── lib
├── mar
├── sur
├── sys
├── ted
└── tests
```
- `app`: Gall agents.
- `gen`: Generators.
- `lib`: Libraries - these are imported with the `/+` Ford rune.
- `mar`: `mark` files, which are filetype definitions.
- `sur`: Structures - these typically contain type definitions and structures,
and would be imported with the `/-` Ford rune.
- `sys`: Kernel files and standard library. Only the `%base` desk has this
directory, it's omitted entirely in all other desks.
- `ted`: Threads.
- `tests`: Unit tests, to be run by the `%test` thread. This is often omitted in distributed desks.
This directory hierarchy is not strictly enforced, but most tools expect things
to be in their right place. Any of these folders can be omitted if they'd
otherwise be empty.
As mentioned, the `%base` desk alone includes a `/sys` directory containing the
kernel and standard libraries. It looks like this:
```
sys
├── arvo.hoon
├── hoon.hoon
├── lull.hoon
├── vane
│ ├── ames.hoon
│ ├── behn.hoon
│ ├── clay.hoon
│ ├── dill.hoon
│ ├── eyre.hoon
│ ├── gall.hoon
│ ├── iris.hoon
│ └── jael.hoon
└── zuse.hoon
```
- `arvo.hoon`: Source code for Arvo itself.
- `hoon.hoon`: Hoon standard library and compiler.
- `lull.hoon`: Mostly structures and type definitions for interacting with
vanes.
- `vane`: This directory contains the source code for each of the vanes.
- `zuse.hoon`: This is an extra utility library. It mostly contains
cryptographic functions and functions for dealing with web data like JSON.
The chain of dependency for the core kernel files is `hoon.hoon` -> `arvo.hoon`
-> `lull.hoon` -> `zuse.hoon`. For more information, see the [Filesystem
Hierarchy](/docs/arvo/reference/filesystem) documentation.
In addition to the directories discussed, there's a handful of special files a
desk might contain. All of them live in the root of the desk, and all are
optional in the general case, except for `sys.kelvin`, which is mandatory.
- `sys.kelvin`: Specifies the kernel version with which the desk is compatible.
- `desk.bill`: Specifies Gall agents to be auto-started upon desk installation.
- `desk.ship`: If the desk is being republished, the original publisher can be
specified here.
- `desk.docket-0`: Configures the front-end, tile, and other metadata for desks
which include a home screen app.
Each desk must be self-contained; it must include all the `mark`s, libraries,
threads, etc, that it needs. The one exception is the kernel and standard
libraries from the `%base` desk. Agents, threads and generators in other desks
all have these libraries available to them in their subject.
## APIs
You should now have a general idea of the different parts of Arvo, but how does
a Gall agent interact with these things?
There are two basic ways of interacting with other parts of the system: by
scrying into them, and by passing them messages and receiving messages in response.
There are also two basic things to interact with: vanes, and other agents.
- Scries: The scry system allows you to access the state of other agents and
vanes in a read-only fashion. Scries can be performed from any context with
the dotket (`.^`) rune. Each vane has "scry endpoints" which define what you
can read, and these are comprehensively documented in the Scry Reference of
each vane's section of the [Arvo documentation](/docs/arvo/overview). Agents
define scry endpoints in the `+on-peek` arm of their agent core.
Scries can only be done on the local ship; it is not yet possible to perform
scries over the network (but this functionality is planned for the future). There is a separate [guide to
scries](/docs/arvo/concepts/scry) which you might like to read through for
more details.
- Messages:
- Vanes: Each vane has a number of `task`s it can be passed and `gift`s it can
respond with in its respective section of `lull.hoon`. These might do all
manner of things, depending on the vane. For example, Iris might fetch an
external HTTP resource for you, Clay might read or build a specified file,
etc. The `task`s and `gift`s of each vane are comprehensively documented in the
API Reference of each vane's section of the [Arvo
documentation](/docs/arvo/overview).
- Agents: These can be `%poke`d with some data, which is a request to perform a single action. They can also be `%watch`ed,
which means to subscribe for updates. We'll discuss these in detail later in
the guide.
Here's a simplified diagram of the ways an agent can interact with other parts
of the system:
![api diagram](https://media.urbit.org/docs/userspace/gall-guide/api-diagram.svg)
Things like `on-poke` are arms of the agent core. Don't worry about their
meaning for now, we'll discuss them in detail later in the guide.
Inter-agent messaging can occur over the network, so you can interact with
agents on other ships as well as local ones. You can only talk to local vanes,
but some vanes like Clay are able to make requests to other ships on your
behalf. Note this summary is simplified - vanes don't just talk in `task`s and
`gift`s in all cases. For example, requests from HTTP clients through Eyre (the
webserver vane) behave more like those from agents than vanes, and a couple of
other vanes also have some different behaviours. Agent interactions are also a
little more complicated, and we'll discuss that later, but the basic patterns
described here cover the majority of cases.
## Environment Setup
Before proceeding with the Gall Guide, you'll need to have an appropriate text
editor installed and configured, and know how to work with a fake ship for
development. Best practices are described in the [environment setup
guide](/docs/development/environment). Example agents and other code throughout
this guide will just be committed to the `%base` desk of a fake ship, but it's a
good idea to have a read through that guide for when you begin work on your own
apps.

View File

@ -0,0 +1,363 @@
+++
title = "10. Scries"
weight = 50
template = "doc.html"
+++
In this lesson we'll look at scrying agents, as well as how agents handle such
scries. If you're not at all familiar with performing scries in general, have a
read through the [Scry Guide](/docs/arvo/concepts/scry), as well as the [dotket
rune documentation](/docs/hoon/reference/rune/dot#-dotket).
## Scrying
A scry is a read-only request to Arvo's global namespace. Vanes and agents
define _scry endpoints_ which allow data to be requested from their states. The
endpoints can process the data in any way before returning it, but they cannot
alter the actual state - scries can only read, not modify.
Most of the time, scry requests are handled by Arvo, which routes the request to
the appropriate vane. When you scry a Gall agent you actually scry Gall itself.
Gall interprets the request, runs it on the specified agent, and then returns
the result. Scries are performed with the
[dotket](/docs/hoon/reference/rune/dot#-dotket) (`.^`) rune. Here's a summary of
their format:
![scry summary diagram](https://media.urbit.org/docs/arvo/scry-diagram-v2.svg)
A note on `care`s: Cares are most carefully implemented by Clay, where they specify
submodules and have tightly defined behaviors. For Gall agents, most of these
don't have any special behavior, and are just used to indicate the general kind
of data produced by the endpoint. There are a handful of exceptions to this:
`%d`, `%e`, `%u` and `%x`.
#### `%d`
A scry to Gall with a `%d` `care` and no `path` will produce the `desk` in which
the specified agent resides. For example:
```
> .^(desk %gd /=hark-store=)
%garden
> .^(desk %gd /=hood=)
%base
```
#### `%e`
A scry to Gall with a `%e` `care`, a `desk` rather than agent in the `desk`
field of the above diagram, and no path, will produce a set of all installed
agents on that desk and their status. For example:
```
> .^((set [=dude:gall live=?]) %ge /=garden=)
{ [dude=%hark-system-hook live=%.y]
[dude=%treaty live=%.y]
[dude=%docket live=%.y]
[dude=%settings-store live=%.y]
[dude=%hark-store live=%.y]
}
```
#### `%u`
A scry to Gall with a `%u` `care` and no `path` will check whether or not the
specified agent is installed and running:
```
> .^(? %gu /=btc-wallet=)
%.y
> .^(? %gu /=btc-provider=)
%.n
> .^(? %gu /=foobar=)
%.n
```
#### `%x`
A scry to Gall with a `%x` `care` will be passed to the agent for handling. Gall
handles `%x` specially, and expects an extra field at the end of the `path` that
specifies the `mark` to return. Gall will take the data produced by the
specified endpoint and try to convert it to the given mark, crashing if the mark
conversion fails. The extra field specifying the mark is not passed through to
the agent itself. Here's a couple of examples:
```
> =store -build-file /=landscape=/sur/graph-store/hoon
> .^(update:store %gx /=graph-store=/keys/noun)
[p=~2021.11.18..10.50.41..c914 q=[%keys resources={[entity=~zod name=%dm-inbox]}]]
> (crip (en-json:html .^(json %gx /=graph-store=/keys/json)))
'{"graph-update":{"keys":[{"name":"dm-inbox","ship":"zod"}]}}'
```
The majority of Gall agents simply take `%x` `care`s in their scry endpoints,
but in principle it's possible for a Gall agent to define a scry endpoint that
takes any one of the `care`s listed in the diagram above. An agent's scry
endpoints are defined in its `on-peek` arm, which we'll look at next.
## Handling scries
When a scry is performed on a Gall agent, Gall will strip out some extraneous
parts, and deliver it to the agent's `on-peek` arm as a `path`. The `path` will
only have two components from the diagram above: The _care_ and the _path_. For
example, a scry of `.^(update:store %gx /=graph-store=/keys/noun)` will come
into the `on-peek` arm of `%graph-store` as `/x/keys`.
The `on-peek` arm produces a `(unit (unit cage))`. The reason for the double
`unit` is that Arvo interprets `~` to mean the scry path couldn't be resolved,
and interprets `[~ ~]` to means it resolved to nothing. In either case the
dotket expression which initiated the scry will crash. The `cage` will contain
the actual data to return.
An ordinary `on-peek` arm, therefore, begins like so:
```hoon
++ on-peek
|= =path
^- (unit (unit cage))
....
```
Typically, you'd handle the `path` similarly to `on-watch`, as we discussed in
the lesson on subscriptions. You'd use something like a wutlus expression to
test the value of the `path`, defining your scry endpoints like so:
```hoon
?+ path (on-peek:def path)
[%x %some %path ~] ....
[%x %foo ~] ....
[%x %blah @ ~]
=/ =ship (slav %p i.t.t.path)
.....
....
```
Each endpoint would then compose the `(unit (unit cage))`. The simplest way to
format it is like:
```hoon
``noun+!>('some data')
```
If it requires a more complex expression to retrieve or compose the data, you
can do something like:
```hoon
:^ ~ ~ %some-mark
!> ^- some-type
:+ 'foo'
'bar'
'baz'
```
Previously we discussed custom `mark` files. Such mark files are most commonly
used when the data might be accessed through Eyre's HTTP API, and therefore
required JSON conversion methods. We cover such things separately in the
[Full-Stack Walkthrough](/docs/userspace/full-stack/1-intro), but note that if
that's the case for your agent, you may wish to also have your scry endpoints
return data with your custom `mark` so it can easily be converted to JSON when
accessed from the web.
In some cases, typically with scry `path`s that contain wildcards like the `[%x %blah @ ~]` example above, your agent may not always be able to find the
requested data. In such cases, you can just produce a cell of `[~ ~]` for the
`(unit (unit cage))`. Keep in mind, however, that this will result in a crash
for the dotket expression which initiated the scry. In some cases you may want
that, but in other cases you may not, so instead you could wrap the data inside
the `vase` in a `unit` and have _that_ be null instead. It all depends on the
needs of your particular application and its clients.
## Example
Here's a simple example agent with three scry endpoints:
#### `peeker.hoon`
```hoon
/+ default-agent, dbug
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 data=(map @p @t)]
+$ card card:agent:gall
--
%- agent:dbug
=| state-0
=* state -
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
::
++ on-init
^- (quip card _this)
`this
::
++ on-save
^- vase
!>(state)
::
++ on-load
|= old-state=vase
^- (quip card _this)
=/ old !<(versioned-state old-state)
?- -.old
%0 `this(state old)
==
::
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?> =(src.bowl our.bowl)
?+ mark (on-poke:def mark vase)
%noun
`this(data (~(put by data) !<([@p @t] vase)))
==
::
++ on-watch on-watch:def
++ on-leave on-leave:def
::
++ on-peek
|= =path
^- (unit (unit cage))
?+ path (on-peek:def path)
[%x %all ~] ``noun+!>(data)
::
[%x %has @ ~]
=/ who=@p (slav %p i.t.t.path)
``noun+!>(`?`(~(has by data) who))
::
[%x %get @ ~]
=/ who=@p (slav %p i.t.t.path)
=/ maybe-res (~(get by data) who)
?~ maybe-res
[~ ~]
``noun+!>(`@t`u.maybe-res)
==
::
++ on-agent on-agent:def
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
The agent's `on-poke` arm takes a cell of `[@p @t]` and saves it in the agent's
state, which contains a `(map @p @t)` called `data`. The `on-peek` arm is:
```hoon
++ on-peek
|= =path
^- (unit (unit cage))
?+ path (on-peek:def path)
[%x %all ~] ``noun+!>(data)
::
[%x %has @ ~]
=/ who=@p (slav %p i.t.t.path)
``noun+!>(`?`(~(has by data) who))
::
[%x %get @ ~]
=/ who=@p (slav %p i.t.t.path)
=/ maybe-res (~(get by data) who)
?~ maybe-res
[~ ~]
``noun+!>(`@t`u.maybe-res)
==
```
It defines three scry endpoints, all using a `%x` `care`: `/x/all`,
`/x/has/[ship]`, and `/x/get/[ship]`. The first will simply return the entire
`(map @p @t)` in the agent's state. The second will check whether the given ship
is in the map and produce a `?`. The third will produce the `@t` for the given
`@p` if it exists in the map, or else return `[~ ~]` to indicate the data
doesn't exist, producing a crash in the dotket expression.
Let's try it out. Save the agent above as `/app/peeker.hoon` in the `%base`
desk, `|commit %base` and start the agent with `|rein %base [& %peeker]`.
First, let's add some data to the map:
```
> :peeker [~zod 'foo']
>=
> :peeker [~nut 'bar']
>=
> :peeker [~wet 'baz']
>=
```
Now if we use `dbug` to inspect the state, we'll see the data has been added:
```
> {[p=~wet q='baz'] [p=~nut q='bar'] [p=~zod q='foo']}
> :peeker +dbug [%state %data]
>=
```
Next, let's try the `/x/all` scry endpoint:
```
> .^((map @p @t) %gx /=peeker=/all/noun)
{[p=~wet q='baz'] [p=~nut q='bar'] [p=~zod q='foo']}
```
The `/x/has/[ship]` endpoint:
```
> .^(? %gx /=peeker=/has/~zod/noun)
%.y
> .^(? %gx /=peeker=/has/~wet/noun)
%.y
> .^(? %gx /=peeker=/has/~nes/noun)
%.n
```
And finally, the `/x/get/[ship]` endpoint:
```
> .^(@t %gx /=peeker=/get/~zod/noun)
'foo'
> .^(@t %gx /=peeker=/get/~wet/noun)
'baz'
```
We'll now try scrying for a ship that doesn't exist in the map. Note that due to
a bug at the time of writing, the resulting crash will wipe the dojo's subject,
so don't try this if you've got anything pinned there that you want to keep.
```
~zod:dojo> .^(@t %gx /=peeker=/get/~nes/noun)
crash!
```
## Summary
- Scries are read-only requests to vanes or agents which can be done inside any
code, during its evaluation.
- Scries are performed with the dotket (`.^`) rune.
- Scries will fail if the scry endpoint does not exist, the requested data does
not exist, or the data does not nest in the return type specified.
- Scries can only be performed on the local ship, not on remote ships.
- Gall scries with an agent name in the `desk` field will be passed to that
agent's `on-peek` arm for handling.
- Gall scries with a `%x` `care` take a `mark` at the end of the scry `path`,
telling Gall to convert the data returned by the scry endpoint to the mark
specified.
- The `on-peek` arm takes a `path` with the `care` in the head and the `path` part
of the scry in the tail, like `/x/some/path`.
- The `on-peek` arm produces a `(unit (unit cage))`. The outer `unit` is null if
the scry endpoint does not exist, and the inner `unit` is null if the data
does not exist.
## Exercises
- Have a read through the [Scry Guide](/docs/arvo/concepts/scry).
- Have a read through the [dotket rune
documentation](/docs/hoon/reference/rune/dot#-dotket).
- Run through the [Example](#example) yourself if you've not done so already.
- Try adding another scry endpoint to the `peeker.hoon` agent, which uses a
[`wyt:by`](/docs/hoon/reference/stdlib/2i#wytby) map function to produce the
number of items in the `data` map.
- Have a look through the `on-peek` arms of some other agents on your ship, and
try performing some scries to some of the endpoints.

View File

@ -0,0 +1,146 @@
+++
title = "11. Failure"
weight = 55
template = "doc.html"
+++
In this lesson we'll cover the last agent arm we haven't touched on yet:
`on-fail`. We'll also touch on one last concept, which is the _helper core_.
# Failures
When crashes or errors occur in certain cases, Gall passes them to an agent's
`on-fail` arm for handling. This arm is very seldom used, almost all agents
leave it for `default-agent` to handle, which just prints the error message to
the terminal. While you're unlikely to use this arm, we'll briefly go over its
behavior for completeness.
`on-fail` takes a `term` error message and a `tang`, typically containing a
stack trace, and often with additional messages about the error. If it weren't
delegated to `on-fail:def`, it would begin with:
```hoon
++ on-fail
|= [=term =tang]
^- (quip card _this)
....
```
Gall calls `on-fail` in four cases:
- When there's a crash in the `on-arvo` arm.
- When there's a crash in the `on-agent` arm.
- When there's a crash in the `on-leave` arm.
- When an agent produces a `%watch` card but the `wire`, ship, agent and `path`
specified are the same as an existing subscription.
For an `on-arvo` failure, the `term` will always be `%arvo-response`, and the
`tang` will contain a stack trace.
For `on-agent`, the `term` will be the head of the `sign` (`%poke-ack`, `%fact`,
etc). The `tang` will contain a stack trace and a message of "closing
subscription".
For an `on-leave` failure, the `term` will always be `%leave`, and the `tang`
will contain a stack trace.
For a `%watch` failure, the `term` will be `%watch-not-unique`. The `tang` will
include a message of "subscribe wire not unique", as well as the agent name, the
`wire`, the target ship and the target agent.
How you might handle these cases (if you wanted to manually handle them) depends
on the purpose of your particular agent.
## Helper core
Back in the lesson on lustar virtual arms, we briefly mentioned a common pattern
is to define a deferred expression for a helper core named `hc` like:
```hoon
+* this .
def ~(. (default-agent this %.n) bowl)
hc ~(. +> bowl)
```
The name `do` is also used frequently besides `hc`.
A helper core is a separate core composed into the subject of the agent core,
containing useful functions for use by the agent arms. Such a helper core would
typically contain functions that would only ever be used internally by the
agent - more general functions would usually be included in a separate `/lib`
library and imported with a [faslus](/docs/arvo/ford/ford#ford-runes) (`/+`)
rune. Additionally, you might recall that the example agent of the
[subscriptions lesson](/docs/userspace/gall-guide/8-subscriptions#example) used
a barket (`|^`) rune to create a door in the `on-poke` arm with a separate
`handle-poke` arm. That approach is typically used when functions will only be
used in that one arm. The helper core, on the other hand, is useful when
functions will be used by multiple agent arms.
The conventional pattern is to have the helper core _below_ the agent core, so
the structure of the agent file is like:
```
[imports]
[state types core]
[agent core]
[helper core]
```
Recall that the build system will implicitly compose any discrete expressions.
If we simply added the helper core below the agent core, the agent core would be
composed into the subject of the helper core, which is the opposite of what we
want. Instead, we must inversely compose the two cores with a
[tisgal](/docs/hoon/reference/rune/tis#-tisgal) (`=<`) rune. We add the tisgal
rune directly above the agent core like:
```hoon
.....
=<
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
hc ~(. +> bowl)
++ on-init
.....
```
We can then add the helper core below the agent core. The helper core is most
typically a door like the agent core, also with the `bowl` as its sample. This
is just so any functions you define in it have ready access to the `bowl`. It
would look like:
```hoon
|_ =bowl:gall
++ some-function ...
++ another ....
++ etc ...
--
```
Back in the lustar virtual arm of the agent core, we give it a deferred expression name of `hc`
and call it like so:
```hoon
hc ~(. +> bowl)
```
To get to the helper core we composed from within the door, we use a
[censig](/docs/hoon/reference/rune/cen#-censig) expression to call `+>` of the
subject (`.`) with the `bowl` as its sample. After that, any agent arms can make
use of helper core functions by calling them like `(some-function:hc ....)`.
## Summary
- `on-fail` is called in certain cases of crashes or failures.
- Crashes in the `on-agent`, `on-arvo`, or `on-watch` arms will trigger a call
to `on-fail`.
- A non-unique `%watch` `card` will also trigger a call to `on-fail`.
- `on-fail` is seldom used - most agents just leave it to `%default-agent` to
handle, which just prints the error to the terminal.
- A helper core is an extra core of useful functions, composed into the subject
of the agent core.
- Helper cores are typically placed below the agent core, and composed with a
tisgal (`=<`) rune.
- The helper core is typically a door with the `bowl` as a sample.
- The helper core is typically given a name of `hc` or `do` in the lustar virtual arm
of the agent core.

View File

@ -0,0 +1,50 @@
+++
title = "12. Next Steps"
weight = 60
template = "doc.html"
+++
We've now covered all the arms of a Gall agent, and everything you need to know
to start writing your own agent.
The things haven't touched on yet are front-end development and integration,
Eyre's HTTP API for communicating with agents from the web, and dealing with
JSON data. The [Full-stack Walkthrough](/docs/userspace/full-stack/1-intro)
covers these aspects of Urbit app development, and it also puts into practice
many of the concepts we've discussed in this guide, so you might like to work
through that next. In addition to that walkthrough, you can refer to the
following documents for help writing a web front-end for your app:
- [Eyre's external API reference](/docs/arvo/eyre/external-api-ref) - This
explains Eyre's HTTP API, through which a browser or other HTTP client can
interact with a Gall agent.
- [Eyre's usage guide](/docs/arvo/eyre/guide) - This walks through examples of
using Eyre's HTTP API.
- [JSON guide](/docs/hoon/guides/json-guide) - This walks through the basics of
converting Hoon data structures to JSON, for use with a web client. It also
covers JSON conversion methods in `mark` files.
- [Zuse reference](/docs/hoon/reference/zuse/table-of-contents) - This contains
documentation of all JSON encoding and decoding functions included in the
`zuse.hoon` utility library.
- [The software distribution guide](/docs/userspace/dist/dist) - This covers
everything you need to know to distribute apps to other ships. It includes
details of bundling a web front-end and serving it to the user in the browser.
- [The HTTP API guide](/docs/userspace/http-api-guide) - This is a reference
and guide to using the `@urbit/http-api` NPM module.
- [The Sail guide](/docs/hoon/guides/sail) - Sail is a domain-specific language
for composing XML structure in Hoon. It can be used to compose front-ends for
Urbit apps directly in agents, as an alternative approach to having a
separate Javascript app.
In addition to these documents about creating a web-based user interface for
your app, there are some other guides you might like to have a look at:
- [Threads guide](/docs/userspace/threads/overview) - Threads are like transient
agents, typically used for handling complex I/O functionality for Gall
agents - like interacting with an external HTTP API.
- [The software distribution guide](/docs/userspace/dist/dist) - This explains
how to set up a desk for distribution, so other people can install your app.
For more development resources, and for ways to get involved with the Urbit
development community, see the [Urbit Developers
site](https://developers.urbit.org/).

View File

@ -0,0 +1,305 @@
+++
title = "2. The Agent Core"
weight = 10
template = "doc.html"
+++
In this lesson we'll look at the basic type and structure of a Gall agent.
A Gall agent is a [door](/docs/glossary/door) with exactly ten [arms](/docs/glossary/arm). Each arm is responsible for
handling certain kinds of events that Gall feeds in to the agent. A door is
just a [core](/docs/glossary/core) with a sample - it's made with the
[barcab](/docs/hoon/reference/rune/bar#_-barcab) rune (`|_`) instead of the
usual [barcen](/docs/hoon/reference/rune/bar#-barcen) rune (`|%`).
## The ten arms
We'll discuss each of the arms in detail later. For now, here's a quick summary.
The arms of an agent can be be roughly grouped by purpose:
### State management
These arms are primarily for initializing and upgrading an agent.
- `on-init`: Handles starting an agent for the first time.
- `on-save`: Handles exporting an agent's state - typically as part of the
upgrade process but also when suspending, uninstalling and debugging.
- `on-load`: Handles loading a previously exported agent state - typically as
part of the upgrade process but also when resuming or reinstalling an agent.
### Request handlers
These arms handle requests initiated by outside entities, e.g. other agents,
HTTP requests from the front-end, etc.
- `on-poke`: Handles one-off requests, actions, etc.
- `on-watch`: Handles subscription requests from other entities.
- `on-leave`: Handles unsubscribe notifications from other, previously subscribed
entities.
### Response handlers
These two arms handle responses to requests our agent previously initiated.
- `on-agent`: Handles request acknowledgements and subscription updates from
other agents.
- `on-arvo`: Handles responses from vanes.
### Scry handler
- `on-peek`: Handles local read-only requests.
### Failure handler
- `on-fail`: Handles certain kinds of crash reports from Gall.
## Bowl
The sample of a Gall agent door is always a `bowl:gall`. Every time an event triggers the
agent, Gall populates the bowl with things like the current date-time, fresh entropy,
subscription information, which ship the request came from, etc, so that all the
arms of the agent have access to that data. For the exact structure and contents
of the bowl, have a read through [its entry in the Gall vane types
documentation](/docs/arvo/gall/data-types#bowl).
One important thing to note is that the bowl is only repopulated when there's a
new Arvo event. If a local agent or web client were to send multiple
messages to your agent at the same time, these would all arrive in the same
event. This means if your agent depended on a unique date-time or entropy to
process each message, you could run into problems if your agent doesn't account
for this possibility.
## State
If you've worked through [Hoon School](/docs/hoon/hoon-school/intro), you may
recall that a core is a cell of `[battery payload]`. The battery is the core
itself compiled to Nock, and the payload is the subject which it operates on.
For an agent, the payload will at least contain the bowl, the usual Hoon and `zuse` standard
library functions, and the **state** of the agent. For example, if your agent
were for an address book app, it might keep a `map` of ships to address book
entries. It might add entries, delete entries, and modify entries. This address
book `map` would be part of the state stored in the payload.
## Transition function
If you recall from the prologue, the whole Arvo operating system works on the
basis of a simple transition function `(event, oldState) -> (effects, newState)`. Gall agents also function the same way. Eight of an agent's ten arms
produce the same thing, a cell of:
- **Head**: A list of effects called `card`s (which we'll discuss later).
- **Tail**: A new agent core, possibly with a modified payload.
It goes something like this:
1. An event is routed to Gall.
2. Gall calls the appropriate arm of the agent, depending on the kind of event.
3. That arm processes the event, returning a list `card`s to be sent off, and
the agent core itself with a modified state in the payload.
4. Gall sends the `card`s off and saves the modified agent core.
5. Rinse and repeat.
## Virtualization
When a crash occurs in the kernel, the system usually aborts the computation and
discards the event as though it never happened. Gall on the other hand
virtualizes all its agents, so this doesn't happen. Instead, when a crash occurs
in an agent, Gall intercepts the crash and takes appropriate action depending on
the kind of event that caused it. For example, if a poke from another ship
caused a crash in the `on-poke` arm, Gall will respond to the poke with a
"nack", a negative acknowledgement, telling the original ship the poke was
rejected.
What this means is that you can intentionally design your agent to crash in
cases it can't handle. For example, if a poke comes in with an unexpected
`mark`, it crashes. If a permission check fails, it crashes. This is quite
different to most programs written in procedural languages, which must handle
all exceptions to avoid crashing.
## Example
Here's about the simplest valid Gall agent:
```hoon
|_ =bowl:gall
++ on-init `..on-init
++ on-save !>(~)
++ on-load |=(vase `..on-init)
++ on-poke |=(cage !!)
++ on-watch |=(path !!)
++ on-leave |=(path `..on-init)
++ on-peek |=(path ~)
++ on-agent |=([wire sign:agent:gall] !!)
++ on-arvo |=([wire sign-arvo] !!)
++ on-fail |=([term tang] `..on-init)
--
```
This is just a dummy agent that does absolutely nothing - it has no state and
rejects all messages by crashing. Typically we'd cast this to an `agent:gall`,
but in this instance we won't so it's easier to examine its structure in the
dojo. We'll get to what each of the arms do later. For now, we'll just consider
a few particular points.
Firstly, note its structure - it's a door (created with `|_`) with a sample of
`bowl:gall` and the ten arms described earlier.
Secondly, you'll notice some of the arms return:
```hoon
`..on-init
```
A backtick at the beginning is an irregular syntax meaning "prepend with null",
so for example, in the dojo:
```
> `50
[~ 50]
```
The next part has `..on-init`, which means "the subject of the `on-init` arm".
The subject of the `on-init` arm is our whole agent. In the [transition
function](#transition-function) section we mentioned that most arms return a
list of effects called `card`s and a new agent core. Since an empty list is `~`,
we've created a cell that fits that description.
Let's examine our agent. In the dojo of a fake ship, mount the `%base` desk with
`|mount %base`. On the Unix side, navigate to `/path/to/fake/ship/base`, and save
the above agent in the `/app` directory as `skeleton.hoon`. Back in the dojo,
commit the file to the desk with `|commit %base`.
For the moment we won't install our `%skeleton` agent. Instead, we'll use the
`%build-file` thread to build it and save it in the dojo's subject so we can
have a look. Run the following in the dojo:
```
> =skeleton -build-file %/app/skeleton/hoon
```
Now, let's have a look:
```
> skeleton
< 10.fxw
[ bowl
[ [our=@p src=@p dap=@tas]
[ wex=nlr([p=[wire=/ ship=@p term=@tas] q=[acked=?(%.y %.n) path=/]])
sup=nlr([p=it(/) q=[p=@p q=/]])
]
act=@ud
eny=@uvJ
now=@da
byk=[p=@p q=@tas r=?([%da p=@da] [%tas p=@tas] [%ud p=@ud])]
]
<17.zbp 33.wxp 14.dyd 53.vlb 77.wir 232.wfe 51.qbt 123.zao 46.hgz 1.pnw %140>
]
>
```
The dojo pretty-prints cores with a format of `number-of-arms.hash`. You can see
the head of `skeleton` is `10.fxw` - that's the battery of the core, our 10-arm
agent. If we try printing the head of `skeleton` we'll see it's a whole lot of
compiled Nock:
```
> -.skeleton
[ [ 11
[ 1.953.460.339
1
[ 7.368.801
7.957.707.045.546.060.659
1.852.796.776
0
]
[7 15]
7
34
]
...(truncated for brevity)...
```
The battery's not too important, it's not something we'd ever touch in practice.
Instead, let's have a look at the core's payload by printing the tail of
`skeleton`. We'll see its head is the `bowl:gall` sample we specified, and then
the tail is just all the usual standard library functions:
```
> +.skeleton
[ bowl
[ [our=~zod src=~zod dap=%$]
[wex={} sup={}]
act=0
eny=0v0
now=~2000.1.1
byk=[p=~zod q=%$ r=[%ud p=0]]
]
<17.zbp 33.wxp 14.dyd 53.vlb 77.wir 232.wfe 51.qbt 123.zao 46.hgz 1.pnw %140>
]
```
Currently `skeleton` has no state, but of course in practice you'd want to store
some actual data. We'll add `foo=42` as our state with the `=+` rune at the
beginning of our agent:
```hoon
=+ foo=42
|_ =bowl:gall
++ on-init `..on-init
++ on-save !>(~)
++ on-load |=(vase `..on-init)
++ on-poke |=(cage !!)
++ on-watch |=(path !!)
++ on-leave |=(path `..on-init)
++ on-peek |=(path ~)
++ on-agent |=([wire sign:agent:gall] !!)
++ on-arvo |=([wire sign-arvo] !!)
++ on-fail |=([term tang] `..on-init)
--
```
Save the modified `skeleton.hoon` in `/app` on the `%base` desk like before, and run `|commit %base` again in the dojo. Then, rebuild it with the same `%build-file` command as before:
```
> =skeleton -build-file %/app/skeleton/hoon
```
If we again examine our agent core's payload by looking at the tail of
`skeleton`, we'll see `foo=42` is now included:
```
> +.skeleton
[ bowl
[ [our=~zod src=~zod dap=%$]
[wex={} sup={}]
act=0
eny=0v0
now=~2000.1.1
byk=[p=~zod q=%$ r=[%ud p=0]]
]
foo=42
<17.zbp 33.wxp 14.dyd 53.vlb 77.wir 232.wfe 51.qbt 123.zao 46.hgz 1.pnw %140>
]
```
## Summary
- A Gall agent is a door with exactly ten specific arms and a sample of `bowl:gall`.
- Each of the ten arms handle different kinds of events - Gall calls the
appropriate arm for the kind event it receives.
- The ten arms fit roughly into five categories:
- State management.
- Request handlers.
- Response handlers.
- Scry handler.
- Failure handler.
- The state of an agent—the data it's storing—lives in the core's payload.
- Most arms produce a list of effects called `card`s, and a new agent core with
a modified state in its payload.
## Exercises
- Run through the [Example](#example) yourself on a fake ship if you've not done
so already.
- Have a look at the [`bowl` entry in the Gall data types
documentation](/docs/arvo/gall/data-types#bowl) if you've not done so already.

View File

@ -0,0 +1,285 @@
+++
title = "3. Imports and Aliases"
weight = 15
template = "doc.html"
+++
In the last lesson we looked at the most basic aspects of a Gall agent's
structure. Before we get into the different agent arms in detail, there's some
boilerplate to cover that makes life easier when writing Gall agents.
## Useful libraries
There are a couple of libraries that you'll very likely use in every agent you
write. These are [`default-agent`](#default-agent) and [`dbug`](#dbug). In
brief, `default-agent` provides simple default behaviours for each agent arm,
and `dbug` lets you inspect the state and bowl of an agent from the dojo, for
debugging purposes. Every example agent we look at from here on out will make
use of both libraries.
Let's look at each in more detail:
### `default-agent`
The `default-agent` library contains a basic agent with sane default behaviours
for each arm. In some cases it just crashes and prints an error message to the
terminal, and in others it succeeds but does nothing. It has two primary uses:
- For any agent arms you don't need, you can just have them call the matching
function in `default-agent`, rather than having to manually handle events on
those arms.
- A common pattern in an agent is to switch on the input of an arm with
[wutlus](/docs/hoon/reference/rune/wut#-wutlus) (`?+`) runes or maybe
[wutcol](/docs/hoon/reference/rune/wut#wutcol) (`?:`) runes. For any
unexpected input, you can just pass it to the relevant arm of `default-agent`
rather than handling it manually.
The `default-agent` library lives in `/lib/default-agent/hoon` of the `%base`
desk, and you would typically include a copy in any new desk you created. It's
imported at the beginning of an agent with the
[faslus](/docs/arvo/ford/ford#ford-runes) (`/+`) rune.
The library is a wet gate which takes two arguments: `agent` and `help`. The
first is your agent core itself, and the second is a `?`. If `help` is `%.y` (equivalently, `%&`), it
will crash in all cases. If `help` is `%.n` (equivalently, `%|`), it will use its defaults. You would
almost always have `help` as `%.n`.
The wet gate returns an `agent:gall` door with a sample of `bowl:gall` - a
typical agent core. Usually you would define an alias for it in a virtual arm
([explained below](#virtual-arms)) so it's simple to call.
### `dbug`
The `dbug` library lets you inspect the state and `bowl` of your agent from the
dojo. It includes an `agent:dbug` function which wraps your whole `agent:gall`
door, adding its extra debugging functionality while transparently passing
events to your agent for handling like usual.
To use it, you just import `dbug` with a
[faslus](/docs/arvo/ford/ford#ford-runes) (`/+`) rune at the beginning, then add
the following line directly before the door of your agent:
```hoon
%- agent:dbug
```
With that done, you can poke your agent with the `+dbug` generator from the dojo
and it will pretty-print its state, like:
```
> :your-agent +dbug
```
The generator also has a few useful optional arguments:
- `%bowl`: Print the agent's bowl.
- `[%state 'some code']`: Evaluate some code with the agent's state as its
subject and print the result. The most common case is `[%state %some-face]`,
which will print the contents of the wing with the given face.
- `[%incoming ...]`: Print details of the matching incoming subscription, one
of:
- `[%incoming %ship ~some-ship]`
- `[%incoming %path /part/of/path]`
- `[%incoming %wire /part/of/wire]`
- `[%outgoing ...]`: Print details of the matching outgoing subscription, one
of:
- `[%outgoing %ship ~some-ship]`
- `[%outgoing %path /part/of/path]`
- `[%outgoing %wire /part/of/wire]`
- `[outgoing %term %agent-name]`
By default it will retrieve your agent's state by using its `on-save` arm, but
if your app implements a scry endpoint with a path of `/x/dbug/state`, it will
use that instead.
We haven't yet covered some of the concepts described here, so don't worry if
you don't fully understand `dbug`'s functionality - you can refer back here
later.
## Virtual arms
An agent core must have exactly ten arms. However, there's a special kind of
"virtual arm" that can be added without actually increasing the core's arm
count, since it really just adds code to the other arms in the core. A virtual arm is created with the
[lustar](/docs/hoon/reference/rune/lus#-lustar) (`+*`) rune, and its purpose is
to define _deferred expressions_. It takes a list of pairs of names and Hoon
expressions. When compiled, the deferred expressions defined in the virtual arm are
implicitly inserted at the beginning of every other arm of the core, so they all
have access to them. Each time a name in a `+*` is called, the associated Hoon is evaluated in its place, similar to lazy evaluation except it is re-evaluated whenever needed. See the [tistar](/docs/hoon/reference/rune/tis#-tistar) reference for more information on deferred expressions.
A virtual arm in an agent often looks something like this:
```hoon
+* this .
def ~(. (default-agent this %.n) bowl)
```
`this` and `def` are the deferred expressions, and next to each one is the Hoon
expression it evaluates whenever called. Notice that unlike most things that
take _n_ arguments, a virtual arm is not terminated with a `==`. You can define
as many aliases as you like. The two in this example are conventional ones you'd
use in most agents you write. Their purposes are:
```hoon
this .
```
Rather than having to return `..on-init` like we did in the last lesson,
instead our arms can just refer to `this` whenever modifying or returning the
agent core.
```hoon
def ~(. (default-agent this %.n) bowl)
```
This sets up the `default-agent` library we [described above](#default-agent),
so you can easily call its arms like `on-poke:def`, `on-agent:def`, etc.
## Additional cores
While Gall expects a single 10-arm agent core, it's possible to include
additional cores by composing them into the subject of the agent core itself.
The contents of these cores will then be available to arms of the agent core.
Usually to compose cores in this way, you'd have to do something like insert
[tisgar](/docs/hoon/reference/rune/tis#-tisgar) (`=>`) runes in between them.
However, Clay's build system implicitly composes everything in a file by
wrapping it in a [tissig](/docs/hoon/reference/rune/tis#-tissig) (`=~`)
expression, which means you can just butt separate cores up against one another
and they'll all still get composed.
You can add as many extra cores as you'd like before the agent core, but
typically you'd just add one containing type definitions for the agent's state,
as well as any other useful structures. We'll look at the state in more detail
in the next lesson.
## Example
Here's the `skeleton.hoon` dummy agent from the previous lesson, modified with
the concepts discussed here:
```hoon
/+ default-agent, dbug
|%
+$ card card:agent:gall
--
%- agent:dbug
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
++ on-init
^- (quip card _this)
`this
++ on-save on-save:def
++ on-load on-load:def
++ on-poke on-poke:def
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent on-agent:def
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
The first line uses the faslus (`/+`) Ford rune to import
`/lib/default-agent.hoon` and `/lib/dbug.hoon`, building them and loading them
into the subject of our agent so they're available for use. You can read more
about Ford runes in the [Ford section of the vane
documenation](/docs/arvo/ford/ford#ford-runes).
Next, we've added an extra core. Notice how it's not explicitly composed, since
the build system will do that for us. In this case we've just added a single
`card` arm, which makes it simpler to reference the `card:agent:gall` type.
After that core, we call `agent:dbug` with our whole agent core as its argument.
This allows us to use the `dbug` features described earlier.
Inside our agent door, we've added an extra virtual arm and defined a couple
deferred expressions:
```hoon
+* this .
def ~(. (default-agent this %.n) bowl)
```
In most of the arms, you see we've been able to replace the dummy code with
simple calls to the corresponding arms of `default-agent`, which we set up as a deferred
expression named `def` in the virtual arm. We've also replaced the old `..on-init`
with our deferred expression named `this` in the `on-init` arm as an example - it makes things a bit
simpler.
You can save the code above in `/app/skeleton.hoon` of your `%base` desk like
before and `|commit %base` in the dojo. Additionally, you can start the agent so
we can try out `dbug`. To start it, run the following in the dojo:
```
> |rein %base [& %skeleton]
```
For details of using the `|rein` generator, see the [Dojo
Tools](/docs/userspace/dist/tools#rein) section of the software distribution
documentation.
Now our agent should be running, so let's try out `dbug`. In the dojo, let's try
poking our agent with the `+dbug` generator:
```
> ~
> :skeleton +dbug
>=
```
It just printed out `~`. Our dummy `skeleton` agent doesn't have any state
defined, so it's printing out null as a result. Let's try printing the `bowl`
instead:
```
> [ [our=~zod src=~zod dap=%skeleton]
[wex={} sup={}]
act=5
eny
0v209.tg795.bc2e8.uja0d.11eq9.qp3b3.mlttd.gmf09.q7ro3.6unfh.16jiu.m9lh9.6jlt8.4f847.f0qfh.up08t.3h4l2.qm39h.r3qdd.k1r11.bja8l
now=~2021.11.5..13.28.24..e20e
byk=[p=~zod q=%base r=[%da p=~2021.11.5..12.02.22..f99b]]
]
> :skeleton +dbug %bowl
>=
```
We'll use `dbug` more throughout the guide, but hopefully you should now have an
idea of its basic usage.
## Summary
The key takeaways are:
- Libraries are imported with `/+`.
- `default-agent` is a library that provides default behaviors for Gall agent
arms.
- `dbug` is a library that lets you inspect the state and `bowl` of an agent
from the dojo, with the `+dbug` generator.
- Convenient deferred expressions for Hoon expressions can be defined in a virtual arm with
the [lustar](/docs/hoon/reference/rune/lus#-lustar) (`+*`) rune.
- `this` is a conventional deferred expression name for the agent core itself.
- `def` is a conventional deferred expression name for accessing arms in the `default-agent`
library.
- Extra cores can be composed into the subject of the agent core. The
composition is done implicitly by the build system. Typically we'd include one
extra core that defines types for our agent's state and maybe other useful
types as well.
## Exercises
- Run through the [example](#example) yourself on a fake ship if you've not done
so already.
- Have a read through the [Ford rune
documentation](/docs/arvo/ford/ford#ford-runes) for details about importing
libraries, structures and other things.
- Try the `+dbug` generator out on some other agents, like `:settings-store +dbug`, `:btc-wallet +dbug`, etc, and try some of its options [described
above](#dbug).
- Have a quick look over the source of the `default-agent` library, located at
`/lib/default-agent.hoon` in the `%base` desk. We've not yet covered what the
different arms do but it's still useful to get a general idea, and you'll
likely want to refer back to it later.

View File

@ -0,0 +1,499 @@
+++
title = "4. Lifecycle"
weight = 20
template = "doc.html"
+++
In the last lesson we looked at a couple of useful things used as boilerplate in
most agents. Now we're going to get into the guts of how agents work, and start
looking at what the agent arms do. The first thing we'll look at is the agent's
state, and the three arms for managing it: `on-init`, `on-save`, and `on-load`.
These arms handle what we call an agent's "lifecycle".
## Lifecycle
An agent's lifecycle starts when it's first installed. At this point, the
agent's `on-init` arm is called. This is the _only_ time `on-init` is ever
called - its purpose is just to initialize the agent. The `on-init` arm might be
very simple and just set an initial value for the state, or even do nothing at
all and return the agent core exactly as-is. It may also be more complicated,
and perform some [scries](/docs/arvo/concepts/scry) to obtain extra data or
check that another agent is also installed. It might send off some `card`s to
other agents or vanes to do things like load data in to the `%settings-store`
agent, bind an Eyre endpoint, or anything else. It all depends on the needs of
your particular application. If `on-init` fails for whatever reason, the agent
installation will fail and be aborted.
Once initialized, an agent will just go on doing its thing - processing events,
updating its state, producing effects, etc. At some point, you'll likely want to
push an update for your agent. Maybe it's a bug fix, maybe you want to add extra
features. Whatever the reason, you need to change the source code of your agent,
so you commit a modified version of the file to Clay. When the commit completes, Gall updates the app as follows:
- The agent's `on-save` arm is called, which packs the agent's state in a `vase`
and exports it.
- The new version of the agent is built and loaded into Gall.
- The previously exported `vase` is passed to the `on-load` arm of the newly
built agent. The `on-load` arm will process it, convert it to the new version
of the state if necessary, and load it back into the state of the agent.
A `vase` is just a cell of `[type-of-the-noun the-noun]`. Most data an agent
sends or receives will be encapsulated in a vase. A vase is made with the
[zapgar](/docs/hoon/reference/rune/zap#-zapgar) (`!>`) rune like
`!>(some-data)`, and unpacked with the
[zapgal](/docs/hoon/reference/rune/zap#-zapgal) (`!<`) rune like
`!<(type-to-extract vase)`. Have a read through the [`vase` section of the type
reference for details](/docs/userspace/gall-guide/types#vase).
We'll look at the three arms described here in a little more detail, but first
we need to touch on the state itself.
## Versioned state type
In the previous lesson we introduced the idea of composing additional cores into
the subject of the agent core. Here we'll look at using such a core to define
the type of the agent's state. In principle, we could make it as simple as this:
```hoon
|%
+$ my-state-type @ud
--
```
However, when you update your agent as described in the [Lifecycle](#lifecycle)
section, you may want to change the type of the state itself. This means
`on-load` might find different versions of the state in the `vase` it receives,
and it might not be able to distinguish between them.
For example, if you were creating an agent for a To-Do task management app, your
tasks might initially have a `?(%todo %done)` union to specify whether they're
complete or not. Something like:
```hoon
(map task=@t status=?(%todo %done))
```
At some point, you might want to add a third status to represent "in progress",
which might involve changing `status` like:
```hoon
(map title=@t status=?(%todo %done %work))
```
The conventional way to keep this managable and reliably differentiate possible
state types is to have _versioned states_. The first version of the state would
typically be called `state-0`, and its head would be tagged with `%0`. Then,
when you change the state's type in an update, you'd add a new structure called
`state-1` and tag its head with `%1`. The next would then be `state-2`, and so
on.
In addition to each of those individual state versions, you'd also define a
structure called `versioned-state`, which just contains a union of all the
possible states. This way, the vase `on-load` receives can be unpacked to a
`versioned-state` type, and then a
[wuthep](/docs/hoon/reference/rune/wut#--wuthep) (`?-`) expression can switch on
the head (`%0`, `%1`, `%2`, etc) and process each one appropriately.
For example, your state definition core might initially look like:
```hoon
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 tasks=(map title=@t status=?(%todo %done))]
--
```
When you later update your agent with a new state version, you'd change it to:
```hoon
|%
+$ versioned-state
$% state-0
state-1
==
+$ state-0 [%0 tasks=(map title=@t status=?(%todo %done))]
+$ state-1 [%1 tasks=(map title=@t status=?(%todo %done %work))]
--
```
Another reason for versioning the state type is that there may be cases where
the state type doesn't change, but you still want to apply special transition
logic for an old state during upgrade. For example, you may need to reprocess
the data for a new feature or to fix a bug.
## Adding the state
Along with a core defining the type of the state, we also need to actually add
it to the subject of the core. The conventional way to do this is by adding the following immediately before the agent core itself:
```hoon
=| state-0
=* state -
```
The first line bunts (produces the default value) of the state type we defined
in the previous core, and adds it to the head of the subject _without a face_.
The next line uses [tistar](/docs/hoon/reference/rune/tis#-tistar) to give it
the name of `state`. You might wonder why we don't just give it a face when we
bunt it and skip the tistar part. If we did that, we'd have to refer to `tasks`
as `tasks.state`. With tistar, we can just reference `tasks` while also being
able to reference the whole `state` when necessary.
Note that adding the state like this only happens when the agent is built - from
then on the arms of our agent will just modify it.
## State management arms
We've described the basic lifecycle process and the purpose of each state
management arm. Now let's look at each arm in detail:
### `on-init`
This arm takes no argument, and produces a `(quip card _this)`. It's called
exactly once, when the agent is first installed. Its purpose is to initialize
the agent.
`(quip a b)` is equivalent to `[(list a) b]`, see the [types
reference](/docs/userspace/gall-guide/types#quip) for details.
A `card` is a message to another agent or vane. We'll discuss `card`s in detail
later.
`this` is our agent core, which we give the `this` alias in the virtual arm
described in the previous lesson. The underscore at the beginning is the
irregular syntax for the [buccab](/docs/hoon/reference/rune/buc#_-buccab) (`$_`)
rune. Buccab is like an inverted bunt - instead of producing the default value
of a type, instead it produces the type of some value. So `_this` means "the
type of `this`" - the type of our agent core.
Recall that in the last lesson, we said that most arms return a cell of
`[effects new-agent-core]`. That's exactly what `(quip card _this)` is.
### `on-save`
This arm takes no argument, and produces a `vase`. Its purpose is to export the
state of an agent - the state is packed into the vase it produces. The main time
it's called is when an agent is upgraded. When that happens, the agent's state
is exported with `on-save`, the new version of the agent is compiled and loaded,
and then the state is imported back into the new version of the agent via the
[`on-load`](#on-load) arm.
As well as the agent upgrade process, `on-save` is also used when an agent is
suspended or an app is uninstalled, so that the state can be restored when it's
resumed or reinstalled.
The state is packed in a vase with the
[zapgar](/docs/hoon/reference/rune/zap#-zapgar) (`!>`) rune, like `!>(state)`.
### `on-load`
This arm takes a `vase` and produces a `(quip card _this)`. Its purpose is to
import a state previously exported with [`on-save`](#on-save). Typically
you'd have used a [versioned state](#versioned-state-type) as described above,
so this arm would test which state version the imported data has, convert data
from an old version to the new version if necessary, and load it into the
`state` wing of the subject.
The vase would be unpacked with a
[zapgal](/docs/hoon/reference/rune/zap#-zapgal) (`!<`) rune, and then typically
you'd test its version with a [wuthep](/docs/hoon/reference/rune/wut#--wuthep)
(`?-`) expression.
## Example
Here's a new agent to demonstrate the concepts we've discussed here:
```hoon
/+ default-agent, dbug
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 val=@ud]
+$ card card:agent:gall
--
%- agent:dbug
=| state-0
=* state -
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
::
++ on-init
^- (quip card _this)
`this(val 42)
::
++ on-save
^- vase
!>(state)
::
++ on-load
|= old-state=vase
^- (quip card _this)
=/ old !<(versioned-state old-state)
?- -.old
%0 `this(state old)
==
::
++ on-poke on-poke:def
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent on-agent:def
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
Let's break it down and have a look at the new parts we've added. First, the
state core:
```hoon
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 val=@ud]
+$ card card:agent:gall
--
```
In `state-0` we've defined the structure of our state, which is just a `@ud`.
We've tagged the head with a `%0` constant representing the version number, so
`on-load` can easily test the state version. In `versioned-state` we've created
a union and just added our `state-0` type. We've added an extra `card` arm as
well, just so we can use `card` as a type, rather than the unweildy
`card:agent:gall`.
After that core, we have the usual `agent:dbug` call, and then we have this:
```hoon
=| state-0
=* state -
```
We've just bunted the `state-0` type, which will produce `[%0 val=0]`, pinning
it to the head of the subject. Then, we've use
[tistar](/docs/hoon/reference/rune/tis#-tistar) (`=*`) to give it a name of
`state`.
Inside our agent core, we have `on-init`:
```hoon
++ on-init
^- (quip card _this)
`this(val 42)
```
The `a(b c)` syntax is the irregular form of the
[centis](/docs/hoon/reference/rune/cen#-centis) (`%=`) rune. You'll likely be
familiar with this from recursive functions, where you'll typically call the buc
arm of a trap like `$(a b, c d, ...)`. It's the same concept here - we're saying
`this` (our agent core) with `val` replaced by `42`. Since `on-init` is only
called when the agent is first installed, we're just initializing the state.
Next we have `on-save`:
```hoon
++ on-save
^- vase
!>(state)
```
This exports our agent's state, and is called during upgrades, suspensions, etc.
We're having it pack the `state` value in a `vase`.
Finally, we have `on-load`:
```hoon
++ on-load
|= old-state=vase
^- (quip card _this)
=/ old !<(versioned-state old-state)
?- -.old
%0 `this(state old)
==
```
It takes in the old state in a `vase`, then unpacks it to the `versioned-state`
type we defined earlier. We test its head for the version, and load it back into
the state of our agent if it matches. This test is a bit redundant at this stage
since we only have one state version, but you'll soon see the purpose of it.
You can save it as `/app/lifecycle.hoon` in the `%base` desk and `|commit %base`. Then, run `|rein %base [& %lifecycle]` to start it.
Let's try inspecting our state with `dbug`:
```
> [%0 val=42]
> :lifecycle +dbug
>=
```
`dbug` can also dig into the state with the `%state` argument, printing the value of the specified face:
```
> 42
> :lifecycle +dbug [%state %val]
>=
```
Next, we're going to modify our agent and change the structure of the state so
we can test out the upgrade process. Here's a modified version, which you can
again save in `/app/lifecycle.hoon` and `|commit %base`:
```hoon
/+ default-agent, dbug
|%
+$ versioned-state
$% state-0
state-1
==
+$ state-0 [%0 val=@ud]
+$ state-1 [%1 val=[@ud @ud]]
+$ card card:agent:gall
--
%- agent:dbug
=| state-1
=* state -
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
::
++ on-init
^- (quip card _this)
`this(val [27 32])
::
++ on-save
^- vase
!>(state)
::
++ on-load
|= old-state=vase
^- (quip card _this)
=/ old !<(versioned-state old-state)
?- -.old
%1 `this(state old)
%0 `this(state 1+[val.old val.old])
==
::
++ on-poke on-poke:def
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent on-agent:def
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
As soon as you `|commit` it, Gall will immediately export the existing state
with `on-save`, build the new version of the agent, then import the state back
in with `on-load`.
In the state definition core, you'll see we've added a new state version with a different structure:
```hoon
+$ versioned-state
$% state-0
state-1
==
+$ state-0 [%0 val=@ud]
+$ state-1 [%1 val=[@ud @ud]]
+$ card card:agent:gall
--
```
We've also changed the part that adds the state, so it uses the new version instead:
```hoon
=| state-1
=* state -
```
In `on-init`, we've updated it to initialize the state with a value that fits the new type we've defined:
```hoon
++ on-init
^- (quip card _this)
`this(val [27 32])
```
`on-init` won't be called in this case, but if someone were to directly install
this new version of the agent, it would be, so we still need to update it.
`on-save` has been left unchanged, but `on-load` has been updated like so:
```hoon
++ on-load
|= old-state=vase
^- (quip card _this)
=/ old !<(versioned-state old-state)
?- -.old
%1 `this(state old)
%0 `this(state 1+[val.old val.old])
==
```
We've updated the `?-` expression with a new case that handles our new state
type, and for the old state type we've added a function that converts it to the
new type - in this case by duplicating `val` and changing the head-tag from `%0`
to `%1`. This is an extremely simple state type transition function - it would
likely be more complicated for an agent with real functionality.
Note: the `a+b` syntax (as in `1+[val.old val.old]`) forms a cell of the
constant `%a` and the noun `b`. The constant may either be an integer or a `@tas`.
For example:
```
> foo+'bar'
[%foo 'bar']
> 42+'bar'
[%42 'bar']
```
Let's now use `dbug` to confirm our state has successfully been updated to the new
type:
```
> [%1 val=[42 42]]
> :lifecycle +dbug
>=
```
## Summary
- The app lifecycle rougly consists of initialization, state export, upgrade,
state import and state version transition.
- This is managed by three arms: `on-init`, `on-save` and `on-load`.
- `on-init` initializes the agent and is called when it's first installed.
- `on-save` exports the agent's state and is called during upgrade or
when an app is suspended.
- `on-load` imports an agent's state and is called during upgrade or
when an app is unsuspended. It also handles converting data from old
state versions to new state versions.
- The type of an agent's state is typically defined in a separate core.
- The state type is typically versioned, with a new type definition for each
version of the state.
- The state is initially added by bunting the state type and then naming it
`state` with the tistar (`=*`) rune, so its contents can be referenced
directly.
- A `vase` is a cell of `[type-of-the-noun the-noun]`.
- `(quip a b)` is the same as `[(list a) b]`, and is the `[effects new-agent-core]` pair returned by many arms of an agent core.
## Exercises
- Run through the [example](#example) yourself on a fake ship if you've not done
so already.
- Have a look at the [`vase` entry in the type
reference](/docs/userspace/gall-guide/types#vase).
- Have a look at the [`quip` entry in the type
reference](/docs/userspace/gall-guide/types#quip).
- Try modifying the second version of the agent in the [example](#example)
section, adding a third state version. Include functions in the wuthep
expression in `on-load` to convert old versions to your new state type.

View File

@ -0,0 +1,244 @@
+++
title = "5. Cards"
weight = 25
template = "doc.html"
+++
As we previously discussed, most arms of an agent core produce a cell of
`[effects new-agent-core]`, and the type we use for this is typically `(quip card _this)`. We've covered `_this`, but we haven't yet looked at `card` effects
in detail. That's what we'll do here. In explaining `card`s we'll touch on some
concepts relating to the mechanics of pokes, subscriptions and other things
we've not yet covered. Don't worry if you don't understand how it all fits
together yet, we just want to give you a basic idea of `card`s so we can then
dig into how they work in practice.
## `card` type
The `card:agent:gall` type (henceforth just `card`) has a slightly complex
structure, so we'll walk through it step-by-step.
`lull.hoon` defines a `card` like so:
```hoon
+$ card (wind note gift)
```
A `wind` is defined in `arvo.hoon` as:
```hoon
++ wind
|$ [a b]
$% [%pass p=wire q=a]
[%slip p=a]
[%give p=b]
==
```
Gall will not accept a `%slip`, so we can ignore that. A `card`, then, is one
of:
```hoon
[%pass wire note]
[%give gift]
```
We'll consider each separately.
## `%pass`
```hoon
[%pass wire note]
```
The purpose of a `%pass` card is to send some kind of one-off request, action,
task, or what have you, to another agent or vane. A `%pass` card is a request
your agent _initiates_. This is in contrast to a [`%give`](#give) card, which is
sent in _response_ to another agent or vane.
The type of the first field in a `%pass` card is a `wire`. A `wire` is just a
list of `@ta`, with a syntax of `/foo/bar/baz`. When you `%pass` something to an
agent or vane, the response will come back on the `wire` you specify here. Your
agent can then check the `wire` and maybe do different things depending on its
content. The [`wire`](/docs/userspace/gall-guide/types#wire) type is covered in
the [types reference](/docs/userspace/gall-guide/types). We'll show how `wire`s
are practically used later on.
The type of the next field is a `note:agent:gall` (henceforth just `note`), which
`lull.hoon` defines as:
```hoon
+$ note
$% [%agent [=ship name=term] =task]
[%arvo note-arvo]
[%pyre =tang]
==
```
- An `%agent` `note` is a request to another Gall agent, either local or on a
remote ship. The `ship` and `name` fields are just the target ship and agent
name. The `task` is the request itself, we'll discuss it separately
[below](#task).
- An `%arvo` `note` is a request to a vane. We'll discuss such requests
[below](#note-arvo).
- A `%pyre` `note` is used to abort an event. It's mostly used internally by
`kiln` (a submodule of `%hood`), it's unlikely you'd use it in your own agent. The `tang` contains an
error message.
### `task`
A `task:agent:gall` (henceforth just `task`) is defined in `lull.hoon` as:
```hoon
+$ task
$% [%watch =path]
[%watch-as =mark =path]
[%leave ~]
[%poke =cage]
[%poke-as =mark =cage]
==
```
Note a few of these include a `path` field. The `path` type is exactly the same
as a `wire` - a list of `@ta` with a syntax of `/foo/bar/baz`. The reason for
the `wire`/`path` distinction is just to indicate their separate purposes. While
a `wire` is for _responses_, a `path` is for _requests_. The
[`path`](/docs/userspace/gall-guide/types#path) type is also covered in the
[types reference](/docs/userspace/gall-guide/types).
The kinds of `task`s can be divided into two categories:
#### Subscriptions
`%watch`, `%watch-as` and `%leave` all pertain to subscriptions.
- `%watch`: A request to subscribe to the specified `path`. Once subscribed,
your agent will receive any updates the other agent sends out on that `path`.
You can subscribe more than once to the same `path`, but each subscription
must have a separate `wire` specified at the beginning of the [`%pass`
card](#pass).
- `%watch-as`: This is the same as `%watch`, except Gall will convert updates to
the given `mark` before delivering them to your agent.
- `%leave`: Unsubscribe. The subscription to cancel is determined by the `wire`
at the beginning of the [`pass` card](#pass) rather than the subscription
`path`, so its argument is just `~`.
**Examples**
![subscription card examples](https://media.urbit.org/docs/userspace/gall-guide/sub-cards.svg)
#### Pokes
Pokes are requests, actions, or just some data which you send to another agent.
Unlike subscriptions, these are just one-off messages.
A `%poke` contains a `cage` of some data. A `cage` is a cell of `[mark vase]`.
The `mark` is just a `@tas` like `%foo`, and corresponds to a mark file in the
`/mar` directory. We'll cover `mark`s in greater detail later. The `vase` contains
the actual data you're sending.
The `%poke-as` task is the same as `%poke` except Gall will convert the `mark`
in the `cage` to the `mark` you specify before sending it off.
**Examples**
![poke card examples](https://media.urbit.org/docs/userspace/gall-guide/poke-cards.svg)
### `note-arvo`
A `note-arvo` is defined in `lull.hoon` like so:
```hoon
+$ note-arvo
$~ [%b %wake ~]
$% [%a task:ames]
[%b task:behn]
[%c task:clay]
[%d task:dill]
[%e task:eyre]
[%g task:gall]
[%i task:iris]
[%j task:jael]
[%$ %whiz ~]
[@tas %meta vase]
==
```
The letter at the beginning corresponds to the vane - `%b` for Behn, `%c` for
Clay, etc. After then vane letter comes the task. Each vane has an API with a
set of tasks that it will accept, and are defined in each vane's section of
`lull.hoon`. Each vane's tasks are documented on the API Reference page of its
section in the [Arvo documentation](/docs/arvo/arvo).
#### Examples
![arvo card examples](https://media.urbit.org/docs/userspace/gall-guide/arvo-cards.svg)
## `%give`
```hoon
[%give gift]
```
The purpose of a `%give` card is to respond to a request made by another agent
or vane. More specifically, it's either for acknowledging a request, or for
sending out updates to subscribers. This is in contrast to a [`%pass`](#give)
card, which is essentially unsolicited.
A `%give` card contains a `gift:agent:gall` (henceforth just `gift`), which is
defined in `lull.hoon` as:
```hoon
+$ gift
$% [%fact paths=(list path) =cage]
[%kick paths=(list path) ship=(unit ship)]
[%watch-ack p=(unit tang)]
[%poke-ack p=(unit tang)]
==
```
These can be divided into two categories:
### Acknowledgements
`%watch-ack` is sent in response to a `%watch` or `%watch-as` request, and
`%poke-ack` is sent in response to a `%poke` or `%poke-as` request. If the
`(unit tang)` is null, it's an ack - a positive acknowledgement. If the `(unit tang)` is non-null, it's a nack - a negative acknowledgement, and the `tang`
contains an error message. Gall automatically sends a nack with a stack trace if
your agent crashes while processing the request, and automatically sends an ack
if it does not. Therefore, you would not explicitly produce a `%watch-ack` or
`%poke-ack` gift.
#### Examples
![ack card examples](https://media.urbit.org/docs/userspace/gall-guide/ack-cards.svg)
### Subscriptions
`%fact` and `%kick` are both sent out to existing subscribers - entities that
have previously `%watch`ed a path on your ship.
A `%kick` gift takes a list of subscription `path`s. and a `(unit ship)`, which
is the ship to kick from those paths. If the `unit` is null, all subscribers are
kicked from the specified paths. Note that sometimes Gall can produce `%kick`
gifts without your agent explicitly sending a card, due to networking
conditions.
`%fact`s are how updates are sent out to subscribers. The `paths` field is a
list of subscription paths - all subscribers of the specified `path`s will
receive the `%fact`. The `cage` is the data itself - a cell of a `mark` and a
`vase`.
#### Examples
![gift card examples](https://media.urbit.org/docs/userspace/gall-guide/gift-cards.svg)
## Summary
Here's a diagram that summarizes the different kinds of `card`s:
[![card diagram](https://media.urbit.org/docs/userspace/gall-guide/card-diagram.svg)](https://media.urbit.org/docs/userspace/gall-guide/card-diagram.svg)
## Exercises
- Have a read of the [`wire`](/docs/userspace/gall-guide/types#wire) and
[`path`](/docs/userspace/gall-guide/types#path) entries in the type reference.

View File

@ -0,0 +1,538 @@
+++
title = "6. Pokes"
weight = 30
template = "doc.html"
+++
In this lesson we'll look at sending and receiving one-off messages called
`%poke`s. We'll look at the `on-poke` agent arm which handles incoming pokes.
We'll also introduce the `on-agent` arm, and look at the one kind of response it can
take - a `%poke-ack`.
## Receiving a poke
Whenever something tries to poke your agent, Gall calls your agent's `on-poke`
arm and give it the `cage` from the poke as its sample. The `on-poke` arm will
produce a `(quip card _this)`. Here's how it would typically begin:
```hoon
++ on-poke
|= [=mark =vase]
^- (quip card _this)
...
```
The sample of the gate is usually specified as a cell of `mark` and `vase`
rather than just `cage`, simply because it's easier to work with.
Typically, you'd first test the `mark` with something like a
[wutlus](/docs/hoon/reference/rune/wut#-wutlus) `?+` expression, passing
unexpected `mark`s to `default-agent`, which just crashes. We'll look at custom
`mark`s in a subsequent lesson, but the basic patten looks like:
```hoon
?+ mark (on-poke:def mark vase)
%noun ...
%something-else ...
...
==
```
After testing the `mark`, you'd usually extract the `vase` to the expected type,
and then apply whatever logic you need. For example:
```hoon
=/ action !<(some-type vase)
?- -.action
%foo ...
%bar ...
...
==
```
Your agent will then produce a list of `card`s to be sent off and a new,
modified state, as appropriate. We'll go into subscriptions in the next lesson,
but just to give you an idea of a typical pattern: An agent for a chat app might
take new messages as pokes, add them to the list of messages in its state, and
send out the new messages to subscribed chat participants as `gift`s.
As discussed in the previous lesson, Gall will automatically send a `%poke-ack`
`gift` back to wherever the poke came from. The `%poke-ack` will be a nack if
your agent crashed while processing the poke, and an ack otherwise. If it's a
nack, the `tang` in the `%poke-ack` will contain a stack trace of the crash.
As a result, you do not need to explicitly send a `%poke-ack`. Instead, you
would design your agent to handle only what you expect and crash in all other
cases. You can crash by passing the `cage` to `default-agent`, or just with a
`!!`. In the latter case, if you want to add an error message to the stack
trace, you can do so like:
```hoon
~| "some error message"
!!
```
This will produce a trace that looks something like:
```
/sys/vane/gall/hoon:<[1.372 9].[1.372 37]>
/app/pokeme/hoon:<[31 3].[43 5]>
/app/pokeme/hoon:<[32 3].[43 5]>
/app/pokeme/hoon:<[34 5].[42 7]>
/app/pokeme/hoon:<[35 5].[42 7]>
/app/pokeme/hoon:<[38 7].[41 27]>
/app/pokeme/hoon:<[39 9].[40 11]>
"some error message"
/app/pokeme/hoon:<[40 9].[40 11]>
```
Note that the `tang` in the nack is just for debugging purposes, you should not
try to pass actual data by encoding it in the nack `tang`.
## Sending a poke
An agent can send pokes to other agents by producing [`%poke`
`card`s](/docs/userspace/gall-guide/5-cards#pokes). Any agent arm apart from
`on-peek` and `on-save` can produce such `card`s. The arms would typically
produce the `(quip card _this)` like so:
```hoon
:_ this
:~ [%pass /some/wire %agent [~target-ship %target-agent] %poke %some-mark !>('some data')]
==
```
The [colcab](/docs/hoon/reference/rune/col#_-colcab) (`:_`) rune makes an
inverted cell, it's just `:-` but with the head and tail swapped. We use colcab
to produce the `(quip card _this)` because the list of cards is "heavier"
here than the new agent core expression (`this`), so it makes it more
readable.
### Receiving the `%poke-ack`
The pokes will be processed by their targets [as described in the previous
section](#receiving-a-poke), and they'll `%give` back a `%poke-ack` on the
`wire` you specified (`/some/wire` in the previous example). When Gall gets the
`%poke-ack` back, it will call the `on-agent` arm of your agent, with the `wire`
it came in on and the `%poke-ack` itself in a `sign:agent:gall`. Your `on-agent`
arm would therefore begin like so:
```hoon
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
...
```
A `sign:agent:gall` (henceforth just `sign`) is defined in `lull.hoon` as:
```hoon
+$ sign
$% [%poke-ack p=(unit tang)]
[%watch-ack p=(unit tang)]
[%fact =cage]
[%kick ~]
==
```
It's basically the same as a [`gift`](/docs/userspace/gall-guide/5-cards#give),
but incoming instead of outgoing.
The simplest way to handle a `%poke-ack` by passing it to `default-agent`'s
`on-agent` arm, which will just print an error message to the terminal if it's a
nack, and otherwise do nothing. Sometimes you'll want your agent to do something
different depending on whether the poke failed or succeeded (and therefore
whether it's a nack or an ack).
As stated in the [Precepts](/docs/development/precepts#specifics): "Route on wire before sign, never sign before wire.". Thus we first test the
`wire` so you can tell what the `%poke-ack` was for. You might do something
like:
```hoon
?+ wire (on-agent:def wire sign)
[%some %wire ~] ...
...
==
```
After that, you'll need to see what kind of `sign` it is:
```hoon
?+ -.sign (on-agent:def wire sign)
%poke-ack ...
...
```
Then, you can tell whether it's an ack or a nack by testing whether the `(unit tang)` in the `%poke-ack` is null:
```hoon
?~ p.sign
...(what to do if the poke succeeded)...
...(what to do if the poke failed)...
```
Finally, you can produce the `(quip card _this)`.
## Example
We're going to look at a couple of agents to demonstrate both sending and
receiving pokes. Here's the first, an agent that receives pokes:
### `pokeme.hoon`
```hoon
/+ default-agent, dbug
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 val=@ud]
+$ card card:agent:gall
--
%- agent:dbug
=| state-0
=* state -
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
::
++ on-init
^- (quip card _this)
`this
::
++ on-save
^- vase
!>(state)
::
++ on-load
|= old-state=vase
^- (quip card _this)
=/ old !<(versioned-state old-state)
?- -.old
%0 `this(state old)
==
::
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
=/ action !<(?(%inc %dec) vase)
?- action
%inc `this(val +(val))
::
%dec
?: =(0 val)
~| "Can't decrement - already zero!"
!!
`this(val (dec val))
==
==
::
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent on-agent:def
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
This is a very simple agent that just has `val`, a number, in its state. It will
take pokes that either increment or decrement `val`. Here's its `on-poke` arm:
```hoon
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
=/ action !<(?(%inc %dec) vase)
?- action
%inc `this(val +(val))
%dec
?: =(0 val)
~| "Can't decrement - already zero!"
!!
`this(val (dec val))
==
==
```
It only expects pokes with a `%noun` mark, and passes all others to
`on-poke:def`, which just crashes. For `%noun` pokes, it expects to receive
either `%inc` or `%dec` in the `vase`. If it's `%inc`, it produces a new `this`
with `val` incremented. If it's `%dec`, it produces `this` with `val`
decremented, or crashes if `val` is already zero.
Let's try it out. Save the agent above as `/app/pokeme.hoon` in the `%base` desk
and `|commit %base`. Then, start it up with `|rein %base [& %pokeme]`. We can
check its initial state with `dbug`:
```
> 0
> :pokeme +dbug [%state %val]
>=
```
Next, we'll try poking it. The dojo lets you poke agents with the following syntax:
```
:agent-name &some-mark ['some' 'noun']
```
If the `mark` part is omitted, it'll just default to `%noun`. Since our agent
only takes a `%noun` mark, we can skip that. The rest will be packed in a vase
by the dojo and delivered as a poke, so we can do:
```
> :pokeme %inc
>=
```
If we now look at the state with `dbug`, we'll see the poke was successful and
it's been incremented:
```
> 1
> :pokeme +dbug [%state %val]
>=
```
Let's try decrement:
```
> :pokeme %dec
>=
> 0
> :pokeme +dbug [%state %val]
>=
```
As you can see, it's back at zero. If we try again, we'll see it fails, and the
dojo will print the `tang` in the `%poke-ack` nack:
```
> :pokeme %dec
/sys/vane/gall/hoon:<[1.372 9].[1.372 37]>
/app/pokeme/hoon:<[31 3].[43 5]>
/app/pokeme/hoon:<[32 3].[43 5]>
/app/pokeme/hoon:<[34 5].[42 7]>
/app/pokeme/hoon:<[35 5].[42 7]>
/app/pokeme/hoon:<[38 7].[41 27]>
/app/pokeme/hoon:<[39 9].[40 11]>
"Can't decrement - already zero!"
/app/pokeme/hoon:<[40 9].[40 11]>
dojo: app poke failed
```
### `pokeit.hoon`
Here's a second agent. It takes a poke of `%inc` or `%dec` like before, but
rather than updating its own state, it sends two pokes to `%pokeme`, so
`%pokeme`'s state will be incremented or decremented by two.
```hoon
/+ default-agent, dbug
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 ~]
+$ card card:agent:gall
--
%- agent:dbug
=| state-0
=* state -
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
::
++ on-init
^- (quip card _this)
`this
::
++ on-save
^- vase
!>(state)
::
++ on-load
|= old-state=vase
^- (quip card _this)
=/ old !<(versioned-state old-state)
?- -.old
%0 `this(state old)
==
::
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
=/ action !<(?(%inc %dec) vase)
?- action
%inc
:_ this
:~ [%pass /inc %agent [our.bowl %pokeme] %poke %noun !>(%inc)]
[%pass /inc %agent [our.bowl %pokeme] %poke %noun !>(%inc)]
==
::
%dec
:_ this
:~ [%pass /dec %agent [our.bowl %pokeme] %poke %noun !>(%dec)]
[%pass /dec %agent [our.bowl %pokeme] %poke %noun !>(%dec)]
==
==
==
::
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
::
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ wire (on-agent:def wire sign)
[%inc ~]
?. ?=(%poke-ack -.sign)
(on-agent:def wire sign)
?~ p.sign
%- (slog '%pokeit: Increment poke succeeded!' ~)
`this
%- (slog '%pokeit: Increment poke failed!' ~)
`this
::
[%dec ~]
?. ?=(%poke-ack -.sign)
(on-agent:def wire sign)
?~ p.sign
%- (slog '%pokeit: Decrement poke succeeded!' ~)
`this
%- (slog '%pokeit: Decrement poke failed!' ~)
`this
==
::
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
Here's the `on-poke` arm:
```hoon
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
=/ action !<(?(%inc %dec) vase)
?- action
%inc
:_ this
:~ [%pass /inc %agent [our.bowl %pokeme] %poke %noun !>(%inc)]
[%pass /inc %agent [our.bowl %pokeme] %poke %noun !>(%inc)]
==
%dec
:_ this
:~ [%pass /dec %agent [our.bowl %pokeme] %poke %noun !>(%dec)]
[%pass /dec %agent [our.bowl %pokeme] %poke %noun !>(%dec)]
==
==
==
```
It's similar to `%pokeme`, except it sends two `%poke` `card`s to `%pokeme` for
each case, rather than modifying its own state. The `%inc` pokes specify a
`wire` of `/inc`, and the `%dec` pokes specify a `wire` of `/dec`, so we can
differentiate the responses. It also has the following `on-agent`:
```hoon
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ wire (on-agent wire sign)
[%inc ~]
?. ?=(%poke-ack -.sign)
(on-agent wire sign)
?~ p.sign
%- (slog '%pokeit: Increment poke succeeded!' ~)
`this
%- (slog '%pokeit: Increment poke failed!' ~)
`this
::
[%dec ~]
?. ?=(%poke-ack -.sign)
(on-agent wire sign)
?~ p.sign
%- (slog '%pokeit: Decrement poke succeeded!' ~)
`this
%- (slog '%pokeit: Decrement poke failed!' ~)
`this
==
```
`on-agent` tests the `wire`, checks if it's a `%poke-ack`, and then prints to
the terminal whether it succeeded or failed.
Save this agent to `/app/pokeit.hoon` on the `%base` desk, `|commit %base`, and
start it with `|rein %base [& %pokeme] [& %pokeit]`.
Let's try it out:
```
%pokeit: Increment poke succeeded!
%pokeit: Increment poke succeeded!
> :pokeit %inc
>=
```
`%pokeit` has received positive `%poke-ack`s, which means both pokes succeeded.
It could tell they were increments because the `%poke-ack`s came back on the
`/inc` wire we specified. We can check the state of `%pokeme` to confirm:
```
> 2
> :pokeme +dbug [%state %val]
>=
```
Let's try decrementing `%pokeme` so it's an odd number, and then try a `%dec`
via `%pokeit`:
```
> :pokeme %dec
>=
%pokeit: Decrement poke succeeded!
%pokeit: Decrement poke failed!
> :pokeit %dec
>=
```
The `on-agent` arm of `%pokeit` has received one ack and one nack. The first
took `val` to zero, and the second crashed trying to decrement below zero.
## Summary
- Incoming pokes go to the `on-poke` arm of an agent.
- The `on-poke` arm takes a `cage` and produces an `(quip card _this)`.
- Gall will automatically return a `%poke-ack` to the poke's source, with a
stack trace in the `(unit tang)` if your agent crashed while processing the
poke.
- Outgoing pokes can be sent by including `%poke` `%pass` `card`s in the `quip`
produced by most agent arms.
- `%poke-ack`s in response to pokes you've sent will come in to the `on-agent`
arm in a `sign`, on the `wire` you specified in the original `%poke` `card`.
- You can poke agents from the dojo with a syntax of `:agent &mark ['some' 'noun']`.
## Exercises
- Run through the [example](#example) yourself on a fake ship if you've not done
so already.
- Have a look at the `on-agent` arm of `/lib/default-agent.hoon` to see how
`default-agent` handles incoming `sign`s.
- Try modifying the `%pokeme` agent with another action of your choice (in
addition to `%inc` and `%dec`).
- Try modifying the `%pokeit` agent to send your new type of poke to `%pokeme`,
and handle the `%poke-ack` it gets back.

View File

@ -0,0 +1,408 @@
+++
title = "7. Structures and Marks"
weight = 35
template = "doc.html"
+++
Before we get into subscription mechanics, there's three things we need to touch
on that are very commonly used in Gall agents. The first is defining an agent's
types in a `/sur` structure file, the second is `mark` files, and the third is
permissions. Note the example code presented in this lesson will not yet build a
fully functioning Gall agent, we'll get to that in the next lesson.
## `/sur`
In the [previous lesson on pokes](/docs/userspace/gall-guide/6-pokes), we used a
very simple union in the `vase` for incoming pokes:
```hoon
=/ action !<(?(%inc %dec) vase)
```
A real Gall agent is likely to have a more complicated API. The most common
approach is to define a head-tagged union of all possible poke types the agent
will accept, and another for all possible updates it might send out to
subscribers. Rather than defining these types in the agent itself, you would
typically define them in a separate core saved in the `/sur` directory of the
desk. The `/sur` directory is the canonical location for userspace type
definitions.
With this approach, your agent can simply import the structures file and make use
of its types. Additionally, if someone else wants to write an agent that
interfaces with yours, they can include your structure file in their own desk
to interact with your agent's API in a type-safe way.
#### Example
Let's look at a practical example. If we were creating a simple To-Do app, our
agent might accept a few possible `action`s as pokes: Adding a new task,
deleting a task, toggling a task's "done" status, and renaming an existing task.
It might also be able to send `update`s out to subscribers when these events
occur. If our agent were named `%todo`, it might have the following structure in
`/sur/todo.hoon`:
```hoon
|%
+$ id @
+$ name @t
+$ task [=name done=?]
+$ tasks (map id task)
+$ action
$% [%add =name]
[%del =id]
[%toggle =id]
[%rename =id =name]
==
+$ update
$% [%add =id =name]
[%del =id]
[%toggle =id]
[%rename =id =name]
[%initial =tasks]
==
--
```
Our `%todo` agent could then import this structure file with a [fashep ford
rune](/docs/arvo/ford/ford#ford-runes) (`/-`) at the beginning of the agent like
so:
```hoon
/- todo
```
The agent's state could be defined like:
```hoon
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 =tasks:todo]
+$ card card:agent:gall
--
```
Then, in its `on-poke` arm, it could handle these actions in the following
manner:
```hoon
++ on-poke
|= [=mark =vase]
^- (quip card _this)
|^
?> =(src.bowl our.bowl)
?+ mark (on-poke:def mark vase)
%todo-action
=^ cards state
(handle-poke !<(action:todo vase))
[cards this]
==
::
++ handle-poke
|= =action:todo
^- (quip card _state)
?- -.action
%add
:_ state(tasks (~(put by tasks) now.bowl [name.action %.n]))
:~ :* %give %fact ~[/updates] %todo-update
!>(`update:todo`[%add now.bowl name.action])
==
==
::
%del
:_ state(tasks (~(del by tasks) id.action))
:~ :* %give %fact ~[/updates] %todo-update
!>(`update:todo`action)
==
==
::
%toggle
:_ %= state
tasks %+ ~(jab by tasks)
id.action
|=(=task:todo task(done !done.task))
==
:~ :* %give %fact ~[/updates] %todo-update
!>(`update:todo`action)
==
==
::
%rename
:_ %= state
tasks %+ ~(jab by tasks)
id.action
|=(=task:todo task(name name.action))
==
:~ :* %give %fact ~[/updates] %todo-update
!>(`update:todo`action)
==
==
::
%allow
`state(friends (~(put in friends) who.action))
::
%kick
:_ state(friends (~(del in friends) who.action))
:~ [%give %kick ~[/updates] `who.action]
==
==
--
```
Let's break this down a bit. Firstly, our `on-poke` arm includes a
[barket](/docs/hoon/reference/rune/bar#-barket) (`|^`) rune. Barket creates a
core with a `$` arm that's computed immediately. We extract the `vase` to the
`action:todo` type and immediately pass it to the `handle-poke` arm of the core
created with the barket. This `handle-poke` arm tests what kind of `action` it's
received by checking its head. It then updates the state, and also sends an
update to subscribers, as appropriate. Don't worry too much about the `%give`
`card` for now - we'll cover subscriptions in the next lesson.
Notice that the `handle-poke` arm produces a `(quip card _state)` rather than
`(quip card _this)`. The call to `handle-poke` is also part of the following
expression:
```hoon
=^ cards state
(handle-poke !<(action:todo vase))
[cards this]
```
The [tisket](/docs/hoon/reference/rune/tis#-tisket) (`=^`) expression takes two
arguments: A new named noun to pin to the subject (`cards` in this case), and an
existing wing of the subject to modify (`state` in this case). Since
`handle-poke` produces `(quip card _state)`, we're saving the `card`s it
produces to `cards` and replacing the existing `state` with its new one.
Finally, we produce `[cards this]`, where `this` will now contain the modified
`state`. The `[cards this]` is a `(quip card _this)`, which our `on-poke` arm is
expected to produce.
This might seem a little convoluted, but it's a common pattern we do for two
reasons. Firstly, it's not ideal to be passing around the entire `this` agent
core - it's much tidier just passing around the `state`, until you actually want
to return it to Gall. Secondly, It's much easier to read when the poke handling
logic is separated into its own arm. This is a fairly simple example but if your
agent is more complex, handling multiple marks and containing additional logic
before it gets to the actual contents of the `vase`, structuring things this way
can be useful.
You can of course structure your `on-poke` arm differently than we've done
here - we're just demonstrating a typical pattern.
## `mark` files
So far we've just used a `%noun` mark for pokes - we haven't really delved into
what such `mark`s represent, or considered writing custom ones.
Formally, marks are file types in the Clay filesystem. They correspond to mark
files in the `/mar` directory of a desk. The `%noun` mark, for example,
corresponds to the `/mar/noun.hoon` file. Mark files define the actual hoon data
type for the file (e.g. a `*` noun for the `%noun` mark), but they also specify
some extra things:
- Methods for converting between the mark in question and other marks.
- Revision control functions like patching, diffing, merging, etc.
Aside from their use by Clay for storing files in the filesystem, they're also
used extensively for exchanging data with the outside world, and for exchanging
data between Gall agents. When data comes in from a remote ship, destined for a
particular Gall agent, it will be validated by the file in `/mar` that
corresponds to its mark before being delivered to the agent. If the remote data
has no corresponding mark file in `/mar` or it fails validation, it will crash
before it touches the agent.
A mark file is a door with exactly three arms. The door's sample is the data type the
mark will handle. For example, the sample of the `%noun` mark is just `non=*`,
since it handles any noun. The three arms are as follows:
- `grab`: Methods for converting _to_ our mark _from_ other marks.
- `grow`: Methods for converting _from_ our mark _to_ other marks.
- `grad`: Revision control functions.
In the context of Gall agents, you'll likely just use marks for sending and
receiving data, and not for actually storing files in Clay. Therefore, it's
unlikely you'll need to write custom revision control functions in the `grad`
arm. Instead, you can simply delegate `grad` functions to another mark -
typically `%noun`. If you want to learn more about writing such `grad`
functions, you can refer to the [Marks Guide](/docs/arvo/clay/marks/marks) in
the Clay vane documentation, which is much more comprehensive, but it's not
necessary for our purposes here.
#### Example
Here's a very simple mark file for the `action` structure we created in the
[previous section](#sur):
```hoon
/- todo
|_ =action:todo
++ grab
|%
++ noun action:todo
--
++ grow
|%
++ noun action
--
++ grad %noun
--
```
We've imported the `/sur/todo.hoon` structure library from the previous section,
and we've defined the sample of the door as `=action:todo`, since that's what
it will handle. Now let's consider the arms:
- `grab`: This handles conversion methods _to_ our mark. It contains a core with
arm names corresponding to other marks. In this case, it can only convert from
a `noun` mark, so that's the core's only arm. The `noun` arm simply calls the
`action` structure from our structure library. This is called "clamming" or
"molding" - when some noun comes in, it gets called like `(action:todo [some-noun])` - producing data of the `action` type if it nests, and crashing
otherwise.
- `grow`: This handles conversion methods _from_ our mark. Like `grab`, it
contains a core with arm names corresponding to other marks. Here we've also
only added an arm for a `%noun` mark. In this case, `action` data will come in
as the sample of our door, and the `noun` arm simply returns it, since it's
already a noun (as everything is in Hoon).
- `grad`: This is the revision control arm, and as you can see we've simply
delegated it to the `%noun` mark.
This mark file could be saved as `/mar/todo/action.hoon`, and then the `on-poke`
arm in the previous example could test for it instead of `%noun` like so:
```hoon
++ on-poke
|= [=mark =vase]
|^ ^- (quip card _this)
?+ mark (on-poke:def mark vase)
%todo-action
...
```
Note how `%todo-action` will be resolved to `/mar/todo/action.hoon` - the hyphen
will be interpreted as `/` if there's not already a `/mar/todo-action.hoon`.
This simple mark file isn't all that useful. Typically, you'd add `json` arms
to `grow` and `grab`, which allow your data to be converted to and from JSON,
and therefore allow your agent to communicate with a web front-end. Front-ends,
JSON, and Eyre's APIs which facilitate such communications will be covered in
the separate [Full-Stack Walkthrough](/docs/userspace/full-stack/1-intro),
which you might like to work through after completing this guide. For now
though, it's still useful to use marks and understand how they work.
One further note on marks - while data from remote ships must have a matching
mark file in `/mar`, it's possible to exchange data between local agents with
"fake" marks - ones that don't exist in `/mar`. Your `on-poke` arm could, for
example, use a made-up mark like `%foobar` for actions initiated locally. This
is because marks come into play only at validation boundries, none of which are
crossed when doing local agent-to-agent communications.
## Permissions
In example agents so far, we haven't bothered to check where events such as
pokes are actually coming from - our example agents would accept data from
anywhere, including random foreign ships. We'll now have a look at how to handle
such permission checks.
Back in [lesson 2](/docs/userspace/gall-guide/2-agent#bowl) we discussed the
[bowl](/docs/arvo/gall/data-types#bowl). The `bowl` includes a couple of useful
fields: `our` and `src`. The `our` field just contains the `@p` of the local
ship. The `src` field contains the `@p` of the ship from which the event
originated, and is updated for every new event.
When messages come in over Ames from other ships on the network, they're
[encrypted](/docs/arvo/ames/cryptography) with our ship's public keys and signed by the ship which sent them.
The Ames vane decrypts and verifies the messages using keys in the Jael vane,
which are obtained from the [Azimuth Ethereum contract](/docs/azimuth/azimuth-eth) and [Layer 2 data](/docs/azimuth/l2/layer2) where Urbit ID ownership
and keys are recorded. This means the originating `@p` of all messages are
cryptographically validated before being passed on to Gall, so the `@p`
specified in the `src` field of the `bowl` can be trusted to be correct, which
makes checking permissions very simple.
You're free to use whatever logic you want for this, but the most common way is
to use [wutgar](/docs/hoon/reference/rune/wut#-wutgar) (`?>`) and
[wutgal](/docs/hoon/reference/rune/wut#-wutgal) (`?<`) runes, which are
respectively True and False assertions that crash if they don't evaluate to the
expected truth value. To only allow messages from the local ship, you can just
do the following in the relevant agent arm:
```hoon
?> =(src.bowl our.bowl)
```
A common permission is to allow messages from the local ship, as well as all of
its moons, which can be done with the `team:title` standard library function:
```hoon
?> (team:title our.bowl src.bowl)
```
If we want to only allow messages from a particular set of ships, we could, for
example, have a `(set @p)` in our agent's state called `allowed`. Then, we can
use the `has:in` set function to check:
```hoon
?> (~(has in allowed) src.bowl)
```
If we wanted to check a ship was allowed in a particular group in the Groups
app, we could scry our ship's `%group-store` agent and compare:
```hoon
?> .^(? %gx /(scot %p our.bowl)/group-store/(scot %da now.bowl)/groups/ship/~bitbet-bolbel/urbit-community/join/(scot %p src.bowl)/noun)
```
There are many ways to handle permissions, it all depends on your particular use
case.
## Summary
Type definitions:
- An agent's type definitions live in the `/sur` directory of a desk.
- The `/sur` file is a core, typically containing a number of lusbuc (`+$`)
arms.
- `/sur` files are imported with the fashep (`/-`) Ford rune at the beginning
of a file.
- Agent API types, for pokes and updates to subscribers, are commonly defined as
head-tagged unions such as `[%foo bar=baz]`.
Mark files:
- Mark files live in the `/mar` directory of a desk.
- A mark like `%foo` corresponds to a file in `/mar` like `/mar/foo.hoon`
- Marks are file types in Clay, but are also used for passing data between
agents as well as for external data generally.
- A mark file is a door with a sample of the data type it handles and exactly three
arms: `grab`, `grow` and `grad`.
- `grab` and `grow` each contain a core with arm names corresponding to other marks.
- `grab` and `grow` define functions for converting to and from our mark,
respectively.
- `grad` defines revision control functions for Clay, but you'd typically just
delegate such functions to the `%noun` mark.
- Incoming data from remote ships will have their marks validated by the
corresponding mark file in `/mar`.
- Messages passed between agents on a local ship don't necessarily need mark
files in `/mar`.
- Mark files are most commonly used for converting an agent's native types to
JSON, in order to interact with a web front-end.
Permissions:
- The source of incoming messages from remote ships are cryptographically
validated by Ames and provided to Gall, which then populates the `src` field
of the `bowl` with the `@p`.
- Permissions are most commonly enforced with wutgar (`?>`) and wutgal (`?<`)
assertions in the relevant agent arms.
- Messages can be restricted to the local ship with `?> =(src.bowl our.bowl)` or to
its moons as well with `?> (team:title our.bowl src.bowl)`.
- There are many other ways to handle permissions, it just depends on the needs
of the particular agent.
## Exercises
- Have a quick look at the [tisket
documentation](/docs/hoon/reference/rune/tis#-tisket).
- Try writing a mark file for the `update:todo` type, in a similar fashion to
the `action:todo` one in the [mark file section](#mark-files). You can compare
yours to the one we'll use in the next lesson.

View File

@ -0,0 +1,911 @@
+++
title = "8. Subscriptions"
weight = 40
template = "doc.html"
+++
In this lesson we're going to look at subscriptions. Subscriptions are probably
the most complicated part of writing agents, so there's a fair bit to cover.
Before we get into the nitty-gritty details, we'll give a brief overview of
Gall's subscription mechanics.
The basic unit of subscriptions is the _path_. An agent will typically define a
number of subscription paths in its `on-watch` arm, and other agents (local or
remote) can subscribe to those paths. The agent will then send out updates
called `%fact`s on one or more of its paths, and _all_ subscribers of those
paths will receive them. An agent cannot send out updates to specific
subscribers, it can only target its paths. An agent can kick subscribers from
its paths, and subscribers can unsubscribe from any paths.
The subscription paths an agent defines can be simple and fixed like
`/foo/bar/baz`. They can also be dynamic, containing data of a particular atom
aura encoded in certain elements of the path. These paths can therefore be as
simple or complex as you need for your particular application.
Note it's not strictly necessary to define subscription paths explicitly. As
long as the arm doesn't crash, the subscription will succeed. In practice,
however, it's nearly always appropriate to define them explicitly and crash on
unrecognized paths.
For a deeper explanation of subscription mechanics in Arvo, you can refer to
Arvo's [Subscriptions](/docs/arvo/concepts/subscriptions) section.
## Incoming subscriptions
Subscription requests from other entities arrive in your agent's `on-watch` arm.
The `on-watch` arm takes the `path` to which they're subscribing, and produces a
`(quip card _this)`:
```hoon
++ on-watch
|= =path
^- (quip card _this)
...
```
Your agent's subscription paths would be defined in this arm, typically in a
wutlus (`?+`) expression or similar:
```hoon
?+ path (on-watch:def path)
[%updates ~]
......
......
[%blah %blah ~]
......
......
[%foo @ ~]
=/ when=@da (slav %da i.t.path)
......
......
[%bar %baz *]
?+ t.t.path (on-watch:def path)
~ .....
[%abc %def ~] .....
[%blah ~] .....
==
==
```
Subscription paths can be simple and fixed like the first two examples above:
`/updates` and `/blah/blah`. They can also contain "wildcard" elements, with an
atom of a particular aura encoded in an element of the `path`, as in the `[%foo @ ~]` example. The type pattern matcher is quite limited, so we just specify
such variable elements as `@`, and then decode them with something like `(slav %da i.t.path)` (for a `@da`), as in the example. The incoming `path` in this
example would look like `/foo/~2021.11.14..13.30.39..6b17`. For more information
on decoding atoms in strings, see the [Strings
Guide](/docs/hoon/guides/strings#decoding-from-text).
In the last case of `[%bar baz *]`, we're allowing a variable number of elements
in the path. First we check it's `/foo/bar/...something...`, and then we check
what the "something" is in another wutlus expression and handle it
appropriately. In this case, it could be `/foo/bar`, `/foo/bar/abc/def`, or
`/foo/bar/blah`. You could of course also have "wildcard" elements here too, so
there's not really a limit to the complexity of your subscription paths, or the
data that might be encoded therein.
Permissions can be checked as described in the previous lesson, comparing the
source `@p` of the request in `src.bowl` to `our.bowl` or any other logic you
find appropriate.
If a permission check fails, the path is not valid, or any other reason you want
to reject the subscription request, your agent can simply crash. The behavior
here is the same as with `on-poke` - Gall will send a `%watch-ack` card in
response, which is either an ack (positive acknowledgement) or a nack (negative
acknowledgement). The `(unit tang)` in the `%watch-ack` will be null if
processing succeeded, and non-null if it crashed, with a stack trace in the
`tang`. Like with `poke-ack`s, you don't need to explicitly send a
`%watch-ack` - Gall will do it automatically.
As well as sending a `%watch-ack`, Gall will also record the subscription in the
`sup` field of the `bowl`, if it succeeded. Then, when you send updates out to
subscribers of the `path` in question, the new subscriber will begin receiving
them as well.
Updates to subscribers would usually be sent from other arms, but there's one
special case for `on-watch` which is very useful. Normally updates can only be
sent to all subscribers of a particular path - you can't target a specific
subscriber. There's one exception to this: In `on-watch`, when there's a new
subscription, you can send a `%fact` back with an empty `(list path)`, and it'll
only go to the new subscriber. This is most useful when you want to give the
subscriber some initial state, which you otherwise couldn't do without sending
it to everyone. It might look something like this:
```hoon
:_ this
:~ [%give %fact ~ %todo-update !>(`update:todo`initial+tasks)]
==
```
## Sending updates to subscribers
Once your agent has subscribers, it's easy to send them out updates. All you
need to do is produce `card`s with `%fact`s in them:
```hoon
:_ this
:~ [%give %fact ~[/some/path /another/path] %some-mark !>('some data')]
[%give %fact ~[/some/path] %some-mark !>('more data')]
....
==
```
The `(list path)` in the `%fact` specifies which subscription `path`s the
`%fact` should be sent on. All subscribers of all `path`s specified will receive
the `%fact`. Any agent arm which produces a `(quip card _this)` can send
`%fact`s to subscribers. Most often they will be produced in the `on-poke` arm,
since new data will often be added in `poke`s.
## Kicking subscribers
To kick a subscriber, you just send a `%kick` `card`:
```hoon
[%give %kick ~[/some/path] `~sampel-palnet]
```
The `(list path)` specifies which subscription `path`s the ship should be kicked
from, and the `(unit ship)` specifies which ship to kick. The `(unit ship)` can also be null, like so:
```hoon
[%give %kick ~[/some/path] ~]
```
In this case, all subscribers to the specified `path`s will be kicked.
Note that `%kick`s are not exclusively sent by the agent itself - Gall itself
can also kick subscribers under certain network conditions. Because of this,
`%kick`s are not assumed to be intentional, and the usual behavior is for a
kicked agent to try and resubscribe. Therefore, if you want to disallow a
particular subscriber, your agent's `on-watch` arm should reject further
subscription requests from them - your agent should not just `%kick` them and
call it a day.
## Outgoing subscriptions
Now that we've covered incoming subscriptions, we'll look at the other side of
it: Subscribing to other agents. This is done by `%pass`ing the target agent a
`%watch` task in a `card`:
```hoon
[%pass /some/wire %agent [~some-ship %some-agent] %watch /some/path]
```
If your agent's subscription request is successful, updates will come in to your
agent's `on-agent` arm on the `wire` specified (`/some/wire` in this example).
The `wire` can be anything you like - its purpose is for your agent to figure
out which subscription the updates came from. The `[ship term]` pair specifies the
ship and agent you're trying to subscribe to, and the final `path` (`/some/path`
in this example) is the path you want to subscribe to - a `path` the target
agent has defined in its `on-watch` arm.
Gall will deliver the `card` to the target agent and call that agent's
`on-watch` arm, which will process the request [as described
above](#incoming-subscription-requests), accept or reject it, and send back
either a positive or negative `%watch-ack`. The `%watch-ack` will come back in
to your agent's `on-agent` arm in a `sign`, along with the `wire` you specified
(`/some/wire` in this example). Recall in the lesson on pokes, the `on-agent`
arm starts with:
```hoon
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
.....
```
The `sign` will be of the following format:
```hoon
[%watch-ack p=(unit tang)]
```
How you want to handle the `%watch-ack` really depends on the particular agent.
In the simplest case, you can just pass it to the `on-agent` arm of
`default-agent`, which will just accept it and do nothing apart from printing
the error in the `%watch-ack` `tang` if it's a nack. You shouldn't have your
agent crash on a `%watch-ack` - even if it's a nack your agent should process it
successfully. If you wanted to apply some additional logic on receipt of the
`%watch-ack`, you'd typically first test the `wire`, then test whether it's a
`%watch-ack`, then test whether it's an ack or a nack and do whatever's
appropriate:
```hoon
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ wire (on-agent:def wire sign)
[%expected %wire ~]
?+ -.sign (on-agent:def wire sign)
%watch-ack
?~ p.sign
...(do something if ack)...
...(do something if nack)...
......
```
The `on-agent` arm produces a `(quip card _this)`, so you can produce new
`card`s and update your agent's state, as appropriate.
One further thing to note with subscriptions is that you can subscribe multiple
times to the same `path` on the same ship and agent, as long as the `wire` is
unique. If the ship, agent, `path` and `wire` are all the same as an existing
subscription, Gall will not allow the request to be sent, and instead fail with
an error message fed into the `on-fail` arm of your agent.
## Receiving updates
Assuming the `%watch` succeeded, your agent will now begin receiving any
`%fact`s the other agent publishes on the `path` to which you've subscribed. These
`%fact`s will also come in to your agent's `on-agent` arm in a `sign`, just like
the initial `%watch-ack`. The `%fact` `sign` will have the following format:
```hoon
[%fact =cage]
```
You would typically handle such `%fact`s in the following manner: Test the
`wire`, test whether the `sign` is a `%fact`, test the `mark` in the `cage`,
extract the data from the `vase` in the `cage`, and apply your logic. Again, routing on `wire` before `sign` is one of the [Precepts](/docs/development/precepts#specifics). For example:
```hoon
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ wire (on-agent:def wire sign)
[%expected %wire ~]
?+ -.sign (on-agent:def wire sign)
%fact
?+ p.cage.sign (on-agent:def wire sign)
%expected-mark
=/ foo !<(expected-type q.cage.sign)
.....
......
```
Note that Gall will not allow `sign`s to come into `on-agent` unsolicited, so
you don't necessarily need to include permission logic in this arm.
The `on-agent` arm produces a `(quip card _this)`, so you can produce new
`card`s and update your agent's state, as appropriate.
## Getting kicked
For whatever reason, the agent you're `%watch`ing might want to kick your agent
from a `path` to which it's suscribed, ending your subscription and ceasing to
send your agent `%fact`s. To do this, it will send your agent a `%kick` card [as
described above](#kicking-subscribers). The `%kick` will come in to your agent's
`on-agent` arm in a `sign`, like `%watch-ack`s and `%fact`s do. The `%kick`
`sign` will have the following format:
```hoon
[%kick ~]
```
Since the `%kick` itself contains no information, you'll need to consider the
`wire` it comes in on to know what it pertains to. As explained previously,
`%kick`s aren't always intentional - sometimes Gall will kick subscribers due to
network issues. Your `on-agent` arm therefore has no way to know whether the
other agent actually intended to kick it. This means _your agent should almost
always try to resubscribe if it gets kicked_. Then, if the resubscribe `%watch`
request is rejected with a negative `%watch-ack`, you can conclude that it was
intentional and give up. The logic would look something like this:
```hoon
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ wire (on-agent:def wire sign)
[%some %wire ~]
?+ -.sign (on-agent:def wire sign)
%kick
:_ this
:~ [%pass /some/wire %agent [src.bowl dap.bowl] %watch /some/path]
==
.......
```
## Leaving a subscription
Eventually you may wish to unsubscribe from a `path` in another agent and stop
receiving updates. This is done by `%pass`ing a `%leave` task to the agent in
question:
```hoon
[%pass /some/wire %agent [~some-ship %some-agent] %leave ~]
```
The subcription to be ended is determined by the combination of the `wire`, ship
and agent, so the `%leave` task itself always just has `~` at the end.
## Example
Here we're going to give a pretty well fleshed out example. It will demonstrate
both inbound and outbound subscriptions, most of the concepts we've discussed
here, as well as some from the previous lesson - `/sur` files, `mark` files, and
permission checks.
In previous lessons we've only dealt with things on a local ship - this example
will demonstrate messages being sent over the network.
The example will be composed of two separate agents - a publisher called
`/app/todo.hoon` and a subscriber called `/app/todo-watcher.hoon`, which will
live on separate ships. It will be a very rudimentary To-Do app - to-do tasks
will be poked into the publisher and sent out to the subscriber as `%fact`s,
which will just print them to the dojo. It will have its types defined in
`/sur/todo.hoon`, and it will have a couple of `mark` files for pokes and
updates: `/mar/todo/action.hoon` and `/mar/todo/update.hoon`.
Before we get into trying it out, we'll first walk through the `/sur` file, mark
files, and each agent.
### Types and marks
#### `/sur/todo.hoon`
```hoon
|%
+$ id @
+$ name @t
+$ task [=name done=?]
+$ tasks (map id task)
+$ who @p
+$ friends (set who)
+$ action
$% [%add =name]
[%del =id]
[%toggle =id]
[%rename =id =name]
[%allow =who]
[%kick =who]
==
+$ update
$% [%add =id =name]
[%del =id]
[%toggle =id]
[%rename =id =name]
[%initial =tasks]
==
--
```
This file defines most of the types for the agents. The list of to-do tasks will
be stored in the state of the publisher agent as the `tasks` type, a `(map id task)`, where a `task` is a `[=name done=?]`. The set of ships allowed to
subscribe will be stored in `friends`, a `(set @p)`, also in the publisher's
state. After that, there are the head-tagged unions of accepted poke `action`s
and `update`s for subscribers.
#### `/mar/todo/action.hoon`
```hoon
/- todo
|_ =action:todo
++ grab
|%
++ noun action:todo
--
++ grow
|%
++ noun action
--
++ grad %noun
--
```
This is a very simple mark file for the `action` type.
#### `/mar/todo/update.hoon`
```hoon
/- todo
|_ =update:todo
++ grab
|%
++ noun update:todo
--
++ grow
|%
++ noun update
--
++ grad %noun
--
```
This is a very simple mark file for the `update` type.
### Publisher
#### `/app/todo.hoon`
```hoon
/- todo
/+ default-agent, dbug
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 =friends:todo =tasks:todo]
+$ card card:agent:gall
--
%- agent:dbug
=| state-0
=* state -
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
::
++ on-init
^- (quip card _this)
`this
::
++ on-save
^- vase
!>(state)
::
++ on-load
|= old-state=vase
^- (quip card _this)
=/ old !<(versioned-state old-state)
?- -.old
%0 `this(state old)
==
::
++ on-poke
|= [=mark =vase]
^- (quip card _this)
|^
?> =(src.bowl our.bowl)
?+ mark (on-poke:def mark vase)
%todo-action
=^ cards state
(handle-poke !<(action:todo vase))
[cards this]
==
++ handle-poke
|= =action:todo
^- (quip card _state)
?- -.action
%add
?: (~(has by tasks) now.bowl)
$(now.bowl (add now.bowl ~s0..0001))
:_ state(tasks (~(put by tasks) now.bowl [name.action %.n]))
:~ :* %give %fact ~[/updates] %todo-update
!>(`update:todo`[%add now.bowl name.action])
==
==
::
%del
:_ state(tasks (~(del by tasks) id.action))
:~ :* %give %fact ~[/updates] %todo-update
!>(`update:todo`action)
==
==
::
%toggle
:- :~ :* %give %fact ~[/updates] %todo-update
!>(`update:todo`action)
== ==
%= state
tasks %+ ~(jab by tasks)
id.action
|=(=task:todo task(done !done.task))
==
::
%rename
:- :~ :* %give %fact ~[/updates] %todo-update
!>(`update:todo`action)
== ==
%= state
tasks %+ ~(jab by tasks)
id.action
|=(=task:todo task(name name.action))
==
%allow
`state(friends (~(put in friends) who.action))
::
%kick
:_ state(friends (~(del in friends) who.action))
:~ [%give %kick ~[/updates] `who.action]
==
==
--
::
++ on-watch
|= =path
^- (quip card _this)
?+ path (on-watch:def path)
[%updates ~]
?> (~(has in friends) src.bowl)
:_ this
:~ [%give %fact ~ %todo-update !>(`update:todo`initial+tasks)]
==
==
::
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent on-agent:def
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
This is the publisher agent, `todo.hoon`. The bulk of its logic is in its
`on-poke` arm, where it handles the various possible actions like `%add`ing a
task, `%toggle`ing its "done" state, `%rename`ing a task, and so on. It also has
a couple of `action`s for `%allow`ing and `%kick`ing subscribers.
Most of these cases both update the state of the agent, as well as producing
`%fact` cards to send out to subscribers with the new data.
You'll notice it only allows these pokes from the local ship, and enforces this
in `on-poke` with:
```hoon
?> =(src.bowl our.bowl)
```
Additionally, you might notice the `%add` case in `handle-poke` begins with the
following:
```hoon
?: (~(has by tasks) now.bowl)
$(now.bowl (add now.bowl ~s0..0001))
```
Back in lesson two, we mentioned that the bowl is only repopulated when there's
a new Arvo event, so simultaneous messages from a local agent or web client
would be processed with the same bowl. Since we're using `now.bowl` for the task
ID, this means multiple `%add` actions could collide. To handle this case, we
check if there's already an entry in the `tasks` map with the current date-time,
and if there is, we increase the time by a fraction of a second and try again.
Let's now look at `on-watch`:
```hoon
++ on-watch
|= =path
^- (quip card _this)
?+ path (on-watch:def path)
[%updates ~]
?> (~(has in friends) src.bowl)
:_ this
:~ [%give %fact ~ %todo-update !>(`update:todo`initial+tasks)]
==
==
```
When `on-watch` gets a subscription request, it checks whether the requesting
ship is in the `friends` set, and crashes if it is not. If they're in `friends`,
it produces a `%fact` card with a null `(list path)`, which means it goes only
to the new subscriber. This `%fact` contains the entire `tasks` map as it
currently exists, getting the new subscriber up to date.
### Subscriber
#### `/app/todo-watcher.hoon`
```hoon
/- todo
/+ default-agent, dbug
|%
+$ versioned-state
$% state-0
==
+$ state-0 [%0 ~]
+$ card card:agent:gall
--
%- agent:dbug
=| state-0
=* state -
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
::
++ on-init
^- (quip card _this)
`this
::
++ on-save
^- vase
!>(state)
::
++ on-load
|= old-state=vase
^- (quip card _this)
=/ old !<(versioned-state old-state)
?- -.old
%0 `this(state old)
==
::
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?> =(src.bowl our.bowl)
?+ mark (on-poke:def mark vase)
%noun
=/ action !<(?([%sub @p] [%unsub @p]) vase)
?- -.action
%sub
:_ this
:~ [%pass /todos %agent [+.action %todo] %watch /updates]
==
%unsub
:_ this
:~ [%pass /todos %agent [+.action %todo] %leave ~]
==
==
==
::
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
::
++ on-agent
|= [=wire =sign:agent:gall]
^- (quip card _this)
?+ wire (on-agent:def wire sign)
[%todos ~]
?+ -.sign (on-agent:def wire sign)
%watch-ack
?~ p.sign
((slog '%todo-watcher: Subscribe succeeded!' ~) `this)
((slog '%todo-watcher: Subscribe failed!' ~) `this)
::
%kick
%- (slog '%todo-watcher: Got kick, resubscribing...' ~)
:_ this
:~ [%pass /todos %agent [src.bowl %todo] %watch /updates]
==
::
%fact
?+ p.cage.sign (on-agent:def wire sign)
%todo-update
~& !<(update:todo q.cage.sign)
`this
==
==
==
::
++ on-arvo on-arvo:def
++ on-fail on-fail:def
--
```
This is the subscriber agent. Since it's just for demonstrative purposes, it has
no state and just prints the updates it receives. In practice it would keep the
`tasks` map it receives in its own state, and then update it as it receives new
`%fact`s.
The `on-poke` arm is fairly simple - it accepts two pokes, to either `[%sub ~some-ship]` or `[%unsub ~some-ship]`.
The `on-agent` arm will print whether a subscription request succeeded or
failed, as well as printing a message when it gets kicked. When it receives a
`%fact` from the publisher agent, it will just print it to the terminal with a
`~&` expression.
### Trying it out
We're going to try this between two different ships. The first ship will be the
usual fakezod. We'll add both `mark` files, the `/sur` file, and the `todo.hoon`
agent to the `%base` desk of our fakezod, putting them in the following
directories:
```
base
├── app
│ └── todo.hoon
├── mar
│ └── todo
│ ├── action.hoon
│ └── update.hoon
└── sur
└── todo.hoon
```
In `~zod`'s dojo, we can `|commit %base`, and then start the `%todo` agent:
```
|rein %base [& %todo]
```
Now we need to spin up another fake ship. We'll use `~nut` in this example:
```
urbit -F nut
```
Once it's booted, we can `|mount %base` and then add just the `update.hoon` mark
file, the `/sur` file, and the `todo-watcher.hoon` agent like so:
```
base
├── app
│ └── todo-watcher.hoon
├── mar
│ └── todo
│ └── update.hoon
└── sur
└── todo.hoon
```
On `~nut` we can then `|commit %base`, and start the `%todo-watcher` agent:
```
|rein %base [& %todo-watcher]
```
Now, on `~nut`, let's try subscribing:
```
> :todo-watcher [%sub ~zod]
>=
%todo-watcher: Subscribe failed!
```
Our `%todo-watcher` agent tried, but received a negative `%watch-ack` from
`%todo`, because we haven't yet added `~nut` to the `friends` set of allowed
ships. Let's now remedy that on `~zod`:
```
> :todo &todo-action [%allow ~nut]
>=
```
Let's also add a couple of to-do tasks, on `~zod`:
```
> :todo &todo-action [%add 'foo']
>=
> :todo &todo-action [%add 'bar']
>=
```
If we now check its state with `+dbug`, we'll see they're in the `tasks` map,
and `~nut` will also now be in the `friends` set:
```
> [ %0
friends={~nut}
tasks
{ [ p=170.141.184.505.349.079.206.522.766.950.035.095.552
q=[name='foo' done=%.n]
]
[ p=170.141.184.505.349.079.278.538.984.166.386.565.120
q=[name='bar' done=%.n]
]
}
]
> :todo +dbug
>=
```
Let's now try subscribing again on `~nut`:
```
> :todo-watcher [%sub ~zod]
>=
%todo-watcher: Subscribe succeeded!
[ %initial
tasks
{ [ p=170.141.184.505.349.079.206.522.766.950.035.095.552
q=[name='foo' done=%.n]
]
[ p=170.141.184.505.349.079.278.538.984.166.386.565.120
q=[name='bar' done=%.n]
]
}
]
```
As you can see, this time it's worked, and we've immediately received the
initial `tasks` map.
Now, let's try adding another task on `~zod`:
```
> :todo &todo-action [%add 'baz']
>=
```
On `~nut`, we'll see it has received the `%fact` with the new task in it:
```
[ %add
id=170.141.184.505.349.082.779.030.192.959.445.270.528
name='baz'
]
```
Let's try toggle its done state on `~zod`:
```
> :todo &todo-action [%toggle 170.141.184.505.349.082.779.030.192.959.445.270.528]
>=
```
`~nut` will again get the `%fact`:
```
[ %toggle
id=170.141.184.505.349.082.779.030.192.959.445.270.528
]
```
Recall that incoming subscriptions are stored in `sup.bowl`, and outgoing
subscriptions are stored in `wex.bowl`. Let's have a look at the incoming
subscription on `~zod`:
```
> [ path=/updates
from=~nut
duct=~[/gall/sys/req/~nut/todo /ames/bone/~nut/1 //ames]
]
> :todo +dbug [%incoming %ship ~nut]
>=
```
On `~nut`, let's look at the outgoing subscription:
```
> [wire=/todos agnt=[~zod %todo] path=/updates ackd=%.y]
> :todo-watcher +dbug [%outgoing %ship ~zod]
>=
```
Now on `~zod`, let's try kicking `~nut` and removing it from our `friends` set:
```
> :todo &todo-action [%kick ~nut]
>=
```
On `~nut`, we'll see it got the `%kick`, tried resubscribing automatically, but
was rejected because `~nut` is no longer in `friends`:
```
%todo-watcher: Got kick, resubscribing...
%todo-watcher: Subscribe failed!
```
## Summary
- Incoming subscription requests arrive in an agent's `on-watch` arm.
- An agent will define various subscription `path`s in its `on-watch` arm, which
others can subscribe to.
- Gall will automatically produce a negative `%watch-ack` if `on-watch` crashed,
and a positive one if it was successful.
- Incoming subscribers are recorded in the `sup` field of the `bowl`.
- `on-watch` can produce a `%fact` with a null `(list path)` which will go only
to the new subscriber.
- Updates are sent to subscribers in `%fact` cards, and contain a `cage` with a
`mark` and some data in a `vase`.
- `%fact`s are sent to all subscribers of the paths specified in the `(list path)`.
- A subscriber can be kicked from subscription paths with a `%kick` card
specifying the ship in the `(unit ship)`. All subscribers of the specified
paths will be kicked if the `(unit ship)` is null.
- An outgoing subscription can be initiated with a `%watch` card.
- The `%watch-ack` will come back in to the subscriber's `on-agent` arm as a
`sign`, and may be positive or negative, depending on whether the `(unit tang)` is null.
- `%kick`s will also arrive in the subscriber's `on-agent` arm as a `sign`.
Since kicks may not be intentional, the subscriber should attempt to
resubscribe and only give up if the subsequent `%watch-ack` is negative.
- `%fact`s will also arrive in the subscriber's `on-agent` arm.
- All such `sign`s that arrive in `on-agent` will also have a `wire`.
- The `wire` for subscription updates to arrive on is specified in the initial
`%watch` card.
- A subscriber can unsubscribe by passing a `%leave` card on the original
`wire`.
## Exercises
- Have a look at the [Strings Guide](/docs/hoon/guides/strings) if you're not
already familiar with decoding/encoding atoms in strings.
- Try running through the [example](#example) yourself, if you've not done so
already.
- Try modifying `%todo-watcher` to recording the data it receives in its state,
rather than simply printing it to the terminal.
- If you'd like, try going back to [lesson
6](/docs/userspace/gall-guide/6-pokes) (on pokes) and modifying the agents
with an appropriate permission system, and also try running them on separate
ships.

View File

@ -0,0 +1,284 @@
+++
title = "9. Vanes"
weight = 45
template = "doc.html"
+++
In this lesson we're going to look at interacting with vanes (kernel modules).
The API for each vane consists of `task`s it can take, and `gift`s it can
return. The `task`s and `gift`s for each vane are defined in its section of
`lull.hoon`. Here's the `task:iris`s and `gift:iris`s for Iris, the HTTP client
vane, as an example:
```hoon
|%
+$ gift
$% [%request id=@ud request=request:http]
[%cancel-request id=@ud]
[%http-response =client-response]
==
+$ task
$~ [%vega ~]
$% $>(%born vane-task)
$>(%trim vane-task)
$>(%vega vane-task)
[%request =request:http =outbound-config]
[%cancel-request ~]
[%receive id=@ud =http-event:http]
==
```
The API of each vane is documented in its respective section of the [Arvo
documentation](/docs/arvo/overview). Each vane has a detailed API reference and
examples of their usage. There are far too many `task`s and `gift`s across the
vanes to cover here, so in the [`Example`](#example) section of this document,
we'll just look at a single, simple example with a Behn timer. The basic pattern
in the example is broadly applicable to the other vanes as well.
## Sending a vane task
A `task` can be sent to a vane by `%pass`ing it an `%arvo` card. We touched on
these in the [Cards](/docs/userspace/gall-guide/5-cards) lesson, but we'll
briefly recap it here. The type of the card is as follows:
```hoon
[%pass path %arvo note-arvo]
```
The `path` will just be the `wire` you want the response to arrive on. The
`note-arvo` is the following union:
```hoon
+$ note-arvo
$~ [%b %wake ~]
$% [%a task:ames]
[%b task:behn]
[%c task:clay]
[%d task:dill]
[%e task:eyre]
[%g task:gall]
[%i task:iris]
[%j task:jael]
[%$ %whiz ~]
[@tas %meta vase]
==
```
The letter tags just specify which vane it goes to, and then follows the `task`
itself. Here are a couple of examples. The first sends a `%wait` `task:behn` to
Behn, setting a timer to go off one minute in the future. The second sends a
`%warp` `task:clay` to Clay, asking whether `sys.kelvin` exists on the `%base`
desk.
```hoon
[%pass /some/wire %arvo %b %wait (add ~m1 now.bowl)]
[%pass /some/wire %arvo %c %warp our.bowl %base ~ %sing %u da+now.bowl /sys/kelvin]
```
## Receiving a vane gift
Once a `task` has been sent to a vane, any `gift`s the vane sends back in
response will arrive in the `on-arvo` arm of your agent. The `on-arvo` arm
exclusively handles such vane `gift`s. The `gift`s will arrive in a `sign-arvo`,
along with the `wire` specified in the original request. The `on-arvo` arm
produces a `(quip card _this)` like usual, so it would look like:
```hoon
++ on-arvo
|= [=wire =sign-arvo]
^- (quip card _this)
.....
```
A `sign-arvo` is the following structure, defined in `lull.hoon`:
```hoon
+$ sign-arvo
$% [%ames gift:ames]
$: %behn
$% gift:behn
$>(%wris gift:clay)
$>(%writ gift:clay)
$>(%mere gift:clay)
$>(%unto gift:gall)
==
==
[%clay gift:clay]
[%dill gift:dill]
[%eyre gift:eyre]
[%gall gift:gall]
[%iris gift:iris]
[%jael gift:jael]
==
```
The head of the `sign-arvo` will be the name of the vane like `%behn`, `%clay`,
etc. The tail will be the `gift` itself. Here are a couple of `sign-arvo`
examples, and the responses to the example `task`s in the previous section:
```hoon
[%behn %wake ~]
```
```
[ %clay
[ %writ
p
[ ~
[ p=[p=%u q=[%da p=~2021.11.17..13.55.00..c195] r=%base]
q=/sys/kelvin
r=[p=%flag q=[#t/?(%.y %.n) q=0]]
]
]
]
]
```
The typical pattern is to first test the `wire` with something like a wutlus
(`?+`) expression, and then test the `sign-arvo`. Since most `gift`s are
head-tagged, you can test both the vane and the gift at the same time like:
```hoon
?+ sign-arvo (on-arvo:def wire sign-arvo)
[%behn %wake *]
.....
....
```
## Example
Here's a very simple example that takes a poke of a `@dr` (a relative date-time
value) and sends Behn a `%wait` `task:behn`, setting a timer to go off `@dr` in
the future. When the timer goes off, `on-arvo` will take the `%wake` `gift:behn`
and print "Ding!" to the terminal.
#### `ding.hoon`
```hoon
/+ default-agent, dbug
|%
+$ card card:agent:gall
--
%- agent:dbug
^- agent:gall
|_ =bowl:gall
+* this .
def ~(. (default-agent this %.n) bowl)
++ on-init on-init:def
++ on-save on-save:def
++ on-load on-load:def
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
:_ this
:~ [%pass /timers %arvo %b %wait (add now.bowl !<(@dr vase))]
==
==
++ on-watch on-watch:def
++ on-leave on-leave:def
++ on-peek on-peek:def
++ on-agent on-agent:def
++ on-arvo
|= [=wire =sign-arvo]
^- (quip card _this)
?+ wire (on-arvo:def wire sign-arvo)
[%timers ~]
?+ sign-arvo (on-arvo:def wire sign-arvo)
[%behn %wake *]
?~ error.sign-arvo
((slog 'Ding!' ~) `this)
(on-arvo:def wire sign-arvo)
==
==
++ on-fail on-fail:def
--
```
Let's examine the `on-poke` arm:
```hoon
++ on-poke
|= [=mark =vase]
^- (quip card _this)
?+ mark (on-poke:def mark vase)
%noun
:_ this
:~ [%pass /timers %arvo %b %wait (add now.bowl !<(@dr vase))]
==
==
```
A Behn `%wait` task has the format `[%wait @da]` - the `@da` (an absolute
date-time value) is the time the timer should go off. The `vase` of the poke
takes a `@dr`, so we extract it directly into an `add` expression, producing a
date-time `@dr` from now. Behn will receive the `%wait` task and set the timer
in Unix. When it fires, Behn will produce a `%wake` `gift:behn` and deliver it
to `on-arvo`, on the `wire` we specified (`/timers`). Here's the `on-arvo` arm:
```hoon
++ on-arvo
|= [=wire =sign-arvo]
^- (quip card _this)
?+ wire (on-arvo:def wire sign-arvo)
[%timers ~]
?+ sign-arvo (on-arvo:def wire sign-arvo)
[%behn %wake *]
?~ error.sign-arvo
((slog 'Ding!' ~) `this)
(on-arvo:def wire sign-arvo)
==
==
```
We remark that, just like in the case of agent-agent communication, `gift`s from Arvo are also routed `wire` before `sign-arvo`.
First we check the `wire` is `/timers`, and then we check the `sign-arvo` begins
with `[%behn %wake ....]`. Behn's `%wake` gift has the following format:
```hoon
[%wake error=(unit tang)]
```
The `error` is null if the timer fired successfully, and contains an error in
the `tang` if it did not. We therefore test whether `error.sign-arvo` is `~`,
and if it is, we print `Ding!` to the terminal. If the `wire`, `sign-arvo` or
`error` are something unexpected, we pass it to `%default-agent`, which
will just crash and print an error message.
Let's try it out. Save the agent above as `/app/ding.hoon` on the `%base` desk
and `|commit %base`. Then, start the agent with `|rein %base [& %ding]`.
Next, in the dojo let's try poking our agent, setting a timer for five seconds
from now:
```
> :ding ~s5
>=
```
After approximately five seconds, we see the timer fired successfully:
```
> Ding!
```
## Summary
- Each vane has an API composed of `task`s it takes and `gift`s it produces.
- Each vane's `task`s and `gift`s are defined in `lull.hoon`
- Each vane's section of the [Arvo documentation](/docs/arvo/overview) includes
an API reference that explains its `task`s and `gift`s, as well as an Examples
section demonstrating their usage.
- Vane `task`s can be sent to vanes by `%pass`ing them an `%arvo` `card`.
- Vane `gift`s come back to the `on-arvo` arm of the agent core in a
`sign-arvo`.
## Exercises
- Run through the [Example](#example) yourself if you've not done so already.
- Have a look at some vane sections of `lull.hoon` to familiarize yourself with
its structure.
- Have a quick look at the API reference sections of a couple of vanes in the
[Arvo documentation](/docs/arvo/overview).

View File

@ -0,0 +1,45 @@
+++
title = "Gall Guide"
weight = 5
sort_by = "weight"
template = "sections/docs/chapters.html"
insert_anchor_links = "right"
+++
## Table of Contents
- [Introduction](/docs/userspace/gall-guide/intro)
#### Lessons
1. [Arvo](/docs/userspace/gall-guide/1-arvo) - This lesson provides an
overview of the Arvo operating system, and some other useful background
information.
2. [The Agent Core](/docs/userspace/gall-guide/2-agent) - This lesson goes over
the basic structure of a Gall agent.
3. [Imports and Aliases](/docs/userspace/gall-guide/3-imports-and-aliases) -
This lesson covers some useful libraries, concepts and boilerplate commonly
used when writing Gall agents.
4. [Lifecycle](/docs/userspace/gall-guide/4-lifecycle) - This lesson introduces
the state management arms of an agent.
5. [Cards](/docs/userspace/gall-guide/5-cards) - This lesson covers `card`s -
the structure used to pass messages to other agents and vanes.
6. [Pokes](/docs/userspace/gall-guide/6-pokes) - This lesson covers sending and
receiving one-off messages called "pokes" between agents.
7. [Structures and Marks](/docs/userspace/gall-guide/7-sur-and-marks) - This
lesson talks about importing type defintions, and writing `mark` files.
8. [Subscriptions](/docs/userspace/gall-guide/8-subscriptions) - This lesson
goes through the mechanics of subscriptions - both inbound and outbound.
9. [Vanes](/docs/userspace/gall-guide/9-vanes) - This lesson explains how to
interact with vanes (kernel modules) from an agent.
10. [Scries](/docs/userspace/gall-guide/10-scry) - This lesson gives an overview
of scrying Gall agents, and how scry endpoints are defined in agents.
11. [Failure](/docs/userspace/gall-guide/11-fail) - This lesson covers how Gall
handles certain errors and crashes, as well as the concept of a helper core.
12. [Next Steps](/docs/userspace/gall-guide/12-next-steps) - The Gall Guide is
now complete - here are some things you can look at next.
#### Appendix
- [Types](/docs/userspace/gall-guide/types) - A reference for a few of
the types used in the Gall Guide.

View File

@ -0,0 +1,83 @@
+++
title = "Introduction"
weight = 1
template = "doc.html"
+++
This guide will walk through everything you need to know to write your own Gall
agents.
The Gall Guide is suitable for anyone with an intermediate knowledge of Hoon. If
you've worked through [Hoon School](/docs/hoon/hoon-school/intro) or something
equivalent, you should be fine.
## What are Gall agents?
Gall is one of the eight vanes (kernel modules) of Arvo, Urbit's operating
system. Gall's purpose is to manage userspace applications called _agents_.
An agent is a piece of software that is primarily focused on maintaining and
distributing a piece of state with a defined structure. It exposes an interface
that lets programs read, subscribe to, and manipulate the state. Every event
happens in an atomic transaction, so the state is never inconsistent. Since the
state is permanent, when the agent is upgraded with a change to the structure of
the state, the developer provides a migration function from the old state type
to the new state type.
It's not too far off to think of an agent as simply a database with
developer-defined logic. But an agent is significantly less constrained than a
database. Databases are usually tightly constrained in one or more ways because
they need to provide certain guarantees (like atomicity) or optimizations (like
indexes). Arvo is a [single-level store](/docs/arvo/overview#single-level-store), so atomicity comes for free. Many
applications don't use databases because they need relational indices; rather,
they use them for their guarantees around persistence. Some do need the indices,
though, and it's not hard to imagine an agent which provides a SQL-like
interface.
On the other hand, an agent is also a lot like what many systems call a
"service". An agent is permanent and addressable -- a running program can talk
to an agent just by naming it. An agent can perform [IO](/blog/io-in-hoon), unlike most databases.
This is a critical part of an agent: it performs IO along the same transaction
boundaries as changes to its state, so if an effect happens, you know that the
associated state change has happened.
But the best way to think about an agent is as a state machine. Like a state
machine, any input could happen at any time, and it must react coherently to
that input. Output (effects) and the next state of the machine are a pure
function of the previous state and the input event.
## Table of Contents
#### Lessons
1. [Arvo](/docs/userspace/gall-guide/1-arvo) - This lesson provides an
overview of the Arvo operating system, and some other useful background
information.
2. [The Agent Core](/docs/userspace/gall-guide/2-agent) - This lesson goes over
the basic structure of a Gall agent.
3. [Imports and Aliases](/docs/userspace/gall-guide/3-imports-and-aliases) -
This lesson covers some useful libraries, concepts and boilerplate commonly
used when writing Gall agents.
4. [Lifecycle](/docs/userspace/gall-guide/4-lifecycle) - This lesson introduces
the state management arms of an agent.
5. [Cards](/docs/userspace/gall-guide/5-cards) - This lesson covers `card`s -
the structure used to pass messages to other agents and vanes.
6. [Pokes](/docs/userspace/gall-guide/6-pokes) - This lesson covers sending and
receiving one-off messages called "pokes" between agents.
7. [Structures and Marks](/docs/userspace/gall-guide/7-sur-and-marks) - This
lesson talks about importing type defintions, and writing `mark` files.
8. [Subscriptions](/docs/userspace/gall-guide/8-subscriptions) - This lesson
goes through the mechanics of subscriptions - both inbound and outbound.
9. [Vanes](/docs/userspace/gall-guide/9-vanes) - This lesson explains how to
interact with vanes (kernel modules) from an agent.
10. [Scries](/docs/userspace/gall-guide/10-scry) - This lesson gives an overview
of scrying Gall agents, and how scry endpoints are defined in agents.
11. [Failure](/docs/userspace/gall-guide/11-fail) - This lesson covers how Gall
handles certain errors and crashes, as well as the concept of a helper core.
12. [Next Steps](/docs/userspace/gall-guide/12-next-steps) - The Gall Guide is
now complete - here are some things you can look at next.
#### Appendix
- [Types](/docs/userspace/gall-guide/types) - A reference for a few of
the types used in the Gall Guide.

View File

@ -0,0 +1,228 @@
+++
title = "Appendix: Types"
weight = 65
template = "doc.html"
+++
This document explains a few of the types commonly used in Gall agents. In
addition to these, the [Data Types](/docs/arvo/gall/data-types) section of the
Gall vane documentation is a useful reference. In particular, the whole
[`agent`](/docs/arvo/gall/data-types#agent) subsection, as well as
[`bowl`](/docs/arvo/gall/data-types#bowl),
[`boat`](/docs/arvo/gall/data-types#boat), and
[`bitt`](/docs/arvo/gall/data-types#bitt).
## `vase`
Vases are used to encapsulate _dynamically typed_ data - they let typed data be
moved around in contexts where you can't know the type ahead of time, and
therefore can't have a _static_ type.
Vases are used extensively - almost all data your agent will send
and received is wrapped in a vase.
A vase is just a cell with data in the tail and the type of the data in the
head. Its formal definition is:
```hoon
+$ vase [p=type q=*]
```
Here's what it looks like if we bunt a vase in the dojo:
```
> *vase
[#t/* q=0]
```
There are two simple runes used to create and unpack vases. We'll look at each
of these next.
### Create a `vase`
The [zapgar](https://urbit.org/docs/hoon/reference/rune/zap#-zapgar) rune (`!>`)
takes a single argument of any noun, and wraps it in a vase. For example, in the
dojo:
```
> !>([1 2 3])
[#t/[@ud @ud @ud] q=[1 2 3]]
> !>('foo')
[#t/@t q=7.303.014]
> !>([[0xdead 0xb33f] 'foo'])
[#t/[[@ux @ux] @t] q=[[57.005 45.887] 7.303.014]]
> !>(foo='bar')
[#t/foo=@t q=7.496.034]
```
You would typically use `!>` as part of a [`cage`](#cage) when you're
constructing a `card` like a poke or a `%fact` `gift` to be sent off.
### Extract data from `vase`
The [zapgal](https://urbit.org/docs/hoon/reference/rune/zap#-zapgal) rune (`!<`)
takes two arguments: A mold specifying the type to try and extract the data as,
and the vase to be extracted.
Let's look at an example in the dojo. First, let's create a vase of `[@t @ux @ud]`:
```
> =myvase !>(['foo' 0xabcd 123])
> myvase
[#t/[@t @ux @ud] q=[7.303.014 43.981 123]]
```
Next, let's try extracting our vase:
```
> !< [@t @ux @ud] myvase
['foo' 0xabcd 123]
```
Now let's try asking for a `@p` rather than `@t`:
```
> !< [@p @ux @ud] myvase
-need.@p
-have.@t
nest-fail
```
As you can see, it will crash if the type does not nest. Note that
rather than using `!<`, you can also just clam the tail of the vase like:
```
> ((trel @t @ux @ud) +.myvase)
[p='foo' q=0xabcd r=123]
```
The only problem is that you can't tell if the auras were wrong:
```
> ((trel @p @ud @ux) +.myvase)
[p=~sibtel-tallyd q=43.981 r=0x7b]
```
You'd typically use `!<` on the data in `card`s that come in from other ships,
agents, etc.
## `mark`
The `mark` type is just a `@tas` like `%foo`, and specifies the Clay filetype of
some data. The `mark` corresponds to a mark file in the `/mar` directory, so a
`mark` of `%foo` corresponds to `/mar/foo/hoon`. Mark files are used for saving
data in Clay, validating data sent between agents or over the network, and
converting between different data types. For more information about mark files,
you can refer to the [Marks section of the Clay
documentation](/docs/arvo/clay/marks/marks).
## `cage`
A `cage` is a cell of a [`mark`](#mark) and a [`vase`](#vase), like `[%foo !>('bar')]`. The data in the vase should match the data type of the specified
mark.
Most data an agent sends will be in a `cage`, and most data it receives will
arrive in a `cage`. The `mark` may be used to validate or convert the data in
the `vase`, depending on the context.
## `quip`
`quip` is a mold-builder. A `(quip a b)` is equivalent to `[(list a) b]`, it's
just a more convenient way to specify it. Most arms of an agent return a `(quip card _this)`, which is a list of effects and a new state.
## `path`
The `path` type is formally defined as:
```hoon
+$ path (list knot)
```
A knot is a `@ta` text atom (see the [Strings guide](/docs/hoon/guides/strings)
for details), so a `path` is just a list of text. Rather than having to write
`[~.foo ~.bar ~.baz ~]` though, it has its own syntax which looks like
`/foo/bar/baz`.
A `path` is similar to a filesystem path in Unix, giving data a location in a
nested hierarchy. In Arvo though, they're not only used for files, but are a
more general type used for several different purposes. Its elements have no
inherent significance, it depends on the context. In a Gall agent, a `path` is
most commonly a subscription path - you might subscribe for updates to
`/foo/bar` on another agent, or another agent might subscribe to `/baz` on your
agent.
A `path` might just be a series of fixed `@ta` like `/foo/bar`, but some
elements might also be variable and include encoded atoms, or some other datum. For
example, you might like to include a date in the path like
`/updates/~2021.10.31..07.24.27..db68`. Other agents might create the path by
doing something like:
```hoon
/update/(scot %da now.bowl)
```
Then, when you get a subscription request, you might do something like:
```hoon
?+ path !!
[%updates @ ~]
=/ date=@da (slav %da i.t.path)
...(rest of code)...
```
See the [Encoding in text](/docs/hoon/guides/strings#encoding-in-text) and
[Decoding from text](/docs/hoon/guides/strings#decoding-from-text) sections of
the Strings guide for more information on dealing with atoms encoded in strings.
Aside from using function calls when constructing a `path` as demonstrated
above, you can also insert text you're previously stored with `=/` or what have
you, simply by enclosing them in brackets. For example, in the dojo:
```
> =const ~.bar
> `path`/foo/[const]/baz
/foo/bar/baz
```
## `wire`
The type of a wire is formally defined as:
```hoon
+$ wire path
```
So, a `wire` is just a [`path`](#path), type-wise they're exactly the same. The
reason there's a separate `wire` type is just to differentiate their purpose. A
`wire` is a path for responses to requests an agent initiates. If you subscribe
to the `path` `/some/path` on another agent, you also specify `/some/wire`.
Then, when that agent sends out updates to subscribers of `/some/path`, your
agent receives them on `/some/wire`.
More formally, `wire`s are used by Arvo to represent an event cause, and
therefore return path, in a call stack called a
[`duct`](/docs/arvo/overview#duct). Inter-vane communications happen over
`duct`s as [`move`](/docs/arvo/overview#moves)s, and Gall converts the `card`s
produced by agents into such `move`s behind the scenes. A detailed understanding
of this system is not necessary to write Gall agents, but if you're interested
it's comprehensively documented in the [Arvo overview](/docs/arvo/overview) and
[move trace tutorial](/docs/arvo/tutorials/move-trace).
For agents, the `wire` is specified in the second argument of a `%pass` `card`.
It's used for anything you can `%pass`, such as `%poke`s, `%watch`es, and
`%arvo` notes. For example:
```hoon
[%pass /this/is/wire %agent [~zod %foobar] %watch /this/is/path]
::
[%pass /this/is/wire %agent [~zod %foobar] %poke %foo !>('hello')]
::
[%pass /this/is/wire %arvo %b %wait (add now.bowl ~m1)]
```
The `on-agent` and `on-arvo` arms of the agent core include a `wire` in their
respective sample. Responses from agents come in to the former, and responses
from vanes come in to the latter.

View File

@ -0,0 +1,131 @@
---
title: Introduction to Hoon
nodes: 100, 103
objectives:
- "Explain what an Urbit ship is."
- "Distinguish a fakeship from a liveship."
- "Pronounce ASCII characters per standard Hoon developer practice."
---
# Introduction to Hoon
Hoon School is designed to teach you Hoon without assuming you have an extensive programming background. You should be able to following most of it even if you have no programming experience at all yet, though of course experience helps. We strongly encourage you to try out all the examples of each lesson. Hoon School is meant for the beginner, but it's not meant to be skimmed. Each lesson consists of:
- **Explanations**, which are prose-heavy commentary on the Hoon fundamentals.
- **Exercises**, which challenge you to clarify or expand your own understanding in practice.
- **Tutorials**, which are line-by-line commentary on example programs.
There are two flavors of Hoon School: the Hoon School Live cohort class, in which you work through lessons with other students and receive a certification (`%gora`) for completion, and these written Hoon School docs. To sign up for a future cohort of Hoon School Live, please [let us know of your interest here](https://forms.gle/bbW6QtJPMhsjCCML8) and we'll be in touch.
<!-- TODO point to HSL/ASL landing pages -->
## Why Hoon?
The short version is that Hoon uses Urbit's provisions and protocols to enable very fast application development with shared primitives, sensible affordances, and straightforward distribution.
Urbit consists of an identity protocol (“Azimuth”, or “Urbit ID”) and a system protocol (“Arvo”, or “Urbit OS”). These two parts work hand-in-hand to build your hundred-year computer.
1. **Urbit ID (Azimuth)** is a general-purpose public-key infrastructure (PKI) on the Ethereum blockchain, used as a platform for Urbit identities. It provides a system of scarce and immutable identities which are cryptographically secure.
2. **Urbit OS (Arvo)** is an operating system which provides the software for the personal server platform that constitutes the day-to-day usage of Urbit. Arvo works over a [peer-to-peer](https://en.wikipedia.org/wiki/Peer-to-peer) [end-to-end-encrypted](https://en.wikipedia.org/wiki/End-to-end_encryption) network to interact with other Urbit ships (or unique instances).
Arvo is an axiomatic operating system which restricts itself to pure mathematical functions, making it [deterministic](https://en.wikipedia.org/wiki/Deterministic_algorithm) and [functional-as-in-programming](https://en.wikipedia.org/wiki/Functional_programming). Such strong guarantees require an operating protocol, the [Nock virtual machine](https://urbit.org/docs/nock/definition), which will be persistent across hardware changes and always provide an upgrade path for necessary changes.
It's hard to write a purely functional operating system on hardware which doesn't make such guarantees, so Urbit OS uses a new language, Hoon, which compiles to Nock and hews to the necessary conceptual models for a platform like Urbit. [The Hoon overview](https://urbit.org/docs/hoon/overview) covers more of the high-level design decisions behind the language, as does [developer ~rovnys-ricfer's explanation](https://urbit.org/blog/why-hoon/).
Hoon School introduces and explains the fundamental concepts you need in order to understand Hoon's semantics. It then introduces a number of key examples and higher-order abstractions which will make you a more fluent Hoon programmer.
Once you have completed Hoon School, you should work through the [Gall Guide](https://urbit.org/docs/userspace/gall-guide/intro) to learn how to build full applications on Urbit.
## Environment Setup
An Urbit ship is a particular realization of an _identity_ and an _event log_ or _state_. Both of these are necessary.
Since live network identities (_liveships_) are finite, scarce, and valuable, most developers prefer to write new code using fake identities (_fakeships_ or _fakezods_). A fakeship is also different from a comet, which is an unkeyed liveship.
Two fakeships can communicate with each other on the same machine, but have no awareness of the broader Urbit network. We won't need to use this capability in Hoon School Live, but it will be helpful later when you start developing networked apps.
Before beginning, you'll need to get a development ship running and configure an appropriate editor. See the [Environment Setup](https://urbit.org/docs/development/environment) guide for details.
Once you have a `dojo>` prompt, the system is ready to go and waiting on input.
## Getting started
Once you've created your development ship, let's try a basic command. Type `%- add [2 2]` at the prompt and hit `Return`. (Note the double spaces before and after `add`.) Your screen now shows:
```hoon
fake: ~zod
ames: czar: ~zod on 31337 (localhost only)
http: live (insecure, public) on 80
http: live (insecure, loopback) on 12321
> %- add [2 2]
4
~zod:dojo>
```
You just used a function from the Hoon standard library, `add`, which for reasons that will become clear later is frequently written [`++add`](https://urbit.org/docs/hoon/reference/stdlib/1a#add). Next, quit Urbit by entering `|exit`:
```hoon
> %- add [2 2]
4
~zod:dojo> |exit
$
```
Your ship isn't running anymore and you're back at your computer's normal terminal prompt. If your ship is ~zod, then you can restart the ship by typing:
```hoon
urbit zod
```
You've already used a standard library function to produce one value, in the Dojo. Now that your ship is running again, let's try another. Enter the number `17`.
(We won't show the `~zod:dojo>` prompt from here on out. We'll just show the echoed command along with its result.)
You'll see:
```hoon
> 17
17
```
You asked Dojo to evaluate `17` and it echoed the number back at you. This value is a _noun_. We'll talk more about nouns in the next lesson.
Basically, every Hoon expression operates on the values it is given until it reduces to some form that can't evaluate any farther. This is then returned as the result of the evaluation.
One more:
```hoon
> :- 1 2
[1 2]
```
This `:-` rune takes two values and composes them into a _cell_, a pair of values.
## Pronouncing Hoon
Hoon uses _runes_, or two-character ASCII symbols, to describe its structure. (These are analogous to keywords in other programming languages.) Because there has not really been a standard way of pronouncing, say, `#` (hash, pound, number, sharp, hatch) or `!` (exclamation point, bang, shriek, pling), the authors of Urbit decided to adopt a one-syllable mnemonic to uniquely refer to each.
It is highly advisable for you to learn these pronunciations, as the documentation and other developers employ them frequently. For instance, a rune like `|=` is called a “bartis”, and you will find it designated as such in the docs, in the source code, and among the developers.
| Name | Character | Name | Character | Name | Character |
| ---- | ----- | ---- | ----- | ---- | ----- |
| `ace` | `␣` | `gap` | `␣␣`, `\n` | pat | `@` |
| `bar` | `|` | `gar` | `>` | `sel` | `[` |
| `bas` | `\` | `hax` | `#` | `ser` | `]` |
| `buc` | `$` | `hep` | `-` | `sig` | `~` |
| `cab` | `_` | `kel` | `{` | `soq` | `'` |
| `cen` | `%` | `ker` | `}` | `tar` | `*` |
| `col` | `:` | `ket` | `^` | `tic` | `\`` |
| `com` | `,` | `lus` | `+` | `tis` | `=` |
| `doq` | `"` | `mic` | `;` | `wut` | `?` |
| `dot` | `.` | `pal` | `(` | `zap` | `!` |
| `fas` | `/` | `pam` | `&` | |
| `gal` | `<` | `par` | `)` | |
Note that the list includes two separate whitespace forms: `ace` for a single space `␣`; `gap` is either two or more spaces `␣␣` or a line break `\n`. In Hoon, the only whitespace significance is the distinction between `ace` and `gap`—i.e., the distinction between one space and more than one.

View File

@ -0,0 +1,561 @@
---
title: Hoon Syntax
nodes: 110, 113
objectives:
- "Distinguish nouns, cells, and atoms."
- "Apply auras to transform an atom."
- "Identify common Hoon molds, such as cells, lists, and tapes."
- "Pin a face to the subject."
- "Make a decision at a branch point."
- "Distinguish loobean from boolean operations."
- "Slam a gate (call a function)."
---
# Hoon Syntax
_This module will discuss the fundamental data concepts of Hoon and how programs effect control flow._
The study of Hoon can be divided into two parts: syntax and semantics.
1. The **syntax** of a programming language is the set of rules that determine what counts as admissible code in that language. It determines which characters may be used in the source, and also how these characters may be assembled to constitute a program. Attempting to run a program that doesnt follow these rules will result in a syntax error.
2. The **semantics** of a programming language concerns the meaning of the various parts of that languages code.
In this lesson we will give a general overview of Hoons syntax. By the end of it, you should be familiar with all the basic elements of Hoon code.
## Hoon Elements
An [**expression**](https://en.wikipedia.org/wiki/Expression_%28computer_science%29) is a combination of characters that a language interprets and evaluates to produce a value. All Hoon programs are built of expressions, rather like mathematical equations. Hoon expressions are built along a backbone of _runes_, which are two-character symbols that act like keywords in other programming languages to define the syntax, or grammar, of the expression.
Runes are the building blocks of all Hoon code, represented as a pair of non-alphanumeric ASCII characters. Runes form expressions; runes are used how keywords are used in other languages. In other words, all computations in Hoon ultimately require runes. Runes and other Hoon expressions are all separated from one another by either two spaces or a line break.
All runes take a fixed number of “children” or “daughters”. Children can themselves be runes with children, and Hoon programs work by chaining through these until a value—not another rune—is arrived at. For this reason, we very rarely need to close expressions. Keep this scheme in mind when examining Hoon code.
Hoon expressions can be either basic or complex. Basic expressions of Hoon are fundamental, meaning that they cant be broken down into smaller expressions. Complex expressions are made up of smaller expressions (which are called **subexpressions**).
The Urbit operating system hews to a conceptual model wherein each expression takes place in a certain context (the _subject_). While sharing a lot of practicality with other programming paradigms and platforms, Urbit's model is mathematically well-defined and unambiguously specified. Every expression of Hoon is evaluated relative to its subject, a piece of data that represents the environment, or the context, of an expression.
At its root, Urbit is completely specified by [Nock](https://urbit.org/docs/nock/definition), sort of a machine language for the Urbit virtual machine layer and event log. However, Nock code is basically unreadable (and unwriteable) for a human. [One worked example](https://urbit.org/docs/nock/example) yields, for decrementing a value by one, the Nock formula:
```hoon
[8 [1 0] 8 [1 6 [5 [0 7] 4 0 6] [0 6] 9 2 [0 2] [4 0 6] 0 7] 9 2 0 1]
```
This is like reading binary machine code: we mortals need a clearer vernacular.
Hoon serves as Urbit's practical programming language. Everything in Urbit OS is written in Hoon, and many of the ancillary tools as well.
Any operation in Urbit ultimately results in a value. Much like machine language designates any value as a command, an address, or a number, a Hoon value is interpreted per the Nock rules and results in a basic data value at the end. So what are our data values in Hoon, and how do they relate to each other?
## Nouns
Think about a child persistently asking you what a thing is made of. At first, you may respond, “plastic”, or “metal”. Eventually, the child may wear you down to a more fundamental level: atoms and molecules (bonded atoms).
In a very similar sense, everything in a Hoon program is an atom or a bond. Metaphorically, a Hoon program is a complex molecule, a digital chemistry that describes one mathematical representation of data.
The most general data category in Hoon is a _noun_. This is just about as broad as saying “thing”, so let's be more specific:
> A noun is an atom or a cell.
Progress? We can say, in plain English, that
- An _atom_ is a nonzero integer number (0+∞), e.g. `42`.
- A _cell_ is a pair of two nouns, written in square brackets, e.g. `[0 1]`.
_Everything_ in Hoon (and Nock, and Urbit) is a noun. The Urbit OS itself is a noun. So given any noun, the Urbit VM simply applies the Nock rules to change the noun in well-defined mechanical ways.
### Atoms
If an atom is a nonzero number, how do we represent anything else? Hoon provides each atom an _aura_, a tag which lets you treat a number as text, time, date, Urbit address, IP address, and much more.
An aura always begins with `@` pat, which denotes an atom (as opposed to a cell, `^` ket, or the general noun, `*` tar). The next letter or letters tells you what kind of representation you want the value to have.
For instance, to change the representation of a regular decimal number like `32` to a binary representation (i.e. for 2⁵), use `@ub`:
```
> `@ub`32
0b10.0000
```
(The tic marks are a shorthand which we'll explain later.)
Aura values are all designed to be [URL-safe](https://developers.google.com/maps/url-encoding), so the European-style thousands separator `.` dot is used instead of the English `,` com. `1.000` is one thousand, not `1.0` one with a fractional part of zero.
While there are dozens of auras for specialized applications, here are the most important ones for you to know:
| Aura | Meaning | Example | Comment |
| ---- | ------- | ------- | ------- |
| `@` | Empty aura | `100` | (displays as `@ud`) |
| `@da` | Date (absolute) | ~2022.2.8..16.48.20..b53a | Epoch calculated from 292 billion B.C. |
| `@p` | Ship name | `~zod` | |
| `@rs` | Number with fractional part | `.3.1415` | Note the preceding `.` dot. |
| `@t` | Text (“cord”) | `'hello'` | One of Urbit's several text types; only UTF-8 values are valid. |
| `@ub` | Binary value | `0b1100.0101` | |
| `@ud` | Decimal value | `100.000` | Note that German-style thousands separator is used, `.` dot. |
| `@ux` | Hexadecimal value | `0x1f.3c4b` | |
Hearkening back to our discussion of interchangeable representations in Lesson -1, you can see that these are all different-but-equivalent ways of representing the same underlying data values.
There's a special value that recurs in many contexts in Hoon: `~` sig is the null or zero value.
The [`^-` kethep](https://urbit.org/docs/hoon/reference/rune/ket#--kethep) rune is useful for ensuring that everything in the second child matches the type (aura) of the first, e.g.
```
^- @ux 0x1ab4
```
We will use `^-` kethep extensively to enforce type constraints, a very useful tool in Hoon code.
#### Exercise: Aura Conversions
Convert between some of the given auras at the Dojo prompt, e.g.:
- `100` to `@p`
- `0b1100.0101` to `@p`
- `0b1100.0101` to `@ux`
- `0b1100.0101` to `@ud`
- `~` to any other aura
### Cells
A cell is a pair of two nouns. Cells are traditionally written using square brackets: `[]`. For now, just recall the square brackets and that cells are always _pairs_ of values.
```
[1 2]
[@p @t]
[[1 2] [3 4]]
```
This is actually a shorthand for a rune as well, [`:-` colhep](https://urbit.org/docs/hoon/reference/rune/col#--colhep):
```
:- 1 2
```
produces a cell `[1 2]`. You can chain these together:
```
:- 1 :- 2 3
```
to produce `[1 [2 3]]` or `[1 2 3]`.
We deal with cells in more detail below.
> ### Hoon as Noun
>
> We mentioned earlier that everything in Urbit is a noun, including the program itself. This is true, but getting from the rune expression in Hoon to the numeric expression requires a few more tools than we currently are prepared to introduce.
>
> For now, you can preview the structure of the Urbit OS as a noun by typing `.` dot at the Dojo prompt. This displays a summary of the structure of the operating function itself as a noun.
{: .callout}
## Verbs (Runes)
The backbone of any Hoon expression is a scaffolding of _runes_, which are essentially mathematical relationships between daughter components. If nouns are nouns, then runes are verbs: they describe how nouns relate. Runes provide the structural and logical relationship between noun values.
A rune is just a pair of ASCII characters (a digraph). We usually pronounce runes by combining their characters names, e.g.: “kethep” for `^-`, “bartis” for `|=`, and “barcen” for `|%`.
For instance, when we called a function earlier (in Hoon parlance, we _slammed a gate_), we needed to provide the [`%-` cenhep](https://urbit.org/docs/hoon/reference/rune/cen#-cenhep) rune with two bits of information, a function name and the values to associate with it:
```hoon
%-
add
[1 2]
```
The operation you just completed is straightforward enough: `1 + 2`, in many languages, or `(+ 1 2)` in a [Lisp dialect](https://en.wikipedia.org/wiki/Lisp_%28programming_language%29) like [Clojure](https://en.wikipedia.org/wiki/Clojure). Literally, we can interpret `%- add [1 2]` as “evaluate the `add` core on the input values `[1 2]`”.
[`++add`](https://urbit.org/docs/hoon/reference/stdlib/1a#add) expects precisely two values (or _arguments_), which are provided by `%-` in the neighboring child expression as a cell. There's really no limit to the complexity of Hoon expressions: they can track deep and wide. They also don't care much about layout, which leaves you a lot of latitude. The only hard-and-fast rule is that there are single spaces (`ace`s) and everything else (`gap`s).
```hoon
%-
add
[%-(add [1 2]) 3]
```
(Notice that inside of the `[]` cell notation we are using a slightly different form of the `%-` rune call. In general, there are several ways to use many runes, and we will introduce these gradually. We'll see more expressive ways to write Hoon code after you're comfortable using runes.)
For instance, here are some of the standard library functions which have a similar architecture in common:
- [`++add`](https://urbit.org/docs/hoon/reference/stdlib/1a#add) (addition)
- [`++sub`](https://urbit.org/docs/hoon/reference/stdlib/1a#sub) (subtraction, positive results only—what happens if you subtract past zero?)
- [`++mul`](https://urbit.org/docs/hoon/reference/stdlib/1a#mul) (multiplication)
- [`++div`](https://urbit.org/docs/hoon/reference/stdlib/1a#div) (integer division, no remainder)
- [`++pow`](https://urbit.org/docs/hoon/reference/stdlib/1a#pow) (power or exponentiation)
- [`++mod`](https://urbit.org/docs/hoon/reference/stdlib/1a#add) (modulus, remainder after integer division)
- [`++dvr`](https://urbit.org/docs/hoon/reference/stdlib/1a#dvr) (integer division with remainder)
- [`++max`](https://urbit.org/docs/hoon/reference/stdlib/1a#max) (maximum of two numbers)
- [`++min`](https://urbit.org/docs/hoon/reference/stdlib/1a#min) (minimum of two numbers)
### Rune Expressions
Any Hoon program is architected around runes. If you have used another programming language, you can see these as analogous to keywords, although they also make explicit what most language syntax parsers leave implicit. Hoon aims at a parsimony of representation while leaving latitude for aesthetics. In other words, Hoon strives to give you a unique characteristic way of writing a correct expression, but it leaves you flexibility in how you lay out the components to maximize readability.
We are only going to introduce a handful of runes in this lesson, but by the time we're done with Hoon School, you'll know the twenty-five or so runes that yield 80% of the capability.
#### Exercise: Identifying Unknown Runes
Here is a lightly-edited snippet of Hoon code. Anything written after a `::` colcol is a _comment_ and is ignored by the computer. (Comments are useful for human-language explanations.)
```hoon
%- send
:: forwards compatibility with next-dill
?@ p.kyz [%txt p.kyz ~]
?: ?= %hit -.p.kyz
[%txt ~]
?. ?= %mod -.p.kyz
p.kyz
=/ =@c
?@ key.p.kyz key.p.kyz
?: ?= ?(%bac %del %ret) -.key.p.kyz
`@`-.key.p.kyz
~-
?: ?= %met mod.p.kyz [%met c] [%ctl c]
```
1. Mark each rune.
2. For each rune, find its corresponding children. (You don't need to know what a rune does to identify how things slot together.)
3. Consider these questions:
- Is every pair of punctuation marks a rune?
- How can you tell a rune from other kinds of marks?
One clue: every rune in Hoon (except for one, not in the above code) has _at least one child_.
### Exercise: Inferring Rune Behavior
Here is a snippet of Hoon code:
```hoon
^- list
:~ [hen %lsip %e %init ~]
[hen %lsip %d %init ~]
[hen %lsip %g %init ~]
[hen %lsip %c %init ~]
[hen %lsip %a %init ~]
==
```
Without looking it up first, what does the [`==` tistis](https://urbit.org/docs/hoon/reference/rune/terminators#-tistis) do for the [`:~` colsig](https://urbit.org/docs/hoon/reference/rune/col#-colsig) rune? Hint: some runes can take any number of arguments.
> Most runes are used at the beginning of a complex expression, but there are exceptions. For example, the runes [`--` hephep](https://urbit.org/docs/hoon/reference/rune/terminators#-hephep) and [`==` tistis](https://urbit.org/docs/hoon/reference/rune/terminators#-tistis) are used at the end of certain expressions.
#### Aside: Writing Incorrect Code
At the Dojo, you can attempt to operate using the wrong values; for instance, `++add` doesn't know how to add three numbers at the same time.
```hoon
> %- add [1 2 3]
-need.@
-have.[@ud @ud]
nest-fail
dojo: hoon expression failed
```
So this statement above is _syntactically_ correct (for the `%-` rune) but in practice fails because the expected input arguments don't match. Any time you see a `need`/`have` pair, this is what it means.
### Rune Families
Runes are classified by family (with the exceptions of `--` hephep and `==` tistis). The first of the two symbols indicates the family—e.g., the `^-` kethep rune is in the `^` ket family of runes, and the `|=` bartis and `|%` barcen runes are in the `|` bar family. The runes of particular family usually have related meanings. Two simple examples: the runes in the `|` bar family are all used to create cores, and the runes in the `:` col family are all used to create cells.
Rune expressions are usually complex, which means they usually have one or more subexpressions. The appropriate syntax varies from rune to rune; after all, theyre used for different purposes. To see the syntax rules for a particular rune, consult the rune reference. Nevertheless, there are some general principles that hold of all rune expressions.
Runes generally have a fixed number of expected children, and thus do not need to be closed. In other languages youll see an abundance of terminators, such as opening and closing parentheses, and this way of doing this is largely absent from Urbit. Thats because all runes take a fixed number of children. Children of runes can themselves be runes (with more children), and Hoon programs work by chaining through these series of children until a value—not another rune—is arrived at. This makes Hoon code nice and neat to look at.
### Tall and Wide Forms
We call rune expressions separated by `gap`s **tall form** and those using parentheses **wide form**. Tall form is usually used for multi-line expressions, and wide form is used for one-line expressions. Most runes can be used in either tall or wide form. Tall form expressions may contain wide form subexpressions, but wide form expressions may not contain tall form.
The spacing rules differ in the two forms. In tall form, each rune and subexpression must be separated from the others by a `gap`: two or more spaces, or a line break. In wide form the rune is immediately followed by parentheses `( )`, and the various subexpressions inside the parentheses must be separated from the others by an `ace`: a single space.
Seeing an example will help you understand the difference. The `:-` colhep rune is used to produce a cell. Accordingly, it is followed by two subexpressions: the first defines the head of the cell, and the second defines the tail. Here are three different ways to write a `:-` colhep expression in tall form:
```hoon
> :- 11 22
[11 22]
> :- 11
22
[11 22]
> :-
11
22
[11 22]
```
All of these expressions do the same thing. The first example shows that, if you want to, you can write tall form code on a single line. Notice that there are two spaces between the `:-` colhep rune and `11`, and also between `11` and `22`. This is the minimum spacing necessary between the various parts of a tall form expression—any fewer will result in a syntax error.
Usually one or more line breaks are used to break up a tall form expression. This is especially useful when the subexpressions are themselves long stretches of code. The same `:-` colhep expression in wide form is:
```hoon
> :-(11 22)
[11 22]
```
This is the preferred way to write an expression on a single line. The rune itself is followed by a set of parentheses, and the subexpressions inside are separated by a single space. Any more spacing than that results in a syntax error.
Nearly all rune expressions can be written in either form, but there are exceptions. `|%` barcen and `|_` barcab expressions, for example, can only be written in tall form. (Those are a bit too complicated to fit comfortably on one line anyway.)
### Nesting Runes
Since runes take a fixed number of children, one can visualize how Hoon expressions are built by thinking of each rune being followed by a series of boxes to be filled—one for each of its children. Let us illustrate this with the `:-` colhep rune.
![Colhep rune with two empty boxes for children.](https://media.urbit.org/docs/hoon-syntax/cell1.png)
Here we have drawn the `:-` colhep rune followed by a box for each of its two children. We can fill these boxes with either a value or an additional rune. The following figure corresponds to the Hoon expression `:- 2 3`.
![Colhep rune with two boxes for children containing 2 and 3.](https://media.urbit.org/docs/hoon-syntax/cell2.png)
This, of course, evaluates to the cell `[2 3]`.
The next figure corresponds to the Hoon expression `:- :- 2 3 4`.
![Colhep rune with two boxes for children, one containing a colhep rune with two boxes for children containing 2 and 3, and 4.](https://media.urbit.org/docs/hoon-syntax/cell3.png)
This evaluates to `[[2 3] 4]`, and we can think of the second `:-` colhep as being “nested” inside of the first `:-` colhep.
What Hoon expression does the following figure correspond to, and what does it evaluate to?
![Colhep rune with two boxes for children containing 2 and a colhep rune with two boxes for children containing 3 and 4.](https://media.urbit.org/docs/hoon-syntax/cell4.png)
This represents the Hoon expression `:- 2 :- 3 4`, and evaluates to `[2 [3 4]]`. (If you input this into dojo it will print as `[2 3 4]`, which we'll consider later.)
Thinking in terms of such “LEGO brick” diagrams can be a helpful learning and debugging tactic.
## Preserving Values with Faces
A Hoon expression is evaluated against a particular subject, which includes Hoon definitions and the standard library, as well as any cuser-specified values which have been made available. Unlike many procedural programming languages, a Hoon expression only knows what it has been told explicitly. This means that as soon as we calculate a value, it returns and falls back into the ether.
```
%- sub [5 1]
```
Right now, we don't have a way of preserving values for subsequent use in a more complicated Hoon expression.
We are going to store the value as a variable, or in Hoon, “pin a face to the subject”. Hoon faces aren't exactly like variables in other programming languages, but for now we can treat them that way, with the caveat that they are only accessible to daughter or sister expressions.
When we used `++add` or `++sub` previously, we wanted an immediate answer. There's not much more to say than `5 + 1`. In contrast, pinning a face accepts three daughter expressions: a name (or face), a value, and the rest of the expression.
```hoon
=/ perfect-number 28
%- add [perfect-number 10]
```
This yields `38`, but if you attempt to refer to `perfect-number` again on the next line, the Dojo fails to locate the value.
```hoon
> =/ perfect-number 28
%- add [perfect-number 10]
38
> perfect-number
-find.perfect-number
dojo: hoon expression failed
```
This syntax is a little bit strange in the Dojo because subsequent expressions, although it works quite well in long-form code. The Dojo offers a workaround to retain named values:
```
> =perfect-number 28
> %- add [perfect-number 10]
38
> perfect-number
38
```
The difference is that the Dojo “pin” is permanent until deleted:
```
=perfect-number
```
rather than only effective for the daughter expressions of a `=/` tisfas rune. (We also won't be able to use this Dojo-style pin in a regular Hoon program.)
### Exercise: A Large Power of Two
Create two numbers named `two` and `twenty`, with appropriate values, using the `=/` tisfas rune.
Then use these values to calculate 2²⁰ with `++pow` and `%-` cenhep.
## Containers & Basic Data Structures
Atoms are well and fine for relatively simple data, but we already know about cells as pairs of nouns. How else can we think of collections of data?
### Cells
A cell is formally a pair of two objects, but as long as the second (right-hand) object is a cell, these can be written stacked together:
```hoon
> [1 [2 3]]
[1 2 3]
> [1 [2 [3 4]]]
[1 2 3 4]
```
This convention keeps the notation from getting too cluttered. For now, let's call this a “running cell” because it consists of several cells run together.
Since almost all cells branch rightwards, the pretty-printer (the printing routine that the Dojo uses) prefers to omit `[]` brackets marking the rightmost cells in a running cell. These read to the right—that is, `[1 2 3]` is the same as `[1 [2 3]]`.
#### Exercise: Comparing Cells
Enter the following cells:
```hoon
[1 2 3]
[1 [2 3]]
[[1 2] 3]
[[1 2 3]]
[1 [2 [3]]]
[[1 2] [3 4]]
[[[1 2] [3 4]] [[5 6] [7 8]]]
```
Note which are the same as each other, and which are not. We'll look at the deeper structure of cells later when we consider trees.
### Lists
A running cell which terminates in a `~` sig (null) atom is a list.
- What is `~`'s value? Try casting it to another aura.
`~` is the null value, and here acts as a list terminator.
Lists are ubiquitous in Hoon, and many specialized tools exist to work with them. (For instance, to apply a gate to each value in a list, or to sum up the values in a list, etc.) We'll see more of them in a future lesson.
#### Exercise: Making a List from a Null-Terminated Cell
You can apply an aura to explicitly designate a null-terminated running cell as a list containing particular types of data. Sometimes you have to clear the aura using a more general aura (like `@`) before the conversion can work.
```hoon
> `(list @ud)`[1 2 3 ~]
~[1 2 3]
> `(list @ux)`[1 2 3 ~]
mint-nice
-need.?(%~ [i=@ux t=it(@ux)])
-have.[@ud @ud @ud %~]
nest-fail
dojo: hoon expression failed
> `(list @)`[1 2 3 ~]
~[1 2 3]
> `(list @ux)``(list @)`[1 2 3 ~]
~[0x1 0x2 0x3]
```
### Text
There are two ways to represent text in Urbit: cords (`@t` aura atoms) and tapes (lists of individual characters). Both of these are commonly called [“strings”](https://en.wikipedia.org/wiki/String_%28computer_science%29).
Why represent text? What does that mean? We have to have a way of distinguishing words that mean something to Hoon (like `list`) from words that mean something to a human or a process (like `'hello world'`).
Right now, all you need to know is that there are (at least) two valid ways to write text:
- `'with single quotes'` as a cord.
- `"with double quotes"` as text.
We will use these incidentally for now and explain their characteristics in a later lesson. Cords and text both use [UTF-8](https://en.wikipedia.org/wiki/UTF-8) representation, but all actual code is [ASCII](https://en.wikipedia.org/wiki/ASCII).
```hoon
> "You can put ½ in quotes, but not elsewhere!"
"You can put ½ in quotes, but not elsewhere!"
> 'You can put ½ in single quotes, too.'
'You can put ½ in single quotes, too.'
> "Some UTF-8: ἄλφα"
"Some UTF-8: ἄλφα"
```
#### Exercise: ASCII Values in Text
A cord (`@t`) represents text as a sequence of characters. If you know the [ASCII](https://en.wikipedia.org/wiki/ASCII) value for a particular character, you can identify how the text is structured as a number. (This is most easily done using the hexadecimal `@ux` representation due to bit alignment.)
![](https://upload.wikimedia.org/wikipedia/commons/thumb/1/1b/ASCII-Table-wide.svg/1024px-ASCII-Table-wide.svg.png)
If you produce a text string as a cord, you can see the internal structure easily in Hoon:
```hoon
> `@ux`'Mars'
0x7372.614d
```
that is, the character codes `0x73` = `'s'`, `0x72` = `'r'`, `0x61` = `'a'`, and `0x4d` = `'M'`. Thus a cord has its first letter as the smallest (least significant, in computer-science parlance) byte.
## Making a Decision
The final rune we introduce in this lesson will allow us to select between two different Hoon expressions, like picking a fork in a road. Any computational process requires the ability to distinguish options. For this, we first require a basis for discrimination: truthness.
Essentially, we have to be able to decide whether or not some value or expression evaluates as `%.y` _true_ (in which case we will do one thing) or `%.n` _false_ (in which case we do another). At this point, our basic expressions are always mathematical; later on we will check for existence, for equality of two values, etc.
- [`++gth`](https://urbit.org/docs/hoon/reference/stdlib/1a#gth) (greater than `>`)
- [`++lth`](https://urbit.org/docs/hoon/reference/stdlib/1a#lth) (less than `<`)
- [`++gte`](https://urbit.org/docs/hoon/reference/stdlib/1a#gte) (greater than or equal to `≥`)
- [`++lte`](https://urbit.org/docs/hoon/reference/stdlib/1a#lte) (less than or equal to `≤`)
If we supply these with a pair of numbers to a `%-` cenhep call, we can see if the expression is considered `%.y` true or `%.n` false.
```
> %- gth [5 6]
%.n
> %- lth [7 6]
%.n
> %- gte [7 6]
%.y
> %- lte [7 7]
%.y
```
Given a test expression like those above, we can use the `?:` wutcol rune to decide between the two possible alternatives. `?:` wutcol accepts three children: a true/false statement, an expression for the `%.y` true case, and an expression for the `%.n` false case.
[Piecewise mathematical functions](https://en.wikipedia.org/wiki/Piecewise) require precisely this functionality. For instance, the Heaviside function is a piecewise mathematical function which is equal to zero for inputs less than zero and one for inputs greater than or equal to zero.
<img src="https://latex.codecogs.com/svg.image?\large&space;H(x)=\begin{cases}&space;1,&space;&&space;x&space;>&space;0&space;\\&space;0,&space;&&space;x&space;\le&space;0&space;\end{cases}" title="https://latex.codecogs.com/svg.image?\large H(x):=\begin{cases} 1, & x > 0 \\ 0, & x \le 0 \end{cases}" />
<!--$$
H(x)
=
\begin{cases} 1, & x > 0 \\ 0, & x \le 0 \end{cases}
$$-->
_However_, we don't yet know how to represent a negative value! All of the decimal values we have used thus far are unsigned (non-negative) values, `@ud`. For now, the easiest solution is to just translate the Heaviside function so it activates at a different value:
<img src="https://latex.codecogs.com/svg.image?\large&space;H_{10}(x)=\begin{cases}&space;1,&space;&&space;x&space;>&space;10&space;\\&space;0,&space;&&space;x&space;\le&space;10&space;\end{cases}" title="https://latex.codecogs.com/svg.image?\large H_{10}(x):=\begin{cases} 1, & x > 10 \\ 0, & x \le 10 \end{cases}" />
<!--$$
H_{10}(x)
=
\begin{cases} 1, & x > 10 \\ 0, & x \le 10 \end{cases}
$$-->
Thus equipped, we can evaluate the Heaviside function for particular values of `x`:
```hoon
=/ x 10
?: %- gte [x 10]
1
0
```
We don't know yet how to store this capability for future use on as-yet-unknown values of `x`, but we'll see how to do so in a future lesson.
Carefully map how the runes in that statement relate to each other, and notice how the taller structure makes it relatively easier to read and understand what's going on.
#### Exercise: “Absolute” Value (Around Ten)
Implement a version of the absolute value function, _|x|_, similar to the Heaviside implementation above. (Translate it to 10 as well since we still can't deal with negative numbers; call this $|x|_{10}$.)
<img src="https://latex.codecogs.com/svg.image?|x|_{10}=\begin{cases}&space;x-10,&space;&&space;x&space;>&space;10&space;\\&space;0,&space;&&space;10-x&space;\le&space;10&space;\end{cases}" title="https://latex.codecogs.com/svg.image?|x|_{10}=\begin{cases} x-10, & x > 10 \\ 0, & 10-x \le 10 \end{cases}" />
<!--$$
|x|_{10}
=
\begin{cases} x-10, & x > 10 \\ 0, & 10-x \le 10 \end{cases}
$$-->
Test it on a few values like 8, 9, 10, 11, and 12.

View File

@ -0,0 +1,241 @@
---
title: The Structure of Azimuth
nodes: 102, 112
objectives:
- "Understand the role of the public-key infrastructure in Urbit."
- "Describe the high-level architecture of the Urbit ID address space and distinguish types of points."
- "Interpret and apply the Azimuth point naming scheme."
- "Identify point features such as activity."
- "List at least two services/roles provided by a galaxy for the network."
- "List at least two services provided by a star for its planets."
- "Use Hoon to map the Azimuth address space domains and boundaries."
- "Identify points, sponsors, neighbors, etc. from `@p` identifiers and simple operations."
---
# The Structure of Azimuth
_This module introduces how Urbit ID is structured and provides practice in converting and working with `@p` identity points. It may be considered optional and skipped if you are speedrunning Hoon School._
## A Public-Key Infrastructure
What is the purpose of a [public-key infrastructure](https://en.wikipedia.org/wiki/Public_key_infrastructure)? Essentially a PKI defines a protocol for asymmetrically revealing a public key (which anyone can use to check that a message came from where it says it came) and retaining a private key, used by the owner as a cryptographically secure tool for signing electronic transactions. Azimuth functions as a PKI so that Urbit ID points can be uniquely controlled, transferred, and used to work with instances of Urbit OS (ships).
Urbit ID (=Azimuth) provides persistent and stable futureproof identity to its users through a hierarchical address space. Any particular Urbit ID plays a particular role in the overall Urbit system which is determined by its point number and classified into ranks.
### The Urbit Address Space
Each Urbit ID point is a 128-bit address. Urbit is structured with a hierarchy of addressable points, and bands of smaller values (preceded by many zeroes) have more “weight” in the system and broker access for higher-addressed points.
- **Galaxies** represent the “governing council” of Urbit, primarily concerned with peer discovery and packet routing as well as network protocol governance. Galaxies allocate star address space.
- **Stars** provide peer discovery services, handle distribution of software updates, and allocate planet address space.
- **Planets** are the primary single-user identities.
- **Moons** are intended to represent devices and associated accounts for the owning planet, but are currently only rarely used. Each planet has 2³² moons available to it.
- **Comets** are zero-reputation instances, in principle spammers or bots. Comets require a star sponsor to access the network, but once online they are persistent. They are also free to spin up.
In total there are 2¹²⁸ addressable points, of which the vast majority are available as unclaimed “comet space.”
#### Naming
Urbit uses a system of mnemonic syllables to uniquely identify each address point. These mnemonic names, called “`patp`s” after their Hoon representation `@p`, occur in a set of 256 suffixes (such as “zod”) and 256 prefixes (such as “lit”). They were selected to be memorable and pronounceable but not meaningful.
| Number | Prefix | Suffix |
| -----: | :----: | :----: |
| 0 | doz | zod |
| 1 | mar | nec |
| 2 | bin | bud |
| 3 | wan | wes |
| 4 | sam | sev |
| … | … | … |
| 254 | mip | nev |
| 255 | fip | fes |
Many points may be determined from the prefix and suffix alone, but planet names are obfuscated, meaning that they are scrambled so that the sponsor is not readily apparent to a peer.
#### Galaxy
Galaxies span the first 2⁸ addresses of Azimuth. There are 255 (`0xff` - 1)
associated stars; counting the galaxy yields 256 points (not counting moons). Galaxy names are suffix-only.
| | First Address | Last Address |
| ------------ | ------------- | ------------ |
| Decimal | `0` | `255` |
| Hexadecimal | `0x0` | `0xff` |
| `@p` | ~zod | ~fes |
As galaxies have no sponsors, they instead have an IP address determined by `gal.urbit.org` at port `13337`+galaxy number.
At the current time, galaxies play the role of network peer discovery, but at some future time this will fall to the stars instead.
#### Star
Peer discovery, the primary role of stars besides planet allocation, is an important step in responsibly controlling network traffic. “The basic idea is, you need someone to sponsor your membership on the network. An address that cant find a sponsor is probably a bot or a spammer” ([docs](https://urbit.org/understanding-urbit/)).
Stars span the remaining addresses to 2¹⁶. There are thus 65,536 -
256 = 65,280 stars. Star names have prefix and suffix. They share the
suffix with their sponsoring galaxy.
| | First Address | Last Address |
| ------------ | ------------- | ------------ |
| Decimal | `256` | `65.535` |
| Hexadecimal | `0x100` | `0xffff` |
| `@p` | ~marzod | ~fipfes |
A star's sponsor can be calculated as modulo 2⁸. The first star of
~zod is `0x100` ~marzod. The last star of ~zod is `0xffff` - `0xff` =
`0xff00` ~fipzod. The last star (of ~fes) is `0xffff` ~fipfes.
#### Planet
Planets span the remaining addresses to 2³². There are thus
4,294,967,296 - 65,536 = 4,294,901,760 planets. Planet names occur in
pairs separated by a single hyphen. A planet's name is obfuscated so it
is not immediately apparent who its sponsor is.
| | First Address | Last Address |
| ------------ | ------------- | ------------ |
| Decimal | `65.536` | `4.294.967.295` |
| Hexadecimal | `0x1.0000` | `0xffff.ffff` |
| `@p` | ~dapnep-ropmyl | ~dostec-risfen |
A planet's sponsor can be calculated as modulo 2¹⁶.
Galaxy planets occupy points beginning with `0x1.0000` ~dapnep-ronmyl
(for ~zod); ~zod's last galaxy planet is `0xffff.ffff` - `0xffff` =
`0xffff.0000` ~lodnyt-ranrud. The last galaxy planet (of ~fes) is
`0xffff.ffff` - `0xffff` + `0x100` = `0xffff.0100` ~hidwyt-mogbud.
Star planets span the remaining space. The first star planet (of
~marzod) is `0x1.000` + `0x100` = `0x1.0100` ~wicdev-wisryt. The last star
planet (of ~fipfes) is `0xffff.ffff` ~dostec-risfen. Remember that star
planet recur module 2¹⁶.
#### Moon
Moons occupy the block to 2⁶⁴, with 2³² moons for each planet. Moon
names have more than two blocks (three or four) separated by single
hyphens.
| | First Address | Last Address |
| ------------ | ------------- | ------------ |
| Decimal | `4.294.967.296` | `18.446.744.073.709.551.615` |
| Hexadecimal | `0x1.0000.0000` | `0xffff.ffff.ffff.ffff` |
| `@p` | ~doznec-dozzod-dozzod | ~fipfes-fipfes-dostec-risfen |
Moons recur modulo 2³² from their sponsor. Thus dividing a moon's
address by 2³² and taking the remainder yields the address of the
sponsor.
Any moon that begins with the prefix ~dopzod-dozzod-doz___ is a
galaxy moon, but not every galaxy moon begins with that prefix. The
first galaxy moon of ~zod is 0x1.0000.0000 ~doznec-dozzod-dozzod; the
last is `0xffff.ffff.ffff.ffff` - `0xffff.ffff` = `0xffff.ffff.0000.0000` ~fipfes-fipfes-dozzod-dozzod.
Any moon that begins with the prefix ~dopzod-dozzod-______ is a
star moon (other than galaxy moons), but not every star moon begins with
that prefix. The first star moon of ~marzod is `0x1.0000.0000.0100`
~doznec-dozzod-dozzod-marzod; the last is `0xffff.ffff.ffff.ffff` -
`0xffff.ffff` + `0x100` = `0xffff.ffff.0000.0100`
~fipfes-fipfes-dozzod-marzod.
Any moon from ~dopzod-______-______ onwards is a planet
moon.
#### Comet
Comets occupy the upper portion of the Urbit address space. There are
approximately 3.4×10³⁸ comets, a fantastically large number. Comet
names occur in blocks of five to eight syllable pairs, separated by a double hyphen at the fourth.
| | First Address | Last Address |
| ------------ | ------------- | ------------ |
| Decimal | `18.446.744.073.709.551.616` | `340.282.366.920.938.463.463.374.607.431.768.211.456` |
| Hexadecimal | `0x1.0000.0000.0000.0000` | `0xffff.ffff.ffff.ffff.ffff.ffff.ffff.ffff` |
| @p | ~doznec--dozzod-dozzod-dozzod-dozzod | ~fipfes-fipfes-fipfes-fipfes--fipfes-fipfes-fipfes-fipfes |
A comet is sponsored by a star. Currently star sponsors are determined
randomly from a list supplied to `u3_dawn_come` in
`pkg/urbit/vere/dawn.c` from a [jamfile](https://urbit.org/docs/hoon/reference/stdlib/2p#jam) provided by urbit.org at
`https://bootstrap.urbit.org/comet-stars.jam`.
Comets cannot be breached or rekeyed: possession of the comet is *ipso
facto* attestation of ownership.
## Calculating with Addresses
### Sponsors
Each point other than a galaxy has a sponsor. To determine the sponsor of any point, use `++sein:title`:
```hoon
%-(sein:title [our now ~marzod])
```
where ~marzod is the point in question; or more succinctly:
```hoon
(sein:title our now ~marzod)
```
(This previews the irregular syntax of `%-` cenhep; it is equivalent to `%- sein:title [our now ~marzod]`.)
#### Exercise: Finding neighbors
A neighbor of a point is a point which occupies the point immediately above or below that point's `@ud` number.
For instance, the `@ud` of ~sampel-palnet may be found by:
```hoon
> `@ud`~sampel-palnet
1.624.961.343
```
The previous neighbor of ~sampel-palnet is thus:
```hoon
> %-(sub [1.624.961.343 1])
1.624.961.342
> `@p`1.624.961.342
~datwyn-lavrud
```
- Find the next neighbor of ~sampel-palnet.
#### Exercise: Finding the sponsor of a neighbor
The sponsor of ~sampel-palnet may be found by:
```hoon
> (sein:title our now ~sampel-palnet)
~talpur
```
The sponsor of the previous neighbor of ~sampel-palnet is thus:
```hoon
> %-(sub [1.624.961.343 1])
1.624.961.342
> `@p`1.624.961.342
~datwyn-lavrud
> (sein:title our now ~datwyn-lavrud)
~talnep
```
- Find the sponsor of the next neighbor of ~sampel-palnet.
#### Exercise: Finding the child of a point
A point has many children, but the first moon of a planet is located at that point plus 2³² = `4.294.967.296`.
The first moon of ~sampel-palnet is:
```hoon
> `@p`%-(add [~sampel-palnet 4.294.967.296])
~doznec-sampel-palnet
```
- What are the first moon children of ~sampel-palnet's neighbors?
- What is the first planet of the star ~sampel? (Check the above text to determine the offset.)

View File

@ -0,0 +1,399 @@
---
title: Gates
nodes: 111, 115, 120
objectives:
- "Use the `+ls` generator to show a directory's contents."
- "`|mount` and `|commit` a desk."
- "Identify current known irregular syntax."
- "Convert between regular and irregular forms of runes to date."
- "Employ a gate to defer a computation."
- "Produce a gate as a generator."
- "Annotate Hoon code with comments."
- "Produce a generator to convert a value between auras."
---
# Gates (Functions)
_This module will teach you how to produce deferred computations for later use, like functions in other languages._
## A Spoonful of Sugar
Until this point in Hoon School, we have rigorously adhered to the regular syntax of runes so that you could get used to using them. In fact, the only two irregular forms we used were these:
- Cell definition `[a b]` which represents the [`:-` colhep](https://urbit.org/docs/hoon/reference/rune/col#--colhep) rune, `:- a b`.
That is, these expressions are all the same for Hoon:
```hoon
> [1 2]
[1 2]
> :- 1 2
[1 2]
> :-
1
2
[1 2]
```
- Aura application ``@ux`500`` which represents a double [`^-` kethep](https://urbit.org/docs/hoon/reference/rune/ket#--kethep), `^- @ux ^- @ 500`.
These are equivalent in Hoon:
```hoon
> ^- @p ^- @ 255
~fes
> `@p`255
~fes
```
(Why two `^-`s? We have to clear the type information in general to be able to apply new type information.)
Hoon developers often employ irregular forms, sometimes called “sugar syntax”. Besides the `:-` colhep and `^-` kethep forms, we will commonly use a new form for [`%-` cenhep](https://urbit.org/docs/hoon/reference/rune/cen#cenhep) “function calls”:
```hoon
> %- add [1 2]
3
> (add 1 2)
3
```
You should get used to reading and interpreting these forms. We will start to use them actively during this lesson. You can find other irregular forms in the [irregular forms reference](https://urbit.org/docs/hoon/reference/irregular).
#### Exercise: Converting Between Forms
Convert each of the following irregular forms into the correct regular runic syntax.
1. `(add 1 2)`
2. ``@ub`16`
3. `[%lorem %ipsum]`
4. `[%lorem %ipsum %dolor]` (can do two ways)
Convert each of the following regular forms into the correct irregular syntax.
1. `:- %lemon %jello`
2. `^- @p ^- @ 256`
3. `%- pow :- 2 16`
## Deferring Computations
So far, every time we have calculated something, we have had to build it from scratch in Dojo. This is completely untenable for nontrivial calculations, and clearly the Urbit OS itself is built on persistent code structures defining the behavior.
```hoon
:: Confirm whether a value is greater than one.
=/ a 5
?: (gth a 1)
'yes'
'no'
```
This has no flexibility: if we want to change `a` we have to rewrite the whole thing every time!
(Note also our introduction of the [`::` colcol](https://urbit.org/docs/hoon/reference/rune/col#-colcol) digraph in the above code block. This marks anything following it as a _comment_, meaning that it is meant for the developer and reader, and ignored by the computer.)
Hoon uses _gates_ as deferred computations. What this means is that we can build a Hoon expression now and use it at need later on, perhaps many times. More than that, we can also use it on different data values. A gate is the Hoon analogue of a [function or subroutine](https://en.wikipedia.org/wiki/Subroutine) in other programming languages.
The word "function" is used in various ways, but let's start by talking about them in the [mathematical sense](https://en.wikipedia.org/wiki/Function_%28mathematics%29). Roughly put, a function takes one or more arguments (i.e., input values) and returns a value. The return value depends solely on the argument(s), and nothing else. For example, we can understand multiplication as a function: it takes two numbers and returns another number. It doesn't matter where you ask, when you ask, or what kind of hat you're wearing when you ask. If you pass the same two numbers (e.g., `3` and `4`), you get the same answer returned every time (`12`).
That output value depends solely upon input value(s) is an important property of functions. This property is called [referential transparency](https://en.wikipedia.org/wiki/Referential_transparency), and it's one of the key ingredients to building a secure Urbit stack.
Functions are implemented in Hoon with a special kind of [core](https://urbit.org/docs/glossary/core/) called a _gate_. In this lesson you'll learn what a gate is and how a gate represents a function. (We _won't_ talk about what a core is quite yet.) Along the way you'll build some example gates of your own.
### Building a Gate
Syntactically, a gate is a [`|=` bartis](https://urbit.org/docs/hoon/reference/rune/bar#-bartis) rune with two children: a [`spec`](https://urbit.org/docs/hoon/reference/stdlib/4o#spec) (specification of input) and a [`hoon`](https://urbit.org/docs/hoon/reference/stdlib/4o#hoon) (body). Think of just replacing the `=/` tisfas with the `|=` bartis:
```hoon
:: Confirm whether a value is greater than one.
|= a=@ud
?: (gth a 1)
'yes'
'no'
```
Compare this to other programming languages, if you know any:
- Does it have a name?
- Does it have a return value?
Beyond those, what is the purpose of each line?
The [`spec`](https://urbit.org/docs/hoon/reference/stdlib/4o#spec) gives the type as a mold and attaches a face to it for use in the gate.
The [`hoon`](https://urbit.org/docs/hoon/reference/stdlib/4o#hoon) body expression evaluates and yields a result, ultimately sent back to the call site. Frequently it is wise to explicitly require a particular type for the return value using the [`^-` kethep](https://urbit.org/docs/hoon/reference/rune/ket#--kethep) rune:
```hoon
:: Confirm whether a value is greater than one.
|= a=@ud
^- @t
?: (gth a 1)
'yes'
'no'
```
The input value, what is included in the `spec`, is sometimes called the argument or parameter in mathematics and other programming languages. It's basically the input value. Hoon prefers to call it the `sample` for reasons that will become apparent later on, but you won't confuse other developers if you call it the argument or input.
Note as well that the backbone of the program runs straight down the left-hand margin. This makes it easier to read the essential mainline logic of the program.
Gates enforce the type of incoming and outgoing values. In other words, a `spec` is a kind of type which is fixing the possible noun inputs. (The lesson on types which follows this one will go into greater detail.)
Gates can take multiple arguments as a cell:
```hoon
:: Return which of two numbers is larger.
|= [a=@ud b=@ud]
?: (gth a b)
a
b
```
You can also call them different ways with raw [`%` cen](https://urbit.org/docs/hoon/reference/rune/cen) runes:
```hoon
%- max [100 200]
%+ max 100 200
```
### Creating Your Own Gate
You can type the above Hoon code snippets directly into Dojo, but there's no way to actually use them yet! The Dojo recognizes the expression as valid Hoon code, but can't actually apply it to an input `sample` yet.
```hoon
> |= [a=@ud b=@ud]
?: (gth a b)
a
b
< 1.tfm
[ [a=@ud b=@ud]
[our=@p now=@da eny=@uvJ]
<17.bny 33.ehb 14.dyd 53.vlb 77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
]
```
We need to attach a _name_ or a `face` to the expression. Then we'll be able to use it directly. Somewhat confusingly, there are three common ways to do this:
1. Attach the face (name) directly in Dojo. (This is a good quick solution, and we'll use it when teaching and testing code, but it doesn't work inside of code files.)
2. Save the gate as a _generator_ file and call it using the name of the file. (We'll do this in the next section of this lesson.)
3. Attach the face (name) as an _arm_ in a _core_. (We don't know what those are yet, so we'll set them aside for a couple of lessons.)
To name a gate in Dojo (or any expression resulting in a value, which is _every_ expression), you can use the Dojo-specific syntax `=name value`:
```hoon
> =inc |= [a=@]
(add 1 a)
> (inc 1)
2
> (inc 12)
13
> (inc 5)
6
```
Notice that there is _one_ space (`ace`) after the `=name` term and then regular `gap`s thereafter. We could also do this in one line using wide form:
```hoon
> =inc |=(a=@ (add 1 a))
> (inc 123)
124
```
To reiterate: we typically use the `|=` bartis rune to create a gate. In the expression above the `|=` is immediately followed by a set of parentheses containing two subexpressions: `a=@` and `(add 1 a)`. The first defines the gate's `sample` (input value type), and the second defines the gate's product (output value).
In the example gate above, `inc`, the sample is defined by `a=@`. This means that the sample is defined as an atom `@` meaning that the gate will take as input anything of that type (so, not a cell). The `sample` is given the face `a`. With a face it's easier to refer to the `sample` value in later code.
The second subexpression after the `|=` bartis rune is used to build the gate's body, where all the computations go. In `inc`, the product is defined by `(add 1 a)`. There's not much to it—it returns the value of `a+1`!
#### Exercise: Double a Value
- Produce a gate which accepts any `@` unsigned integer value and doubles it. Call it `double`.
```hoon
> =double |=(a=@ (mul a 2))
> (double 5)
10
```
#### Exercise: Convert Between Auras
- Produce a gate which accepts any `@` unsigned integer value and converts it to the `@p` equivalent. Call it `myship`.
- Produce a gate which accepts any `@` unsigned integer value and calculates the next neighbor (the `@p` of the number plus one). Call it `myneighbor`.
- Produce a gate which accepts a `@p` ship name and produces the `@ux` unsigned hexadecimal integer value of the ship. Call it `mynumber`.
### Output Values
How can we control what kind of value a gate returns? Many programming languages (such as C and Java) are _extremely_ concerned about this specification. Others, like Python and MATLAB, are _laissez-faire_. Hoon tends to be strict, but leaves some discretion over _how_ strict to you, the developer.
Remember `^-` kethep? We will use `^-` as a _fence_, a way of making sure only data matching the appropriate structure get passed on.
```hoon
:: Confirm whether a value is greater than one.
|= a=@ud
^- @ud
?: (gth a 1)
%.n
%.y
```
**This is the correct way to define a gate.** Frequent annotation of type with `^-` kethep fences is _essential_ to producing good Hoon code. From this point forward in Hoon School, we will hew to this standard.
In technical language, we describe Hoon as a _statically typed_ language. This means that it enforces type constraints on all values very aggressively. If you are used to a dynamic language like Python or Ruby, this will seem very restrictive at first. The flip side is that once your code compiles correctly, you will often find that it is very much along the way towards being a working correct product.
## Coordinating Files
In pragmatic terms, an Urbit ship is what results when you successfully boot a new ship. If you are in the host OS, what you see is an apparently-empty folder:
```sh
$ ls zod
$
```
(For this lesson in particular take pains to distinguish the host OS prompt `$ ` from the Urbit Dojo prompt `> `. You should look into particular system setup instructions for Windows, macOS, and Linux hosts.) <!-- TODO -->
Contrast that apparently empty folder with what the `+ls %` command shows you from inside of your Urbit (at the Dojo prompt):
```hoon
> +ls %
app/ desk/bill gen/ lib/ mar/ sur/ sys/ ted/
```
Urbit organizes its internal view of data and files as _desks_, which are associated collections of code and data. These are not visible to the host operating system unless you explicitly mount them, and changes on one side are not made clear to the other until you “commit” them. (Think of Dropbox, except that you have to explicitly synchronize to see changes somewhere else.)
Inside of your ship (“Mars”), you can mount a particular desk to the host operating system (“Earth”):
```hoon
> |mount %base
```
Now check what happens outside of your ship:
```sh
$ ls zod
base/
$ ls zod/base
app/ desk.bill gen/ lib/ mar/ sur/ sys/ ted/
```
If we make a change in the folder on Earth, the contents will only update on Mars if we explicitly tell the two systems to coordinate.
On Earth:
```sh
$ cp zod/base/desk.bill zod/base/desk.txt
```
On Mars:
```hoon
> |commit %base
+ /~zod/base/2/desk/txt
```
You can verify the contents of the copied files are the same using the `+cat` command:
```hoon
> +cat %/desk/bill
> +cat %/desk/txt
```
(Dojo does know what a `bill` file is, so it displays the contents slightly formatted. They are actually identical.)
We will use this `|commit` pattern to store persistent code as files, editing on Earth and then synchronizing to Mars.
## Building Code
The missing piece to really tie all of this together is the ability to store a gate and use it at a later time, not just in the same long Dojo session. Enter the _generator_.
A generator is a simple program which can be called from the Dojo. It is a gate, so it takes some input as sample and produces some result. Naked generators are the simplest generators possible, having access only to information passed to them directly in their `sample`.
In this section, we will compose our first generator.
### The Gate
```
:: Square a number.
|= a=@ud
^- @ud
%+ mul
a
a
```
(Any time you write code to use later, you should include some comments to explain what the code does and perhaps how it does that.)
### The Process
1. Open a text editor.
2. Copy the gate above into the text editor. (Double-check that two-space gaps are still gaps; some text editors chew them up into single-space aces.)
3. Save the gate as `square.hoon` in the `base/gen` folder of your fakeship.
4. In the Dojo, `|commit %base`. _You should see a message indicating that the file has been loaded._
5. Run the generator with `+square 5`.
Any generator can be run the same way, beginning with the `+` lus character and followed by the name of a file in the `base/gen` directory.
### Hoon Source and Special Characters
Hoon source files are composed almost entirely of the printable ASCII characters. Hoon does not accept any other characters in source files except for [UTF-8](https://en.wikipedia.org/wiki/UTF-8) in quoted strings. Hard tab characters are illegal; use two spaces instead.
```hoon
> "You can put ½ in quotes, but not elsewhere!"
"You can put ½ in quotes, but not elsewhere!"
> 'You can put ½ in single quotes, too.'
'You can put ½ in single quotes, too.'
> "Some UTF-8: ἄλφα"
"Some UTF-8: ἄλφα"
```
**Note**: If you're using VS Code on Windows, you might need to manually change the line endings from Windows-style `CRLF` to Unix-style `LF` in the status bar at the bottom. Urbit requires Unix-style line endings for Hoon files.
#### Exercise: Triangular Function
- Implement the triangular function as a gate and save it as a generator `tri.hoon`.
![](https://lh4.googleusercontent.com/zdauTDEWvhhOkFEb6VcDEJ4SITsHOgcStf4NYFQSIVjTDPjaCqYGdin9TDCCeTG3OyMrUUdq-JtViiu_c9wuojim_mHpV6-DoTNwZzYz5_6qVVvN5fc3hEuSna2GwY15RQ=w740)
### Coding Piecemeal
If you need to test code without completing it, you can stub out as-yet-undefined arms with the [`!!` zapzap](https://urbit.org/docs/hoon/reference/rune/zap#-zapzap) crash rune. `!!` is the only rune which has no children, and it's helpful when you need something to satisfy Hoon syntax but aren't ready to flesh out the program yet.
### Building Code Generally
A generator gives us on-demand access to code, but it is helpful to load and use code from files while we work in the Dojo.
A conventional library import with [`/+` faslus](https://urbit.org/docs/arvo/ford/ford#ford-runes) will work in a generator or another file, but won't work in Dojo, so you can't use `/+` faslus interactively.
Instead, you need to use the `-build-file` thread to load the code. Most commonly, you will do this with library code when you need a particular core's functionality.
`-build-file` accepts a file path and returns the built operational code, to which you can then attach a `face`. For instance:
```hoon
> =ntw -build-file %/lib/number-to-words/hoon
> one-hundred:numbers:ntw
100
> (to-words:eng-us:numbers:ntw 19)
[~ "nineteen"]
```
There are also a number of other import runes which make library, structure, and mark code available to you. Right now, the only one you need to worry about is `/+` faslus.
For simplicity, everything we do will take place on the `%base` desk for now. We will learn how to create a library in a subsequent lesson.
#### Exercise: Loading a Library
In a generator, load the `number-to-words` library using the `/+` tislus rune. (This must take place at the very top of your file.)
Use this to produce a gate which accepts an unsigned decimal integer and returns the text interpretation of its increment.

View File

@ -0,0 +1,451 @@
---
title: Gates
nodes: 125
objectives:
- "Identify a mold in the hierarchy of Urbit types (nouns, molds, marks)."
- "Understand how type inference and type checking takes place."
- "Bunt a mold."
- "Produce a type union."
- "Produce a named tuple."
- "Identify type using `!>`."
---
# Molds (Types)
_This module will introduce the Hoon type system and illustrate how type checking and type inference work._
## The Hoon Type System
Programming languages use data types to distinguish different kinds of data and associated rules. For instance, what does it mean to add 3 to the letter A? Depending on your programming language, you could see `A3`, `D`, or an error.
Like most modern high-level programming languages, Hoon has a type system. Because Hoon is a functional programming language, its type system differs somewhat from those of non-functional languages. In this lesson we'll introduce Hoon's type system and point out some of its distinctive features. Certain advanced topics (e.g. type polymorphism) won't be addressed until a later chapter.
A type is ordinarily understood to be a set of values. Examples: the set of all atoms is a type, the set of all cells is a type, and so on.
Type systems provide type safety, in part by making sure functions produce values of the correct type. When you write a function whose product is intended to be an atom, it would be nice to know that the function is guaranteed to produce an atom. Hoon's type system provides such guarantees with _type checking_ and _type inference_.
A _type_ is really a rule for interpretation. But for our Hoonish purposes, it's rather too broad a notion and we need to clarify some different kinds of things we could refer to as “type”. It is instructive for learners to distinguish three kinds of types in Hoon:
1. Atoms: values with auras.
2. Molds: structures. Think of cells, lists, and sets.
3. Marks: file types. Compare to conventional files distinguished by extension and definite internal structure.
To employ a chemical metaphor, an atom is an atom; a cell is a molecule; a mold is an molecule definition, a template or structural representation; a mark is like a protein, a more complex transformation rule. **All of these are molds, or Hoon types. We are simply separating them by complexity as you learn.**
You have seen and worked with the trivial atoms and cells. We will leave marks until a later discussion of Gall agents or the Clay filesystem, which use marks to type file data. For now, we focus on molds.
This lesson will talk about atoms, cells, then molds in a general sense. We allude to several topics which will be explored in Data Structures.
## Atoms and Auras
In the most straightforward sense, atoms simply are unsigned integers. But they can also be interpreted as representing signed integers, ASCII symbols, floating-point values, dates, binary numbers, hexadecimal numbers, and more. Every atom is, in and of itself, just an unsigned integer; but Hoon keeps track of type information about each atom, and this bit of metadata tells Hoon how to interpret the atom in question.
The piece of type information that determines how Hoon interprets an atom is called an **aura**. The set of all atoms is indicated with the symbol `@`. An aura is indicated with `@` followed by some letters, e.g., `@ud` for unsigned decimal. Accordingly, the Hoon type system does more than track sets of values. It also tracks certain other relevant metadata about how those values are to be interpreted.
How is aura information generated so that it can be tracked? One way involves **type inference**. In certain cases Hoon's type system can infer the type of an expression using syntactic clues. The most straightforward case of type inference is for a [literal](https://en.wikipedia.org/wiki/Literal_%28computer_programming%29) expression of data, such as `0x1000` for `@ux`. Hoon recognizes the aura literal syntax and infers that the data in question is an atom with the aura associated with that syntax.
To see the inferred type of a literal expression in the Dojo, use the `?` operator. (This operator isn't part of the Hoon programming language; it's a Dojo-only tool.)
The `?` Dojo operator shows both the product and the inferred type of an expression. Let's try `?` on `15`:
```hoon
> 15
15
> ? 15
@ud
15
```
`@ud` is the inferred type of `15` (and of course `15` is the product). The `@` is for “atom” and the `ud` is for “unsigned decimal”. The letters after the `@` indicate the “aura” of the atom.
One important role played by the type system is to make sure that the output of an expression is of the intended data type. If the output is of the wrong type then the programmer did something wrong. How does Hoon know what the intended data type is? The programmer must specify this explicitly by using a _cast_. To cast for an unsigned decimal atom, you can use the `^-` kethep rune along with the `@ud` from above.
What exactly does the `^-` kethep rune do? It compares the inferred type of some expression with the desired cast type. If the expression's inferred type _nests_ under the desired type, then the product of the expression is returned.
Let's try one in the Dojo.
```hoon
> ^-(@ud 15)
15
```
Because `@ud` is the inferred type of `15`, the cast succeeds. Notice that the `^-` kethep expression never does anything to modify the underlying [noun](https://urbit.org/docs/glossary/noun/) of the second subexpression. It's used simply to mandate a type-check on that expression. This check occurs at compile-time (when the expression is compiled to Nock).
What if the inferred type doesn't fit under the cast type? You will see a `nest-fail` crash at compile-time:
```hoon
> ^-(@ud [13 14])
nest-fail
[crash message]
```
Why `nest-fail`? The inferred type of `[13 14]` doesn't nest under the cast type `@ud`. It's a cell, not an atom. But if we use the symbol for nouns, `*`, then the cast succeeds:
```hoon
> ^-(* [13 14])
[13 14]
```
A cell of atoms is a noun, so the inferred type of `[13 14]` nests under `*`. Every product of a Hoon expression nests under `*` because every product is a noun.
### What Auras are There?
Hoon has a wide (but not extensible) variety of atom literal syntaxes. Each literal syntax indicates to the Hoon type checker which predefined aura is intended. Hoon can also pretty-print any aura literal it can parse. Because atoms make great path nodes and paths make great URLs, all regular atom literal syntaxes use only URL-safe characters. The pretty-printer is convenient when you are used to it, but may surprise you occasionally as a learner.
Here's a non-exhaustive list of auras, along with examples of corresponding literal syntax:
| Aura | Meaning | Example Literal Syntax |
|:-------|:-----------------------------|:-----------------------|
| `@d` | date | no literal |
| `@da` | absolute date | `~2018.5.14..22.31.46..1435` |
| `@dr` | relative date (ie, timespan) | `~h5.m30.s12` |
| `@p` | phonemic base (ship name) | `~sorreg-namtyv` |
| `@r` | [IEEE-754](https://en.wikipedia.org/wiki/IEEE_754) floating-point | |
| `@rd` | double precision (64 bits) | `.~6.02214085774e23` |
| `@rh` | half precision (16 bits) | `.~~3.14` |
| `@rq` | quad precision (128 bits) | `.~~~6.02214085774e23` |
| `@rs` | single precision (32 bits) | `.6.022141e23` |
| `@s` | signed integer, sign bit low | no literal |
| `@sb` | signed binary | `--0b11.1000` |
| `@sd` | signed decimal | `--1.000.056` |
| `@sv` | signed base32 | `-0v1df64.49beg` |
| `@sw` | signed base64 | `--0wbnC.8haTg` |
| `@sx` | signed hexadecimal | `-0x5f5.e138` |
| `@t` | UTF-8 text (cord) | `'howdy'` |
| `@ta` | ASCII text (knot) | `~.howdy` |
| `@tas` | ASCII text symbol (term) | `%howdy` |
| `@u` | unsigned integer | no literal |
| `@ub` | unsigned binary | `0b11.1000` |
| `@ud` | unsigned decimal | `1.000.056` |
| `@uv` | unsigned base32 | `0v1df64.49beg` |
| `@uw` | unsigned base64 | `0wbnC.8haTg` |
| `@ux` | unsigned hexadecimal | `0x5f5.e138` |
```
Some of these auras nest under others. For example, `@u` is for all unsigned auras. But there are other, more specific auras; `@ub` for unsigned binary numbers, `@ux` for unsigned hexadecimal numbers, etc. (For a more complete list of auras, see [Auras](https://urbit.org/docs/hoon/reference/auras).)
### Aura Inference in Hoon
Let's work a few more examples in the Dojo using the `?` operator. We'll focus on just the unsigned auras for now:
```hoon
> 15
15
> ? 15
@ud
15
> 0x15
0x15
> ? 0x15
@ux
0x15
```
When you enter just `15`, the Hoon type checker infers from the syntax that its aura is `@ud` because you typed an unsigned integer in decimal notation. Hence, when you use `?` to check the aura, you get `@ud`.
And when you enter `0x15` the type checker infers that its aura is `@ux`, because you used `0x` before the number to indicate the unsigned hexadecimal literal syntax. In both cases, Hoon pretty-prints the appropriate literal syntax by using inferred type information from the input expression; the Dojo isn't (just) echoing what you enter.
More generally: for each atom expression in Hoon, you can use the literal syntax of an aura to force Hoon to interpret the atom as having that aura type. For example, when you type `~sorreg-namtyv` Hoon will interpret it as an atom with aura `@p` and treat it accordingly.
Here's another example of type inference at work:
```unknown
> (add 15 15)
30
> ? (add 15 15)
@
30
> (add 0x15 15)
36
> ? (add 0x15 15)
@
36
```
The `add` function in the Hoon standard library operates on all atoms, regardless of aura, and returns atoms with no aura specified. Hoon isn't able to infer anything more specific than `@` for the product of `add`. This is by design, however. Notice that when you `add` a decimal and a hexadecimal above, the correct answer is returned (pretty-printed as a decimal). This works for all of the unsigned auras:
```unknown
> (add 100 0b101)
105
> (add 100 0xf)
115
> (add 0b1101 0x11)
30
```
The reason these add up correctly is that unsigned auras all map directly to the 'correct' atom underneath. For example, `16`, `0b1.0000`, and `0x10` are all the exact same atom, just with different literal syntax. (This doesn't hold for signed versions of the auras!)
## Cells
Let's move on to consider cells. For now we'll limit ourselves to simple cell types made up of various atom types.
### Generic Cells
The `^` ket symbol is used to indicate the type for cells (i.e., the set of all cells). We can use it for casting as we did with atom auras, like `@ux` and `@t`:
```hoon
> ^-(^ [12 13])
[12 13]
> ^-(^ [[12 13] 14])
[[12 13] 14]
> ^-(^ [[12 13] [14 15 16]])
[[12 13] [14 15 16]]
> ^-(^ 123)
nest-fail
> ^-(^ 0x10)
nest-fail
```
If the expression to be evaluated produces a cell, the cast succeeds; if the expression evaluates produces an atom, the cast fails with a nest-fail crash.
The downside of using `^` ket for casts is that Hoon will infer only that the product of the expression is a cell; it won't know what kind of cell is produced.
```hoon
> ? ^-(^ [12 13])
{* *}
[12 13]
> ? ^-(^ [[12 13] 14])
{* *}
[[12 13] 14]
> ? ^-(^ [[12 13] [14 15 16]])
{* *}
[[12 13] [14 15 16]]
```
When we use the `?` operator to see the type inferred by Hoon for the expression, in all three of the above cases the same thing is returned: `{* *}`. The `*` symbol indicates the type for any noun, and the curly braces `{ }` indicate a cell. Every cell in Hoon is a cell of nouns; remember that cells are defined as pairs of nouns.
Yet the cell `[[12 13] [14 15 16]]` is a bit more complex than the cell `[12 13]`. Can we use the type system to distinguish them? Yes.
### Getting More Specific
What if you want to cast for a particular kind of cell? You can use square brackets when casting for a specific cell type. For example, if you want to cast for a cell in which the head and the tail must each be an atom, then simply cast using `[@ @]`:
```hoon
> ^-([@ @] [12 13])
[12 13]
> ? ^-([@ @] [12 13])
{@ @}
[12 13]
> ^-([@ @] 12)
nest-fail
> ^-([@ @] [[12 13] 14])
nest-fail
```
The `[@ @]` cast accepts any expression that evaluates to a cell with exactly two atoms, and crashes with a `nest-fail` for any expression that evaluates to something different. The expression `12` doesn't evaluate to a cell; and while the expression `[[12 13] 14]` does evaluate to a cell, the left-hand side isn't an atom, but is instead another cell.
You can get even more specific about the kind of cell you want by using atom auras:
```hoon
> ^-([@ud @ux] [12 0x10])
[12 0x10]
> ^-([@ub @ux] [0b11 0x10])
[0b11 0x10]
> ? ^-([@ub @ux] [0b11 0x10])
{@ub @ux}
[0b11 0x10]
> ^-([@ub @ux] [12 13])
nest-fail
```
You are also free to embed more square brackets `[ ]` to indicate cells within cells:
```hoon
> ^-([[@ud @sb] @ux] [[12 --0b1101] 0xdead.beef])
[[12 --0b1101] 0xdead.beef]
> ? ^-([[@ud @sb] @ux] [[12 --0b1101] 0xdead.beef])
{{@ud @sb} @ux}
[[12 --0b1101] 0xdead.beef]
> ^-([[@ @] @] [12 13])
nest-fail
```
You can also be highly specific with certain parts of the type structure, leaving other parts more general. Keep in mind that when you do this, Hoon's type system will infer a general type from the general part of the cast. Type information may be thrown away:
```hoon
> ^-([^ @ux] [[12 --0b1101] 0xdead.beef])
[[12 26] 0xdead.beef]
> ? ^-([^ @ux] [[12 --0b1101] 0xdead.beef])
{{* *} @ux}
[[12 26] 0xdead.beef]
> ^-(* [[12 --0b1101] 0xdead.beef])
[[12 26] 3.735.928.559]
> ? ^-(* [[12 --0b1101] 0xdead.beef])
*
[[12 26] 3.735.928.559]
```
Because every piece of Hoon data is a noun, everything nests under `*`. When you cast to `*` you can see the raw noun with cells as brackets and atoms as unsigned integers.
## Molds
Molds are templates or rules for identifying actual type structures. They are actually gates, meaning that they operate on a value to coerce it to a particular structure. Technically, a mold is a function from a noun to a noun. What this means is that we can use a mold to map any noun to a typed value—if this fails, then the mold crashes.
```hoon
> (^ [1 2])
[1 2]
> (@ [1 2])
dojo: hoon expression failed
> `@`[1 2]
mint-nice
-need.@
-have.[@ud @ud]
nest-fail
dojo: hoon expression failed
```
We commonly need to do one of two things with a mold:
1. Validate the shape of a noun (_clam_).
```hoon
> (@ux 0x1000)
0x1000
> (@ux [1 2])
dojo: hoon expression failed
```
2. Produce an example value (_bunt_).
We often use bunts to clam; for example ``@ud` implicitly uses the `@ud` default value (`0`) as the type specimen which the computation must match.
To _actually_ get the bunt value, use the [`^*` kettar](https://urbit.org/docs/hoon/reference/rune/ket#kettar) rune, almost always used in its irregular form `*` tar:
```hoon
> ^* @ud
0
> ^* @da
~2000.1.1
> *@da
~2000.1.1
> *[@ud @ux @ub]
[0 0x0 0b0]
```
One more way to validate against type is to use an example instead of the extracted mold. This uses the [`^+` ketlus](https://urbit.org/docs/hoon/reference/rune/ket#ketlus) rune similarly to how we used `^-` ketlus previously:
```hoon
^+(1.000 100)
```
(This is what `^-` is actually doing: `^-(p q)` reduces to `^+(^*(p) q)`. Many runes we use actually reduce to other rune forms, and have been introduced for ease of use.)
We can use more complex structures for molds though, including built-in types like `list`s and `tape`s. (A `tape` represents text.)
```hoon
`(list @)`[104 101 108 108 111 32 77 97 114 115 33 ~]
`tape``(list @)`[104 101 108 108 111 32 77 97 114 115 33 ~]
`(list @)`[144 57 195 46 200 165 186 88 118 99 ~]
`(list @p)``(list @)`[144 57 195 46 200 165 186 88 118 99 ~]
```
(Sometimes you see a `%bad-text` when using `tape`s, which means that you've tried to convert a number into text which isn't text. More on `tape`s in Trees.)
- Why does this mold conversion fail?
```hoon
`(list @ux)`[1 2 3 ~]
```
What do we need to do in order to make it succeed?
We can have more complex molds as well:
```hoon
:: [[from-ship to-ship] points]
[[@p @p] @ud]
```
Most of the time, we will define such complex types using specific runes and “mold builder” tools. Thus a `list` needs an associated type `(list @)` to correctly denote the data type.
### Identifying Molds
Besides `?` (which is a Dojo-specific tool), the programmatic way to figure out which mold the Hoon compiler thinks something is to use the [`!>` zapgar](https://urbit.org/docs/hoon/reference/rune/zap#-zapgar) rune.
```
> !>(0xace2.bead)
[#t/@ux q=2.900.541.101]
```
For reasons which will be elaborated in Trees, this is often employed as the so-called “type spear” `-:!>`:
```
> -:!>(0xace2.bead)
#t/@ux
```
### Type Unions
[`$?` bucwut](https://urbit.org/docs/hoon/reference/rune/buc#-bucwut) forms a type union.
For instance, if you wanted a gate to return one of an unsigned aura type, but no other type, you could define a type union thus:
```hoon
$? [@ud @ux @ub ~]
```
and use it in a gate:
```hoon
|= [n=$?(@ud @ux @ub)]
(add n 1)
```
```hoon
> (foo 4)
5
> (foo 0x5)
6
> (foo 0b110)
7
> (foo ~zod)
-need.?(@ub @ud @ux)
-have.@p
nest-fail
dojo: hoon expression failed
```
The irregular form of `%?` bucwut looks like this:
```hoon
?(@ud @ux @ub)
```
Type unions are mainly helpful when you need to match something that can have multiple options. We will use them extensively with `@tas` terms, such as `?(%red %green %blue)` which would only admit one of those three tags.

View File

@ -0,0 +1,939 @@
---
title: Cores
nodes: 130, 133
objectives:
- "Employ a trap to produce a reentrant block of code."
- "Produce a recursive gate."
- "Distinguish head and tail recursion."
- "Consider Hoon structures as cores."
- "Identify the special role of the `$` buc arm in many cores."
- "Order neighboring cores within the subject for addressibility."
- "Produce a type arm."
---
# Cores
_This module will introduce the key Hoon data structure known as the **core**, as well as ramifications._
The Hoon subject is a noun. One way to look at this noun is to denote each fragment of is as either a computation or data. By strictly separating these two kinds of things, we derive the data structure known within Hoon as a _core_.
Cores are the most important data structure in Hoon. They allow you to solve many coding problems by identifying a pattern and supplying a proper data structure apt to the challenge. You have already started using cores with `|=` bartis gate construction and use.
This lesson will introduce another core to solve a specific use case, then continue with a general discussion of cores. Getting cores straight will be key to understanding why Hoon has the structure and internal logic it does.
## Repeating Yourself Using a Trap
Computers were built and designed to carry out tasks which were too dainty and temperamental for humans to repeat consistently, or too prodigiously numerous for humans to ever complete. At this point, you know how to build code that can make a decision between two branches, two different Hoon expressions. Computers can decide between alternatives, but they also need to carry out a task until some condition is met. (We can think of it as a recipe step, like “crack five eggs into a bowl”. Until that process is complete, we as humans continue to carry out the equivalent action again and again until the process has been completed.)
In programming, we call this behavior a “loop”. A loop describes the situation in which we set up some condition, and repeat a process over and over until something we do meets that condition. _Most_ of the time, this means counting once for each item in a collection, like a list.
Hoon effects the concept of a loop using recursion, return to a particular point in an expression (presumably with some different values). One way to do this is using the [`|-` barhep](https://urbit.org/docs/hoon/reference/rune/bar#-barhep) rune, which creates a structure called a _trap_. (Think of the “trap” in the bottom of your sink.) It means a point to which you can return again, perhaps with some key values (like a counter) changed. Then you can repeat the calculation inside the trap again. This continues until some single value, some noun, results, thereby handing a value back out of the expression. (Remember that every Hoon expression results in a value.)
This program adds 1+2+3+4+5 and returns the sum:
```hoon
=/ counter 1
=/ sum 0
|-
?: (gth counter 5)
sum
%= $
counter (add counter 1)
sum (add sum counter)
==
```
(The last two lines happen simultaneously, so make sure to refer to the _current_ version of any variables.)
Let's unroll it:
0. `counter = 1`
`sum = 0`
1. `(gth counter 5) = %.n`
`counter ← (add counter 1) = 2`
`sum ← (add sum counter) = 0 + 1 = 1`
2. `(gth counter 5) = %.n`
`counter ← (add counter 1) = 3`
`sum ← (add sum counter) = 1 + 2 = 3`
3. `(gth counter 5) = %.n`
`counter ← (add counter 1) = 4`
`sum ← (add sum counter) = 3 + 3 = 6`
4. `(gth counter 5) = %.n`
`counter ← (add counter 1) = 5`
`sum ← (add sum counter) = 6 + 4 = 10`
5. `(gth counter 5) = %.n`
`counter ← (add counter 1) = 6`
`sum ← (add sum counter) = 10 + 5 = 15`
6. `(gth counter 5) = %.y`
And thus `sum` yields the final value of `15`.
It is frequently helpful, when constructing these, to be able to output the values at each step of the process. Use the [`~&` sigpam](https://urbit.org/docs/hoon/reference/rune/sig#-sigpam) rune to create output without changing any values:
```hoon
=/ counter 1
=/ sum 0
|-
~& "counter:"
~& counter
~& "sum:"
~& sum
?: (gth counter 5)
sum
%= $
counter (add counter 1)
sum (add sum counter))
==
```
You can do even better using _interpolation_:
```hoon
=/ counter 1
=/ sum 0
|-
~& "counter: {<counter>}"
~& "sum: {<sum>}"
?: (gth counter 5)
sum
%= $
counter (add counter 1)
sum (add sum counter))
==
```
#### Exercise: Calculate a Factorial
- Let's calculate a [factorial](https://mathworld.wolfram.com/Factorial.html). The factorial of a number _n_ is _n_×(_n_-1)×...×2×1. We will introduce a couple of new bits of syntax and a new gate (`++dec`). Make this into a generator `factorial.hoon`:
```hoon
|= n=@ud
|-
~& n
?: =(n 1)
1
%+ mul
n
%= $
n (dec n)
==
```
- We are using the `=` irregular syntax for the [`.=` dottis](https://urbit.org/docs/hoon/reference/rune/dot#dottis) rune, which tests for the equality of two expressions.
- We are using the `+` irregular syntax for the [`.+` dotlus](https://urbit.org/docs/hoon/reference/rune/dot#dotlus) rune, which increments a value (adds one).
```hoon
> +factorial 5
120
```
Let's visualize the operation of this gate using pseudocode (fake code that's explanatory but may not be operational). Here's basically what's happening when `factorial` receives the value `5`:
```hoon
(factorial 5)
(mul 5 (factorial 4))
(mul 5 (mul 4 (factorial 3)))
(mul 5 (mul 4 (mul 3 (factorial 2))))
(mul 5 (mul 4 (mul 3 (mul 2 (factorial 1)))))
(mul 5 (mul 4 (mul 3 (mul 2 1))))
(mul 5 (mul 4 (mul 3 2)))
(mul 5 (mul 4 6))
(mul 5 24)
120
```
We're “floating” gate calls until we reach the final iteration of such calls that only produces a value. The `mul n` component of the gate leaves `mul 5` waiting for the final series of terms to be operated upon. The `%=($ n (dec n)))` component expands the expression outwards, as illustrated by `(factorial 4)`. This continues until the expression is not expanded further, at which point the operations work backwards, successively feeding values into the `mul` functions behind them.
The pyramid-shaped illustration approximates what's happening on the _call stack_, a memory structure that tracks the instructions of the program. In this code, every time a parent gate calls another gate, the gate being called is "pushed" to the top of the stack in the form of a frame. This process continues until a value is produced instead of a function, completing the stack.
- Why do we return the result (`product` in Hoon parlance) at 1 instead of 0?
#### Exercise: Tracking Expression Structure
As we write more complicated programs, it is helpful to learn to read the runes by identifying which daughter expressions attach to which runes, e.g.:
```
=/
n
15
|-
~&
n
?:
=(n 1) :: .= n 1
1
%+
mul
n
%=
$
n
(dec n) :: %- dec n
==
```
Recall that the `::` digraph tells the compiler to ignore the rest of the text on the line. Such text is referred to as a "comment" because, instead of performing a computation, it exists to explain things to human readers of the source code. Here, we have also explicitly marked the expansion of the irregular forms.
We will revert to the irregular form more and more. If you would like to see exactly how an expression is structured, you can use the [`!,` zapcom](https://urbit.org/docs/hoon/reference/rune/zap#-zapcom) rune. `!,` zapcom produces an annotated _abstract syntax tree_ (AST) which labels every value and expands any irregular syntax into the regular runic form.
```hoon
> !, *hoon (add 5 6)
[%cncl p=[%wing p=~[%add]] q=~[[%sand p=%ud q=5] [%sand p=%ud q=6]]]
```
```hoon
> !, *hoon |= n=@ud
|-
~& n
?: =(n 1)
n 1
%+ mul
n
%= $
n (dec n)
==
[ %brts
p=[%bcts p=term=%n q=[%base p=[%atom p=~.ud]]]
q
[ %brhp
p
[ %sgpm
p=0
q=[%wing p=~[%n]]
r
[ %wtcl
p=[%dtts p=[%wing p=~[%n]] q=[%sand p=%ud q=1]]
q=[%wing p=~[%n]]
r
[ %cnls
p=[%wing p=~[%mul]]
q=[%wing p=~[%n]]
r=[%cnts p=~[%$] q=~[[p=~[%n] q=[%cncl p=[%wing p=~[%dec]] q=~[[%wing p=~[%n]]]]]]]
]
]
]
]
]
```
(_There's a lot going on in there._ Focus on the four-letter runic identifiers: `%sgpm` for `~&` sigpam, for instance.)
#### Exercise: Calculate a sequence of numbers
Produce a gate (generator) which accepts a `@ud` value and calculates the series where the *i*th term in the series is given by the equation
![](https://latex.codecogs.com/png.image?\large%20\dpi{110}n_{i}%20=%20i^{2}\textrm{,})
<!--
$$
n_{i} = i^{2}
\textrm{,}
$$
-->
that is, the first numbers are 0, 1, 4, 9, 16, 25, etc.
For this exercise, you do not need to store these values in a list. Calculate each one but only return the final value.
#### Exercise: Output each letter in a `tape`
Produce a gate (generator) which accepts a `tape` value and returns a `(list @ud)` containing the ASCII value of each character. Use a `|-` barhep trap.
The previous code simply modified a value by addition. You can generalize this to other arithmetic processes, like multiplication, but you can also grow a data structure like a list.
For example, given the `tape` `"hello"`, the generator should return the list `~[104 101 108 108 111]`.
Two tools that may help:
- You can retrieve the _n_-th element in a `tape` using the [`++snag`](https://urbit.org/docs/hoon/reference/stdlib/2b#snag) gate, e.g. `(snag 3 `(list @ud)`~[1 2 3 4 5])` yields `4` (so `++snag` is zero-indexed; it counts from zero).
- You can join an element to a list using the [`++snoc`](https://urbit.org/docs/hoon/reference/stdlib/2b#snoc) gate, e.g. `(snoc `(list @ud)`~[1 2 3] 4)` yields `~[1 2 3 4]`.
```hoon
|= [input=tape]
=/ counter 0
=/ results *(list @ud)
|-
?: =(counter (lent input))
results
=/ ascii `@ud`(snag counter input)
%= $
counter (add counter 1)
results (snoc results ascii)
==
```
## Cores
So far we have introduced and worked with a few key structures:
1. Nouns
2. Molds (types)
3. Gates
4. Traps
Some of them are _data_, like raw values: `0x1234.5678.abcd` and `[5 6 7]`. Others are _code_, programs that do something. What unifies all of these under the hood?
A core is a cell pairing operations to data. Formally, we'll say a core is a cell `[battery payload]`, where `battery` describes the things that can be done (the operations) and `payload` describes the data on which those operations rely. (For many English speakers, the word “battery” evokes a [voltaic pile](https://en.wikipedia.org/wiki/Voltaic_pile) more than a bank of guns, but the artillery metaphor is a better mnemonic for `[battery payload]`.)
**Cores are the most important structural concept for you to grasp in Hoon.** Everything nontrivial is a core. Some of the runes you have used already produce cores, like the gate. That is, a gate marries a `battery` (the operating code) to the `payload` (the input values AND the “subject” or operating context).
Urbit adopts an innovative programming paradigm called _subject-oriented programming_. By and large, Hoon (and Nock) is a functional programming language in that running a piece of code twice will always yield the same result, and because runs cause a program to explicitly compose various subexpressions in a somewhat mathematical way.
Hoon (and Nock) very carefully bounds the known context of any part of the program as the _subject_. Basically, the subject is the noun against which any arbitrary Hoon code is evaluated.
For instance, when we first composed generators, we made what are called “naked generators”: that is, they do not have access to any information outside of the base subject (Arvo, Hoon, and `%zuse`) and their sample (arguments). Other generators (such as `%say` generators, described below) can have more contextual information, including random number generators and optional arguments, passed to them to form part of their subject.
Cores have two kinds of values attached: arms and legs, both called limbs. Arms describe known labeled addresses (with `++` luslus or `+$` lusbuc) which carry out computations. Legs are limbs which store data (with e.g. `/=` tisfas).
### Arms
So legs are for data and arms are for computations. But what _specifically_ is an arm, and how is it used for computation? Let's begin with a preliminary explanation that we'll refine later.
An _arm_ is some expression of Hoon encoded as a noun. (By 'encoded as a noun' we literally mean: 'compiled to a Nock formula'. But you don't need to know anything about Nock to understand Hoon.) You virtually never need to treat an arm as raw data, even though technically you can—it's just a noun like any other. You almost always want to think of an arm simply as a way of running some Hoon code.
Every expression of Hoon is evaluated relative to a subject. An [_arm_](https://urbit.org/docs/glossary/arm) is a Hoon expression to be evaluated against the core subject (i.e. its parent core is its subject).
#### Arms for Gates
Within a core, we label arms as Hoon expressions (frequently `|=` bartis gates) using the [`++` luslus](https://urbit.org/docs/hoon/reference/rune/lus#-luslus) digraph. (`++` isn't formally a rune because it doesn't actually change the structure of a Hoon expression, it simply marks a name for an expression or value. The `--` hephep limiter digraph is used because `|%` barcen can have any number of arms attached. Like `++`, it is not formally a rune.)
```hoon
|%
++ add-one
|= a=@ud
^- @ud
(add a 1)
++ sub-one
|= a=@ud
^- @ud
(sub a 1)
--
```
Give the name `adder` to the above, and use it thus:
```hoon
> (add-one:adder 5)
6
> (sub-one:adder 5)
4
```
Notice here that we read the arm resolution from right-to-left. This isn't the only way to address an arm, but it's the most common one.
#### Exercise: Produce a Gate Arm
- Compose a core which contains arms for multiplying a value by two and for dividing a value by two.
#### Arms for Types
We can define custom types for a core using [`+$` lusbuc](https://urbit.org/docs/hoon/reference/rune/lus#-lusbuc) digraphs. We won't do much with these yet but they will come in handy for custom types later on.
This core defines a set of types intended to work with playing cards:
```hoon
|%
+$ suit ?(%hearts %spades %clubs %diamonds)
+$ rank ?(1 2 3 4 5 6 7 8 9 10 11 12 13)
+$ card [sut=suit val=rank]
+$ deck (list card)
---
```
#### Cores in Generators
When we write generators, we can include helpful tools as arms either before the main code (with `=>` tisgar) or after the main code (with `=<` tisgal):
```hoon
|= n=@ud
=<
(add-one n)
|%
++ add-one
|= a=@ud
^- @ud
(add a 1)
--
```
A library (a file in `/lib`) is typically structured as a `|%` barcen core.
### Legs
A [_leg_](https://urbit.org/docs/hoon/hoon-school/the-subject-and-its-legs) is a data value. They tend to be trivial but useful ways to pin constants. `=/` tisfas values are legs, for instance.
```hoon
> =/ a 1
+(a)
2
```
Under the hood, legs and arms are distinguished by the Nock instructions used in each case. A leg is evaluated by Nock 0, while an arm is evaluated by Nock 9.
### Recalculating a Limb
Arms and legs are both _limbs_. Either one can be replaced in a given subject. This turns out to be very powerful, and permits Hoon to implement gates (functions) in a mathematically rigorous way, among other applications.
Often a leg of the subject is produced with its value unchanged. But there is a way to produce a modified version of the leg as well. To do so, we use the `%=` cenhep rune:
```hoon
%= subject-limb
leg-1 new-leg-1
leg-2 new-leg-2
...
==
```
`%=` cenhep is frequently used in its irregular form, particularly if the expression within it fits on a single line. The irregular form prepends the arm (often `$`) to parentheses `()`. In its irregular form, the above would be:
```hoon
subject-limb(leg-1 new-leg-1, leg-2 new-leg-2, ...)
```
In the first example, we saw the expression
```hoon
%= $
counter (add counter 1)
sum (add sum counter)
==
```
which can equivalently be expressed as
```hoon
$(counter (add counter 1), sum (add sum counter))
```
This statement means that we recalculate the `$` buc arm of the current subject with the indicated changes. But what is `$` buc? `$` buc is the _default arm_ for many core structures, including `|=` bartis gate cores and `|-` barhep trap cores.
### What is a Gate?
A core is a cell: `[battery payload]`.
A gate is a core with two distinctive properties:
1. The **battery** of a gate contains an arm which has the special name `$` buc. The `$` buc arm contains the instructions for the function in question.
2. The **payload** of a gate consists of a cell of `[sample context]`.
1. The **sample** is the part of the payload that stores the "argument" (i.e., input value) of the function call.
2. The **context** contains all other data that is needed for computing the `$` buc arm of the gate correctly.
As a tree, a gate looks like the following:
```
[$ [sample context]]
gate
/ \
$ .
/ \
sample context
```
Like all arms, `$` buc is computed with its parent core as the subject. When `$` buc is computed, the resulting value is called the “product” of the gate. No other data is used to calculate the product other than the data in the gate itself.
We will always call the values supplied to the gate the “sample” since we will later discover that this technical meaning (`[battery [sample context]]`) holds throughout more advanced cores.
#### Exercise: Another Way to Calculate a Factorial
Let's revisit our factorial code from above:
```hoon
|= n=@ud
|-
?: =(n 1)
1
%+ mul
n
%= $
n (dec n)
==
```
We can write this code in several ways using the `%=` cenhep plus `$` buc structure.
For instance, we can eliminate the trap by recursing straight back to the gate:
```hoon
|= n=@ud
?: =(n 1)
1
%+ mul
n
%= $
n (dec n)
==
```
Even more compactly, `(add counter 1)` can be replaced by the Nock increment rune, [`.+` dotlus](https://urbit.org/docs/hoon/reference/rune/dot#-dotlus), for the equivalent version:
```hoon
|= n=@ud
?: =(n 1)
1
(mul n $(n (dec n)))
```
(Remember that sugar syntax like `$()` does not affect code efficiency, merely visual layout.)
#### The `$` Buc Arm
The (only) arm of a gate encodes the instructions for the Hoon function in question.
```hoon
> =inc |=(a=@ (add 1 a))
> (inc 5)
6
```
The pretty printer represents the `$` buc arm of `inc` as `1.yop`. To see the actual noun of the `$` buc arm, enter `+2:inc` into the Dojo:
```hoon
> +2:inc
[8 [9 36 0 8.191] 9 2 10 [6 [7 [0 3] 1 1] 0 14] 0 2]
```
This is un-computed Nock. You don't need to understand any of this, except that code and data are homoiconic—they are in a sense the same for Urbit programs.
It's worth pointing out that the arm named `$` buc can be used like any other name. We can compute `$` buc directly with `$:inc` in the Dojo:
```hoon
> $:inc
1
```
This result may seem a bit strange. We didn't call `inc` or in any other way pass it a number. Yet using `$` buc to evaluate `inc`'s arm seems to work—sort of, anyway. Why is it giving us `1` as the return value? We can answer this question after we understand gate samples a little better.
#### The Sample
The sample of a gate is the address reserved for storing the argument(s) to the Hoon function. Although we don't know about addressing yet, you saw above that `+2` referred to the battery. The sample is always at the head of the gate's tail, `+6`. (We'll look at addressing in more depth in [the next module](./G-trees.md).)
Let's look at the gate for inc again, paying particular attention to its sample:
```hoon
> inc
< 1.mgz
[ a=@
[our=@p now=@da eny=@uvJ]
<17.bny 33.ehb 14.dyd 53.vlb 77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
]
>
```
We see `a=@`. This may not be totally clear, but at least the `@` should make a little sense. This is the pretty-printer's way of indicating an atom with the face `a`. Let's take a closer look:
```hoon
> +6:inc
a=0
```
We see now that the sample of `inc` is the value `0`, and has `a` as a face. This is a placeholder value for the function argument. If you evaluate the `$` buc arm of `inc` without passing it an argument the placeholder value is used for the computation, and the return value will thus be `0+1`:
```hoon
> $:inc
1
```
The placeholder value, as you saw in the previous module, is sometimes called the bunt value. The bunt value is determined by the input type; for `@` atoms the bunt value is typically `0`.
The face value of `a` comes from the way we defined the gate above: `|=(a=@ (add 1 a))`. This was so we can use `a` to refer to the sample to generate the product with `(add 1 a)`.
#### The Context
The context of a gate contains other data that may be necessary for the `$` buc arm to evaluate correctly. The context is always located at the tail of the tail of the gate, i.e., `+7` of the gate. There is no requirement for the context to have any particular arrangement, though often it does.
Let's look at the context of inc:
```hoon
> +7:inc
[ [ our=~nec
now=~2022.6.21..19.26.59..9016
eny
0v304.vhjvs.406g0.bn6ph.ggd02.buadd.2lot0.va6q0.fiqb1.a96gj.9jmb2.6kk07.5d75s.thpbg.9idrt.vmg9j.e748l.fea0l.7ckcf.ieesj.7q6lr
]
<17.bny 33.ehb 14.dyd 53.vlb 77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
]
```
This is the default Dojo subject from before we put `inc` into the subject. The `|=` bartis expression defines the context as whatever the subject is. This guarantees that the context has all the information it needs to have for the `$` buc arm to work correctly.
#### Gates Define Functions of the Sample
The value of a function's output depends solely upon the input value. This is one of the features that make functions desirable in many programming contexts. It's worth going over how Hoon function calls implement this feature.
In Hoon, one can use `(gate arg)` syntax to make a function call. For example:
```hoon
> (inc 234)
235
```
The name of the gate is `inc`. How is the `$` buc arm of inc evaluated? When a function call occurs, a copy of the `inc` gate is created, but with one modification: the sample is replaced with the function argument. Then the `$` buc arm is computed against this modified version of the `inc` gate.
Remember that the default or “bunt” value of the sample of inc is `0`. In the function call above, a copy of the `inc` gate is made but with a sample value of `234`. When `$` buc is computed against this modified core, the product is `235`.
Notice that neither the arm nor the context is modified before the arm is evaluated. That means that the only part of the gate that changes before the arm evaluation is the sample. Hence, we may understand each gate as defining a function whose argument is the sample. If you call a gate with the same sample, you'll get the same value returned to you every time.
Let's unbind inc to keep the subject tidy:
```hoon
> =inc
> inc
-find.inc
```
#### Modifying the Context of a Gate
It is possible to modify the context of a gate when you make a function call; or, to be more precise, it's possible to call a _mutant copy_ of the gate in which the context is modified. To illustrate this let's use another example gate. Let's write a gate which uses a value from the context to generate the product. Bind `b` to the value 10:
```hoon
> =b 10
> b
10
```
Now let's write a gate called `ten` that adds `b` to the input value:
```hoon
> =ten |=(a=@ (add a b))
> (ten 10)
20
> (ten 20)
30
> (ten 25)
35
```
We can unbind `b` from the Dojo subject, and `ten` works just as well because it's using a copy of `b` stored its context:
```hoon
> =b
> (ten 15)
25
> (ten 35)
45
> b.+14.ten
10
```
We can use `ten(b 25)` to produce a variant of `ten`. Calling this mutant version of ten causes a different value to be returned than we'd get with a normal `ten` call:
```hoon
> (ten(b 25) 10)
35
> (ten(b 1) 25)
26
> (ten(b 75) 100)
175
```
Before finishing the lesson let's unbind ten:
```hoon
> =ten
```
### Recursion
_Recursion_ refers to a return to the same logical point in a program again and again. It's a common pattern for solving certain problems in most programming languages, and Hoon is no exception.
In the following code, the `|-` barhep trap serves as the point of recursion, and the return to that point (with changes) is indicated by the `%=` cenhep. All this code does is count to the given number, then return that number.
```hoon
|= n=@ud
=/ index 0
|-
?: =(index n)
index
%=($ index +(index))
```
In a formal sense, we have to make sure that there is always a base case, a way of actually ending the recursion—if there isn't, we end up with an [infinite loop](https://en.wikipedia.org/wiki/Infinite_loop)! Some children's songs like [“Yon Yonson”](https://en.wikipedia.org/wiki/Yon_Yonson) or [“The Song That Never Ends”](https://en.wikipedia.org/wiki/The_Song_That_Never_Ends) rely on such recursive humor.
> This is the song that never ends
> Yes, it goes on and on, my friends
> Some people started singing it not knowing what it was
> And theyll continue singing it forever just because—
>
> This is the song that never ends
> . . .
You need to make sure when you compose a trap that it has a base case which returns a noun. The following trap results in an infinite loop:
```hoon
=/ index 1
|-
?: (lth index 1) ~
$(index +(index))
```
If you find yourself caught in such a loop, press `Ctrl`+`C` to stop execution.
Recursion can be set up different ways. A full treatment requires thinking about [algorithmic complexity and efficiency](https://en.wikipedia.org/wiki/Big_O_notation), but we can highlight some good rules of thumb here.
#### Tutorial: The Fibonacci Sequence
For instance, let's talk about calculating the [Fibonacci sequence](https://en.wikipedia.org/wiki/Fibonacci_sequence), which is a sequence of numbers wherein each is formed by adding the two previous numbers together. Thus 1, 1, 1+1→2, 1+2→3, 2+3→5, and so forth. We may write the _n_th Fibonacci number in a generic way as:
<img src="https://latex.codecogs.com/gif.image?\large&space;\dpi{110}F_n&space;=&space;F_{n-1}&space;&plus;&space;F_{n-2}" title="https://latex.codecogs.com/gif.image?\large \dpi{110}F_n = F_{n-1} + F_{n-2}" />
<!--
F_n = F_{n-1} + F_{n-2}
-->
and verify that our program correctly produces the sequence of numbers 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ….
- Compose a Fibonacci sequence program which produces a `list` of the appropriate values.
We can elide some details of working with `list`s until the next lesson; simply recall that they are a way of storing multiple values in a cell of cells of cells….
The most naïve version of this calculation simply calculates all previous numbers in the sequence every time they are needed.
```hoon
|= n=@ud
^- @ud
?: =(n 1) 1
?: =(n 2) 1
(add $(n (dec n)) $(n (dec (dec n))))
```
We can use _two_ recursion points for `%=` cenhep. The first calculate _F_ for _n_-1; the second calculate _F_ for _n_-2. These are then added together. If we diagram what's happening, we can see that each additional number costs as much as the previous numbers:
```
(fibonacci 5)
(add (fibonacci 4) (fibonacci 3))
(add (add (fibonacci 3) (fibonacci 2)) (add (fibonacci 2) (fibonacci 1)))
(add (add (add (fibonacci 2) (fibonacci 1)) (fibonacci 2)) (add (fibonacci 2) (fibonacci 1)))
(add (add (add 1 1) 1) (add 1 1))
5
```
```
(fibonacci 6)
(add (fibonacci 5) (fibonacci 4))
...
(add (add (add (add (fibonacci 2) (fibonacci 1)) (fibonacci 2)) (add (fibonacci 2) (fibonacci 1))) (add (add (fibonacci 2) (fibonacci 1)) (fibonacci 2)))
(add (add (add (add 1 1) 1) (add 1 1)) (add (add 1 1) 1))
8
```
This fully recursive version of the Fibonacci calculation is very wasteful because it keeps no intermediate results.
An improved version stores each value in the sequence as an element in a list so that it can be used rather than re-calculated. We use the [`++snoc`](https://urbit.org/docs/hoon/reference/stdlib/2b#snoc) gate to append a noun to a `list`.
```hoon
|= n=@ud
=/ index 0
=/ p 0
=/ q 1
=/ r *(list @ud)
|- ^- (list @ud)
?: =(index n) r
~& > [index p q r]
%= $
index +(index)
p q
q (add p q)
r (snoc r q)
==
```
This version is a little more complicated to compare using a diagram because of the trap, but yields something like this:
```
(fibonacci 5)
~[1]
~[1 1]
~[1 1 2]
~[1 1 2 3]
~[1 1 2 3 5]
```
The program can be improved somewhat again by appending to the head of the cell (rather than using `++snoc`). This builds a list in a backwards order, so we apply the [`++flop`](https://urbit.org/docs/hoon/reference/stdlib/2b#flop) gate to flip the order of the list before we return it.
```hoon
|= n=@ud
%- flop
=/ index 0
=/ p 0
=/ q 1
=/ r *(list @ud)
|- ^- (list @ud)
?: =(i n) r
%= $
i +(i)
p q
q (add p q)
r [q r]
==
```
Why are we building the list backwards instead of just producing the list in the order we want it in the first place? Because with lists, adding an element to the end is a computationally expensive operation that gets more expensive the longer the list is, due to the fact that you need to traverse to the end of the tree. Adding an element to the front, however, is cheap. In Big-O notation, adding to the end of a list is _O_(_n_) while adding to the front is _O_(1).
Here's our diagram:
```
(fibonacci 5)
~[1]
~[1 1]
~[2 1 1]
~[3 2 1 1]
~[5 3 2 1 1]
~[1 1 2 3 5]
```
Finally (and then we'll move along) here's a very efficient implementation, which starts with a `0` but builds the list entirely from cells, then appends the `~` `0` at the end:
```hoon
|= n=@ud
^- (list @ud)
=/ f0 *@ud
=/ f1=@ud 1
:- 0
|- ^- (list @ud)
?: =(n 0)
~
[f1 $(f0 f1, f1 (add f0 f1), n (dec n))]
```
- Produce a diagram of how this last implementation yields a Fibonacci sequence for _F_₅, `(fibonacci 5)`.
#### Tutorial: Tail-Call Optimization of the Factorial Gate
The last factorial gate we produced looked like this:
```hoon
|= n=@ud
?: =(n 1)
1
(mul n $(n (dec n)))
```
This example isn't a very efficient use of computing resources. The pyramid-shaped illustration from up above approximates what's happening on the _call stack_, a memory structure that tracks the instructions of the program. In our example code, every time a parent gate calls another gate, the gate being called is "pushed" to the top of the stack in the form of a frame. This process continues until a value is produced instead of a function, completing the stack.
```
Push order Pop order
(fifth frame) ^ |
(fourth frame) | |
(third frame) | |
(second frame) | |
(first frame) | V
```
Once this stack of frames is completed, frames "pop" off the stack starting at the top. When a frame is popped, it executes the contained gate and passes produced data to the frame below it. This process continues until the stack is empty, giving us the gate's output.
When a program's final expression uses the stack in this way, it's considered to be **not tail-recursive**. This usually happens when the last line of executable code calls more than one gate, our example code's `(mul n $(n (dec n)))` being such a case. That's because such an expression needs to hold each iteration of `$(n (dec n)` in memory so that it can know what to run against the `mul` function every time.
To reiterate: if you have to manipulate the result of a recursion as the last expression of your gate, as we did in our example, the function is not tail-recursive, and therefore not very efficient with memory. A problem arises when we try to recurse more times than we have space on the stack. This will result in our computation failing and producing a stack overflow. If we tried to find the factorial of `5.000.000`, for example, we would almost certainly run out of stack space.
But the Hoon compiler, like most compilers, is smart enough to notice when the last statement of a parent can reuse the same frame instead of needing to add new ones onto the stack. If we write our code properly, we can use a single frame that simply has its values replaced with each recursion.
- Change the order of the aspects of the call in such a way that the compiler can produce a more [tail-recursive](https://en.wikipedia.org/wiki/Tail_call) program.
With a bit of refactoring, we can write a version of our factorial gate that is tail-recursive and can take advantage of this feature:
```hoon
|= n=@ud
=/ t=@ud 1
|-
^- @ud
?: =(n 1) t
$(n (dec n), t (mul t n))
```
The above code should look familiar. We are still building a gate that takes one argument a `@ud` unsigned decimal integer `n`. The `|-` here is used to create a new gate with one [arm](https://urbit.org/docs/glossary/arm) `$` and immediately call it. As before, think of `|-` as the recursion point.
We then evaluate `n` to see if it is 1. If it is, we return the value of `t`. In case that `n` is anything other than 1, we perform our recursion:
```hoon
$(n (dec n), t (mul t n))
```
All we are doing here is recursing our new gate and modifying the values of `n` and `t`. `t` is used as an accumulator variable that we use to keep a running total for the factorial computation.
Let's use more of our pseudo-Hoon to illustrate how the stack is working in this example for the factorial of 5.
```
(factorial 5)
(|- 5 1)
(|- 4 5)
(|- 3 20)
(|- 2 60)
(|- 1 120)
120
```
We simply multiply `t` and `n` to produce the new value of `t`, and then decrement `n` before repeating. Since this `$` call is the final and solitary thing that is run in the default case and since we are doing all computation before the call, this version is properly tail-recursive. We don't need to do anything to the result of the recursion except recurse it again. That means that each iteration can be replaced instead of held in memory.
#### Tutorial: The Ackermann Function
The [Ackermann function](https://en.wikipedia.org/wiki/Ackermann_function) is one of the earliest examples of a function that is both totally computable—meaning that it can be solved—and not primitively recursive—meaning it can not be rewritten in an iterative fashion.
<img src="https://latex.codecogs.com/svg.image?\large&space;\begin{array}{lcl}\operatorname{A}(0,&space;n)&space;&&space;=&space;&&space;n&space;&plus;&space;1&space;\\\operatorname{A}(m&plus;1,&space;0)&space;&&space;=&space;&&space;\operatorname{A}(m,&space;1)&space;\\\operatorname{A}(m&plus;1,&space;n&plus;1)&space;&&space;=&space;&&space;\operatorname{A}(m,&space;\operatorname{A}(m&plus;1,&space;n))\end{array}" title="https://latex.codecogs.com/svg.image?\large \begin{array}{lcl}\operatorname{A}(0, n) & = & n + 1 \\\operatorname{A}(m+1, 0) & = & \operatorname{A}(m, 1) \\\operatorname{A}(m+1, n+1) & = & \operatorname{A}(m, \operatorname{A}(m+1, n))\end{array}" />
<!--
\begin{array}{lcl}
\operatorname{A}(0, n) & = & n + 1 \\
\operatorname{A}(m+1, 0) & = & \operatorname{A}(m, 1) \\
\operatorname{A}(m+1, n+1) & = & \operatorname{A}(m, \operatorname{A}(m+1, n))
\end{array}
-->
- Compose a gate that computes the Ackermann function.
```hoon
|= [m=@ n=@]
^- @
?: =(m 0) +(n)
?: =(n 0) $(m (dec m), n 1)
$(m (dec m), n $(n (dec n)))
```
This gate accepts two arguments of `@` atom type and yields an atom.
There are three cases to consider:
1. If `m` is zero, return the increment of `n`.
2. If `n` is zero, decrement `m`, set `n` to 1 and recurse.
3. Else, decrement `m` and set `n` to be the value of the Ackermann function with `n` and the decrement of `n` as arguments.
The Ackermann function is not terribly useful in and of itself, but it has an interesting history in mathematics. When running this function the value grows rapidly even for very small input. The value of computing this where `m` is `4` and `n` is `2` is an integer with 19,729 digits.
- Calculate some of the _m_/_n_ pairs given in [the table](https://en.wikipedia.org/wiki/Ackermann_function#Table_of_values).
#### Exercise: The Sudan Function
The [Sudan function](https://en.wikipedia.org/wiki/Sudan_function) is related to the Ackermann function.
<img src="https://latex.codecogs.com/svg.image?\large&space;\begin{array}{lll}F_0&space;(x,&space;y)&space;&&space;=&space;x&plus;y&space;\\F_{n&plus;1}&space;(x,&space;0)&space;&&space;=&space;x&space;&&space;\text{if&space;}&space;n&space;\ge&space;0&space;\\F_{n&plus;1}&space;(x,&space;y&plus;1)&space;&&space;=&space;F_n&space;(F_{n&plus;1}&space;(x,&space;y),&space;F_{n&plus;1}&space;(x,&space;y)&space;&plus;&space;y&space;&plus;&space;1)&space;&&space;\text{if&space;}&space;n\ge&space;0&space;\\\end{array}" title="https://latex.codecogs.com/svg.image?\large \begin{array}{lll}F_0 (x, y) & = x+y \\F_{n+1} (x, 0) & = x & \text{if } n \ge 0 \\F_{n+1} (x, y+1) & = F_n (F_{n+1} (x, y), F_{n+1} (x, y) + y + 1) & \text{if } n\ge 0 \\\end{array}" />
<!--
\begin{array}{lll}
F_0 (x, y) & = x+y \\
F_{n+1} (x, 0) & = x & \text{if } n \ge 0 \\
F_{n+1} (x, y+1) & = F_n (F_{n+1} (x, y), F_{n+1} (x, y) + y + 1) & \text{if } n\ge 0 \\
\end{array}
-->
- Implement the Sudan function as a gate.

View File

@ -0,0 +1,891 @@
---
title: Trees, Addressing, and Lists
nodes: 135, 140, 156
objectives:
- "Address nodes in a tree using numeric notation."
- "Address nodes in a tree using lark notation."
- "Address data in a tree using faces."
- "Distinguish `.` and `:` notation."
- "Diagram Hoon structures such as gates into the corresponding abstract syntax tree."
- "Use lists to organize data."
- "Convert between kinds of lists (e.g. tapes)."
- "Diagram lists as binary trees."
- "Operate on list elements using `snag`, `find`, `weld`, etc."
- "Explain how Hoon manages the subject and wing search paths."
- "Explain how to skip to particular matches in a wing search path through the subject."
- "Identify common Hoon patterns: batteries, and doors, arms, wings, and legs."
---
# Trees, Addressing, and Lists
_Every noun in Urbit is an atom or a cell. This module will elaborate how we can use this fact to locate data and evaluate code in a given expression. It will also discuss the important `list` mold builder and a number of standard library operations._
## Trees
Every noun in Urbit is a either an atom or a cell. Since a cell has only two elements, a head and a tail, we can derive that everything is representable as a [_binary tree_](https://en.wikipedia.org/wiki/Binary_tree). We can draw this layout naturally:
![Binary tree with labeled nodes](./binary-tree.png)
A binary tree has a single base node, and each node of the tree may have up to two child nodes (but it need not have any). A node without children is a “leaf”. You can think of a noun as a binary tree whose leaves are atoms, i.e., unsigned integers. All non-leaf nodes are cells. An atom is a trivial tree of just one node; e.g., `17`.
For instance, if we produce a cell in the Dojo
```hoon
>=a [[[8 9] [10 11]] [[12 13] [14 15]]]
```
it can be represented as a tree with the contents
![Binary tree with bottom row only populated](./binary-tree-bottom-row.png)
We will use the convention in these graphics that black-text-on-white-circle represents an address, and that green-text-on-black-circle represents the content at that address. So another way to represent the same data would be this:
![Binary tree with bottom row only populated](./binary-tree-bottom-row-full.png)
When we input the above cell representation into the Dojo, the pretty-printer hides the rightwards-branching `[]` sel/ser brackets.
```hoon
> [[[8 9] [10 11]] [[12 13] [14 15]]]
[[[8 9] 10 11] [12 13] 14 15]
```
We can refer to any data stored anywhere in this tree. The numbers in the labeled diagram above are the _numerical addresses_ of the tree, and may be extended indefinitely downwards into ever-deeper tree representations.
Most of any possible tree will be unoccupied for any actual data structure. For instance, `list`s (and thus `tape`s) are collections of values which occupy the tails of cells, leading to a rightwards-branching tree representation. (Although this may seem extravagant, it has effectively no bearing on efficiency in and of itself—that's a function of the algorithms working with the data.)
#### Exercise: Map Nouns to Tree Diagrams
- Consider each of the following nouns. Which tree diagram do they correspond to?
| Noun | Tree Diagram |
| ---- | ------------ |
| `[[[1 2] 3] 4]` | ![](./binary-tree-exercise-1.png) |
| `[[1 2] 3 4]` | ![](./binary-tree-exercise-2.png) |
| `[1 2 3 4]` | ![](./binary-tree-exercise-3.png) |
#### Exercise: Produce a List of Numbers
- Produce a generator called `list.hoon` which accepts a single `@ud` number `n` as input and produces a list of numbers from `1` up to (but not including) `n`. For example, if the user provides the number `5`, the program will produce: `~[1 2 3 4]`.
```hoon
|= end=@
=/ count=@ 1
|-
^- (list @)
?: =(end count)
~
:- count
$(count (add 1 count))
```
In the Dojo:
```hoon
> +list 5
~[1 2 3 4]
> +list 10
~[1 2 3 4 5 6 7 8 9]
> +list 1
~
```
OK, we've seen these runes before. This time we want to focus on the list, the thing that's being built here.
This program works by having each iteration of the list create a cell. In each of these cells, the head—the cell's first position—is filled with the current-iteration value of `count`. The tail of the cell, its second position, is filled with _the product of a new iteration of our code_ that starts at `|-`. This iteration will itself create another cell, the head of which will be filled by the incremented value of `count`, and the tail of which will start another iteration. This process continues until `?:` branches to `~` (`null`). When that happens, it terminates the list and the expression ends. A built-out list of nested cells can be visualized like this:
```
[1 [2 [3 [4 ~]]]]
.
/ \
1 .
/ \
2 .
/ \
3 .
/ \
4 ~
```
### Tuples as Trees
What we've been calling a running cell would more conventionally be named a _tuple_, so we'll switch to that syntax now that the idea is more familiar. Basically its a cell series which doesn't necessarily end in `~`.
Given the cell `[1 2 3 4 ~]` (or equivalently `~[1 2 3 4]`, an irregular form for a null-terminated tuple or list), what tree address does each value occupy?
![A binary tree of the cell [1 2 3 4 ~].](./binary-tree-1234.png)
At this point, you should start to be able to work this out in your head, at least for the first few rows. The `+` lus operator can be used to return the limb of the subject at a given numeric address. If there is no such limb, the result is a crash.
```hoon
> =data ~[1 2 3 4]
> +1:data
[1 2 3 4 ~]
> +2:data
1
> +3:data
[2 3 4 ~]
> +4:data
dojo: hoon expression failed
> +6:data
2
> +7:data
[3 4 ~]
> +14:data
3
> +15:data
[4 ~]
> +30:data
4
> +31:data
~
```
### Lists as Trees
We have used lists incidentally. A `list` is an ordered arrangement of elements ending in a `~` (null). Most lists have the same kind of content in every element (for instance, a `(list @rs)`, a list of numbers with a fractional part), but some lists have many kinds of things within them. Some lists are even empty.
```hoon
> `(list @)`['a' %b 100 ~]
~[97 98 100]
```
(Notice that all values are converted to the specified aura, in this case the empty aura.)
A `list` is built with the `list` mold. A `list` is actually a _mold builder_, a gate that produces a gate. This is a common design pattern in Hoon. (Remember that a mold is a type and can be used as an enforcer: it attempts to convert any data it receives into the given structure, and crashes if it fails to do so.)
Lists are commonly written with a shorthand `~[]`:
```hoon
> `(list)`~['a' %b 100]
~[97 98 100]
```
```hoon
> `(list (list @ud))`~[~[1 2 3] ~[4 5 6]]
~[~[1 2 3] ~[4 5 6]]
```
True `list`s have `i` and `t` faces which allow the head and tail of the data to be quickly and conveniently accessed; the _head_ is the first element while the _tail_ is everything else. If something has the same _structure_ as a `list` but hasn't been explicitly labeled as such, then Hoon won't always recognize it as a `list`. In such cases, you'll need to explicitly mark it as such:
```hoon
> [3 4 5 ~]
[3 4 5 ~]
> `(list @ud)`[3 4 5 ~]
~[3 4 5]
> -:!>([3 4 5 ~])
#t/[@ud @ud @ud %~]
> -:!>(`(list @ud)`[3 4 5 ~])
#t/it(@ud)
```
A null-terminated tuple is almost the same thing as a list. (That is, to Hoon all lists are null-terminated tuples, but not all null-terminated tuples are lists. This gets rather involved in subtleties, but you should cast a value as `(list @)` or another type as appropriate whenever you need a `list`. See also [`++limo`](https://urbit.org/docs/hoon/reference/stdlib/2b#limo) which explicitly marks a null-terminated tuple as a `list`.)
## Addressing Limbs
Everything in Urbit is a binary tree. And all code in Urbit is also represented as data. One corollary of these facts is that we can access any arbitrary part of an expression, gate, core, whatever, via addressing (assuming proper permissions, of course). (In fact, we can even hot-swap parts of cores, which is how [wet gates](./Q-metals.md) work.)
There are three different ways to access values:
1. _Numeric addressing_ is useful when you know the address, rather like knowing a house's street address directly.
2. _Positional addressing_ is helpful when you don't want to figure out the room number, but you know how to navigate to the value. This is like knowing the directions somewhere even if you don't know the house number.
3. _Wing addressing_ is a way of attaching a name to the address so that you can access it directly.
### Numeric Addressing
We have already seen numeric addressing used to refer to parts of a binary tree.
![Binary tree with labeled nodes](./binary-tree.png)
Since a node is _either_ an atom (value) _or_ a cell (fork), you never have to decide if the contents of a node is a direct value or a tree: it just happens.
#### Exercise: Tapes for Text
A `tape` is one way of representing a text message in Hoon. It is written with double quotes:
```hoon
"I am the very model of a modern Major-General"
```
A `tape` is actually a `(list @t)`, a binary tree of single characters which only branches rightwards and ends in a `~`:
![](./binary-tree-tape.png)
- What are the addresses of each letter in the tree for the Gilbert & Sullivan quote above? Can you see the pattern? Can you get the address of EVERY letter through `l`?
### Positional Addressing (Lark Notation)
Much like relative directions, one can also state “left, left, right, left” or similar to locate a particular node in the tree. These are written using `-` (left) and `+` (right) alternating with `<` (left) and `<` (right).
![](binary-tree-lark.png)
Lark notation can locate a position in a tree of any size. However, it is most commonly used to grab the head or tail of a cell, e.g. in the _type spear_ (on which [more later](./L2-struct.md)):
```hoon
-:!>('hello Mars')
```
Lark notation is not preferred in modern Hoon for more than one or two elements deep, but it can be helpful when working interactively with a complicated data structure like a JSON data object.
When lark expressions resolve to the part of the subject containing an arm, they don't evaluate the arm. They simply return the indicated noun fragment of the subject, as if it were a leg.
#### Exercise: Address the Fruit Tree
Produce the numeric and lark-notated equivalent addresses for each of the following nodes in the binary fruit tree:
![A fruit tree](./binary-tree-fruit.png)
- 🍇
- 🍌
- 🍉
- 🍏
- 🍋
- 🍑
- 🍊
- 🍍
- 🍒
There is a solution at the bottom of the page.
#### Exercise: Lark Notation
- Use a lark expression to obtain the value 6 in the following noun represented by a binary tree:
```
.
/\
/ \
/ \
. .
/ \ / \
/ . 10 .
/ / \ / \
. 8 9 11 .
/ \ / \
5 . 12 13
/ \
6 7
```
- Use a lark expression to obtain the value `9` in the following noun: `[[5 6] 7 [[8 9 10] 3] 2]`.
Solutions to these exercises may be found at the bottom of this lesson.
### Wings
One can also identify a resource by a label, called a _wing_. A wing represents a depth-first search into the current subject (context). A wing is a limb resolution path into the subject. A wing expression indicates the path as a series of limb expressions separated by the `.` character. E.g.,
```hoon
inner-limb.outer-limb.limb
```
You can read this as `inner-limb` in `outer-limb` in `limb`, etc. Notice that these read left-to-right!
A wing is a resolution path pointing to a limb. It's a search path, like an index to a particular labeled part of the subject.
Here are some examples:
```hoon
> c.b:[[4 a=5] b=[c=14 15]]
14
> b.b:[b=[a=1 b=2 c=3] a=11]
2
> a.b:[b=[a=1 b=2 c=3] a=11]
1
> c.b:[b=[a=1 b=2 c=3] a=11]
3
> a:[b=[a=1 b=2 c=3] a=11]
11
> b.a:[b=[a=1 b=2 c=3] a=11]
-find.b.a
> g.s:[s=[c=[d=12 e='hello'] g=[h=0xff i=0b11]] r='howdy']
[h=0xff i=0b11]
> c.s:[s=[c=[d=12 e='hello'] g=[h=0xff i=0b11]] r='howdy']
[d=12 e='hello']
> e.c.s:[s=[c=[d=12 e='hello'] g=[h=0xff i=0b11]] r='howdy']
'hello'
> +3:[s=[c=[d=12 e='hello'] g=[h=0xff i=0b11]] r='howdy']
r='howdy'
> r.+3:[s=[c=[d=12 e='hello'] g=[h=0xff i=0b11]] r='howdy']
'howdy'
```
To locate a value in a named tuple data structure:
```hoon
> =data [a=[aa=[aaa=[1 2] bbb=[3 4]] bb=[5 6]] b=[7 8]]
> -:aaa.aa.a.data
1
```
A wing is a limb resolution path into the subject. This definition includes as a trivial case a path of just one limb. Thus, all limbs are wings, and all limb expressions are wing expressions.
We mention this because it is convenient to refer to all limbs and non-trivial wings as simply “wings”.
#### Names and Faces
A name can resolve either an arm or a leg of the subject. Recall that arms are for computations and legs are for data. When a name resolves to an arm, the relevant computation is run and the product of the computation is produced. When a limb name resolves to a leg, the value of that leg is produced.
Hoon doesn't have variables like other programming languages do; it has _faces_. Faces are like variables in certain respects, but not in others. Faces play various roles in Hoon, but most frequently faces are used simply as labels for legs.
A face is a limb expression that consists of a series of alphanumeric characters. A face has a combination of lowercase letters, numbers, and the `-` character. Some example faces: `b`, `c3`, `var`, `this-is-kebab-case123`. Faces must begin with a letter.
There are various ways to affix a face to a limb of the subject, but for now we'll use the simplest method: `face=value`. An expression of this form is equivalent in value to simply `value`. Hoon registers the given `face` as metadata about where the value is stored in the subject, so that when that face is invoked later its data is produced.
Now we have several ways to access values:
```hoon
> b=5
b=5
> [b=5 cat=6]
[b=5 cat=6]
> -:[b=5 cat=6]
b=5
> b:[b=5 cat=6]
5
> b2:[[4 b2=5] [cat=6 d=[14 15]]]
5
> d:[[4 b2=5] [cat=6 d=[14 15]]]
[14 15]
```
To be clear, `b=5` is equivalent in value to `5`, and `[[4 b2=5] [cat=6 d=[14 15]]]` is equivalent in value to `[[4 5] 6 14 15]`. The faces are not part of the underlying noun; they're stored as metadata about address values in the subject.
```hoon
> (add b=5 1)
6
```
If you use a face that isn't in the subject you'll get a `find.[face]` crash:
```
> a:[b=12 c=14]
-find.a
[crash message]
```
You can even give faces to faces:
```hoon
> b:[b=c=123 d=456]
c=123
```
#### Duplicate Faces
There is no restriction against using the same face name for multiple limbs of the subject. This is one way in which faces aren't like ordinary variables:
```hoon
> [[4 b=5] [b=6 b=[14 15]]]
[[4 b=5] b=6 b=[14 15]]
> b:[[4 b=5] [b=6 b=[14 15]]]
5
```
Why does this return `5` rather than `6` or `[14 15]`? When a face is evaluated on a subject, a head-first binary tree search occurs starting at address `1` of the subject. If there is no matching face for address `n` of the subject, first the head of `n` is searched and then `n`'s tail. The complete search path for `[[4 b=5] [b=6 b=[14 15]]]` is:
1. `[[4 b=5] [b=6 b=[14 15]]]`
2. `[4 b=5]`
3. `4`
4. `b=5`
5. `[b=6 b=[14 15]]`
6. `b=6`
7. `b=[14 15]`
There are matches at steps 4, 6, and 7 of the total search path, but the search ends when the first match is found at step 4.
The children of legs bearing names aren't included in the search path. For example, the search path of `[[4 a=5] b=[c=14 15]]` is:
1. `[[4 a=5] b=[c=14 15]]`
2. `[4 a=5]`
3. `4`
4. `a=5`
5. `b=[c=14 15]`
Neither of the legs `c=14` or `15` is checked. Accordingly, a search for `c` of `[[4 a=5] b=[c=14 15]]` fails:
```hoon
> c:[[4 b=5] [b=6 b=[c=14 15]]]
-find.c [crash message]
```
In any programming paradigm, good names are valuable and collisions (repetitions, e.g. a list named `list`) are likely. There is no restriction against using the same face name for multiple limbs of the subject. This is one way in which faces aren't like ordinary variables. If multiple values match a particular face, we need a way to distinguish them. In other words, there are cases when you don't want the limb of the first matching face. You can skip the first match by prepending `^` to the face. Upon discovery of the first match at address `n`, the search skips `n` (as well as its children) and continues the search elsewhere:
```hoon
> ^b:[[4 b=5] [b=6 b=[14 15]]]
6
```
Recall that the search path for this noun is:
1. `[[4 b=5] [b=6 b=[14 15]]]`
2. `[4 b=5]`
3. `4`
4. `b=5`
5. `[b=6 b=[14 15]]`
6. `b=6`
7. `b=[14 15]`
The second match in the search path is step 6, `b=6`, so the value at that leg is produced. You can stack `^` characters to skip more than one matching face:
```hoon
> a:[[[a=1 a=2] a=3] a=4]
1
> ^a:[[[a=1 a=2] a=3] a=4]
2
> ^^a:[[[a=1 a=2] a=3] a=4]
3
> ^^^a:[[[a=1 a=2] a=3] a=4]
4
```
When a face is skipped at some address `n`, neither the head nor the tail of `n` is searched:
```hoon
> b:[b=[a=1 b=2 c=3] a=11]
[a=1 b=2 c=3]
> ^b:[b=[a=1 b=2 c=3] a=11]
-find.^b
```
The first `b`, `b=[a=1 b=2 c=3]`, is skipped; so the entire head of the subject is skipped. The tail has no `b`; so `^b` doesn't resolve to a limb when the subject is `[b=[a=1 b=2 c=3] a=11]`.
How do you get to that `b=2`? And how do you get to the `c` in `[[4 a=5] b=[c=14 15]]`? In each case you should use a wing.
We say that the outer face has been _shadowed_ when an inner name obscures it.
If you run into `^$`, don't go look for a `^$` ketbuc rune: it's matching the outer `$` buc arm. `^$` is one way of setting up a `%=` cenhep loop/recursion of multiple cores with a `|-` barket trap nested inside of a `|=` bartis gate, for instance.
### Limb Resolution Operators
There are two symbols we use to search for a face or limb:
- `.` dot resolves the wing path into the current subject.
- `:` col resolves the wing path with the right-hand-side as the subject.
Logically, `a:b` is two operations, while `a.b` is one operation. The compiler is smart about `:` col wing resolutions and reduces it to a regular lookup, though.
### What `%=` Does
Now we're equipped to go back and examine the syntax of the `%=` centis rune we have been using for recursion: it _resolves a wing with changes_, which in this particular case means that it takes the `$` (default) arm of the trap core, applies certain changes, and re-evaluates the expression.
```hoon
|= n=@ud
|-
~& n
?: =(n 1)
n
%+ mul
n
$(n (dec n))
```
The `$()` syntax is the commonly-used irregular form of the [`%=` centis](https://urbit.org/docs/hoon/reference/rune/cen#centis) rune.
Now, we noted that `$` buc is the default arm for the trap. It turns out that `$` is also the default arm for some other structures, like the gate! That means we can cut out the trap, in the factorial example, and write something more compact like this:
```hoon
|= n=@ud
?: =(n 1)
1
(mul n $(n (dec n)))
```
It's far more common to just use a trap, but you will see `$` buc used to manipulate a core in many in-depth code instances.
### Expanding the Runes
`|=` bartis produces a gate. It actually expands to
```hoon
=| a=spec
|% ++ $ b=hoon
--
```
where `=|` tisbar means to add its sample to the current subject with the given face.
Similarly, `|-` barhep produces a core with one arm `$`. How could you write that in terms of `|%` and `++`?
#### Example: Number to Digits
- Compose a generator which accepts a number as `@ud` unsigned decimal and returns a list of its digits.
One verbose Hoon program
```hoon
!:
|= [n=@ud]
=/ values *(list @ud)
|- ^- (list @ud)
?: (lte n 0) values
%= $
n (div n 10)
values (weld ~[(mod n 10)] values)
==
```
Save this as a file `/gen/num2digit.hoon`, `|commit %base`, and run it:
```hoon
> +num2dig 1.000
~[1 0 0 0]
> +num2dig 123.456.789
~[1 2 3 4 5 6 7 8 9]
```
A more idiomatic solution would use the `^` ket infix to compose a cell and build the list from the head first. (This saves a call to `++weld`.)
```hoon
!:
|= [n=@ud]
=/ values *(list @ud)
|- ^- (list @ud)
?: (lte n 0) values
%= $
n (div n 10)
values (mod n 10)^values
==
```
A further tweak maps to `@t` ASCII characters instead of the digits.
```hoon
!:
|= [n=@ud]
=/ values *(list @t)
|- ^- (list @t)
?: (lte n 0) values
%= $
n (div n 10)
values (@t (add 48 (mod n 10)))^values
==
```
(Notice that we apply `@t` as a mold gate rather than using the tic notation. This is because `^` ket is a rare case where the order of evaluation of operators would cause the intuitive writing to fail.)
- Extend the above generator so that it accepts a cell of type and value (a `vase` as produced by the [`!>` zapgar](https://urbit.org/docs/hoon/reference/rune/zap#-zapgar) rune). Use the type to determine which number base the digit string should be constructed from; e.g. `+num2dig !>(0xdead.beef)` should yield `~['d' 'e' 'a' 'd' 'b' 'e' 'e' 'f']`.
#### Exercise: Resolving Wings
Enter the following into dojo:
```hoon
=a [[[b=%bweh a=%.y c=8] b="no" c="false"] 9]
```
- Test your knowledge from this lesson by evaluating the following expressions and then checking your answer in the dojo or see the solutions below.
1. `b:a(a [b=%skrt a="four"])`
2. `^b:a(a [b=%skrt a="four"])`
3. `^^b:a(a [b=%skrt a="four"])`
4. `b.a:a(a [b=%skrt a="four"])`
5. `a.a:a(a [b=%skrt a="four"])`
6. `+.a:a(a [b=%skrt a="four"])`
7. `a:+.a:a(a [b=%skrt a="four"])`
8. `a(a a)`
9. `b:-<.a(a a)`
10. How many times does the atom `9` appear in `a(a a(a a))`?
The answers are at the bottom of the page.
### List operations
Once you have your data in the form of a `list`, there are a lot of tools available to manipulate and analyze the data:
- [`++flop`](https://urbit.org/docs/hoon/hoon-school/lists#flop) reverses the order of the elements (exclusive of the `~`):
```hoon
> (flop ~[1 2 3 4 5])
~[5 4 3 2 1]
```
**Exercise: `++flop` Yourself**
- Without using flop, write a gate that takes a `(list @)` and returns it in reverse order. There is a solution at the bottom of the page.
- [`++sort`](https://urbit.org/docs/hoon/hoon-school/lists#sort) uses a `list` and a comparison function (like `++lth`) to order things:
```hoon
> (sort ~[1 3 5 2 4] lth)
~[1 2 3 4 5]
```
- [`++snag`](https://urbit.org/docs/hoon/hoon-school/lists#snag) takes a index and a `list` to grab out a particular element (note that it starts counting at zero):
```hoon
> (snag 0 `(list @)`~[11 22 33 44])
11
> (snag 1 `(list @)`~[11 22 33 44])
22
> (snag 3 `(list @)`~[11 22 33 44])
44
> (snag 3 "Hello!")
'l'
> (snag 1 "Hello!")
'e'
> (snag 5 "Hello!")
'!'
```
- [`++weld`](https://urbit.org/docs/hoon/hoon-school/lists#weld) takes two lists of the same type and concatenates them:
```hoon
> (weld ~[1 2 3] ~[4 5 6])
~[1 2 3 4 5 6]
> (weld "Happy " "Birthday!")
"Happy Birthday!"
```
**Exercise: `++weld` Yourself**
- Without using weld, write a gate that takes a `[(list @) (list @)]` of which the product is the concatenation of these two lists. There is a solution at the bottom of the page.
There are a couple of sometimes-useful `list` builders:
- [`++gulf`](https://urbit.org/docs/hoon/reference/stdlib/2b#gulf) spans between two numeric values (inclusive of both):
```hoon
> (gulf 5 10)
~[5 6 7 8 9 10]
```
- [`++reap`](https://urbit.org/docs/hoon/reference/stdlib/2b#reap) repeats a value many times in a `list`:
```hoon
> (reap 5 0x0)
~[0x0 0x0 0x0 0x0 0x0]
> (reap 8 'a')
<|a a a a a a a a|>
> `tape`(reap 8 'a')
"aaaaaaaa"
> (reap 5 (gulf 5 10))
~[~[5 6 7 8 9 10] ~[5 6 7 8 9 10] ~[5 6 7 8 9 10] ~[5 6 7 8 9 10] ~[5 6 7 8 9 10]]
```
- [`++roll`](https://urbit.org/docs/hoon/reference/stdlib/2b#roll) takes a list and a gate, and accumulates a value of the list items using that gate. For example, if you want to add or multiply all the items in a list of atoms, you would use roll:
```hoon
> (roll `(list @)`~[11 22 33 44 55] add)
165
> (roll `(list @)`~[11 22 33 44 55] mul)
19.326.120
```
Once you have a `list` (including a `tape`), there are a lot of manipulation tools you can use to extract data from it or modify it:
- [`++find`](https://urbit.org/docs/hoon/reference/stdlib/2b#find) `[nedl=(list) hstk=(list)]` locates a sublist (`nedl`, needle) in the list (`hstk`, haystack)
- [`++snag`](https://urbit.org/docs/hoon/reference/stdlib/2b#snag) `[a=@ b=(list)]` produces the element at an index in the list (zero-indexed)
```hoon
> (snag 0 `(list @)`~[11 22 33 44])
11
> (snag 1 `(list @)`~[11 22 33 44])
22
> (snag 3 `(list @)`~[11 22 33 44])
44
```
- [`++snap`](https://urbit.org/docs/hoon/reference/stdlib/2b#snap) `[a=(list) b=@ c=*]` replaces the element at an index in the list (zero-indexed) with something else
- [`++scag`](https://urbit.org/docs/hoon/reference/stdlib/2b#scag) `[a=@ b=(list)]` produces the first _a_ elements from the front of the list
- [`++slag`](https://urbit.org/docs/hoon/reference/stdlib/2b#slag) `[a=@ b=(list)]` produces the last _a_ elements from the end of the list
- [`++weld`](https://urbit.org/docs/hoon/reference/stdlib/2b#weld) `[a=(list) b=(list)]` glues two `list`s together (_not_ a single item to the end)
There are a few more that you should pick up eventually, but these are enough to get you started.
Using what we know to date, most operations that we would do on a collection of data require a trap.
#### Exercise: Evaluating Expressions
- Without entering these expressions into the Dojo, what are the products of the following expressions?
```hoon
> (lent ~[1 2 3 4 5])
> (lent ~[~[1 2] ~[1 2 3] ~[2 3 4]])
> (lent ~[1 2 (weld ~[1 2 3] ~[4 5 6])])
```
#### Exercise: Welding Nouns
First, bind these faces.
```hoon
> =b ~['moon' 'planet' 'star' 'galaxy']
> =c ~[1 2 3]
```
- Determine whether the following Dojo expressions are valid, and if so, what they evaluate to.
```hoon
> (weld b b)
> (weld b c)
> (lent (weld b c))
> (add (lent b) (lent c))
```
#### Exercise: Palindrome
- Write a gate that takes in a list `a` and returns `%.y` if `a` is a palindrome and `%.n` otherwise. You may use the `++flop` function.
---
#### Solutions to Exercises
- Fruit Tree:
- 🍇 `9` or `-<+`
- 🍌 `11` or `->+`
- 🍉 `12` or `+<-`
- 🍏 `16` or `-<-<`
- 🍋 `27` or `+<+>`
- 🍑 `42` or `->->-`
- 🍊 `62` or `+>+>-`
- 🍍 `87` or `->->+>`
- 🍒 `126` or `+>+>+<`
- Resolving Lark Expressions
```hoon
> =b [[[5 6 7] 8 9] 10 11 12 13]
> -<+<:b
6
```
- Resolving Wing Expressions
1. `%bweh`
2. `"no"`
3. Error: `ford: %slim failed:`
4. `%skrt`
5. `"four"`
6. `a="four"` - Note that this is different from the above!
7. `"four"`
8. `[[[b=%bweh a=[[[b=%bweh a=%.y c=8] b="no" c="false"] 9] c=8] b="no" c="false"]9]`
9. `%bweh`
10. `9` appears 3 times:
```hoon
> a(a a(a a))
[[[ b=%bweh a [[[b=%bweh a=[[[b=%bweh a=%.y c=8] b="no" c="false"] 9] c=8] b="no" c="false"] 9] c=8] b="no" c="false"] 9]
```
- Roll-Your-Own-`++flop`:
```hoon
:: /gen/flop.hoon
::
|= a=(list @)
=| b=(list @)
|- ^- (list @)
?~ a b
$(b [i.a b], a t.a)
```
- Roll-Your-Own-`++weld`:
```hoon
:: /gen/weld.hoon
::
|= [a=(list @) b=(list @)]
|- ^- (list @)
?~ a b
[i.a $(a t.a)]
```
- `++lent` expressions
Running each one in the Dojo:
```hoon
> (lent ~[1 2 3 4 5])
5
> (lent ~[~[1 2] ~[1 2 3] ~[2 3 4]])
3
> (lent ~[1 2 (weld ~[1 2 3] ~[4 5 6])])
3
```
- `++weld` expressions
Running each one in the Dojo:
```hoon
> (weld b b)
<|moon planet star galaxy moon planet star galaxy|>
```
This will not run because `weld` expects the elements of both lists to be of the same type:
```hoon
> (weld b c)
```
This also fails for the same reason, but it is important to note that in some languages that are more lazily evaluated, such an expression would still work since it would only look at the length of `b` and `c` and not worry about what the elements were. In that case, it would return `7`.
```hoon
> (lent (weld b c))
```
We see here the correct way to find the sum of the length of two lists of unknown type.
```hoon
> (add (lent b) (lent c))
7
```
- Palindrome
```hoon
:: palindrome.hoon
::
|= a=(list)
=(a (flop a))
```

View File

@ -0,0 +1,475 @@
---
title: Libraries
nodes: 145, 153, 175
objectives:
- "Import a library using `/+` faslus."
- "Create a new library in `/lib`."
- "Identify the role of a desk in the Clay filesystem."
- "Identify the components of a beak."
- "Identify filesystem locations (including desks)."
- "Identify the components of a path."
- "Build code samples with `-build-file` thread."
- "Discuss Ford import runes."
---
# Libraries
_Libraries allow you to import and share processing code. This module will discuss how libraries can be produced, imported, and used._
## Importing a Library
If you have only built generators, you will soon or later become frustrated with the apparent requirement that you manually reproduce helper cores and arms every time you need them in a different generator. Libraries are cores stored in `/lib` which provide access to arms and legs (operations and data). While the Hoon standard library is directly available in the regular subject, many other elements of functionality have been introduced by software authors.
### Building Code Generally
A generator gives us on-demand access to code, but it is helpful to load and use code from files while we work in the Dojo.
A conventional library import with [`/+` faslus](https://urbit.org/docs/arvo/ford/ford#ford-runes) will work in a generator or another file, but won't work in Dojo, so you can't use `/+` faslus interactively. The first line of many generators will include an import line like this:
```hoon
/+ number-to-words
```
Subsequent invocations of the core require you to refer to it by name:
**/gen/n2w.hoon**
```hoon
/+ number-to-words
|= n=@ud
(to-words:eng-us:numbers:number-to-words n)
```
Since `/` fas runes don't work in the Dojo, you need to instead use the `-build-file` thread to load the code. Most commonly, you will do this with library code when you need a particular gate's functionality for interactive coding.
`-build-file` accepts a file path and returns the built operational code. For instance:
```hoon
> =ntw -build-file %/lib/number-to-words/hoon
> one-hundred:numbers:ntw
100
> (to-words:eng-us:numbers:ntw 19)
[~ "nineteen"]
```
There are also a number of other import runes which make library, structure, and mark code available to you. For now, the only one you need to worry about is `/+` faslus.
For simplicity, everything we do will take place on the `%base` desk for now. We will learn how to create a library in a subsequent lesson.
> ### Loading a Library
>
> In a generator, load the `number-to-words` library using the
> `/+` tislus rune. (This must take place at the very top of
> your file.)
>
> Use this to produce a gate which accepts an unsigned decimal
> integer and returns the text interpretation of its increment.
{: .challenge}
### Helper Cores
Another common design pattern besides creating a library is to sequester core-specific behavior in a helper core, which sits next to the interface operations. Two runes are used to compose expressions together so that the subject has everything it needs to carry out the desired calculations.
- [`=>` tisgar](https://urbit.org/docs/hoon/reference/rune/tis#-tisgar) composes two expressions so that the first is included in the second's subject (and thus can see it).
- [`=<` tisgal](https://urbit.org/docs/hoon/reference/rune/tis#-tisgal) inverts the order of composition, allowing heavier helper cores to be composed after the core's logic but still be available for use.
Watch for these being used in generators and libraries over the next few modules.
#### Exercise: A Playing Card Library
In this exercise, we examine a library that can be used to represent a deck of 52 playing cards. The core below builds such a library, and can be accessed by programs. You should recognize most of the things this program does aside from the `++shuffle-deck` arm which uses a [door](./K-doors.md) to produce [randomness](./N-subject.md). This is fairly idiomatic Hoon and it relies a lot on the convention that heavier code should be lower in the expression. This means that instead of `?:` wutcol you may see [`?.` wutdot](https://urbit.org/docs/hoon/reference/rune/wut#-wutdot), which inverts the order of the true/false arms, as well as other new constructions.
```hoon
|%
+$ suit ?(%hearts %spades %clubs %diamonds)
+$ darc [sut=suit val=@ud]
+$ deck (list darc)
++ make-deck
^- deck
=/ mydeck *deck
=/ i 1
|-
?: (gth i 4)
mydeck
=/ j 2
|-
?. (lte j 13)
^$(i +(i))
%= $
j +(j)
mydeck [[(num-to-suit i) j] mydeck]
==
++ num-to-suit
|= val=@ud
^- suit
?+ val !!
%1 %hearts
%2 %spades
%3 %clubs
%4 %diamonds
==
++ shuffle-deck
|= [unshuffled=deck entropy=@]
^- deck
=/ shuffled *deck
=/ random ~(. og entropy)
=/ remaining (lent unshuffled)
|-
?: =(remaining 1)
:_ shuffled
(snag 0 unshuffled)
=^ index random (rads:random remaining)
%= $
shuffled [(snag index unshuffled) shuffled]
remaining (dec remaining)
unshuffled (oust [index 1] unshuffled)
==
++ draw
|= [n=@ud d=deck]
^- [hand=deck rest=deck]
:- (scag n d)
(slag n d)
--
```
The `|%` barcen core created at the top of the file contains the entire library's code, and is closed by `--` tistis on the last line.
To create three types we're going to need, we use `+$` lusbuc, which is an arm used to define a type.
- `+$ suit ?(%hearts %spades %clubs %diamonds)` defines `+$suit`, which can be either `%hearts`, `%spades`, `%clubs`, or `%diamonds`. It's a type union created by the irregular form of `$?` bucwut.
- `+$ darc [sut=suit val=@ud]` defines `+$darc`, which is a pair of `suit` and a `@ud`. By pairing a suit and a number, it represents a particular playing card, such as “nine of hearts”. Why do we call it `darc` and not `card`? Because `card` already has a meaning in Gall, the Arvo app module, where one would likely to use this (or any) library. It's worthwhile to avoid any confusion over names.
- `+$ deck (list darc)` is simply a `list` of `darc`.
One way to get a feel for how a library works is to skim the `++` luslus arm-names before diving into any specific arm. In this library, the arms are `++make-deck`, `++num-to-suit`, `++shuffle-deck`, and `++draw`. These names should be very clear, with the exception of `++num-to-suit` (although you could hazard a guess at what it does). Let's take a closer look at it first:
```hoon
++ num-to-suit
|= val=@ud
^- suit
?+ val !!
%1 %hearts
%2 %spades
%3 %clubs
%4 %diamonds
==
```
`++num-to-suit` defines a gate which takes a single `@ud` unsigned decimal integer and produces a `suit`. The [`?+` wutlus](https://urbit.org/docs/hoon/reference/rune/wut#-wutlus) rune creates a structure to switch against a value with a default in case there are no matches. (Here the default is to crash with [`!!` zapzap](https://urbit.org/docs/hoon/reference/rune/zap#-zapzap).) We then have options 14 which each resulting in a different suit.
```hoon
++ make-deck
^- deck
=/ mydeck *deck
=/ i 1
|-
?: (gth i 4)
mydeck
=/ j 2
|-
?. (lte j 14)
^$(i +(i))
%= $
j +(j)
mydeck [[(num-to-suit i) j] mydeck]
==
```
`++make-deck` assembles a deck of 52 cards by cycling through every possible suit and number and combining them. It uses `++num-to-suit` and a couple of loops to go through the counters. It has an interesting `^$` loop skip where when `j` is greater than 14 it jumps instead to the outer loop, incrementing `i`.
[`?.` wutdot](https://urbit.org/docs/hoon/reference/rune/wut#-wutdot) may be an unfamiliar rune; it is simply the inverted version of `?:` wutcol, so the first branch is actually the if-false branch and the second is the if-true branch. This is done to keep the “heaviest” branch at the bottom, which makes for more idiomatic and readable Hoon code.
```hoon
++ draw
|= [n=@ud d=deck]
^- [hand=deck rest=deck]
:- (scag n d)
(slag n d)
```
`++draw` takes two arguments: `n`, an unsigned integer, and `d`, a `deck`. The gate will produce a cell of two `decks` using [`++scag`](https://urbit.org/docs/hoon/reference/stdlib/2b#scag) and [`++slag`](https://urbit.org/docs/hoon/reference/stdlib/2b#slag). [`++scag`](https://urbit.org/docs/hoon/reference/stdlib/2b#scag) is a standard library gate produces the first `n` elements from a list, while [`++slag`](https://urbit.org/docs/hoon/reference/stdlib/2b#slag) is a standard library gate that produces the remaining elements of a list starting after the `n`th element. So we use `++scag` to produce the drawn hand of `n` cards in the head of the cell as `hand`, and `++slag` to produce the remaining deck in the tail of the cell as `rest`.
```hoon
++ shuffle-deck
|= [unshuffled=deck entropy=@]
^- deck
=/ shuffled *deck
=/ random ~(. og entropy)
=/ remaining (lent unshuffled)
|-
?: =(remaining 1)
:_ shuffled
(snag 0 unshuffled)
=^ index random (rads:random remaining)
%= $
shuffled [(snag index unshuffled) shuffled]
remaining (dec remaining)
unshuffled (oust [index 1] unshuffled)
==
```
Finally we come to `++shuffle-deck`. This gate takes two arguments: a `deck`, and a `@` as a bit of `entropy` to seed the `og` random-number core. It will produce a `deck`.
We add a bunted `deck`, then encounter a very interesting statement that you haven't run into yet. This is the irregular form of [`%~` censig](https://urbit.org/docs/hoon/reference/rune/cen#-censig), which “evaluates an arm in a door.” For our purposes now, you can see it as a way of creating a random-value arm that we'll use later on with `++rads:random`.
With `=/ remaining (lent unshuffled)`, we get the length of the unshuffled deck with [`++lent`](https://urbit.org/docs/hoon/reference/stdlib/2b#lent).
`?: =(remaining 1)` checks if we have only one card remaining. If that's true, we produce a cell of `shuffled` and the one card left in `unshuffled`. We use the [`:_` colcab](https://urbit.org/docs/hoon/reference/rune/col#-colcab) rune here, so that the “heavier” expression is at the bottom.
If the above conditional evaluates to `%.n` false, we need to do a little work. [`=^` tisket](https://urbit.org/docs/hoon/reference/rune/tis#-tisket) is a rune that pins the head of a pair and changes a leg in the subject with the tail. It's useful for interacting with the `og` core arms, as many of them produce a pair of a random numbers and the next state of the core. We're going to put the random number in the subject with the face `index` and change `random` to be the next core.
With that completed, we use `%=` centis to call `$` buc to recurse back up to `|-` barhep with a few changes:
- `shuffled` gets the `darc` from `unshuffled` at `index` added to the front of it.
- `remaining` gets decremented. Why are we using a counter here instead of just checking the length of `unshuffled` on each loop? `lent` traverses the entire list every time it's called so maintaining a counter in this fashion is much faster.
- `unshuffled` becomes the result of using `oust` to remove 1 `darc` at `index` on `unshuffled`.
This is a very naive shuffling algorithm. We leave the implementation of a better shuffling algorithm as an exercise for the reader.
#### Exercise: Using the Playing Card Library
Unfortunately `/` fas runes don't work in the Dojo right now, so we need to build code using the `-build-file` thread if we want to use the library directly.
- Import the `/lib/playing-cards.hoon` library and use it to shuffle and show a deck and a random hand of five cards.
We first import the library:
```hoon
> =playing-cards -build-file /===/lib/playing-cards/hoon
```
We then invoke it using the _entropy_ or system randomness. (This is an unpredictable value we will use when we want a process to be random. We will discuss it in detail when we talk about [subject-oriented programming](./N-subject.md).)
```hoon
> =deck (shuffle-deck:playing-cards make-deck:playing-cards eny)
> deck
~[
[sut=%spades val=12]
[sut=%spades val=8]
[sut=%hearts val=5]
[sut=%clubs val=2]
[sut=%diamonds val=10]
...
[sut=%spades val=2]
[sut=%hearts val=6]
[sut=%hearts val=12]
]
```
Draw a hand of five cards from the deck:
```hoon
> (draw:playing-cards 5 deck)
[ hand
~[
[sut=%spades val=12]
[sut=%spades val=8]
[sut=%hearts val=5]
[sut=%clubs val=2]
[sut=%diamonds val=10]
]
rest
~[
[sut=%hearts val=2]
[sut=%clubs val=7]
[sut=%clubs val=9]
[sut=%diamonds val=6]
[sut=%diamonds val=8]
...
[sut=%spades val=2]
[sut=%hearts val=6]
[sut=%hearts val=12]
]
]
```
Of course, since the deck was shuffled once, any time we draw from the same deck we will get the same hand. But if we replace the deck with the `rest` remaining, then we can continue to draw new hands.
## Desks
A [desk](https://urbit.org/docs/glossary/desk) organizes a collection of files, including generators, libraries, agents, and system code, into one coherent bundle. A desk is similar to a file drive in a conventional computer, or a Git branch. Desks are supported by the Clay vane in Arvo, the Urbit OS.
At this point, you've likely only worked on the `%base` desk. You can see data about any particular desk using the `+vat` generator:
```hoon
> +vat %base
%base
/sys/kelvin: [%zuse 418]
base hash: ~
%cz hash: 0v2.r1lbp.i9jr2.hosbi.rvg16.pqe7u.i3hnp.j7k27.9jsgv.8k7rp.oi98q
app status: running
force on: ~
force off: ~
publishing ship: ~
updates: local
source ship: ~
source desk: ~
source aeon: ~
pending updates: ~
```
You'll see a slightly different configuration on the particular ship you are running.
### Aside: Filesystems
A filesystem is responsible for providing access to blobs of data somewhere on a disk drive. If you have worked with Windows or macOS, you have become accustomed to using a file browser to view and interact with files. Mobile devices tend to obscure the nature of files more, in favor of just providing an end-user interface for working with or viewing the data. To use files effectively, you need to know a few things:
1. How to identify the data.
2. How to locate the data.
3. How to read or interpret the data.
Files are identified by a _file name_, which is typically a short descriptor like `Waterfall Visit 5.jpg` (if produced by a human) or `DSC_54694.jpg` (if produced by a machine).
Files are located using the _path_ or _file path_. Colloquially, this is what we mean when we ask which folder or directory a file is located in. It's an address that users and programs can use to uniquely locate a particular file, even if that file has the same name as another file.
An Earth filesystem and path orients itself around some key metaphor:
- Windows machines organize the world by drive, e.g. `C:\`.
- Unix machines (including macOS and Linux) organize the world from `/`, the root directory.
**Absolute paths** are like street addresses, or latitude and longitude. They let you unambiguously locate a file or folder. **Relative paths** are more like informal (but correct) instructions: “It's on the right just three houses past the church.” They are often shorter but require the user to know the starting point.
Once you have located a particular file, you need to load the data. Conventionally, _file extensions_ indicate what kind of file you are dealing with: `.jpg`, `.png`, and `.gif` are image files, for instance; `.txt`, `.docx`, and `.pdf` are different kinds of documents; and `.mp3` and `.ogg` are audio files. Simply changing the extension on the file doesn't change the underlying data, but it can either elicit a stern warning from the OS or confuse it, depending on the OS. Normally you have to open the file in an appropriate program and save it as a new type if such a conversion is possible.
### File Data in Urbit
On Mars, we treat a filesystem as a way of organizing arbitrary access to blocks of persistent data. There are some concessions to Earth-style filesystems, but Clay (Urbit's filesystem) organizes everything with respect to a `desk`, a discrete collection of static data on a particular ship. Of course, like everything else in Hoon, a desk is a tree as well.
So far everything we have done has taken place on the `%base` desk. You have by this point become proficient at synchronizing Earthling data (Unix data) and Martian data (Urbit data), using `|mount` and `|commit`, and every time you've done this with `%base` that has been recorded in the update report the Dojo makes to you.
```hoon
> |commit %base
kiln: commit detected at %base (local)
+ /~zod/base/2/gen/demo/hoon
```
This message says that a file `demo.hoon` was added to the Urbit filesystem at the path in `/gen`. What is the rest of it, though, the first three components? We call this the _beak_. The beak lets Clay globally identify any resource on any ship at any point in time. A beak has three components:
1. The **ship**, here `~zod`. (You can find this out on any ship using `our`.)
2. The **desk**, here `%base`.
3. A **revision number** or **timestamp**, here `2`. (The current system time is available as `now`.) Clay tracks the history of each file, so older versions can be accessed by their revision number. (This is uncommon to need to do today.)
The beak is commonly constructued with the `/` fas prefix and `=` tis signs for the three components:
```hoon
> /===
[~.~zod ~.base ~.~2022.6.14..18.13.35..ccaf ~]
```
Any one of those can be replaced as necessary:
```hoon
> /=sandbox=
[~.~zod %sandbox ~.~2022.6.14..18.14.49..a3da ~]
```
You'll also sometimes see `%` cen stand in for the whole including the “current” desk. The current desk is a Dojo concept, since for Clay we can access any desk at any time (with permission).
```hoon
> %
[~.~zod ~.base ~.~2022.6.14..18.15.10..698c ~]
```
#### Paths and Files
A `path` is a `(list @ta)`, a list of text identifiers. The first three are always the beak and the last one conventionally refers to the mark by which the file is represented.
For instance, the `+cat` generator displays the contents of any path, e.g.
```hoon
> +cat /===/gen/ls/hoon
/~zod/base/~2022.6.14..18.16.53..2102/gen/ls/hoon
:: LiSt directory subnodes
::
:::: /hoon/ls/gen
::
/? 310
/+ show-dir
::
::::
::
~& %
:- %say
|= [^ [arg=path ~] vane=?(%g %c)]
=+ lon=.^(arch (cat 3 vane %y) arg)
tang+[?~(dir.lon leaf+"~" (show-dir vane arg dir.lon))]~
```
If no data are located at the given path, `+cat` simply shows `~` null:
```hoon
> +cat /=garden=/gen/ls/hoon
~ /~zod/garden/~2022.6.14..18.17.16..07ff/gen/ls/hoon
```
Every desk has a standard directory structure:
- `/app` for agents
- `/gen` for generators
- `/lib` for library and helper files
- `/mar` for marks
- `/sur` for shared structures
- `/ted` for threads
To run a generator from a different desk in Dojo, you need to prefix the desk name to the generator; to run `/=landscape=/gen/tally/hoon`, you would say:
```hoon
> +landscape!tally
tallied your activity score! find the results below.
to show non-anonymized resource identifiers, +tally |
counted from groups and channels that you are hosting.
groups are listed with their member count.
channels are listed with activity from the past week:
- amount of top-level content
- amount of unique authors
the date is ~2022.6.14..18.19.30..8c94
you are in 0 group(s):
you are hosting 0 group(s):
```
#### Marks
Marks play the role of file extensions, with an important upgrade: they are actually molds and define conversion paths. We won't write them in Hoon School, but you will encounter them when you begin writing apps. They are used more broadly than merely as file types, because they act as smart molds to ingest and yield data structures such as JSON and HTML from Hoon data structures.
In brief, each mark has a `++grab` arm to convert from other types to it; a `++grow` arm to convert it to other types; and a `++grad` arm for some standard operations across marks. You can explore the marks in `/mar`.
## Other Ford Runes
The `++ford` arm of Clay builds Hoon code. It provides [a number of runes](https://urbit.org/docs/arvo/ford/ford#ford-runes) which allow fine-grained control over building and importing files. These must be in the specific order at the top of any file. (They also don't work in Dojo; see `-build-file` for a workaround.) The runes include:
- `/-` fashep imports a structure file from `/sur`. Structure files are a way to share common data structures (across agents, for instance).
- `/+` faslus imports a library file from `/lib`.
Both `/-` fashep and `/+` faslus allow you to import by affecting the name of the exposed core:
1. With the default name:
```hoon
/+ apple
```
2. With no name:
```hoon
/- *orange
```
3. With a new name:
```hoon
/+ pomme=apple
```
`*` is useful when importing libraries with unwieldy names, but otherwise should be avoided as it can shadow names in your current subject.
- `/=` fastis builds a user-specified path and wraps it with a given face.
- `/*` fastar imports the contents of a file, applies a mark to convert it, and wraps it with a given face.

View File

@ -0,0 +1,363 @@
---
title: Building Code Confidently
nodes: 170, 190
objectives:
- "Run existing unit tests."
- "Produce a unit test."
- "Employ a debugging strategy to identify and correct errors in Hoon code."
---
# Building Code Confidently
_This module will discuss how we can have confidence that a program does what it claims to do, using unit testing and debugging strategies._
> Code courageously.
>
> If you avoid changing a section of code for fear of awakening the demons therein, you are living in fear. If you stay in the comfortable confines of the small section of the code you wrote or know well, you will never write legendary code. All code was written by humans and can be mastered by humans.
>
> It's natural to feel fear of code; however, you must act as though you are able to master and change any part of it. To code courageously is to walk into any abyss, bring light, and make it right.
>
> (~wicdev-wisryt, [“Urbit Precepts” C1](https://urbit.org/blog/precepts))
When you produce software, how much confidence do you have that it does what you think it does? Bugs in code are common, but judicious testing can manifest failures so that the bugs can be identified and corrected. We can classify a testing regimen for Urbit code into a couple of layers: fences and unit tests.
### Fences
_Fences_ are barriers employed to block program execution if the state isnt adequate to the intended task. Typically, these are implemented with `assert` or similar enforcement. In Hoon, this means `?>` wutgar, `?<` wutgal, and `?~` wutsig, or judicious use of `^-` kethep and `^+` ketlus. For conditions that must succeed, the failure branch in Hoon should be `!!`, which crashes the program.
### Unit Tests
> Unit tests are so called because they exercise the functionality of the code by interrogating individual functions and methods. Functions and methods can often be considered the atomic units of software because they are indivisible. However, what is considered to be the smallest code unit is subjective. The body of a function can be long are short, and shorter functions are arguably more unit-like than long ones.
>
> (Katy Huff, [“Python Testing and Continuous Integration”](https://mq-software-carpentry.github.io/python-testing/05-units/))
In many languages, unit tests refer to functions, often prefixed `test`, that specify (and enforce) the expected behavior of a given function. Unit tests typically contain setup, assertions, and tear-down. In academic terms, theyre a grading script.
In Hoon, the `tests/` directory contains the relevant tests for the testing framework to grab and utilize. These can be invoked with the `-test` thread:
```hoon
> -test /=landscape=/tests ~
built /tests/lib/pull-hook-virt/hoon
built /tests/lib/versioning/hoon
> test-supported: took 1047µs
OK /lib/versioning/test-supported
> test-read-version: took 28317µs
OK /lib/versioning/test-read-version
> test-is-root: took 28786µs
OK /lib/versioning/test-is-root
> test-current-version: took 507µs
OK /lib/versioning/test-current-version
> test-append-version: took 4804µs
OK /lib/versioning/test-append-version
> test-mule-scry-bad-time: took 8437µs
OK /lib/pull-hook-virt/test-mule-scry-bad-time
> test-mule-scry-bad-ship: took 8279µs
OK /lib/pull-hook-virt/test-mule-scry-bad-ship
> test-kick-mule: took 4614µs
OK /lib/pull-hook-virt/test-kick-mule
ok=%.y
```
(Depending on when you built your fakeship, particular tests may or may not be present. You can download them from [the Urbit repo](https://github.com/urbit/urbit) and add them manually if you like.)
Hoon unit tests come in two categories:
1. `++expect-eq` (equality of two values)
2. `++expect-fail` (failure/crash)
Let's look at a practical example first, then dissect these.
#### Exercise: Testing a Library
Consider an absolute value arm `++absolute` for `@rs` values. The unit tests for `++absolute` should accomplish a few things:
- Verify correct behavior for positive numeric input.
- Verify correct behavior for negative numeric input.
- Verify correct behavior for zero input.
- Verify an exception is raised for nonnumeric input. (Properly speaking Hoon doesn't have exceptions because Nock is crash-only; tools like `unit` are a way of dealing with failed computations.)
By convention any testing suite has the import line `/+ *test` at the top.
**/tests/lib/absolute.hoon**
```hoon
/+ *test, *absolute
|%
++ test-absolute
;: weld
%+ expect-eq
!> .1
!> (absolute .-1)
%+ expect-eq
!> .1
!> (absolute .1)
%+ expect-eq
!> .0
!> (absolute .0)
%- expect-fail
|. (absolute '0') ::actually succeeds
==
--
```
Note that at this point we dont care what the function looks like, only how it behaves.
**/lib/absolute.hoon**
```hoon
|%
++ absolute
|= a=@rs
?: (gth a .0) a
(sub:rs .0 a)
--
```
- Use the tests to determine what is wrong with this library code and correct it.
The dcSpark blog post [“Writing Robust Hoon — A Guide To Urbit Unit Testing”](https://medium.com/dcspark/writing-robust-hoon-a-guide-to-urbit-unit-testing-82b2631fe20a) covers some more good ideas about testing Hoon code.
### `/lib/test.hoon`
In `/lib/test.hoon` we find a core with a few gates: `++expect`, `++expect-eq`, and `++expect-fail`, among others.
`++expect-eq` checks whether two vases are equal and pretty-prints the result of that test. It is our workhorse. The source for `++expect-eq` is:
```hoon
++ expect-eq
|= [expected=vase actual=vase]
^- tang
::
=| result=tang
::
=? result !=(q.expected q.actual)
%+ weld result
^- tang
:~ [%palm [": " ~ ~ ~] [leaf+"expected" (sell expected) ~]]
[%palm [": " ~ ~ ~] [leaf+"actual " (sell actual) ~]]
==
::
=? result !(~(nest ut p.actual) | p.expected)
%+ weld result
^- tang
:~ :+ %palm [": " ~ ~ ~]
:~ [%leaf "failed to nest"]
(~(dunk ut p.actual) %actual)
(~(dunk ut p.expected) %expected)
== ==
result
```
Test code deals in `vase`s, which are produced by [`!>` zapgar](https://urbit.org/docs/hoon/reference/rune/zap#-zapgar) as a cell of the type of a value and the value.
`++expect-fail` by contrast take a `|.` bardot trap (a trap that has the `$` buc arm but hasn't been called yet) and verifies that the code within fails.
```hoon
> (expect-fail:test |.(!!))
~
> (expect-fail:test |.((sub 0 1)))
~
> (expect-fail:test |.((sub 1 1)))
~[[%leaf p="expected failure - succeeded"]]
```
(Recall that `~` null is `%.y` true.)
## Producing Error Messages
Formal error messages in Urbit are built of tanks.
- A `tank` is a structure for printing data.
- `leaf` is for printing a single noun.
- `palm` is for printing backstep-indented lists.
- `rose` is for printing rows of data.
- A `tang` is a `(list tank)`.
As your code evaluates, the Arvo runtime maintains a _stack trace_, or list of the evaluations and expressions that got the program to its notional point of computation. When the code fails, any error hints currently on the stack are dumped to the terminal for you to see what has gone wrong.
- The [`~_` sigcab](https://urbit.org/docs/reference/hoon-expressions/rune/sig/#sigcab) rune, described as a “user-formatted tracing printf”, can include an error message for you, requiring you to explicitly build the `tank`. (`printf` is a reference to [C's I/O library](https://en.wikipedia.org/wiki/Printf_format_string).)
- The [`~|` sigbar](https://urbit.org/docs/reference/hoon-expressions/rune/sig/#sigbar) rune, a “tracing printf”, can include an error message from a simple `@t` cord.
What this means is that these print to the stack trace if something fails, so you can use either rune to contribute to the error description:
```hoon
|= [a=@ud]
~_ leaf+"This code failed"
!!
```
- The [`!:` zapcol`](https://urbit.org/docs/reference/hoon-expressions/rune/zap/#-zapcol) rune turns on line-by-line stack tracing, which is extremely helpful when debugging programs. Drop it in on the first Hoon line (after `/` fas imports) of a generator or library while developing.
```hoon
> (sub 0 1)
subtract-underflow
dojo: hoon expression failed
> !:((sub 0 1))
/~zod/base/~2022.6.14..20.47.19..3b7a:<[1 4].[1 13]>
subtract-underflow
dojo: hoon expression failed
```
When you compose your own library cores, include error messages for likely failure modes.
## Test-Driven Development
_In extremis_, rigorous unit testing yields test-driven development (TDD). Test-driven development refers to the practice of fully specifying desired function behavior before composing the function itself. The advantage of this approach is that it forces you to clarify ahead of time what you expect, rather than making it up on the fly.
For instance, one could publish a set of tests which characterize the behavior of a Roman numeral translation library sufficiently that when such a library is provided it is immediately demonstrable.
**/tests/lib/roman.hoon**
```hoon
/+ *test, *roman
|%
++ test-output-one
=/ src "i"
=/ trg 1
;: weld
%+ expect-eq
!> trg
!> (from-roman src)
%+ expect-eq
!> trg
!> (from-roman (cuss src))
==
++ test-output-two
=/ src "ii"
=/ trg 2
;: weld
%+ expect-eq
!> trg
!> (from-roman src)
%+ expect-eq
!> trg
!> (from-roman (cuss src))
==
:: and so forth
++ test-input-one
=/ trg "i"
=/ src 1
;: weld
%+ expect-eq
!> trg
!> (to-roman src)
==
++ test-input-two
=/ trg "ii"
=/ src 2
;: weld
%+ expect-eq
!> trg
!> (to-roman src)
==
:: and so forth
--
```
By composing the unit tests ahead of time, you exercise a discipline of thinking carefully through details of the interface and implementation before you write a single line of implementation code.
## Debugging Common Errors
Lets enumerate the errors you are likely to have encountered by this point:
### `nest-fail`
`nest-fail` may be the most common. Likely you are using an atom or a cell where the other is expected.
```hoon
> (add 'a' 'b')
195
> (add "a" "b")
-need.@
-have.[i=@tD t=""]
nest-fail
dojo: hoon expression failed
```
### `mint-nice`
`mint-nice` arises from typechecking errors:
```hoon
> ^-(tape ~[78 97 114 110 105 97])
mint-nice
-need.?(%~ [i=@tD t=""])
-have.[@ud @ud @ud @ud @ud @ud %~]
nest-fail
dojo: hoon expression failed
```
Conversion without casting via auras fails because the atom types (auras) don't nest without explicit downcasting to `@`.
```hoon
> `(list @ud)`~[0x0 0x1 0x2]
mint-nice
-need.?(%~ [i=@ud t=it(@ud)])
-have.[@ux @ux @ux %~]
nest-fail
dojo: hoon expression failed
> `(list @ud)``(list @)`~[0x0 0x1 0x2]
~[0 1 2]
```
### `fish-loop`
A `fish-loop` arises when using a recursive mold definition like `list`. (The relevant mnemonic is that `++fish` goes fishing for the type of an expression.) Alas, this fails today:
```hoon
> ?=((list @) ~[1 2 3 4])
[%test ~[[%.y p=2]]]
fish-loop
```
although a promised `?#` wuthax rune should match it once implemented.
### `generator-build-fail`
A `generator-build-fail` most commonly results from composing code with mismatched runes (and thus the wrong children including hanging expected-but-empty slots).
Also check if you are using Windows-style line endings, as Unix-style line endings should be employed throughout Urbit.
### Misusing the `$` buc Arm
Another common mistake is to attempt to use the default `$` buc arm in something that doesn't have it. This typically happens for one of two reasons:
- `$.+2` means that `%-` cenhep or equivalent function call cannot locate a battery. This can occur when you try to use a non-gate as a gate. In particular, if you mask the name of a mold (such as `list`), then a subsequent expression that requires the mold will experience this problem.
```hoon
> =/ list ~[1 2 3]
=/ a ~[4 5 6]
`(list @ud)`a
-find.$.+2
```
- `-find.$` similarly looks for a `$` buc arm in something that _is_ a core but doesn't have the `$` buc arm present.
```hoon
> *tape
""
> (tape)
""
> *(tape)
-find.$
```
- [“Hoon Errors”](https://urbit.org/docs/hoon/reference/hoon-errors)
### Debugging Strategies
What are some strategies for debugging?
- **Debugging stack.** Use the `!:` zapcol rune to turn on the debugging stack, `!.` zapdot to turn it off again. (Most of the time you just pop this on at the top of a generator and leave it there.)
- **`printf` debugging.** If your code will compile and run, employ `~&` sigpam frequently to make sure that your code is doing what you think its doing.
- **Typecast.** Include `^` ket casts frequently throughout your code. Entire categories of error can be excluded by satisfying the Hoon typechecker.
- **The only wolf in Alaska.** Essentially a bisection search, you split your code into smaller modules and run each part until you know where the bug arose (where the wolf howled). Then you keep fencing it in tighter and tighter until you know where it arose. You can stub out arms with `!!` zapzap.
- **Build it again.** Remove all of the complicated code from your program and add it in one line at a time. For instance, replace a complicated function with either a `~&` sigpam and `!!` zapzap, or return a known static hard-coded value instead. That way as you reintroduce lines of code or parts of expressions you can narrow down what went wrong and why.
- **Run without networking**. If you run the Urbit executable with `-L`, you cut off external networking. This is helpful if you want to mess with a _copy_ of an actual ship without producing remote effects. That is, if other parts of Ames dont know what youre doing, then you can delete that copy (COPY!) of your pier and continue with the original. This is an alternative to using fakezods which is occasionally helpful in debugging userspace apps in Gall. You can also develop using a moon if you want to.

View File

@ -0,0 +1,653 @@
---
title: Handling Text
nodes: 160, 163
objectives:
- "Review Unicode text structure."
- "Distinguish cords and tapes and their characteristics."
- "Transform and manipulate text using text conversion arms."
- "Interpolate text."
- "Employ sigpam logging levels."
- "Create a `%say` generator."
- "Identify how Dojo sees and interprets a generator as a cell with a head tag."
- "Identify the elements of a `sample` for a `%say` generator."
- "Produce a `%say` generator with optional arguments."
---
# Text Processing I
_This module will discuss how text is represented in Hoon, discuss tools for producing and manipulating text, and introduce the `%say` generator, a new generator type. We don't deal with formatted text (`tank`s) or parsers here, deferring that discussion. Formatted text and text parsing are covered [in a later module](./O-stdlib-io.md)._
## Text in Hoon
We've incidentally used `'messages written as cords'` and `"as tapes"`, but aside from taking a brief look at how `list`s (and thus `tape`s) work with tree addressing, we haven't discussed why these differ or how text works more broadly.
There are four basic ways to represent text in Urbit:
- `@t`, a `cord`, which is an atom (single value)
- `@ta`, a `knot` or ASCII text, which is an atom (single value)
- `@tas`, a `term` or ASCII text symbol
- `tape`, which is a `(list @t)`
This is more ways than many languages support: most languages simply store text directly as a character array, or list of characters in memory. Colloquially, we would only call cords and tapes [_strings_](https://en.wikipedia.org/wiki/String_%28computer_science%29), however.
What are the applications of each?
### `@t` `cord`
What is a written character? Essentially it is a representation of human semantic content (not sound strictly). (Note that we don't refer to _alphabets_, which prescribe a particular relationship of sound to symbol: there are ideographic and logographic scripts, syllabaries, and other representations. Thus, _characters_ not _letters_.) Characters can be combined—particularly in ideographic languages like Mandarin Chinese.
One way to handle text is to assign a code value to each letter, then represent these as subsequent values in memory. (Think, for instance, of [Morse code](https://en.wikipedia.org/wiki/Morse_code).) On all modern computers, the numeric values used for each letter are given by the [ASCII](https://en.wikipedia.org/wiki/ASCII) standard, which defines 128 unique characters (2⁷ = 128).
```
65 83 67 73 73
A S C I I
```
A cord simply shunts these values together in one-byte-wide slots and represents them as an integer.
```hoon
> 'this is a cord'
'this is a cord'
> `@`'this is a cord'
2.037.307.443.564.446.887.986.503.990.470.772
```
It's very helpful to use the `@ux` aura if you are trying to see the internal structure of a `cord`. Since the ASCII values align at the 8-bit wide characters, you can see each character delineated by a hexadecimal pair.
```hoon
> `@ux`'HELLO'
0x4f.4c4c.4548
> `@ub`'HELLO'
0b100.1111.0100.1100.0100.1100.0100.0101.0100.1000
```
You can think of this a couple of different ways. One way is to simple think of them as chained together, with the first letter in the rightmost position. Another is to think of them as values multipled by a “place value”:
| Letter | ASCII | Place | “Place Value” |
| ------ | ----- | ----- | ------------- |
| `H` | 0x48 | 0 | 2⁰ = 1 → 0x48 |
| `E` | 0x45 | 1 | 2⁸ = 256 = 0x100 → 0x4500 |
| `L` | 0x4c | 2 | 2¹⁶ = 65.536 = 0x1.0000 → 0x4c.0000 |
| `L` | 0x4c | 3 | 2²⁴ = 16.777.216 = 0x100.0000 → 0x4c00.0000 |
| `O` | 0x4f | 4 | 2³² = 4.294.967.296 = 0x1.0000.0000 → 0x4f.0000.0000 |
This way, each value slots in after the preceding value.
Special characters (non-ASCII, beyond the standard keyboard, basically) are represented using a more complex numbering convention. [Unicode](https://en.wikipedia.org/wiki/Unicode) defines a standard specification for _code points_ or numbers assigned to characters, and a few specific bitwise _encodings_ (such as the ubiquitous UTF-8). Urbit uses UTF-8 for `@t` values (thus both `cord` and `tape`).
### `(list @t)` `tape`
There are some tools to work with atom `cord`s of text, but most of the time it is more convenient to unpack the atom into a `tape`. A `tape` splits out the individual characters from a `cord` into a `list` of character values.
![](./binary-tree-tape.png)
We've hinted a bit at the structure of `list`s before; for now the main thing you need to know is that they are cells which end in a `~` sig. So rather than have all of the text values stored sequentially in a single atom, they are stored sequentially in a rightwards-branching binary tree of cells.
A tape is a list of `@tD` atoms (i.e., ASCII characters). (The upper-case character at the end of the aura hints that the `@t` values are D→3 so 2³=8 bits wide.)
```hoon
> "this is a tape"
"this is a tape"
> `(list @)`"this is a tape"
~[116 104 105 115 32 105 115 32 97 32 116 97 112 101]
```
Since a `tape` is a `(list @tD)`, all of the `list` tools we have seen before work on them.
### `@ta` `knot`
If we restrict the character set to certain ASCII characters instead of UTF-8, we can use this restricted representation for system labels as well (such as URLs, file system paths, permissions). `@ta` `knot`s and `@tas` `term`s both fill this role for Hoon.
```hoon
> `@ta`'hello'
~.hello
```
Every valid `@ta` is a valid `@t`, but `@ta` does not permit spaces or a number of other characters. (See `++sane`, discussed below.)
### `@tas` `term`
A further tweak of the ASCII-only concept, the `@tas` `term` permits only “text constants”, values that are first and foremost only _themselves_.
> [`@tas` permits only] a restricted text atom for Hoon constants. The only characters permitted are lowercase ASCII letters, `-`, and `0-9`, the latter two of which cannot be the first character. The syntax for `@tas` is the text itself, always preceded by `%`. The empty `@tas` has a special syntax, `$`.
`term`s are rarely used for message-like text, but they are used all the time for internal labels in code. They differ from regular text in a couple of key ways that can confuse you until you're used to them.
For instance, a `@tas` value is also a mold, and the value will _only_ match its own mold, so they are commonly used with [type unions](./M-logic.md) to filter for acceptable values.
```hoon
> ^- @tas %5
mint-nice
-need.@tas
-have.%5
nest-fail
dojo: hoon expression failed
> ^- ?(%5) %5
%5
> (?(%5) %5)
%5
```
For instance, imagine creating a function to ensure that only a certain [classical element](https://en.wikipedia.org/wiki/Classical_element) can pass through a gate. (This gate is superfluous given how molds work, but it shows off a point.)
```hoon
|= input=@t
=<
(validate-element input)
|%
+$ element ?(%earth %air %fire %water)
++ validate-element
|= incoming=@t
%- element incoming
--
```
(See how that `=<` tisgal works with the helper core?)
## Text Operations
Text-based data commonly needs to be _produced_, _manipulated_, or _analyzed_ (including parsing).
### Producing Text
String interpolation puts the result of an expression directly into a `tape`:
```hoon
> "{<(add 5 6)>} is the answer."
"11 is the answer."
```
`++weld` can be used to glue two `tape`s together:
```hoon
> (weld "Hello" "Mars!")
"HelloMars!"
```
```hoon
|= [t1=tape t2=tape]
^- tape
(weld t1 t2)
```
### Manipulating Text
If you have text but you need to change part of it or alter its form, you can use standard library `list` operators like `++flop` as well as `tape`-specific arms.
Applicable `list` operations—some of which you've seen before—include:
- [`++flop`](https://urbit.org/docs/hoon/reference/stdlib/2b#flop) takes a list and returns it in reverse order:
```hoon
> (flop "Hello!")
"!olleH"
> (flop (flop "Hello!"))
"Hello!"
```
- [`++sort`](https://urbit.org/docs/hoon/reference/stdlib/2b#sort) uses the [quicksort algorithm](https://en.wikipedia.org/wiki/Quicksort) to sort a list. It takes a `list` to sort and a gate that serves as a comparator. For example, if you want to sort the list `~[37 62 49 921 123]` from least to greatest, you would pass that list along with the `++lth` gate (for “less than”):
```hoon
> (sort ~[37 62 49 921 123] lth)
~[37 49 62 123 921]
```
To sort the list from greatest to least, use the gth gate ("greater than") as the basis of comparison instead:
```hoon
> (sort ~[37 62 49 921 123] gth)
~[921 123 62 49 37]
```
You can sort letters this way as well:
```hoon
> (sort ~['a' 'f' 'e' 'k' 'j'] lth)
<|a e f j k|>
```
The function passed to sort must produce a flag, i.e., `?`.
- [`++weld`](https://urbit.org/docs/hoon/reference/stdlib/2b#weld) takes two lists of the same type and concatenates them:
```hoon
> (weld "Happy " "Birthday!")
"Happy Birthday!"
```
It does not inject a separator character like a space.
- [`++snag`](https://urbit.org/docs/hoon/reference/stdlib/2b#snag) takes an atom `n` and a list, and returns the `n`th item of the list, where 0 is the first item:
```hoon
> (snag 3 "Hello!")
'l'
> (snag 1 "Hello!")
'e'
> (snag 5 "Hello!")
'!'
```
**Exercise: `++snag` Yourself**
- Without using `++snag`, write a gate that returns the `n`th item of a list. There is a solution at the bottom of the page.
- [`++oust`](https://urbit.org/docs/hoon/reference/stdlib/2b#oust) takes a pair of atoms [a=@ b=@] and a list, and returns the list with b items removed, starting at item a:
```hoon
> (oust [0 1] `(list @)`~[11 22 33 44])
~[22 33 44]
> (oust [0 2] `(list @)`~[11 22 33 44])
~[33 44]
> (oust [1 2] `(list @)`~[11 22 33 44])
~[11 44]
> (oust [2 2] "Hello!")
"Heo!"
```
- [`++lent`](https://urbit.org/docs/hoon/reference/stdlib/2b#lent) takes a list and returns the number of items in it:
```hoon
> (lent ~[11 22 33 44])
4
> (lent "Hello!")
6
```
**Exercise: Count the Number of Characters in Text**
- There is a built-in `++lent` function that counts the number of characters in a `tape`. Build your own `tape`-length character counting function without using `++lent`.
You may find the [`?~` wutsig](https://urbit.org/docs/hoon/reference/rune/wut#-wutsig) rune to be helpful. It tells you whether a value is `~` or not. (How would you do this with a regular `?:` wutcol?)
The foregoing are `list` operations. The following, in contrast, are `tape`-specific operations:
- [`++crip`](https://urbit.org/docs/hoon/reference/stdlib/4b#crip) converts a `tape` to a `cord` (`tape`→`cord`).
```hoon
> (crip "Mars")
'Mars'
```
- [`++trip`](https://urbit.org/docs/hoon/reference/stdlib/4b#trip) converts a `cord` to a `tape` (`cord`→`tape`).
```hoon
> (trip 'Earth')
"Earth"
```
- [`++cass`](https://urbit.org/docs/hoon/reference/stdlib/4b#cass): convert upper-case text to lower-case (`tape`→`tape`)
```hoon
> (cass "Hello Mars")
"hello mars"
```
- [`++cuss`](https://urbit.org/docs/hoon/reference/stdlib/4b#cuss): convert lower-case text to upper-case (`tape`→`tape`)
```hoon
> (cuss "Hello Mars")
"HELLO MARS"
```
### Analyzing Text
Given a string of text, what can you do with it?
1. Search
2. Tokenize
3. Convert into data
#### Search
- [`++find`](https://urbit.org/docs/hoon/reference/stdlib/2b#find) `[nedl=(list) hstk=(list)]` locates a sublist (`nedl`, needle) in the list (`hstk`, haystack). (`++find` starts counting from zero.)
```hoon
> (find "brillig" "'Twas brillig and the slithy toves")
[~ 6]
```
`++find` returns a `unit`, which right now means that we need to distinguish between nothing found (`~` null) and zero `[~ 0]`. `unit`s are discussed in more detail in [a later lesson](./L-struct.md).
#### Tokenize/Parse
To _tokenize_ text is to break it into pieces according to some rule. For instance, to count words one needs to break at some delimiter.
```
"the sky above the port was the color of television tuned to a dead channel"
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
```
Hoon has a sophisticated parser built into it that [we'll use later](./O-stdlib-io.md). There are a lot of rules to deciding what is and isn't a rune, and how the various parts of an expression relate to each other. We don't need that level of power to work with basic text operations, so we'll instead use basic `list` tools whenever we need to extract or break text apart for now.
#### Exercise: Break Text at a Space
Hoon has a very powerful text parsing engine, built to compile Hoon itself. However, it tends to be quite obscure to new learners. We can build a simple one using `list` tools.
- Compose a gate which parses a long `tape` into smaller `tape`s by splitting the text at single spaces. For example, given a `tape`
```hoon
"the sky above the port was the color of television tuned to a dead channel"
```
the gate should yield
```hoon
~["the" "sky" "above" "the" ...]
```
To complete this, you'll need [`++scag`](https://urbit.org/docs/hoon/reference/stdlib/2b#scag) and [`++slag`](https://urbit.org/docs/hoon/reference/stdlib/2b#slag) (who sound like villainous henchmen from a children's cartoon).
```hoon
|= ex=tape
=/ index 0
=/ result *(list tape)
|- ^- (list tape)
?: =(index (lent ex))
result
?: =((snag index ex) ' ')
$(index 0, ex `tape`(slag +(index) ex), result (weld result ~[`tape`(scag index ex)]))
$(index +(index))
```
#### Convert
If you have a Hoon value and you want to convert it into text as such, use `++scot` and `++scow`. These call for a value of type `+$dime`, which means the `@tas` equivalent of a regular aura. These are labeled as returning `cord`s (`@t`s) but in practice seem to return `knot`s (`@ta`s).
- [`++scot`](https://urbit.org/docs/reference/library/4m/#scot) renders a `dime` as a `cord` (`dime`→`cord`); the user must include any necessary aura transformation.
```hoon
> `@t`(scot %ud 54.321)
'54.321'
> `@t`(scot %ux 54.000)
'0xd2f0'
```
```hoon
> (scot %p ~sampel-palnet)
~.~sampel-palnet
> `@t`(scot %p ~sampel-palnet)
'~sampel-palnet'
```
- [`++scow`](https://urbit.org/docs/reference/library/4m/#scow) renders a `dime` as a `tape` (`dime`→`tape`); it is otherwise identical to `++scot`.
- [`++sane`](https://urbit.org/docs/hoon/reference/stdlib/4b#sane) checks the validity of a possible text string as a `knot` or `term`. The usage of `++sane` will feel a bit strange to you: it doesn't apply directly to the text you want to check, but it produces a gate that checks for the aura (as `%ta` or `%tas`). (The gate-builder is a fairly common pattern in Hoon that we've started to hint at by using molds.) `++sane` is also not infallible yet.
```hoon
> ((sane %ta) 'ångstrom')
%.n
> ((sane %ta) 'angstrom')
%.y
> ((sane %tas) 'ångstrom')
%.n
> ((sane %tas) 'angstrom')
%.y
```
Why is this sort of check necessary? Two reasons:
1. `@ta` `knot`s and `@tas` `term`s have strict rules, such as being ASCII-only.
2. Not every sequence of bits has a conversion to a text representation. That is, ASCII and Unicode have structural rules that limit the possible conversions which can be made. If things don't work, you'll get a `%bad-text` response.
```hoon
> 0x1234.5678.90ab.cdef
0x1234.5678.90ab.cdef
[%bad-text "[39 239 205 171 144 120 86 52 92 49 50 39 0]"]
```
There's a minor bug in Hoon that will let you produce an erroneous `term` (`@tas`):
```hoon
> `@tas`'hello mars'
%hello mars
```
Since a `@tas` cannot include a space, this is formally incorrect, as `++sane` reveals:
```hoon
> ((sane %tas) 'hello')
%.y
> ((sane %tas) 'hello mars')
%.n
```
#### Exercise: Building Your Own Library
Let's take some of the code we've built above for processing text and turn them into a library we can use in another generator.
- Take the space-breaking code and the element-counting code gates from above and include them in a `|%` barcen core. Save this file as `lib/text.hoon` in the `%base` desk of your fakeship and commit.
- Produce a generator `gen/text-user.hoon` which accepts a `tape` and returns the number of words in the text (separated by spaces). (How would you obtain this from those two operations?)
## Logging
The most time-honored method of debugging is to simply output relevant values at key points throughout a program in order to make sure they are doing what you think they are doing. To this end, we introduced `~&` sigpam in the last lesson.
The `~&` sigpam rune offers some finer-grained output options than just printing a simple value to the screen. For instance, you can use it with string interpolation to produce detailed error messages.
There are also `>` modifiers which can be included to mark “debugging levels”, really just color-coding the output:
1. No `>`: regular
2. `>`: information
3. `>>`: warning
4. `>>>`: error
(Since all `~&` sigpam output is a side effect of the compiler, it doesn't map to the Unix [`stdout`/`stderr` streams](https://en.wikipedia.org/wiki/Standard_streams) separately; it's all `stdout`.)
You can use these to differentiate messages when debugging or otherwise auditing the behavior of a generator or library. Try these in your own Dojo:
```hoon
> ~& 'Hello Mars!' ~
~
> 'Hello Mars!'
> ~& > 'Hello Mars!' ~
~
>> 'Hello Mars!'
> ~& >> 'Hello Mars!' ~
~
>>> 'Hello Mars!'
> ~& >>> 'Hello Mars!' ~
~
```
## `%say` Generators
A naked generator is merely a gate: a core with a `$` arm that Dojo knows to call. However, we can also invoke a generator which is a cell of a metadata tag and a core. The next level-up for our generator skills is the `%say` generator, a cell of `[%say core]` that affords slightly more sophisticated evaluation.
We use `%say` generators when we want to provide something else in Arvo, the Urbit operating system, with metadata about the generator's output. This is useful when a generator is needed to pipe data to another program, a frequent occurrence.
To that end, `%say` generators use `mark`s to make it clear, to other Arvo computations, exactly what kind of data their output is. A `mark` is akin to a MIME type on the Arvo level. A `mark` describes the data in some way, indicating that it's an `%atom`, or that it's a standard such as `%json`, or even that it's an application-specific data structure like `%talk-command`. `mark`s are not specific to `%say` generators; whenever data moves between programs in Arvo, that data is marked.
So, more formally, a `%say` generator is a `cell`. The head of that cell is the `%say` tag, and the tail is a `gate` that produces a `cask` -- a pair of the output data and the `mark` describing that data.
Save this example as `add.hoon` in the `/gen` directory of your `%base` desk:
```hoon
:- %say
|= *
:- %noun
(add 40 2)
```
Run it with:
```hoon
> |commit %base
> +say
42
```
Notice that we used no argument, something that is possible with `%say` generators but impossible with naked generators. We'll explain that in a moment. For now, let's focus on the code that is necessary to make something a `%say` generator.
```hoon
:- %say
```
Recall that the rune `:-` produces a cell, with the first following expression as its head and the second following expression as its tail.
The expression above creates a cell with `%say` as the head. The tail is the `|= *` expression on the line that follows.
```hoon
|= *
:- %noun
(add 40 2)
```
`|= *` constructs a [gate](https://urbit.org/docs/glossary/gate/) that takes a noun. This [gate](https://urbit.org/docs/glossary/gate/) will itself produce a `cask`, which is cell formed by the prepending `:-`. The head of that `cask` is `%noun` and the tail is the rest of the program, `(add 40 2)`. The tail of the `cask` will be our actual data produced by the body of the program: in this case, just adding 40 and 2 together.
A `%say` generator has access to values besides those passed into it and the Hoon standard subject. Namely, a `%say` generator knows about `our`, `eny`, and `now`:
- `our` is our current ship identity.
- `eny` is entropy, a source of randomness.
- `now` is the current system timestamp.
- `bec` is the current path (beak).
Dojo will automatically supply these values to the gate unless they are stubbed out with `*`.
### `%say` generators with arguments
We can modify the boilerplate code to allow arguments to be passed into a `%say` generator, but in a way that gives us more power than we would have if we just used a naked generator.
Naked generators are limited because they have no way of accessing data that exists in Arvo, such as the date and time or pieces of fresh entropy. In `%say` generators, however, we can access that kind of subject by identifying them in the gate's sample, which we only specified as `*` in the previous few examples. But we can do more with `%say` generators if we do more with that sample. Any valid sample will follow this 3-tuple scheme:
`[[now=@da eny=@uvJ bec=beak] [list of unnamed arguments] [list of named arguments]]`
This entire structure is a noun, which is why `*` is a valid sample if we wish to not use any of the information here in a generator. But let's look at each of these three elements, piece by piece.
#### Exercise: The Magic 8-Ball
This Magic 8-Ball generator returns one of a variety of answers in response to a call. In its entirety:
```hoon
!:
:- %say
|= [[* eny=@uvJ *] *]
:- %noun
^- tape
=/ answers=(list tape)
:~ "It is certain."
"It is decidedly so."****
"Without a doubt."
"Yes - definitely."
"You may rely on it."
"As I see it, yes."
"Most likely."
"Outlook good."
"Yes."
"Signs point to yes."
"Reply hazy, try again"
"Ask again later."
"Better not tell you now."
"Cannot predict now."
"Concentrate and ask again."
"Don't count on it."
"My reply is no."
"My sources say no."
"Outlook not so good."
"Very doubtful."
==
=/ rng ~(. og eny)
=/ val (rad:rng (lent answers))
(snag val answers)
```
Most of the work is being done by these two lines:
```hoon
=/ rng ~(. og eny)
=/ val (rad:rng (lent answers))
```
`~(. og eny)` starts a random number generator with a seed from the current entropy. A [random number generator](https://en.wikipedia.org/wiki/Random_number_generation) is a stateful mathematical function that produces an unpredictable result (unless you know the algorithm AND the starting value, or seed). Here we pull the subject of [`++og`](https://urbit.org/docs/hoon/reference/stdlib/3d#og), the randomness core in Hoon, to start the RNG. This is an uncommon turn of phrase, but will become more clear in the next lesson, on doors.
Then we slam the `++rad:rng` gate which returns a random number from 0 to _n_-1 inclusive. This gives us a random value from the list of possible answers.
Since this is a `%say` generator, we can run it without arguments:
```hoon
> +magic-8
"Ask again later."
```
#### Exercise: Dice Roll
Let's look at an example that uses all three parts. Save the code below in a file called `dice.hoon` in the `/gen` directory of your `%base` desk.
```hoon
:- %say
|= [[now=@da eny=@uvJ bec=beak] [n=@ud ~] [bet=@ud ~]]
:- %noun
[(~(rad og eny) n) bet]
```
This is a very simple dice program with an optional betting functionality. In the code, our sample specifies faces on all of the Arvo data, meaning that we can easily access them. We also require the argument `[n=@ud ~]`, and allow the _optional_ argument `[bet=@ud ~]`.
We can run this generator like so:
```unknown
> +dice 6, =bet 2
[4 2]
> +dice 6
[5 0]
> +dice 6
[2 0]
> +dice 6, =bet 200
[0 200]
> +dice
nest-fail
```
We get a different value from the same generator between runs, something that isn't possible with a naked generator. Another novelty is the ability to choose to not use the second argument.
#### Exercise: Using the Playing Card Library
Recall the playing card library `/lib/playing-cards.hoon` in `/lib`. Let's use it with a `%say` generator.
**`/gen/cards.hoon`**
```hoon
/+ playing-cards
:- %say
|= [[* eny=@uv *] *]
:- %noun
(shuffle-deck:playing-cards make-deck:playing-cards eny)
```
Having already saved the library as `/lib/playing-cards.hoon`, you can import it with the `/+` faslus rune. When `cards.hoon` gets built, the Hoon builder will pull in the requested library and also build that. It will also create a dependency so that if `/lib/playing-cards.hoon` changes, this file will also get rebuilt.
Below `/+ playing-cards`, you have the standard `say` generator boilerplate that allows us to get a bit of entropy from `arvo` when the generator is run. Then we feed the entropy and a `deck` created by `make-deck` into `shuffle-deck` to get back a shuffled `deck`.
---
#### Solutions to Exercises
- Roll-Your-Own-`++snag`:
```hoon
:: snag.hoon
::
|= [a=@ b=(list @)]
?~ b !!
?: =(0 a) i.b
$(a (dec a), b t.b)
```

View File

@ -0,0 +1,964 @@
---
title: Doors
nodes: 150, 155
objectives:
- "Identify the structure of a door and relate it to a core."
- "Pull an arm in a door."
- "Build cores for later use and with custom samples."
- "Identify the `$` buc arm in several structures and its role."
---
# Cores and Doors
_Hoon is statically typed, which means (among other things) that auras are subject to strict nesting rules, molds are crash-only, and the whole thing is rather cantankerous about matching types. However, since gate-building arms are possible, Hoon developers frequently employ them as templates to build type-appropriate cores, including gates. This module will start by introducing the concept of gate-building gates; then it will expand our notion of cores to include doors; finally it will introduce a common door, the `++map`, to illustrate how doors work._
## Gate-Building Gates
### Calling Gates
There are two ways of making a function call in Hoon. First, you can call a gate in the subject by name. For instance, we can produce a gate `inc` which adds `1` to an input:
```hoon
> =inc |=(a=@ (add 1 a))
> (inc 10)
11
> =inc
```
The second way of making a function call involves an expression that _produces_ a gate on demand:
```hoon
> (|=(a=@ (add 1 a)) 123)
124
> (|=(a=@ (mul 2 a)) 123)
246
```
The difference is subtle: the first cast has an already-created gate in the subject when we called it, while the latter involves producing a gate that doesn't exist anywhere in the subject, and then calling it.
Are calls to `++add` and `++mul` of the Hoon standard library of the first kind, or the second?
```hoon
> (add 12 23)
35
> (mul 12 23)
276
```
They're of the second kind. Neither `++add` nor `++mul` resolves to a gate directly; they're each arms that _produce_ gates.
Often the difference doesn't matter much. Either way you can do a function call using the `(gate arg)` syntax.
It's important to learn the difference, however, because for certain use cases you'll want the extra flexibility that comes with having an already produced core in the subject.
### Building Gates
Let's make a core with arms that build gates of various kinds. As we did in a previous lesson, we'll use the `|%` rune. Copy and paste the following into the Dojo:
```hoon
> =c |%
++ inc |=(a=@ (add 1 a))
++ add-two |=(a=@ (inc (inc a)))
++ double |=(a=@ (mul 2 a))
++ triple |=(a=@ (mul 3 a))
--
```
Let's try out these arms, using them for function calls:
```hoon
> (inc:c 10)
11
> (add-two:c 10)
12
> (double:c 10)
20
> (triple:c 10)
30
```
Notice that each arm in core `c` is able to call the other arms of `c`—`++add-two` uses the `++inc` arm to increment a number twice. As a reminder, each arm is evaluated with its parent core as the subject. In the case of `++add-two` the parent core is `c`, which has `++inc` in it.
#### Mutating a Gate
Let's say you want to modify the default sample of the gate for `double`. We can infer the default sample by calling `double` with no argument:
```hoon
> (double:c)
0
```
Given that `a x 2 = 0`, `a` must be `0`. (Remember that `a` is the face for the `double` sample, as defined in the core we bound to `c` above.)
Let's say we want to mutate the `++double` gate so that the default sample is `25`. There is only one problem: `++double` isn't a gate!
```hoon
> double.c(a 25)
-tack.a
-find.a
dojo: hoon expression failed
```
It's an arm that produces a gate, and `a` cannot be found in `++double` until the gate is created. Furthermore, every time the gate is created, it has the default sample, `0`. If you want to mutate the gate produced by `++double`, you'll first have to put a copy of that gate into the subject:
```hoon
> =double-copy double:c
> (double-copy 123)
246
```
Now let's mutate the sample to `25`, and check that it worked with `+6`. (The sample lives at `+6` in a given core tree.)
```hoon
> +6:double-copy(a 25)
a=25
```
Good. Let's call it with no argument and see if it returns double the value of the modified sample.
```hoon
> (double-copy(a 25))
50
```
It does indeed. Unbind `c` and `double-copy`:
```hoon
> =c
> =double-copy
```
Contrast this with the behavior of `++add`. We can look at the sample of the gate for `add` with `+6:add`:
```hoon
> +6:add
[a=0 b=0]
```
If you try to mutate the default sample of `++add`, it won't work:
```hoon
> add(a 3)
-tack.a
-find.a
dojo: hoon expression failed
```
As before with `++double`, Hoon can't find an `a` to modify in a gate that doesn't exist yet.
### Slamming a Gate
If you check the docs on our now-familiar [`%-` cenhep](https://urbit.org/docs/hoon/reference/rune/cen#cenhep), you'll find that it is actually sugar syntax for another rune:
> This rune is for evaluating the `$` arm of a gate, i.e., calling a gate as a function. `a` is the gate, and `b` is the desired sample value (i.e., input value) for the gate.
>
> ```hoon
> %~($ a b)
> ```
So all gate calls actually pass back through [`%~` censig](https://urbit.org/docs/hoon/reference/rune/cen#-censig). What's the difference?
The [`%~` censig](https://urbit.org/docs/hoon/reference/rune/cen#censig) rune accepts three children, a wing which resolves to an arm in a _door_; the aforesaid door; and a `sample` for the door.
Basically, whenever you use `%-` cenhep, it actually looks up a wing in a door using `%~` censig, which is a more general type of core than a gate. Whatever that wing resolves to is then provided a `sample`. The resulting Hoon expression is evaluated and the value is returned.
## Doors
Doors are another kind of core whose arms evaluate to make gates, as we just discovered. The difference is that a door also has its own sample. A door is the most general case of a function in Hoon. (You could say a "gate-building core" or a "function-building function" to clarify what the intent of most of these are.)
A core is a cell of code and data, called `[battery payload]`. The `battery` contains a series of arms, and the `payload` contains all the data necessary to run those arms correctly.
A _door_ is a core with a sample. That is, a door is a core whose payload is a cell of sample and context: `[sample context]`. A door's overall sample can affect how its gate-building arms work.
```
door
/ \
battery .
/ \
sample context
```
It follows from this definition that a gate is a special case of a door. A gate is a door with exactly one arm, named `$` buc.
Doors are created with the [`|_` barcab](https://urbit.org/docs/hoon/reference/rune/bar#_-barcab) rune. Doors get used for a few different purposes in the standard library:
- instrumenting and storing persistent data structures like `map`s (this module and the next)
- implementing state machines (the [subject-oriented programming module](./N-subject.md))
One BIG pitfall for thinking about doors is thinking of them as “containing” gates, as if they were more like “objects”. Instead, think of them the same way as you think of gates, just that they can be altered at a higher level.
#### Example: The Quadratic Equation
First, a mathematical example. If we wanted to calculate a quadratic polynomial, _y = a x² + b x + c_, then we need to know two kinds of things: the unknown or variable _x_, AND the parameters _a_, _b_, and _c_. These aren't really the same kind of thing. When we calculate a particular curve _y_(_x_), we assume that the parameters _a_, _b_, and _c_ stay constant across evaluations of _x_, and it's inconvenient for us to specify them every single time.
If we were to build this as a gate, we would need to pass in four parameters:
```hoon
> =poly-gate |= [x=@ud a=@ud b=@ud c=@ud]
(add (add (mul a (mul x x)) (mul b x)) c)
```
Any time we call the gate, we have to provide all four values: one unknown, three parameters. But there's a sense in which we want to separate the three parameters and only call the gate with one _x_ value. One way to accomplish this is to wrap the gate inside of another:
```hoon
> =wrapped-gate |= [x=@ud]
=/ a 5
=/ b 4
=/ c 3
(poly-gate x a b c)
```
If we built this as a door instead, we could push the parameters out to a different layer of the structure. In this case, the parameters are the sample of the door, while the arm `++quad` builds a gate that corresponds to those parameters and only accepts one unknown variable `x`. To make a door we use the `|_` barcab rune, which we'll discuss later:
```hoon
> =poly |_ [a=@ud b=@ud c=@ud]
++ quad
|= x=@ud
(add (add (mul a (mul x x)) (mul b x)) c)
--
```
This will be used in two steps: a gate-building step then a gate usage step.
We produce a gate from a door's arm using the [`%~` censig](https://urbit.org/docs/hoon/reference/rune/cen#-censig) rune, almost always used in its irregular form, `~()`. Here we prime the door with `[5 4 3]`, which yields a gate:
```hoon
~(quad poly [5 4 3])
```
By itself, not so much to say. We could pin it into the Dojo, for instance, to use later. Our ultimate goal is to use the built gate on particular data, however:
```hoon
> (~(quad poly [5 4 3]) 2)
31
```
By hand: 5×2² + 4×2 + 3 = 31, so that's correct.
Doors will enable us to build some very powerful data storage tools by letting us defer parts of a gate calculation to other stages of building and calculating the gate.
#### Example: A Calculator
Let's unpack what's going on more with this next door. Each of the arms in this example door will define a simple gate. Let's bind the door to `c`. To make a door we use the `|_` barcab rune:
```hoon
> =c |_ b=@
++ plus |=(a=@ (add a b))
++ times |=(a=@ (mul a b))
++ greater |=(a=@ (gth a b))
--
```
If you type this into the dojo manually, make sure you attend carefully to the spacing. Feel free to cut and paste the code, if desired.
Before getting into what these arms do, let's digress into how the `|_` barcab rune works in general.
[`|_` barcab](https://urbit.org/docs/hoon/reference/rune/bar#_-barcab) works exactly like the `|%` rune for making a core, except that it takes one additional daughter expression, the door's sample. Following that are a series of `++` runes, each of which defines an arm of the door. Finally, the expression is terminated with a `--` rune.
A door really is, at the bedrock level, the same thing as a core with a `sample`. Let's ask Dojo to pretty print a simple door.
```hoon
> =a => ~ |_ b=@ ++ foo b --
> a
<1.zgd [b=@ %~]>
```
Dojo tells us that `a` is a core with one arm and a payload of `[b=@ %~]`. Since a door's payload is `[sample context]`, this means that `b` is the sample and the context is null. (The `=> ~` set the context. We did this to avoid including the standard library that is included in the context by default in Dojo, which would have made the pretty-printed core much more verbose. Try it without `=> ~` as well.)
For the door defined above, `c`, the sample is defined as an `@` atom and given the face `b`. The `++plus` arm defines a gate that takes a single atom as its argument `a` and returns the sum of `a` and `b`. The `++times` arm defines a gate that takes a single atom `a` and returns the product of `a` and `b`. The `++greater` arm defines a gate that takes a single atom `a`, and returns `%.y` if `a` is greater than `b`; otherwise it returns `%.n`.
Let's try out the arms of `c` with ordinary function calls:
```hoon
> (plus:c 10)
10
> (times:c 10)
0
> (greater:c 10)
%.y
```
This works, but the results are not exciting. Passing `10` to the `plus` gate returns `10`, so it must be that the value of `b` is `0` (the bunt value of `@`). The products of the other function calls reinforce that assessment. Let's look directly at `+6` of `c` to see the sample:
```hoon
> +6:c
b=0
```
Having confirmed that `b` is `0`, let's mutate the `c` sample and then call its arms:
```hoon
> (plus:c(b 7) 10)
17
> (times:c(b 7) 10)
70
> (greater:c(b 7) 10)
%.y
> (greater:c(b 17) 10)
%.n
```
Doing the same mutation repeatedly can be tedious, so let's bind `c` to the modified version of the door, where `b` is `7`:
```hoon
> =c c(b 7)
> (plus:c 10)
17
> (times:c 10)
70
> (greater:c 10)
%.y
```
There's a more direct way of passing arguments for both the door sample and the gate sample simultaneously. We may use the `~(arm door arg)` syntax. This generates the `arm` product after modifying the `door`'s sample to be `arg`.
```hoon
> (~(plus c 7) 10)
17
> (~(times c 7) 10)
70
> (~(greater c 7) 10)
%.y
> (~(greater c 17) 10)
%.n
```
Readers with some mathematical background may notice that `~( )` expressions allow us to [curry](https://en.wikipedia.org/wiki/Currying). For each of the arms above, the `~( )` expression is used to create different versions of the same gate:
```hoon
> ~(plus c 7)
< 1.xpd
[ a=@
< 3.bnz
[ b=@
[our=@p now=@da eny=@uvJ]
<17.ayh 34.ozb 14.usy 54.uao 77.gmv 232.hhi 51.qbt 123.ppa 46.hgz 1.pnw %140>
]
>
]
>
> b:~(plus c 7)
7
> b:~(plus c 17)
17
```
Thus, you may think of the `c` door as a function for making functions. Use the `~(arm c arg)` syntax—`arm` defines which kind of gate is produced (i.e., which arm of the door is used to create the gate), and `arg` defines the value of `b` in that gate, which in turn affects the product value of the gate produced.
The standard library provides [currying functionality](./P-func.md) outside of the context of doors.
#### Creating Doors with a Modified Sample
In the above example we created a door `c` with sample `b=@` and found that the initial value of `b` was `0`, the bunt value of `@`. We then created new door from `c` by modifying the value of `b`. But what if we wish to define a door with a chosen sample value directly? We make use of the `$_` rune, whose irregular form is simply `_`. To create the door `c` with the sample `b=@` set to have the value `7` in the dojo, we would write
```hoon
> =c |_ b=_7
++ plus |=(a=@ (add a b))
++ times |=(a=@ (mul a b))
++ greater |=(a=@ (gth a b))
--
```
Here the type of `b` is inferred to be `@` based on the example value `7`, similar to how we've seen casting done by example. You will learn more about how types are inferred in the [next module](./L-struct.md).
#### Exercise: Adding Arms to a Door
Recall the quadratic equation door.
```hoon
|_ [a=@ud b=@ud c=@ud]
++ quad
|= x=@ud
(add (add (mul a (mul x x)) (mul b x)) c)
--
```
- Add an arm to the door which calculates the linear function _a_×_x_ + _b_.
- Add another arm which calculates the derivative of the first quadratic function, 2×_a_×_x_ + _b_.
## Key-Value Pairs: `map` as Door
In general terms, a map is a pattern from a key to a value. You can think of a dictionary, or an index, or a data table. Essentially it scans for a particular key, then returns the data associated with that key (which may be any noun).
| Key | Value |
| ----------- | ---------- |
| 'Mazda' | 'RX-8' |
| 'Dodge' | 'Viper' |
| 'Ford' | 'Mustang' |
| 'Chevrolet' | 'Chevelle' |
| 'Porsche' | 'Boxster' |
| 'Bugatti' | 'Type 22' |
While `map` is the mold or type of the value, the door which affords `map`-related functionality is named `++by`. (This felicitously affords us a way to read `map` operations in an English-friendly phrasing.)
In Urbit, all values are static and never change. (This is why we “overwrite” or replace the values in a limb to change it with `%=` centis.) This means that when we build a `map`, we often rather awkwardly replace it with its modified value explicitly.
We'll build a color `map`, from a `@tas` of a [color's name](https://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors) to its HTML hexadecimal representation as a `@ux` hex value.
We can produce a `map` from a `list` of key-value cells using the [`++malt`](https://urbit.org/docs/hoon/reference/stdlib/2l#malt) function. Using `@tas` terms as keys (which is common) requires us to explicitly mark the list as `(list (pair @tas @ux))`:
```hoon
> =colors (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870] [%green 0x1.a638] [%blue 0x66ff]])
```
To insert one key-value pair at a time, we use `put`. In Dojo, we need to either pin it into the subject or modify a copy of the map for the rest of the expression using `=/` tisfas.
```hoon
> =colors (~(put by colors) [%orange 0xff.8833])
> =colors (~(put by colors) [%violet 0x83.59a3])
> =colors (~(put by colors) [%black 0x0])
```
Note the pattern here: there is a `++put` arm of `++by` which builds a gate to modify `colors` by inserting a value.
What happens if we try to add something that doesn't match the type?
```hoon
> =colors (~(put by colors) [%cerulean '#02A4D3'])
```
We'll see a `mull-grow`, a `mull-nice`, and a `nest-fail`. Essentially these are all flavors of mold-matching errors.
(As an aside, `++put:by` is also how you'd replace a key's value.)
The point of a `map` is to make it easy to retrieve data values given their appropriate key. Use `++get:by`:
```hoon
> (~(get by colors) %orange)
[~ 0xff.8833]
```
What is that cell? Wasn't the value stored as `0xff.8833`? Well, one fundamental problem that a `map` needs to solve is to allow us to distinguish an _empty_ result (or failure to locate a value) from a _zero_ result (or an answer that's actually zero). To this end, the `unit` was introduced, a type union of a `~` (for no result) and `[~ item]` (for when a result exists).
- What does `[~ ~]` mean when returned from a `map`?
`unit`s are common enough that they have their own syntax and set of operational functions. We'll look at them more in [the next module](./K-doors.md).
```hoon
> (~(get by colors) %brown)
~
```
([`++got:by`](https://urbit.org/docs/hoon/reference/stdlib/2i#gotby) returns the value without the `unit` wrapper, but crashes on failure to locate. I recommend just using `++get` and extracting the tail of the resulting cell after confirming it isn't null with `?~` wutsig. See also [`++gut:by`](https://urbit.org/docs/hoon/reference/stdlib/2i#gutby) which allows a default in case of failure to locate.)
You can check whether a key is present using `++has:by`:
```hoon
> (~(has by colors) %teal)
%.n
> (~(has by colors) %green)
%.y
```
You can get a list of all keys with `++key:by`:
```hoon
> ~(key by colors)
{%black %red %blue %violet %green %yellow %orange}
```
You can apply a gate to each value, rather like `++turn` in Lesson 4, using `++run:by`. For instance, these gates will break the color hexadecimal value into red, green, and blue components:
```hoon
> =red |=(a=@ux ^-(@ux (cut 2 [4 2] a)))
> =green |=(a=@ux ^-(@ux (cut 2 [2 2] a)))
> =blue |=(a=@ux ^-(@ux (cut 2 [0 2] a)))
> (~(run by colors) blue)
{ [p=%black q=0x0]
[p=%red q=0x3f]
[p=%blue q=0xff]
[p=%violet q=0xa3]
[p=%green q=0x38]
[p=%yellow q=0x70]
[p=%orange q=0x33]
}
```
#### Exercise: Display Cards
- Recall the `/lib/playing-cards.hoon` library. Use a map to pretty-print the `darc`s as Unicode card symbols.
The map type should be `(map darc @t)`. We'll use `++malt` to build it and associate the fancy (if tiny) [Unicode playing card symbols](https://en.wikipedia.org/wiki/Playing_cards_in_Unicode).
Add the following arms to the library core:
```hoon
++ pp-card
|= c=darc
(~(got by card-table) c)
++ card-table
%- malt
^- (list [darc:playing-cards @t])
:~ :- [sut=%clubs val=1] '🃑'
:- [sut=%clubs val=2] '🃒'
:- [sut=%clubs val=3] '🃓'
:- [sut=%clubs val=4] '🃔'
:- [sut=%clubs val=5] '🃕'
:- [sut=%clubs val=6] '🃖'
:- [sut=%clubs val=7] '🃗'
:- [sut=%clubs val=8] '🃘'
:- [sut=%clubs val=9] '🃙'
:- [sut=%clubs val=10] '🃚'
:- [sut=%clubs val=11] '🃛'
:- [sut=%clubs val=12] '🃝'
:- [sut=%clubs val=13] '🃞'
:- [sut=%diamonds val=1] '🃁'
:- [sut=%diamonds val=2] '🃂'
:- [sut=%diamonds val=3] '🃃'
:- [sut=%diamonds val=4] '🃄'
:- [sut=%diamonds val=5] '🃅'
:- [sut=%diamonds val=6] '🃆'
:- [sut=%diamonds val=7] '🃇'
:- [sut=%diamonds val=8] '🃈'
:- [sut=%diamonds val=9] '🃉'
:- [sut=%diamonds val=10] '🃊'
:- [sut=%diamonds val=11] '🃋'
:- [sut=%diamonds val=12] '🃍'
:- [sut=%diamonds val=13] '🃎'
:- [sut=%hearts val=1] '🂱'
:- [sut=%hearts val=2] '🂲'
:- [sut=%hearts val=3] '🂳'
:- [sut=%hearts val=4] '🂴'
:- [sut=%hearts val=5] '🂵'
:- [sut=%hearts val=6] '🂶'
:- [sut=%hearts val=7] '🂷'
:- [sut=%hearts val=8] '🂸'
:- [sut=%hearts val=9] '🂹'
:- [sut=%hearts val=10] '🂺'
:- [sut=%hearts val=11] '🂻'
:- [sut=%hearts val=12] '🂽'
:- [sut=%hearts val=13] '🂾'
:- [sut=%spades val=1] '🂡'
:- [sut=%spades val=2] '🂢'
:- [sut=%spades val=3] '🂣'
:- [sut=%spades val=4] '🂤'
:- [sut=%spades val=5] '🂥'
:- [sut=%spades val=6] '🂦'
:- [sut=%spades val=7] '🂧'
:- [sut=%spades val=8] '🂨'
:- [sut=%spades val=9] '🂩'
:- [sut=%spades val=10] '🂪'
:- [sut=%spades val=11] '🂫'
:- [sut=%spades val=12] '🂭'
:- [sut=%spades val=13] '🂮'
==
```
Import the library in Dojo (or use `/+` in a generator) and build a deck:
```hoon
> =playing-cards -build-file /===/lib/playing-cards/hoon
> =deck (shuffle-deck:playing-cards make-deck:playing-cards eny)
> deck
~[
[sut=%spades val=12]
[sut=%spades val=8]
[sut=%hearts val=5]
[sut=%clubs val=2]
[sut=%diamonds val=10]
...
[sut=%spades val=2]
[sut=%hearts val=6]
[sut=%hearts val=12]
]
```
Finally, render each card in the hand to a `@t` cord:
```hoon
> =new-deck (draw:playing-cards 5 deck)
> =/ index 0
=/ hand *(list @t)
|-
?: =(index (lent hand:new-deck))
hand
$(index +(index), hand (snoc hand (pp-card:playing-cards (snag index hand:new-deck))))
<|🂭 🂨 🂵 🃒 🃊|>
```
#### Tutorial: Caesar Cipher
The Caesar cipher is a shift cipher ([that was indeed used anciently](https://en.wikipedia.org/wiki/Caesar_cipher)) wherein each letter in a message is encrypted by replacing it with one shifted some number of positions down the alphabet. For example, with a “right-shift” of `1`, `a` would become `b`, `j` would become `k`, and `z` would wrap around back to `a`.
Consider the message below, and the cipher that results when we Caesar-shift the message to the right by 1.
```
Plaintext message: "do not give way to anger"
Right-shifted cipher: "ep opu hjwf xbz up bohfs"
```
Below is a generator that performs a Caesar cipher on a `tape`. This example isn't the most compact implementation of such a cipher in Hoon, but it demonstrates important principles that more laconic code would not. Save it as `/gen/caesar.hoon` on your `%base` desk.
**/gen/caesar.hoon**
```hoon
!:
|= [msg=tape steps=@ud]
=<
=. msg (cass msg)
:- (shift msg steps)
(unshift msg steps)
::
|%
++ alpha "abcdefghijklmnopqrstuvwxyz"
:: Shift a message to the right.
::
++ shift
|= [message=tape steps=@ud]
^- tape
(operate message (encoder steps))
:: Shift a message to the left.
::
++ unshift
|= [message=tape steps=@ud]
^- tape
(operate message (decoder steps))
:: Rotate forwards into encryption.
::
++ encoder
|= [steps=@ud]
^- (map @t @t)
=/ value-tape=tape (rotation alpha steps)
(space-adder alpha value-tape)
:: Rotate backwards out of encryption.
::
++ decoder
|= [steps=@ud]
^- (map @t @t)
=/ value-tape=tape (rotation alpha steps)
(space-adder value-tape alpha)
:: Apply the map of decrypted->encrypted letters to the message.
::
++ operate
|= [message=tape shift-map=(map @t @t)]
^- tape
%+ turn message
|= a=@t
(~(got by shift-map) a)
:: Handle spaces in the message.
::
++ space-adder
|= [key-position=tape value-result=tape]
^- (map @t @t)
(~(put by (map-maker key-position value-result)) ' ' ' ')
:: Produce a map from each letter to its encrypted value.
::
++ map-maker
|= [key-position=tape value-result=tape]
^- (map @t @t)
=| chart=(map @t @t)
?. =((lent key-position) (lent value-result))
~| %uneven-lengths !!
|-
?: |(?=(~ key-position) ?=(~ value-result))
chart
$(chart (~(put by chart) i.key-position i.value-result), key-position t.key-position, value-result t.value-result)
:: Cycle an alphabet around, e.g. from
:: 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' to 'BCDEFGHIJKLMNOPQRSTUVWXYZA'
::
++ rotation
|= [my-alphabet=tape my-steps=@ud]
=/ length=@ud (lent my-alphabet)
=+ (trim (mod my-steps length) my-alphabet)
(weld q p)
--
```
This generator takes two arguments: a `tape`, which is your plaintext message, and an unsigned integer, which is the shift-value of the cipher. It produces a cell of two `tape`s: one that has been shifted right by the value, and another that has been shifted left. It also converts any uppercase input into lowercase.
Try it out in the Dojo:
```hoon
> +caesar ["abcdef" 1]
["bcdefg" "zabcde"]
> +caesar ["test" 2]
["vguv" "rcqr"]
> +caesar ["test" 26]
["test" "test"]
> +caesar ["test" 28]
["vguv" "rcqr"]
> +caesar ["test" 104]
["test" "test"]
> +caesar ["tESt" 2]
["vguv" "rcqr"]
> +caesar ["test!" 2]
nest-fail
```
##### Examining the Code
Let's examine our caesar.hoon code piece by piece. We won't necessarily go in written order; instead, we'll cover code in the intuitive order of the program. For each chunk that we cover, try to read and understand the code itself before reading the explanation.
There are a few runes in this which we haven't seen yet; we will deal with them incidentally in the commentary.
```hoon
!:
|= [msg=tape steps=@ud]
=<
```
The `!:` in the first line of the above code enables a full stack trace in the event of an error.
`|= [msg=tape steps=@ud]` creates a [gate](https://urbit.org/docs/glossary/gate) that takes a cell. The head of this cell is a `tape`, which is a string type that's a list of `cord`s. Tapes are represented as text surrounded by double-quotes, such as this: `"a tape"`. We give this input tape the face `msg`. The tail of our cell is a `@ud` -- an unsigned decimal [atom](https://urbit.org/docs/glossary/atom) -- that we give the face `steps`.
`=<` is the rune that evaluates its first child expression with respect to its second child expression as the subject. In this case, we evaluate the expressions in the code chunk below against the [core](https://urbit.org/docs/glossary/core) declared later, which allows us reference the core's contained [arms](https://urbit.org/docs/glossary/arm) before they are defined. Without `=<`, we would need to put the code chunk below at the bottom of our program. In Hoon, as previously stated, we always want to keep the longer code towards the bottom of our programs - `=<` helps us do that.
```hoon
=. msg (cass msg)
:- (shift msg steps)
(unshift msg steps)
```
`=. msg (cass msg)` changes the input string `msg` to lowercases. `=.` changes the leg of the subject to something else. In our case, the leg to be changed is `msg`, and the thing to replace it is `(cass msg)`. `cass` is a standard-library gate that converts uppercase letters to lowercase.
`:- (shift msg steps)` and `(unshift msg steps)` simply composes a cell of a right-shifted cipher and a left-shifted cipher of our original message. We will see how this is done using the core described below, but this is the final output of our generator. We have indented the lower line, which is not strictly good Hoon style but makes the intent clearer.
```hoon
|%
```
`|%` creates a `core`, the second child of `=<`. Everything after `|%` is part of that second child `core`, and will be used as the subject of the first child of `=<`, described above. The various parts, or `arm`s, of the `core` are denoted by `++` beneath it, for instance:
```hoon
++ rotation
|= [my-alphabet=tape my-steps=@ud]
=/ length=@ud (lent my-alphabet)
=+ (trim (mod my-steps length) my-alphabet)
(weld q p)
```
The `++rotation` arm takes takes a specified number of characters off of a tape and puts them on the end of the tape. We're going to use this to create our shifted alphabet, based on the number of `steps` given as an argument to our gate.
`|= [my-alphabet=tape my-steps=@ud]` creates a gate that takes two arguments: `my-alphabet`, a `tape`, and `my-steps`, a `@ud`.
`=/ length=@ud (lent my-alphabet)` stores the length of `my-alphabet` to make the following code a little clearer.
`trim` is a a gate from the standard library that splits a tape at into two parts at a specified position. So `=+ (trim (mod my-steps length) my-alphabet)` splits the tape `my-alphabet` into two parts, `p` and `q`, which are now directly available in the subject. We call the modulus operation `mod` to make sure that the point at which we split our `tape` is a valid point inside of `my-alphabet` even if `my-steps` is greater than `length`, the length of `my-alphabet`. Try trim in the dojo:
```hoon
> (trim 2 "abcdefg")
[p="ab" q="cdefg"]
> (trim 4 "yourbeard")
[p="your" q="beard"]
```
`(weld q p)` uses `weld`, which combines two strings into one. Remember that `trim` has given us a split version of `my-alphabet` with `p` being the front half that was split off of `my-alphabet` and `q` being the back half. Here we are welding the two parts back together, but in reverse order: the second part `q` is welded to the front, and the first part `p` is welded to the back.
```hoon
++ map-maker
|= [key-position=tape value-result=tape]
^- (map @t @t)
=| chart=(map @t @t)
?. =((lent key-position) (lent value-result))
~| %uneven-lengths !!
|-
?: |(?=(~ key-position) ?=(~ value-result))
chart
$(chart (~(put by chart) i.key-position i.value-result), key-position t.key-position, value-result t.value-result)
```
The `++map-maker` arm, as the name implies, takes two tapes and creates a [`map`](https://urbit.org/docs/hoon/reference/stdlib/2o#map) out of them. A `map` is a type equivalent to a dictionary in other languages: it's a data structure that associates a key with a value. If, for example, we wanted to have an association between `a` and 1 and `b` and 2, we could use a `map`.
`|= [a=tape b=tape]` builds a gate that takes two tapes, `a` and `b`, as its sample.
`^- (map @t @t)` casts the gate to a `map` with a `cord` (or `@t`) key and a `cord` value.
You might wonder, if our gate in this arm takes `tape`s, why then are we producing a map of `cord` keys and values?
As we discussed earlier, a `tape` is a list of `cord`s. In this case what we are going to do is map a single element of a `tape` (either our alphabet or shifted-alphabet) to an element of a different `tape` (either our shifted-alphabet or our alphabet). This pair will therefore be a pair of `cord`s. When we go to use this `map` to convert our incoming `msg`, we will take each element (`cord`) of our `msg` `tape`, use it as a `key` when accessing our `map` and get the corresponding `value` from that position in the `map`. This is how we're going to encode or decode our `msg` `tape`.
`=| chart=(map @t @t)` adds a [noun](https://urbit.org/docs/glossary/noun) to the subject with the default value of the `(map @t @t)` type, and gives that noun the face `chart`.
`?. =((lent key-position) (lent value-result))` checks if the two `tape`s are the same length. If not, the program crashes with an error message of `%uneven-lengths`, using `|~ %uneven-lengths !!`.
If the two `tape`s are of the same length, we continue on to create a trap. `|-` creates a [trap](https://urbit.org/docs/glossary/trap), a gate with no arguments that is called immediately.
`?: |(?=(~ key-position) ?=(~ value-result))` checks if either `tape` is empty. If this is true, the `map-maker` arm is finished and can return `chart`, the `map` that we have been creating.
If the above test finds that the `tape`s are not empty, we trigger a recursion that constructs our `map`: `$(chart (~(put by chart) i.a i.b), a t.a, b t.b)`. This code recursively adds an entry in our `map` where the head of the `tape` `a` maps to the value of the head of `tape` `b` with `~(put by chart)`, our calling of the `put` arm of the `by` map-engine core (note that `~(<wing> <door> <sample>`) is a shorthand for `%~ <wing> <door> <sample>` (see the [Calls % ('cen')](https://urbit.org/docs/hoon/reference/rune/cen#censig) documentation for more information). The recursion also "consumes" those heads with every iteration by changing `a` and `b` to their tails using `a t.a, b t.b`.
We have three related arms to look at next, `++decoder`, `++encoder`, and `++space-adder`. `++space-adder` is required for the other two, so we'll look at it first.
```hoon
++ space-adder
|= [key-position=tape value-result=tape]
^- (map @t @t)
(~(put by (map-maker key-position value-result)) ' ' ' ')
```
`|= [key-position=tape value-result=tape]` creates a gate that takes two `tapes`.
We use the `put` arm of the `by` core on the next line, giving it a `map` produced by the `map-maker` arm that we created before as its sample. This adds an entry to the map where the space character (called `ace`) simply maps to itself. This is done to simplify the handling of spaces in `tapes` we want to encode, since we don't want to shift them.
```hoon
++ encoder
|= [steps=@ud]
^- (map @t @t)
=/ value-tape=tape (rotation alpha steps)
(space-adder alpha value-tape)
++ decoder
|= [steps=@ud]
^- (map @t @t)
=/ key-tape=tape (rotation alpha steps)
(space-adder key-tape alpha)
```
`++encoder` and `++decoder` utilize the `rotation` and `space-adder` arms. These gates are essentially identical, with the arguments passed to `space-adder` reversed. They simplify the two common transactions you want to do in this program: producing `maps` that we can use to encode and decode messages.
In both cases, we create a gate that accepts a `@ud` named `steps`.
In `encoder`: `=/ value-tape=tape (rotation alpha steps)` creates a `value-tape` noun by calling `rotation` on `alpha`. `alpha` is our arm which contains a `tape` of the entire alphabet. The `value-tape` will be the list of `value`s in our `map`.
In `decoder`: `=/ key-tape (rotation alpha steps)` does the same work, but when passed to `space-adder` it will be the list of `key`s in our `map`.
`(space-adder alpha value-tape)`, for `encoder`, and `(space-adder key-tape alpha)`, for `decoder`, produce a `map` that has the first argument as the keys and the second as the values.
If our two inputs to `space-adder` were `"abcdefghijklmnopqrstuvwxyz"` and `"bcdefghijklmnopqrstuvwxyza"`, we would get a `map` where `'a'` maps to `'b'`, `'b'` to `'c'` and so on. By doing this we can produce a `map` that gives us a translation between the alphabet and our shifted alphabet, or vice versa.
Still with us? Good. We are finally about to use all the stuff that we've walked through.
```hoon
++ shift
|= [message=tape shift-steps=@ud]
^- tape
(operate message (encoder shift-steps))
++ unshift
|= [message=tape shift-steps=@ud]
^- tape
(operate message (decoder shift-steps))
```
Both `++shift` and `++unshift` take two arguments: our `message`, the `tape` that we want to manipulate; and our `shift-steps`, the number of positions of the alphabet by which we want to shift our message.
`++shift` is for encoding, and `++unshift` is for decoding. Thus, `++shift` calls the `operate` arm with `(operate message (encoder shift-steps))`, and `++unshift` makes that call with `(operate message (decoder shift-steps))`. These both produce the final output of the core, to be called in the form of `(shift msg steps)` and `(unshift msg steps)` in the cell being created at the beginning of our code.
```hoon
++ operate
|= [message=tape shift-map=(map @t @t)]
^- tape
%+ turn message
|= a=@t
(~(got by shift-map) a)
```
`++operate` produces a `tape`. The `%+` rune allows us to pull an arm with a pair sample. The arm we are going to pull is `turn`. This arm takes two arguments, a `list` and a `gate` to apply to each element of the `list`.
In this case, the `gate` we are applying to our `message` uses the `got` arm of the `by` door with our `shift-map` as the sample (which is either the standard alphabet for keys, and the shifted alphabet for values, or the other way, depending on whether we are encoding or decoding) to look up each `cord` in our `message`, one by one and replace it with the `value` from our `map` (either the encoded or decoded version).
Let's give our arm Caesar's famous statement (translated into English!) and get our left-cipher and right-cipher.
```hoon
> +caesar ["i came i saw i conquered" 4]
["m geqi m wea m gsruyivih" "e ywia e ows e ykjmqanaz"]
```
Now, to decode, we can put either of our ciphers in with the appropriate key and look for the legible result.
```hoon
> +caesar ["m geqi m wea m gsruyivih" 4]
["q kium q aie q kwvycmzml" "i came i saw i conquered"]
> +caesar ["e ywia e ows e ykjmqanaz" 4]
["i came i saw i conquered" "a usew a kso a ugfimwjwv"]
```
##### Further Exercise
1. Take the example generator and modify it to add a second layer of shifts.
2. Extend the example generator to allow for use of characters other than a-z. Make it shift the new characters independently of the alpha characters, such that punctuation is only encoded as other punctuation marks.
3. Build a gate that can take a Caesar shifted `tape` and produce all possible unshifted `tapes`.
4. Modify the example generator into a `%say` generator.
## A Bit More on Cores
The [`|^` barket](https://urbit.org/docs/hoon/reference/rune/bar#-barket) rune is an example of what we can call a _convenience rune_, similar to the idea of sugar syntax (irregular syntax to make writing certain things out in a more expressive manner). `|^` barket produces a core with _at least_ a `$` buc arm and computes it immediately, called a _cork_. (So a cork is like a trap in the regard of computing immediately, but it has more arms than just `$` buc.)
This code calculates the volume of a cylinder, _A=πr²h_.
```hoon
=volume-of-cylinder |^
(mul:rs (area-of-circle .2.0) height)
++ area-of-circle
|= r=@rs
(mul:rs pi r)
++ pi .3.1415926
++ height .10.0
--
```
Since all of the values either have to be pinned ahead of time or made available as arms, a `|^` barket would probably be used inside of a gate. Of course, since it is a core with a `$` buc arm, one could also use it recursively to calculate values like the factorial.
If you read the docs, you'll find that a [`|-` barhep](https://urbit.org/docs/hoon/reference/rune/bar#-barhep) rune “produces a trap (a core with one arm `$`) and evaluates it.” So a trap actually evaluates to a `|%` barcen core with an arm `$`:
```hoon
:: count to five
=/ index 1
|-
?: =(index 5) index
$(index +(index))
```
actually translates to
```hoon
:: count to five
=/ index 1
=< $
|%
++ $
?: =(index 5) index
%= $
index +(index)
==
--
```
You can also create a trap for later use with the [`|.` bardot](https://urbit.org/docs/hoon/reference/rune/bar#-bardot) rune. It's quite similar, but without the `=<($...` part then it doesn't get evaluated immediately.
```hoon
> =forty-two |.(42)
> $:forty-two
42
> (forty-two)
42
```
What is a gate? It is a door with only one arm `$` buc, and whenever you invoke it then that default arm's expression is referred to and evaluated.
A _gate_ and a _trap_ are actually very similar: a [gate](https://urbit.org/docs/hoon/reference/rune/bar#-bartis) simply has a sample (and can actively change when evaluated or via a `%=` cenhep), whereas a trap does not (and can _only_ be passively changed via something like `%=` cenhep).

View File

@ -0,0 +1,370 @@
---
title: Data Structures
nodes: 183
objectives:
- "Identify units, sets, maps, and compound structures like jars and jugs."
- "Explain why units and vases are necessary."
- "Use helper arms and syntax: `` ` ``, `biff`, `some`, etc."
---
# Data Structures
_This module will introduce you to several useful data structures built on the door, then discuss how the compiler handles types and the sample._
## Key Data Structures and Molds
`++map`s are a versatile way to store and access data, but they are far from the only useful pattern. `++map`s were documented in [the previous module](./K-doors.md).
### `tree`
We use `tree` to make a binary tree data structure in Hoon, e.g., `(tree @)` for a binary tree of atoms.
There are two kinds of `tree` in Hoon:
1. The null tree `~`.
2. A non-null tree which is a cell with three parts.
1. The node value.
2. The left child of the node.
3. The right child of the node.
Each child is itself a tree. The node value has the face `n`, the left child has the face `l`, and the right child has the face `r`. The following diagram provides an illustration of a `(tree @)` (without the faces):
```
12
/ \
8 14
/ \ / \
4 ~ ~ 16
/ \ / \
~ ~ ~ ~
```
Hoon supports trees of any type that can be constructed in Hoon, e.g.: `(tree @)`, `(tree ^)`, `(tree [@ ?])`, etc. Let's construct the tree in the diagram above in the dojo, casting it accordingly:
```
> `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]
{4 8 12 14 16}
```
Notice that we don't have to insert the faces manually; by casting the [noun](/docs/glossary/noun/) above to a `(tree @)` Hoon inserts the faces for us. Let's put this noun in the dojo subject with the face `b` and pull out the tree at the left child of the `12` node:
```
> =b `(tree @)`[12 [8 [4 ~ ~] ~] [14 ~ [16 ~ ~]]]
> b
{4 8 12 14 16}
> l.b
-find.l.b
find-fork-d
```
This didn't work because we haven't first proved to Hoon that `b` is a non-null tree. A null tree has no `l` in it, after all. Let's try again, using `?~` to prove that `b` isn't null. We can also look at `r` and `n`:
```
> ?~(b ~ l.b)
{4 8}
> ?~(b ~ r.b)
{14 16}
> ?~(b ~ n.b)
12
```
#### Find and Replace
Here's a program that finds and replaces certain atoms in a `(tree @)`.
```hoon
|= [nedl=@ hay=(tree @) new=@]
^- (tree @)
?~ hay ~
:+ ?: =(n.hay nedl)
new
n.hay
$(hay l.hay)
$(hay r.hay)
```
`nedl` is the atom to be replaced, `hay` is the tree, and `new` is the new atom with which to replace `nedl`. Save this as `findreplacetree.hoon` and run in the dojo:
```
> b
{4 8 12 14 16}
> +findreplacetree [8 b 94]
{4 94 12 14 16}
> +findreplacetree [14 b 94]
{4 8 12 94 16}
```
### `set`
A `set` is rather like a `list` except that each entry can only be represented once. As with a `map`, a `set` is typically associated with a particular type, such as `(set @ud)` for a collection of decimal values. (`set`s also don't have an order, so they're basically a bag of unique values.)
`set` operations are provided by `++in`. Most names are similar to `map`/`++by` operations when applicable.
[`++silt`](https://urbit.org/docs/hoon/reference/stdlib/2l#silt) produces a `set` from a `list`:
```hoon
> =primes (silt ~[2 3 5 7 11 13])
```
`++put:in` adds a value to a `set` (and null-ops when the value is already present):
```hoon
> =primes (~(put in primes) 17)
> =primes (~(put in primes) 13)
```
`++del:in` removes a value from a `set`:
```hoon
> =primes (~(put in primes) 18)
> =primes (~(del in primes) 18)
```
`++has:in` checks for existence:
```hoon
> (~(has in primes) 15)
%.n
> (~(has in primes) 17)
%.y
```
`++tap:in` yields a `list` of the values:
```hoon
> ~(tap in primes)
~[3 2 7 5 11 13 17]
> (sort ~(tap in primes) lth)
~[2 3 5 7 11 13 17]
```
`++run:in` applies a function across all values:
```hoon
> (~(run in primes) dec)
{10 6 12 1 2 16 4}
```
#### Example: Cartesian Product
Here's a program that takes two sets of atoms and returns the [Cartesian product](https://en.wikipedia.org/wiki/Cartesian_product) of those sets. A Cartesian product of two sets `a` and `b` is a set of all the cells whose head is a member of `a` and whose tail is a member of `b`.
```hoon
|= [a=(set @) b=(set @)]
=/ c=(list @) ~(tap in a)
=/ d=(list @) ~(tap in b)
=| acc=(set [@ @])
|- ^- (set [@ @])
?~ c acc
%= $
c t.c
acc |- ?~ d acc
%= $
d t.d
acc (~(put in acc) [i.c i.d])
==
==
```
Save this as `cartesian.hoon` in your urbit's pier and run in the dojo:
```
> =c `(set @)`(sy ~[1 2 3])
> c
{1 2 3}
> =d `(set @)`(sy ~[4 5 6])
> d
{5 6 4}
> +cartesian [c d]
{[2 6] [1 6] [3 6] [1 4] [1 5] [2 4] [3 5] [3 4] [2 5]}
```
### `unit` Redux (and `vase`)
We encountered the `unit` briefly as a tool for distinguishing null results from actual zeroes: using a `unit` allows you to specify something that may not be there. For this reason, `unit`s are commonly used for operations that sometimes fail, such as search functions, database lookups, remote data requests, etc.
You can build a `unit` using the tic special notation or [`++some`](https://urbit.org/docs/hoon/reference/stdlib/2a#some):
```hoon
> `%mars
[~ %mars]
> (some %mars)
[~ u=%mars]
```
While `++got:by` is one way to get a value back without wrapping it in a `unit`, it's better practice to use the [`unit` logic](https://urbit.org/docs/hoon/reference/stdlib/2a) gates to manipulate gates to work correctly with `unit`s.
For example, use [`++need`](https://urbit.org/docs/hoon/reference/stdlib/2a#need) to unwrap a `unit`, or crash if the `unit` is `~` null.
```hoon
> =colors (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870] [%green 0x1.a638] [%blue 0x66ff]])
> (~(get by colors) %yellow)
[~ q=0xfb.e870]
> (need (~(get by colors) %yellow))
0xfb.e870
> (~(get by colors) %teal)
~
> (need (~(get by colors) %teal))
dojo: hoon expression failed
```
Rather than unwrap a `unit`, one can modify gates to work with `unit`s directly even if they're not natively set up that way. For instance, one cannot decrement a `unit` because `++dec` doesn't accept a `unit`. [`++bind`](https://urbit.org/docs/hoon/reference/stdlib/2a#bind) can bind a non-`unit` function—another gate-building gate!.
```hoon
> (bind ((unit @ud) [~ 2]) dec)
[~ 1]
> (bind (~(get by colors) %orange) red)
[~ 0xff]
```
(There are several others tools listed [on that page](https://urbit.org/docs/hoon/reference/stdlib/2a) which may be potentially useful to you.)
A `+$vase` is a pair of type and value, such as that returned by `!>` zapgar. A `vase` is useful when transmitting data in a way that may lose its type information.
### Containers of Containers
`map`s and `set`s are frequently used in the standard library and in the extended ecosystem (such as in `graph-store`). There are a some other common patterns which recur often enough that they have their own names:
- [`++jar`](https://urbit.org/docs/hoon/reference/stdlib/2o#jar) is a mold for a `map` of `list`s. `++jar` uses the [`++ja`](https://urbit.org/docs/hoon/reference/stdlib/2j#ja) core.
- [`++jug`](https://urbit.org/docs/hoon/reference/stdlib/2o#jug) is a mold for a `map` of `set`s. `++jug` uses the [`++ju`](https://urbit.org/docs/hoon/reference/stdlib/2j#ju) core.
- `++mip` is a mold for a map of maps. `++mip` lives in the `%garden` desk in the Urbit repo in `/lib/mip.hoon`. Affordances are still few and there are not currently docs on how to use `++mip`, but a short example follows:
```hoon
> =mip -build-file /=garden=/lib/mip/hoon
> =my-map-warm (malt `(list (pair @tas @ux))`~[[%red 0xed.0a3f] [%yellow 0xfb.e870]])
> =my-map-cool (malt `(list (pair @tas @ux))`~[[%green 0x1.a638] [%blue 0x66ff]])
> =my-mip *(mip:mip @tas (map @tas @ux))
> =my-mip (~(put bi:mip my-mip) %cool %blue 0x66ff)
> =my-mip (~(put bi:mip my-mip) %cool %green 0x1.a638)
> =my-mip (~(put bi:mip my-mip) %warm %red 0xed.0a3f)
> =my-mip (~(put bi:mip my-mip) %warm %yellow 0xfb.e870)
> my-mip
[ n=[p=%warm q=[n=[p=%yellow q=0xfb.e870] l=[n=[p=%red q=0xed.0a3f] l=~ r=~] r=~]]
l=[n=[p=%cool q=[n=[p=%green q=0x1.a638] l=[n=[p=%blue q=0x66ff] l=~ r=~] r=~]] l=~ r=~]
r=~
]
> (~(got bi:mip my-mip) %cool %green)
0x1.a638
> ~(tap bi:mip my-mip)
~[
[x=%warm y=%yellow v=0xfb.e870]
[x=%warm y=%red v=0xed.0a3f]
[x=%cool y=%green v=0x1.a638]
[x=%cool y=%blue v=0x66ff]
]
```
`mip`s are unjetted and quite slow but serve as a proof of concept.
- `++mop` ordered maps are discussed in [the App School guides](TODO).
## Molds and Samples
### Modifying Gate Behavior
Sometimes you need to modify parts of a core (like a gate) on-the-fly to get the desired behavior. For instance, if you are using `++roll` to calculate the multiplicative product of the elements of a list, this “just works”:
```hoon
> (roll `(list @ud)`~[10 12 14 16 18] mul)
483.840
```
In contrast, if you do the same thing to a list of numbers with a fractional part (`@rs` floating-point values), the naïve operation will fail:
```hoon
> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mul:rs)
.0
```
Why is this? Let's peek inside the gates and see. Since we know a core is a cell of `[battery payload]`, let's take a look at the `payload`:
```hoon
> +:mul
[[a=1 b=1] <46.hgz 1.pnw %140>]
> +:mul:rs
[[a=.0 b=.0] <21.hqd [r=?(%d %n %u %z) <51.qbt 123.zao 46.hgz 1.pnw %140>]>]
```
The correct behavior for `++mul:rs` is really to multiply starting from one, not zero, so that `++roll` doesn't wipe out the entire product.
### Custom Samples
In an earlier exercise we created a door with sample `[a=@ud b=@ud c=@ud]`. If we investigated, we would find that the initial value of each is `0`, the bunt value of `@ud`.
```hoon
> +6:poly
[a=0 b=0 c=0]
```
What if we wish to define a door with a chosen sample value directly? We can make use of the `$_` rune, whose irregular form is simply `_`. To create the door `poly` with the sample set to have certain values in the Dojo, we would write
```unknown
> =poly |_ [a=_5 b=_4 c=_3]
++ quad
|= x=@ud
(add (add (mul a (mul x x)) (mul b x)) c)
--
> (quad:poly 2)
31
```
For our earlier example with `++roll`, if we wanted to set the default sample to have a different value than the bunt of the type, we could use `_` cab:
```hoon
> =mmul |=([a=_1 b=_1] (mul:rs a b))
> (roll `(list @rs)`~[.10 .12 .14 .16 .18] mmul)
> .483840
```
### Named Tuples
A named tuple is a structured collection of values with faces. The [`$:` buccol](https://urbit.org/docs/hoon/reference/rune/buc#-buccol) rune forms a named tuple. We use these implicitly in an irregular form when we specify the sample of a gate, as `|=([a=@ b=@] (add a b))` expands to a `$:` buccol expression for `[a=@ b=@]`. Otherwise, we only need these if we are building a special type like a vector (e.g. with two components like an _x_ and a _y_).
### Structure Mode
Most Hoon expressions evaluate normally (that's what “normal” means), what we'll call _noun mode_ (or _normal mode_). However, sample definitions and `+$` lusbuc mold specification arms evaluate in what is called _structure mode_. (You may occasionally see this the older term “spec mode”.) Structure mode expressions use a similar syntax to regular Hoon expressions but create structure definitions instead.
For instance, in eval mode if you use the irregular form `p=1` this is an irregular form of the [`^=` kettis](https://urbit.org/docs/hoon/reference/rune/ket#-kettis) rune. This is one way to define a variable using a [`=+` tislus](https://urbit.org/docs/hoon/reference/rune/tis#-tislus); these are equivalent statements:
```hoon
> =+(hello=1 hello)
1
> =+(^=(hello 1) hello)
1
```
(Normally we have preferred [`=/` tisfas](https://urbit.org/docs/hoon/reference/rune/tis#-tisfas) in the Hoon School docs, but that is just for consistency.)
In a sample definition, such as in a gate, the statement is evaluated in structure mode; these are equivalent statements:
```hoon
|=(hello=@ hello)
|=($=(hello @) hello)
```
There are several other subtle cases where normal mode and structure mode diverge, but most of the time structure mode is invisible to you. The [`$` buc runes](https://urbit.org/docs/hoon/reference/rune/buc) are typically invoked in structure mode.

View File

@ -0,0 +1,856 @@
---
title: Type Checking
nodes: 183
objectives:
- "Use assertions to enforce type constraints."
---
# Type Checking
_In this module we'll cover how the Hoon compiler infers type, as well as various cases in which a type check is performed._
## Type Casting
Casting is used to explain to the Hoon compiler exactly what it is we mean with a given data structure. As you get in the habit of casting your data structures, it will not only help anyone reading your code, but it will help you in hunting down bugs in your code.
`++list` is a mold builder that is used to produce a mold, i.e. a list of a particular type (like `(list @)` for a list of atoms). A list can be thought of as an ordered arrangement of zero or more elements terminated by a `~` (null). There is a difference to Hoon, however, between something explicitly tagged as a `list` of some kind and a null-terminated tuple.
```hoon
> -:!>(~[1 2 3])
#t/[@ud @ud @ud %~]
> -:!>(`(list @)`~[1 2 3])
#t/it(@)
```
The former is inflexible and doesn't have the `i`/`t` faces that a list presents. By marking the type explicitly as a `(list @)` for the compiler, we achieve some stronger guarantees that many of the `list` operators require.
However, we still don't get the faces for free:
```hoon
> =a `(list @)`~[1 2 3]
> i.a
-find.i.a
find-fork
dojo: hoon expression failed
```
What's going on? Formally, a list can be either null or non-null. When the list contains only `~` and no items, it's the null list. Most lists are, however, non-null lists, which have items preceding the `~`. Non-null lists, called _lests_, are cells in which the head is the first list item, and the tail is the rest of the list. The tail is itself a list, and if such a list is also non-null, the head of this sublist is the second item in the greater list, and so on. To illustrate, let's look at a list `[1 2 3 4 ~]` with the cell-delineating brackets left in:
```hoon
[1 [2 [3 [4 ~]]]]
```
It's easy to see where the heads are and where the nesting tails are. The head of the above list is the atom `1` and the tail is the list `[2 [3 [4 ~]]]`, (or `[2 3 4 ~]`). Recall that whenever cell brackets are omitted so that visually there appears to be more than two child nouns, it is implicitly understood that the right-most nouns constitute a cell.
You can construct lists of any type. `(list @)` indicates a list of atoms, `(list ^)` indicates a list of cells, `(list [@ ?])` indicates a list of cells whose head is an atom and whose tail is a flag, etc.
```hoon
> `(list @)`~
~
> `(list @)`[1 2 3 4 5 ~]
~[1 2 3 4 5]
> `(list @)`[1 [2 [3 [4 [5 ~]]]]]
~[1 2 3 4 5]
> `(list @)`~[1 2 3 4 5]
~[1 2 3 4 5]
```
Notice how the last Dojo command has a different construction, with the `~` in front of the bracketed items. This is just another way of writing the same thing; `~[1 2]` is semantically identical to `[1 2 ~]`.
Back to our earlier example:
```hoon
> =a `(list @)`~[1 2 3]
> i.a
-find.i.a
find-fork
dojo: hoon expression failed
```
Any time we see a `find-fork` error, it means that the type checker considers the value to be underspecified. In this case, it can't guarantee that `i.a` exists because although `a` is a list, it's not known to be a non-null lest. If we enforce that constraint, then suddenly we can use the faces:
```hoon
> ?: ?=(~ a) !! i.a
1
```
It's important to note that performing tests like this will actually transform a `list` into a `lest`, a non-null list. Because `lest` is a different type than `list`, performing such tests can come back to bite you later in non-obvious ways when you try to use some standard library functions meant for lists.
### Casting Nouns (`^` ket Runes)
As the Hoon compiler compiles your Hoon code, it does a type check on certain expressions to make sure they are guaranteed to produce a value of the correct type. If it cannot be proved that the output value is correctly typed, the compile will fail with a nest-fail crash. In order to figure out what type of value is produced by a given expression, the compiler uses type inference on that code.
Let's enumerate the most common cases where a type check is called for in Hoon.
The most obvious case is when there is a casting `^` ket rune in your code. These runes don't directly have any effect on the compiled result of your code; they simply indicate that a type check should be performed on a piece of code at compile-time.
#### `^-` kethep Cast with a Type
You've already seen one rune that calls for a type check: [`^-` kethep](https://urbit.org/docs/hoon/reference/rune/ket#--kethep):
```hoon
> ^-(@ 12)
12
> ^-(@ 123)
123
> ^-(@ [12 14])
nest-fail
> ^-(^ [12 14])
[12 14]
> ^-(* [12 14])
[12 14]
> ^-(* 12)
12
> ^-([@ *] [12 [23 43]])
[12 [23 43]]
> ^-([@ *] [[12 23] 43])
nest-fail
```
#### `^+` ketlus Cast with an Example Value
The rune [`^+` ketlus](https://urbit.org/docs/hoon/reference/rune/ket#-ketlus) is like `^-` kethep, except that instead of using a type name for the cast, it uses an example value of the type in question. E.g.:
```hoon
> ^+(7 12)
12
> ^+(7 123)
123
> ^+(7 [12 14])
nest-fail
```
The `^+` ketlus rune takes two subexpressions. The first subexpression is evaluated and its type is inferred. The second subexpression is evaluated and its inferred type is compared against the type of the first. If the type of the second provably nests under the type of the first, the result of the `^+` ketlus expression is just the value of its second subexpression. Otherwise, the code fails to compile.
This rune is useful for casting when you already have a noun—or an expression producing a noun—whose type you may not know or be able to construct easily. If you want your output value to be of the same type, you can use `^+` ketlus.
More examples:
```hoon
> ^+([12 13] [123 456])
[123 456]
> ^+([12 13] [123 [12 14]])
nest-fail
> ^+([12 [1 2]] [123 [12 14]])
[123 12 14]
```
### Nock Checks (`.` dot Runes)
You saw earlier how a type check is performed when [`.=` dottis](https://urbit.org/docs/hoon/reference/rune/dot#dottis)—or more commonly its irregular variant `=( )`—is used. For any expression of the form `=(a b)`, either the type of `a` must be a subset of the type of `b` or the type of `b` must be a subset of the type of `a`. Otherwise, the type check fails and you'll get a `nest-fail`.
```hoon
> =(12 [33 44])
nest-fail
> =([77 88] [33 44])
%.n
```
You can evade the `.=` dottis type-check by casting one of its subexpressions to a `*`, under which all other types nest:
```hoon
> .=(`*`12 [33 44])
%.n
```
(It isn't recommended that you evade the rules in this way, however!)
The [`.+` dotlus](https://urbit.org/docs/hoon/reference/rune/dot#-dotlus) increment rune—including its `+( )` irregular form—does a type check to ensure that its subexpression must evaluate to an atom.
```hoon
> +(12)
13
> +([12 14])
nest-fail
```
### Arm Checks
Whenever an arm is evaluated in Hoon it expects to have some version of its parent core as the subject. Specifically, a type check is performed to see whether the arm subject is of the appropriate type. We see this in action whenever a gate or a multi-arm door is called.
A gate is a one-armed core with a sample. When it is called, its `$` buc arm is evaluated with (a mutated copy of) the gate as the subject. The only part of the core that might change is the payload, including the sample. Of course, we want the sample to be able to change. The sample is where the argument(s) of the function call are placed. For example, when we call `add` the `$` buc arm expects two atoms for the sample, i.e., the two numbers to be added. When the type check occurs, the payload must be of the appropriate type. If it isn't, the result is a `nest-fail` crash.
```hoon
> (add 22 33)
55
> (add [10 22] [10 33])
nest-fail
> (|=(a=@ [a a]) 15)
[15 15]
> (|=(a=@ [a a]) 22)
[22 22]
> (|=(a=@ [a a]) [22 22])
nest-fail
```
We'll talk in more detail about the various kinds of type-checking that can occur at arm evaluation [when we discuss type polymorphism](./Q-metals.md).
This isn't a comprehensive list of the type checks in Hoon: for instance, some other runes that include a type check are [`=.`](https://urbit.org/docs/hoon/reference/rune/tis#-tisdot) and [`%_` cencab](https://urbit.org/docs/hoon/reference/rune/cen#_-cencab).
## Type Inference
Hoon infers the type of any given expression. How does this inference work? Hoon has available various tools for inferring the type of any given expression: literal syntax, cast expressions, gate sample definitions, conditional expressions, and more.
### Literals
[Literals](https://en.wikipedia.org/wiki/Literal_%28computer_programming%29) are expressions that represent fixed values. Atom and cell literals are supported in Hoon, and every supported aura has an unambiguous representation that allows the parser to directly infer the type from the form. Here are a few examples of auras and associated literal formats:
| Type | Literal |
| ---- | ------- |
| `@ud` | `123`, `1.000` |
| `@ux` | `0x1234`, `0x12.3456` |
| `@ub` | `0b1011.1110` |
| `[@ud @ud]` | `[12 14]` |
| `[@ux @t ?]` | `[0x1f 'hello' %.y]` |
### Casts
Casting with `^` ket runes also shape how Hoon understands an expression type, as outlined above. The inferred type of a cast expression is just the type being cast for. It can be inferred that, if the cast didn't result in a `nest-fail`, the value produced must be of the cast type. Here are some examples of cast expressions with the inferred output type on the right:
| Type | Cast |
| ---- | ---- |
| `@ud` | `^-(@ud 123)` |
| `@` | `^-(@ 123)` |
| `^` | `^-(^ [12 14])` |
| `[@ @]` | `^-([@ @] [12 14])` |
| `*` | `^-(* [12 14])` |
| `@ud` | `^+(7 123)` |
| `[@ud @ud]` | `^+([7 8] [12 14])` |
| `[@ud @ud]` | `^+([44 55] [12 14])` |
| `[@ux @ub]` | `^+([0x1b 0b11] [0x123 0b101])` |
You can also use the irregular `` ` `` syntax for casting in the same way as `^-` kethep; e.g., `` `@`123 `` for `^-(@ 123)`.
Since casts can throw away type information, if the cast type is more general, then the more specific type information is lost. Consider the literal `[12 14]`. The inferred type of this expression is `[@ @]`, i.e., a cell of two atoms. If we cast over `[12 14]` with `^-(^ [12 14])` then the inferred type is just `^`, the set of all cells. The information about what kind of cell it is has been thrown away. If we cast over `[12 14]` with `^-(* [12 14])` then the inferred type is `*`, the set of all nouns. All interesting type information is thrown away on the latter cast.
It's important to remember to include a cast rune with each gate and trap expression. That way it's clear what the inferred product type will be for calls to that core.
### (Dry) Gate Sample Definitions
By now you've used the `|=` rune to define several gates. This rune is used to produce a _dry gate_, which has different type-checking and type-inference properties than a _wet gate_ does. We won't explain the distinction until [a later module](./Q-metals.md)—for now, just keep in mind that we're only dealing with one kind of gate (albeit the more common kind).
The first subexpression after the `|=` defines the sample type. Any faces used in this definition have the type declared for it in this definition. Consider an addition generator `/gen/add.hoon`:
```hoon
|= [a=@ b=@]
^- @
?: =(b 0)
a
$(a +(a), b (dec b))
```
We run it in the Dojo using a cell to pass the two arguments:
```hoon
> +add 12 14
26
> +add 22
nest-fail
-need.[a=@ b=@]
-have.@ud
```
If you try to call this gate with the wrong kind of argument, you get a `nest-fail`. If the call succeeds, then the argument takes on the type of the sample definition: `[a=@ b=@]`. Accordingly, the inferred type of `a` is `@`, and the inferred type of `b` is `@`. In this case some type information has been thrown away; the inferred type of `[12 14]` is `[@ud @ud]`, but the addition program takes all atoms, regardless of aura.
### Inferring Type (`?` wut Runes)
#### Using Conditionals for Inference by Branch
You have learned about a few conditional runes (e.g., `?:` wutcol and `?.` wutdot), but other runes of the `?` family are used for branch-specialized type inference. The [`?@` wutpat](https://urbit.org/docs/hoon/reference/rune/wut#-wutpat), [`?^` wutket](https://urbit.org/docs/hoon/reference/rune/wut#-wutket), and [`?~` wutsig](https://urbit.org/docs/hoon/reference/rune/wut#-wutket) conditionals each take three subexpressions, which play the same basic role as the corresponding subexpressions of `?:` wutcol—the first is the test condition, which evaluates to a flag `?`. If the test condition is true, the second subexpression is evaluated; otherwise the third. These second and third subexpressions are the branches of the conditional.
There is also a [`?=` wuttis](https://urbit.org/docs/hoon/reference/rune/wut#-wuttis) rune for pattern-matching expressions by type, returning `%.y` for a match and `%.n` otherwise.
##### `?=` wuttis Non-recursive Type Match Test
The [`?=` wuttis](https://urbit.org/docs/hoon/reference/rune/wut#-wuttis) rune takes two subexpressions. The first subexpression should be a type. The second subexpression is evaluated and the resulting value is compared to the first type. If the value is an instance of the type, `%.y` is produced. Otherwise, `%.n`. Examples:
```hoon
> ?=(@ 12)
%.y
> ?=(@ [12 14])
%.n
> ?=(^ [12 14])
%.y
> ?=(^ 12)
%.n
> ?=([@ @] [12 14])
%.y
> ?=([@ @] [[12 12] 14])
%.n
```
`?=` wuttis expressions ignore aura information:
```hoon
> ?=(@ud 0x12)
%.y
> ?=(@ux 'hello')
%.y
```
We haven't talked much about types that are made with a type constructor yet. We'll discuss these more soon, but it's worth pointing out that every list type qualifies as such, and hence should not be used with `?=`:
```hoon
> ?=((list @) ~[1 2 3 4])
fish-loop
```
Using these non-basic constructed types with the `?=` wuttis rune results in a `fish-loop` error.
The `?=` wuttis rune is particularly useful when used with the `?:` wutcol rune, because in these cases Hoon uses the result of the `?=` wuttis evaluation to infer type information. To see how this works lets use `=/` tisfas to define a face, `b`, as a generic noun:
```hoon
> =/(b=* 12 b)
12
```
The inferred type of the final `b` is just `*`, because that's how `b` was defined earlier. We can see this by using `?` in the Dojo to see the product type:
```hoon
> ? =/(b=* 12 b)
*
12
```
(Remember that `?` isn't part of Hoon -- it's a Dojo-specific instruction.)
Let's replace that last `b` with a `?:` wutcol expression whose condition subexpression is a `?=` wuttis test. If `b` is an `@`, it'll produce `[& b]`; otherwise `[| b]`:
```hoon
> =/(b=* 12 ?:(?=(@ b) [& b] [| b]))
[%.y 12]
```
You can't see it here, but the inferred type of `b` in `[& b]` is `@`. That subexpression is only evaluated if `?=(@ b)` evaluates as true; hence, Hoon can safely infer that `b` must be an atom in that subexpression. Let's set `b` to a different initial value but leave everything else the same:
```hoon
> =/(b=* [12 14] ?:(?=(@ b) [& b] [| b]))
[%.n 12 14]
```
You can't see it here either, but the inferred type of `b` in `[| b]` is `^`. That subexpression is only evaluated if `?=(@ b)` evaluates as false, so `b` can't be an atom there. It follows that it must be a cell.
##### The Type Spear
What if you want to see the inferred type of `b` for yourself for each conditional branch? One way to do this is with the _type spear_. The [`!>` zapgar](https://urbit.org/docs/hoon/reference/rune/zap#-zapgar) rune takes one subexpression and constructs a cell from it. The subexpression is evaluated and becomes the tail of the product cell, with a `q` face attached. The head of the product cell is the inferred type of the subexpression.
```hoon
> !>(15)
[#t/@ud q=15]
> !>([12 14])
[#t/{@ud @ud} q=[12 14]]
> !>((add 22 55))
[#t/@ q=77]
```
The `#t/` is the pretty-printer's way of indicating a type.
To get just the inferred type of a expression, we only want the head of the `!>` product, `-`. Thus we should use our mighty weapon, the type spear, `-:!>`.
```
> -:!>(15)
#t/@ud
> -:!>([12 14])
#t/[@ud @ud]
> -:!>((add 22 55))
#t/@
```
Now let's try using `?=` wuttis with `?:` wutcol again. But this time we'll replace `[& b]` with `[& -:!>(b)]` and `[| b]` with `[| -:!>(b)]`. With `b` as `12`:
```hoon
> =/(b=* 12 ?:(?=(@ b) [& -:!>(b)] [| -:!>(b)]))
[%.y #t/@]
```
… and with `b` as `[12 14]`:
```
> =/(b=* [12 14] ?:(?=(@ b) [& -:!>(b)] [| -:!>(b)]))
[%.n #t/{* *}]
```
In both cases, `b` is defined initially as a generic noun, `*`. But when using `?:` with `?=(@ b)` as the test condition, `b` is inferred to be an atom, `@`, when the condition is true; otherwise `b` is inferred to be a cell, `^` (identical to `{* *}`).
###### `mint-vain`
Expressions of the form `?:(?=(a b) c d)` should only be used when the previously inferred type of `b` isn't specific enough to determine whether it nests under `a`. This kind of expression is only to be used when `?=` can reveal new type information about `b`, not to confirm information Hoon already has.
For example, if you have a wing expression (e.g., `b`) that is already known to be an atom, `@`, and you use `?=(@ b)` to test whether `b` is an atom, you'll get a `mint-vain` crash. The same thing happens if `b` is initially defined to be a cell `^`:
```
> =/(b=@ 12 ?:(?=(@ b) [& b] [| b]))
mint-vain
> =/(b=^ [12 14] ?:(?=(@ b) [& b] [| b]))
mint-vain
```
In the first case it's already known that `b` is an atom. In the second case it's already known that `b` isn't an atom. Either way, the check is superfluous and thus one of the `?:` wutcol branches will never be taken. The `mint-vain` crash indicates that it's provably the case one of the branches will never be taken.
#### `?@` wutpat Atom Match Tests
The [`?@` wutpat](https://urbit.org/docs/hoon/reference/rune/wut#-wutpat) rune takes three subexpressions. The first is evaluated, and if its value is an instance of `@`, the second subexpression is evaluated. Otherwise, the third subexpression is evaluated.
```hoon
> =/(b=* 12 ?@(b %atom %cell))
%atom
> =/(b=* [12 14] ?@(b %atom %cell))
%cell
```
If the second `?@` wutpat subexpression is evaluated, Hoon correctly infers that `b` is an atom. if the third subexpression is evaluated, Hoon correctly infers that `b` is a cell.
```hoon
> =/(b=* 12 ?@(b [%atom -:!>(b)] [%cell -:!>(b)]))
[%atom #t/@]
> =/(b=* [12 14] ?@(b [%atom -:!>(b)] [%cell -:!>(b)]))
[%cell #t/{* *}]
```
If the inferred type of the first `?@` wutpat subexpression nests under `@` then one of the conditional branches provably never runs. Attempting to evaluate the expression results in a `mint-vain`:
```hoon
> ?@(12 %an-atom %not-an-atom)
mint-vain
> ?@([12 14] %an-atom %not-an-atom)
mint-vain
> =/(b=@ 12 ?@(b %an-atom %not-an-atom))
mint-vain
> =/(b=^ [12 14] ?@(b %an-atom %not-an-atom))
mint-vain
```
`?@` wutpat should only be used when it allows for Hoon to infer new type information; it shouldn't be used to confirm type information Hoon already knows.
#### `?^` wutket Cell Match Tests
The [`?^` wutket](https://urbit.org/docs/hoon/reference/rune/wut#-wutket) rune is just like `?@` wutpat except it tests for a cell match instead of for an atom match. The first subexpression is evaluated, and if the resulting value is an instance of `^` the second subexpression is evaluated. Otherwise, the third is run.
```hoon
> =/(b=* 12 ?^(b %cell %atom))
%atom
> =/(b=* [12 14] ?^(b %cell %atom))
%cell
```
Again, if the second subexpression is evaluated Hoon infers that `b` is a cell; if the third, Hoon infers that `b` is an atom. If one of the conditional branches is provably never evaluated, the expression crashes with a `mint-vain`:
```hoon
> =/(b=@ 12 ?^(b %cell %atom))
mint-vain
> =/(b=^ 12 ?^(b %cell %atom))
nest-fail
```
#### Tutorial: Leaf Counting
Nouns can be understood as binary trees in which each 'leaf' of the tree is an atom. Let's look at a program that takes a noun and returns the number of leaves in it, i.e., the number of atoms.
```hoon
|= a=*
^- @
?@ a
1
(add $(a -.a) $(a +.a))
```
Save this as `/gen/leafcount.hoon` in your fakeship's pier and run it from the Dojo:
```hoon
> +leafcount 12
1
> +leafcount [12 14]
2
> +leafcount [12 [63 [829 12] 23] 13]
6
```
This program is pretty simple. If the noun `a` is an atom, then it's a tree of one leaf; return `1`. Otherwise, the number of leaves in `a` is the sum of the leaves in the head, `-.a`, and the tail, `+.a`.
We have been careful to use `-.a` and `+.a` only on a branch for which `a` is proved to be a cell -- then it's safe to treat `a` as having a head and a tail.
#### Tutorial: Cell Counting
Here's a program that counts the number of cells in a noun:
```hoon
|= a=*
=| c=@
|- ^- @
?@ a
c
%= $
c $(c +(c), a -.a)
a +.a
==
```
Save this as `/gen/cellcount.hoon` and run it from the Dojo:
```hoon
> +cellcount 12
0
> +cellcount [12 14]
1
> +cellcount [12 14 15]
2
> +cellcount [[12 [14 15]] 15]
3
> +cellcount [[12 [14 15]] [15 14]]
4
> +cellcount [[12 [14 15]] [15 [14 22]]]
5
```
This code is a little more tricky. The basic idea, however, is simple. We have a counter value, `c`, whose initial value is `0` (`=|` tisbar pins the bunt of the value with the given face)). We trace through the noun `a`, adding `1` to `c` every time we come across a cell. For any part of the noun that is just an atom, `c` is returned unchanged.
What makes this program is little harder to follow is that it recurses within a recursion call. The first recursion expression on line 6 makes changes to two face values: `c`, the counter, and `a`, the input noun. The new value for `c` defined in the line `$(c +(c), a -.a)` is another recursion call (this time in irregular syntax). The new value for `c` is to be the result of running the same function on the the head of `a`, `-.a`, and with `1` added to `c`. We add `1` because we know that `a` must be a cell. Otherwise, we're asking for the number of cells in the rest of `-.a`.
Once that new value for `c` is computed from the head of `a`, we're ready to check the tail of `a`, `+.a`. We've already got everything we want from `-.a`, so we throw that away and replace `a` with `+.a`.
### Lists
You learned about lists earlier in the chapter, but we left out a little bit of information about the way Hoon understands list types.
A non-null list is a cell. If `b` is a non-null list then the head of `b` is the first item of `b` _with an `i` face on it_. The tail of `b` is the rest of the list. The 'rest of the list' is itself another list _with a `t` face on it_. We can (and should) use these `i` and `t` faces in list functions.
To illustrate: let's say that `b` is the list of the atoms `11`, `22`, and `33`. Let's construct this in stages:
```
[i=11 t=[rest-of-list-b]]
[i=11 t=[i=22 t=[rest-of-list-b]]]
[i=11 t=[i=22 t=[i=33 t=~]]]
```
(There are lists of every type. Lists of `@ud`, `@ux`, `@` in general, `^`, `[^ [@ @]]`, etc. We can even have lists of lists of `@`, `^`, or `?`, etc.)
#### Tutorial: List Spanning Values
Here's a program that takes atoms `a` and `b` and returns a list of all atoms from `a` to `b`:
```hoon
|= [a=@ b=@]
^- (list @)
?: (gth a b)
~
[i=a t=$(a +(a))]
```
This program is very simple. It takes two `@` as input, `a` and `b`, and returns a `(list @)`, i.e., a list of `@`. If `a` is greater than `b` the list is finished: return the null list `~`. Otherwise, return a non-null list: a pair in which the head is `a` with an `i` face on it, and in which the tail is another list with the `t` face on it. This embedded list is the product of a recursion call: add `1` to `a` and run the function again.
Save this code as `/gen/gulf.hoon` and run it from the Dojo:
```hoon
> +gulf [1 10]
~[1 2 3 4 5 6 7 8 9 10]
> +gulf [10 20]
~[10 11 12 13 14 15 16 17 18 19 20]
> +gulf [20 10]
~
```
Where are all the `i`s and `t`s??? For the sake of neatness the Hoon pretty-printer doesn't show the `i` and `t` faces of lists, just the items.
In fact, we could have left out the `i` and `t` faces in the program itself:
```hoon
|= [a=@ b=@]
^- (list @)
?: (gth a b)
~
[a $(a +(a))]
```
Because there is a cast to a `(list @)` on line 2, Hoon will silently include `i` and `t` faces for the appropriate places of the noun. Remember that faces are recorded in the type information of the noun in question, not as part of the noun itself.
We called this program `gulf.hoon` because it replicates the `gulf` function in the Hoon standard library:
```
> (gulf 1 10)
~[1 2 3 4 5 6 7 8 9 10]
> (gulf 10 20)
~[10 11 12 13 14 15 16 17 18 19 20]
```
#### `?~` wutsig Null Match Test
The [`?~` wutsig](https://urbit.org/docs/hoon/reference/rune/wut#-wutsig) rune is a lot like `?@` wutpat and `?^` wutket. It takes three subexpressions, the first of which is evaluated to see whether the result is `~` null. If so, the second subexpression is evaluated. Otherwise, the third one is evaluated.
```hoon
> =/(b=* ~ ?~(b %null %not-null))
%null
> =/(b=* [12 13] ?~(b %null %not-null))
%not-null
```
The inferred type of `b` must not already be known to be null or non-null; otherwise, the expression will crash with a `mint-vain`:
```
> =/(b=~ ~ ?~(b %null %not-null))
mint-vain
> =/(b=^ [10 12] ?~(b %null %not-null))
mint-vain
> ?~(~ %null %not-null)
mint-vain
```
Hoon will infer that `b` either is or isn't null based on which `?~` branch is evaluated after the test.
##### Using `?~` wutsig With Lists
`?~` wutsig is especially useful for working with lists. Is a list null, or not? You probably want to do different things based on the answer to that question. Above, we used a pattern of `?:` wutcol and `?=` wuttis to answer the question, but `?~` wutsig will let us know in one step. Here's a program using `?~` wutsig to calculate the number of items in a list of atoms:
```hoon
|= a=(list @)
=| c=@
|- ^- @
?~ a
c
$(c +(c), a t.a)
```
This function takes a list of `@` and returns an `@`. It uses `c` as a counter value, initially set at `0` on line 2. If `a` is `~` (i.e., a null list) then the computation is finished; return `c`. Otherwise `a` must be a non-null list, in which case there is a recursion to the `|-` on line 3, but with `c` incremented, and with the head of the list `a` thrown away.
It's important to note that if `a` is a list, you can only use `i.a` and `t.a` after Hoon has inferred that `a` is non-null. A null list has no `i` or `t` in it! You'll often use `?~` to distinguish the two kinds of list (null and non-null). If you use `i.a` or `t.a` without showing that `a` is non-null you'll get a `find-fork-d` crash.
A non-null `list` is called a `lest`.
Save the above code as `/gen/lent.hoon` and run it from the Dojo:
```hoon
> +lent ~[11 22 33]
3
> +lent ~[11 22 33 44 55 77]
6
> +lent ~[0xff 0b11 'howdy' %hello]
4
```
#### Tutorial: Converting a Noun to a List of its Leaves
Here's a program that takes a noun and returns a list of its 'leaves' (atoms) in order of their appearance:
```hoon
|= a=*
=/ lis=(list @) ~
|- ^- (list @)
?@ a
[i=a t=lis]
$(lis $(a +.a), a -.a)
```
The input noun is `a`. The list of atoms to be output is `lis`, which is given an initial value of `~`. If `a` is just an atom, return a non-null list whose head is `a` and whose tail is `lis`. Otherwise, the somewhat complicated recursion `$(lis $(a +.a), a -.a)` is evaluated, in effect looping back to the `|-` with modifications made to `lis` and `a`.
The modification to `lis` in line 6 is to `$(a +.a)`. The latter is a recursion to `|-` but with `a` replaced by its tail. This evaluates to the list of `@` in the tail of `a`. So `lis` becomes the list of atoms in the tail of `a`, and `a` becomes the head of `a`, `-.a`.
Save the above code as `/gen/listleaf.hoon` and run it from the Dojo:
```hoon
> +listleaf [[[[12 13] [33 22] 12] 11] 33]
~[12 13 33 22 12 11 33]
```
### Other Kinds of Type Inference
So far you've learned about four kinds of type inference:
1. literals
2. explicit casts
3. gate sample definitions
4. branch specialization using runes in the `?` family
There are several other ways that Hoon infers type. Any rune expression that evaluates to a `?` flag, e.g., `.=` dottis, will be inferred from accordingly. The `.+` dotlus rune always evaluates to an `@`, and Hoon knows that too. The cell constructor runes, [`:-` colhep](https://urbit.org/docs/hoon/reference/rune/col#--colhep), [`:+` collus](https://urbit.org/docs/hoon/reference/rune/col#--collus), [`:^` colket](https://urbit.org/docs/hoon/reference/rune/col#--colket), and [`:*` coltar](https://urbit.org/docs/hoon/reference/rune/col#--coltar) are all known to produce cells.
More subtly, the [`=+` tislus](https://urbit.org/docs/hoon/reference/rune/tis#-tislus), [`=/` tisfas](https://urbit.org/docs/hoon/reference/rune/tis#-tisfas), and [`=|` tisbar](https://urbit.org/docs/hoon/reference/rune/tis#-tisbar) runes modify the subject by pinning values to the head. Hoon infers from this that the subject has a new type: a cell whose head is the type of the pinned value and whose tail is the type of the (old) subject.
In general, anything that modifies the subject modifies the type of the subject. Type inference can work in subtle ways for various expressions. However, we have covered enough that it should be relatively clear how to anticipate how type inference works for the vast majority of ordinary use cases.
## Auras as 'Soft' Types
It's important to understand that Hoon's type system doesn't enforce auras as strictly as it does other types. Auras are 'soft' type information. To see how this works, we'll take you through the process of converting the aura of an atom to another aura.
Hoon makes an effort to enforce that the correct aura is produced by an expression:
```hoon
> ^-(@ud 0x10)
nest-fail
> ^-(@ud 0b10)
nest-fail
> ^-(@ux 100)
nest-fail
```
But there are ways around this. First, you can cast to a more general aura, as long as the current aura nests under the cast aura. E.g., `@ub` to `@u`, `@ux` to `@u`, `@u` to `@`, etc. By doing this you're essentially telling Hoon to throw away some aura information:
```hoon
> ^-(@u 0x10)
16
> ? ^-(@u 0x10)
@u
16
> ^-(@u 0b10)
2
> ? ^-(@u 0b10)
@u
2
```
In fact, you can cast any atom all the way to the most general case `@`:
```hoon
> ^-(@ 0x10)
16
> ? ^-(@ 0x10)
@
16
> ^-(@ 0b10)
2
> ? ^-(@ 0b10)
@
2
```
Anything of the general aura `@` can, in turn, be cast to more specific auras. We can show this by embedding a cast expression inside another cast:
```hoon
> ^-(@ud ^-(@ 0x10))
16
> ^-(@ub ^-(@ 0x10))
0b1.0000
> ^-(@ux ^-(@ 10))
0xa
```
Hoon uses the outermost cast to infer the type:
```hoon
> ? ^-(@ub ^-(@ 0x10))
@ub
0b1.0000
```
As you can see, an atom with one aura can be converted to another aura. For a convenient shorthand, you can do this conversion with irregular cast syntax, e.g. `` `@ud` ``, rather than using the `^-` rune twice:
```hoon
> `@ud`0x10
16
> `@ub`0x10
0b1.0000
> `@ux`10
0xa
```
This is what we mean when we call auras 'soft' types. The above examples show that the programmer can get around the type system for auras by casting up to `@` and then back down to the specific aura, say `@ub`; or by casting with `` `@ub` `` for short.
**Note**: there is currently a type system issue that causes some of these functions to fail when passed a list `b` after some type inference has been performed on `b`. For an illustration of the bug, let's set `b` to be a `(list @)` of `~[11 22 33 44]` in the Dojo:
```hoon
> =b `(list @)`~[11 22 33 44]
> b
~[11 22 33 44]
Now let's use ?~ to prove that b isn't null, and then try to snag it:
> ?~(b ~ (snag 0 b))
nest-fail
```
The problem is that `++snag` is expecting a raw list, not a list that is known to be non-null.
You can cast `b` back to `(list)` to work around this:
```hoon
> ?~(b ~ (snag 0 `(list)`b))
11
```
### Pattern Matching and Assertions
To summarize, as values get passed around and checked at various points, the Hoon compiler tracks what the possible data structure or mold looks like. The following runes are particularly helpful when inducing the compiler to infer what it needs to know:
- [`?~` wutsig](https://urbit.org/docs/hoon/reference/rune/wut#-wutsig) asserts non-null.
- [`?^` wutket](https://urbit.org/docs/hoon/reference/rune/wut#-wutket) asserts cell.
- [`?@` wutpat](https://urbit.org/docs/hoon/reference/rune/wut#-wutpat) asserts atom.
- [`?=` wuttis](https://urbit.org/docs/hoon/reference/rune/wut#-wuttis) tests for a pattern match in type.
There are two additional assertions which can be used with the type system:
- [`?>` wutgar](https://urbit.org/docs/hoon/reference/rune/wut#-wutgar) is a positive assertion (`%.y%` or crash).
- [`?<` wutgal](https://urbit.org/docs/hoon/reference/rune/wut#-wutgal) is a negative assertion (`%.n` or crash).
If you are running into `find-fork` errors in more complicated data structures (like marks or JSONs), consider using these assertions to guide the typechecker.

View File

@ -0,0 +1,177 @@
---
title: Gates
nodes: 184
objectives:
- "Produce loobean expressions."
- "Reorder conditional arms."
- "Switch against a union with or without default."
---
# Conditional Logic
_Although you've been using various of the `?` wut runes for a while now, let's wrap up some loose ends. This module will cover the nature of loobean logic and the rest of the `?` wut runes._
## Loobean Logic
Throughout Hoon School, you have been using `%.y` and `%.n`, often implicitly, every time you have asked a question like `?: =(5 4)`. The `=()` expression returns a loobean, a member of the type union `?(%.y %.n)`. (There is a proper aura `@f` but unfortunately it can't be used outside of the compiler.) These can also be written as `&` (`%.y`, true) and `|` (`%.n`, false), which is common in older code but should be avoided for clarity in your own compositions.
What are the actual values of these, _sans_ formatting?
```hoon
> `@`%.y
0
> `@`%.n
1
```
Pretty much all conditional operators rely on loobeans, although it is very uncommon for you to need to unpack them.
## Making Choices
You are familiar in everyday life with making choices on the basis of a decision expression. For instance, you can compare two prices for similar products and select the cheaper one for purchase.
Essentially, we have to be able to decide whether or not some value or expression evaluates as `%.y` true (in which case we will do one thing) or `%.n` false (in which case we do another). Some basic expressions are mathematical, but we also check for existence, for equality of two values, etc.
- [`++gth`](https://urbit.org/docs/hoon/reference/stdlib/1a#gth) (greater than `>`)
- [`++lth`](https://urbit.org/docs/hoon/reference/stdlib/1a#lth) (less than `<`)
- [`++gte`](https://urbit.org/docs/hoon/reference/stdlib/1a#gte) (greater than or equal to `≥`)
- [`++lte`](https://urbit.org/docs/hoon/reference/stdlib/1a#lte) (less than or equal to `≤`)
- [`.=` dottis](https://urbit.org/docs/hoon/reference/rune/dot#-dottis), irregularly `=()` (check for equality)
The key conditional decision-making rune is [`?:` wutcol](https://urbit.org/docs/hoon/reference/rune/wut#-wutcol), which lets you branch between an `expression-if-true` and an `expression-if-false`. [`?.` wutdot](https://urbit.org/docs/hoon/reference/rune/wut#-wutdot) inverts the order of `?:`. Good Hoon style prescribes that the heavier branch of a logical expression should be lower in the file.
There are also two long-form decision-making runes, which we will call [_switch statements_](https://en.wikipedia.org/wiki/Switch_statement) by analogy with languages like C.
- [`?-` wuthep](https://urbit.org/docs/hoon/reference/rune/wut#-wuthep) lets you choose between several possibilities, as with a type union. Every case must be handled and no case can be unreachable.
Since `@tas` terms are constants first, and not `@tas` unless marked as such, `?-` wuthep switches over term unions can make it look like the expression is branching on the value. It's actually branching on the _type_. These are almost exclusively used with term type unions.
```hoon
|= p=?(%1 %2 %3)
?- p
%1 1
%2 2
%3 3
==
```
- [`?+` wutlus](https://urbit.org/docs/hoon/reference/rune/wut#-wutlus) is similar to `?-` but allows a default value in case no branch is taken. Otherwise these are similar to `?-` wuthep switch statements.
```hoon
|= p=?(%0 %1 %2 %3 %4)
?+ p 0
%1 1
%2 2
%3 3
==
```
## Logical Operators
Mathematical logic allows the collocation of propositions to determine other propositions. In computer science, we use this functionality to determine which part of an expression is evaluated. We can combine logical statements pairwise:
- [`?&` wutpam](https://urbit.org/docs/hoon/reference/rune/wut#-wutpam), irregularly `&()`, is a logical `AND` _p__q_ over loobean values, e.g. both terms must be true.
| `AND` | `%.y` | `%.n` |
|-------|-------|-------|
| `%.y` | `%.y` | `%.n` |
| `%.n` | `%.n` | `%.n` |
```hoon
> =/ a 5
&((gth a 4) (lth a 7))
%.y
```
- [`?|` wutbar](https://urbit.org/docs/hoon/reference/rune/wut#-wutbar), irregularly `|()`, is a logical `OR` _p_ _q_ over loobean values, e.g. either term may be true.
| `OR` | `%.y` | `%.n` |
|-------|-------|-------|
| `%.y` | `%.y` | `%.y` |
| `%.n` | `%.y` | `%.n` |
```hoon
> =/ a 5
|((gth a 4) (lth a 7))
%.y
```
- [`?!` wutzap](https://urbit.org/docs/hoon/reference/rune/wut#-wutzap), irregularly `!`, is a logical `NOT` ¬_p_. Sometimes it can be difficult to parse code including `!` because it operates without parentheses.
| | `NOT` |
|-------|-------|
| `%.y` | `%.n` |
| `%.n` | `%.y` |
```hoon
> !%.y
%.n
> !%.n
%.y
```
From these primitive operators, you can build other logical statements at need.
#### Exercise: Design an `XOR` Function
The logical operation `XOR` _p_⊕_q_ exclusive disjunction yields true if one but not both operands are true. `XOR` can be calculated by (_p_ ∧ ¬_q_) (¬_p_ ∧ _q_).
| `XOR` | `%.y` | `%.n` |
|-------|-------|-------|
| `%.y` | `%.n` | `%.y` |
| `%.n` | `%.y` | `%.n` |
- Implement `XOR` as a gate in Hoon.
```hoon
|= [p=?(%.y %.n) q=?(%.y %.n)]
^- ?(%.y %.n)
|(&(p !q) &(!p q))
```
#### Exercise: Design a `NAND` Function
The logical operation `NAND` _p__q_ produces false if both operands are true. `NAND` can be calculated by ¬(_p_ ∧ _q_).
| `NAND` | `%.y` | `%.n` |
|--------|-------|-------|
| `%.y` | `%.n` | `%.y` |
| `%.n` | `%.y` | `%.y` |
- Implement `NAND` as a gate in Hoon.
#### Exercise: Design a `NOR` Function
The logical operation `NOR` _p__q_ produces true if both operands are false. `NOR` can be calculated by ¬(_p_ _q_).
| `NOR` | `%.y` | `%.n` |
|-------|-------|-------|
| `%.y` | `%.n` | `%.n` |
| `%.n` | `%.n` | `%.y` |
- Implement `NAND` as a gate in Hoon.
#### Exercise: Implement a Piecewise Boxcar Function
The boxcar function is a piecewise mathematical function which is equal to zero for inputs less than zero and one for inputs greater than or equal to zero. We implemented the similar Heaviside function [previously](./B-syntax.md) using the `?:` wutcol rune.
- Compose a gate which implements the boxcar function,
<img src="https://latex.codecogs.com/svg.image?\large&space;\text{boxcar}(x):=\begin{pmatrix}1,&space;&&space;10&space;\leq&space;x&space;<&space;20&space;\\0,&space;&&space;\text{otherwise}&space;\\\end{pmatrix}" title="https://latex.codecogs.com/svg.image?\large \text{boxcar}(x):=\begin{matrix}1, & 10 \leq x < 20 \\0, & \text{otherwise} \\\end{matrix}" />
<!--
$$
\text{boxcar}(x)
:=
\begin{matrix}
1, & 10 \leq x < 20 \\
0, & \text{otherwise} \\
\end{matrix}
$$
-->
Use Hoon logical operators to compress the logic into a single statement using at least one `AND` or `OR` operation.

View File

@ -0,0 +1,539 @@
---
title: Subject-Oriented Programming
nodes: 165, 180
objectives:
- "Review subject-oriented programming as a design paradigm."
- "Discuss stateful v. stateless applications and path dependence."
- "Enumerate Hoon's tools for dealing with state: `=.` tisdot, `=^` tisket, `;<` micgal, `;~` micsig."
- "Defer a computation."
---
# Subject-Oriented Programming
_This module discusses how Urbit's subject-oriented programming paradigm structures how cores and values are used and maintain state, as well as how deferred computations and remote value lookups (“scrying”) are handled. This module does not cover core genericity and variance, which will be explained in [a later module](./Q-metals.md)._
## The Subject
As we've said before:
> The Urbit operating system hews to a conceptual model wherein each expression takes place in a certain context (the _subject_). While sharing a lot of practicality with other programming paradigms and platforms, Urbit's model is mathematically well-defined and unambiguously specified. Every expression of Hoon is evaluated relative to its subject, a piece of data that represents the environment, or the context, of an expression.
Subject-oriented programming means that every expression is evaluated with respect to some _subject_. Every arm of a core is evaluated with its parent core as the subject.
You have also seen how wings work as search paths to identify nouns in the subject, and you have learned three ways to access values by address: numeric addressing, lark notation, and wing search expressions.
Generally speaking, the following rune families allow you to do certain things to the subject:
- `|` bar runes create cores, i.e. largely self-contained expressions
- `^` ket runes transform cores, i.e. change core properties
- `%` cen runes pull arms in cores
- `=` tis runes modify the subject by introducing or replacing values
Different kinds of cores can expose or conceal functionality (such as their sample) based on their variance model. We don't need to be concerned about that yet, but if you are building certain kinds of library code or intend to build code expressions directly, you'll need to read [that module](./Q-metals.md) as well.
### Accessing the Subject
Usually the subject of a Hoon expression isn't shown explicitly. In fact, only when using `:`/`.` wing lookup expressions have we made the subject explicit.
An arm is always evaluated with its parent core as its subject. We've briefly mentioned that one can use helper cores (e.g. for generators) by composing the cores side-by-side using [`=<` tisgal](https://urbit.org/docs/hoon/reference/rune/tis#-tisgal) and [`=>` tisgar](https://urbit.org/docs/hoon/reference/rune/tis#-tisgar). This way we can make sure that the arms fall within each other's subject horizon.
Why must an arm have its parent core as the subject, when it's computed? As stated previously, the payload of a core contains all the data needed for computing the arms of that core. Arms can only access data in the subject. By requiring that the parent core be the subject we guarantee that each arm has the appropriate data available to it. The tail of its subject contains the `payload` and thus all the values therein. The head of the subject is the `battery`, which allows for making reference to sibling arms of that same core.
In the Dojo, if you use `+1` by itself, you can see the current subject.
```hoon
> +1
[ [ our=~zod
now=~2022.6.22..18.35.42..da35
eny
0vb6.cve93.67frc.2gtoj.jfl3i.odojg.urrce.o53d3.44h4o.sf3o5.va2mh.ra1ec.jrkej.u512k.l4lin.f003v.li030.l2e6t.ah7ge.6t5cg.epuil
]
<17.bny 33.ehb 14.dyd 53.vlb 77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
]
```
`.` does the same thing: it always refers to the current subject.
If `.` is the subject, then `..arm` is the subject of a given `arm` (the second `.` dot being the wing resolution operator). You can check the details of the parent core using something like `..add`. This trick is used when producing agents that have highly nested operations (search `..` in the `/app` directory), or when composing [jets](https://urbit.org/docs/vere/jetting#edit-the-hoon-source-code), for instance.
Another use case for the `..arm` syntax is when there is a core in the subject without a face bound to it; i.e., the core might be nameless. In that case you can use an arm name in that core to refer to the whole core.
```hoon
> ..add
<46.hgz 1.pnw %140>
```
#### Tutorial: The Core Structure of `hoon.hoon`
Let's take a deeper look at how cores can be combined with `=>` tisgar to build up larger structures. `=> p=hoon q=hoon` yields the product of `q` with the product of `p` taken as the subject; i.e. it composes Hoon statements, like cores.
We use this to set the context of cores. Recall that the payload of a gate is a cell of `[sample context]`. For example:
```
> =foo =>([1 2] |=(@ 15))
> +3:foo
[0 1 2]
```
Here we have created a gate with `[1 2]` as its context that takes in an `@` and returns `15`. `+3:foo` shows the payload of the core to be `[0 [1 2]]`. Here `0` is the default value of `@` and is the sample, while `[1 2]` is the context that was given to `foo`.
`=>` tisgar (and its reversed version `=<` tisgal) are used extensively to put cores into the context of other cores.
```hoon
=>
|%
++ foo
|= a=@
(mul a 2)
--
|%
++ bar
|= a=@
(mul (foo a) 2)
--
```
At the level of arms, `++foo` is in the subject of `++bar`, and so `++bar` is able to call `++foo`. On the other hand, `+bar` is not in the subject of `++foo`, so `++foo` cannot call `++bar` - you will get a `-find.bar` error.
At the level of cores, the `=>` sets the context of the core containing `++bar` to be the core containing `++foo`. Recall that arms are evaluated with the parent core as the subject. Thus `++bar` is evaluated with the core containing it as the subject, which has the core containing `++foo` in its context. This is why `++foo` is in the scope of `++bar` but not vice versa.
Let's look inside `/sys/hoon.hoon`, where the standard library is located, to see how this can be used.
The first core listed here has just one arm.
```hoon
=> %140 =>
|%
++ hoon-version +
--
```
This is reflected in the subject of `hoon-version`.
```hoon
> ..hoon-version
<1.pnw %140>
```
After several lines that we'll ignore for pedagogical purposes, we see
```hoon
|%
:: # %math
:: unsigned arithmetic
+| %math
++ add
~/ %add
:: unsigned addition
::
:: a: augend
:: b: addend
|= [a=@ b=@]
:: sum
^- @
?: =(0 a) b
$(a (dec a), b +(b))
::
++ dec
```
and so on, down to
```hoon
++ unit
|$ [item]
:: maybe
::
:: mold generator: either `~` or `[~ u=a]` where `a` is the
:: type that was passed in.
::
$@(~ [~ u=item])
--
```
This core contains the arms in [sections 1a1c of the standard library documentation](/docs/hoon/reference/stdlib/1a). If you count them, there are 46 arms in the core from `++ add` down to `++ unit`. We again can see this fact reflected in the dojo by looking at the subject of `add`.
```hoon
> ..add
<46.hgz 1.pnw %140>
```
Here we see that core containing `hoon-version` is in the subject of the section 1 core.
Next, [section 2](/docs/hoon/reference/stdlib/2a) starts:
```hoon
=>
:: ::
:::: 2: layer two ::
```
...
```
|%
:: ::
:::: 2a: unit logic ::
:: ::
:: biff, bind, bond, both, clap, drop, ::
:: fall, flit, lift, mate, need, some ::
::
++ biff :: apply
|* {a/(unit) b/$-(* (unit))}
?~ a ~
(b u.a)
```
If you counted the arms in this core by hand, you'll come up with 123 arms. This is also reflected in the dojo:
```hoon
> ..biff
<123.zao 46.hgz 1.pnw %140>
```
and we also see the section 1 core and the core containing `hoon-version` in the subject.
We can also confirm that `++add` is in the subject of `++biff`
```hoon
> add:biff
<1.otf [[a=@ b=@] <46.hgz 1.pnw %140>]>
```
and that `++biff` is not in the subject of `++add`.
```hoon
> biff:add
-find.biff
```
Lastly, let's check the subject of the last arm in `hoon.hoon` (as of June 2022):
```hoon
> ..pi-tell
<77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
```
This confirms for us, then, that `hoon.hoon` consists of six nested cores, with one inside the payload of the next, with the `hoon-version` core most deeply nested.
#### Exercise: Explore `hoon.hoon`
- Pick a couple of arms in `hoon.hoon` and check to make sure that they are only referenced in its parent core or core(s) that have the parent core put in its context via the `=>` or `=<` runes.
### Axes of the Subject
The core Arvo subject exposes several axes (plural of `+$axis` which is the tree address) in the subject. You've encountered these before:
- `our` is the ship's identity.
```hoon
> -<..
our=~nec
```
- `now` is 128-bit timestamp sourced from the wall clock time, Linux's `gettimeofday()`.
```hoon
> ->-..
now=~2022.6.22..20.41.18..82f4
```
- `eny` is 512 bits of entropy as `@uvJ`, sourced from a [CSPRNG](https://en.wikipedia.org/wiki/Cryptographically-secure_pseudorandom_number_generator) and hash-iterated using [`++shax`](). (`eny` is shared between vanes during an event, so there are currently limits on how much it should be relied on until the Urbit kernel is security-hardened, but it is unique within each Gall agent activation.)
```hoon
> ->+..
eny
0vmr.qobqc.fd9f0.h5hf4.dkurh.b4s37.lt4qf.2k505.j3sir.cnshk.ldpm0.jeppc.ti7gs.vtpru.u09sm.0imu0.cgdln.fvoqc.mt41e.3iga5.qpct7
```
## State and Applications
Default Hoon expressions are stateless. This means that they don't really make reference to any other transactions or events in the system. They don't preserve the results of previous calculations beyond their own transient existence.
However, clearly regular applications, such as Gall agents, are stateful, meaning that they modify their own subject regularly.
There are several ways to manage state. One approach, including `%=` centis, directly modifies the subject using a rune. Another method is to use the other runes to compose or sequence changes together (e.g. as a pipe of gates). By and large the `=` tis runes are responsible for modifying the subject, and the `;` mic runes permit chaining deferred computations together.
To act in a stateful manner, a core must mutate itself and then pin the mutated copy in its place. Most of the time this is handled by Arvo's Gall vane, by the Dojo, or another system service, but we need to explicit modify and manage state for cores as we work within these kinds of applicaitons.
We will use `%say` generators as a bridge concept. We will produce some short applications that maintain state while carrying out a calculation; they still result in a single return value, but gesture at the big-picture approach to maintaining state in persistent agents.
Here are a couple of new runes for modifying the subject and chaining computations together, aside from `%=` cenhep which you've already seen:
- [`=.` tisdot](https://urbit.org/docs/hoon/reference/rune/tis#-tisdot) is used to change a leg in the subject.
- [`=~` tissig](https://urbit.org/docs/hoon/reference/rune/tis#-tissig) composes many expressions together serially.
#### Tutorial: Bank Account
In this section, we will write a door that can act as a bank account with the ability to withdraw, deposit, and check the account's balance. This door replaces the sample of the door with the new values as each transaction proceeds.
```hoon
:- %say
|= *
:- %noun
=< =~ new-account
(deposit 100)
(deposit 100)
(withdraw 50)
balance
==
|%
++ new-account
|_ balance=@ud
++ deposit
|= amount=@ud
+>.$(balance (add balance amount))
++ withdraw
|= amount=@ud
+>.$(balance (sub balance amount))
--
--
```
We start with the three boilerplate lines we have in every `%say` generator:
```hoon
:- %say
|= *
:- %noun
```
In the above code chunk, we're creating a cell. The head of this cell is `%say`. The tail is a gate (`|= *`) that produces another cell (`:- %noun`) with a head of the mark of a the kind of data we are going to produce, a `%noun`; the tail of the second cell is the rest of the program.
```hoon
=< =~ new-account
(deposit 100)
(deposit 100)
(withdraw 50)
balance
==
```
In this code above, we're going to compose two runes using `=<`, which has inverted arguments. We use this rune to keep the heaviest twig to the bottom of the code.
The [`=~` tissig](https://urbit.org/docs/hoon/reference/rune/tis#-tissig) rune composes multiple expressions together; we use it here to make the code more readable. We take `new-account` and use that as the subject for the call to `deposit`. `deposit` and `withdraw` both produce a new version of the door that's used in subsequent calls, which is why we are able to chain them in this fashion. The final reference is to `balance`, which is the account balance contained in the [core](/docs/glossary/core/) that we examine below.
```hoon
|%
++ new-account
|_ balance=@ud
++ deposit
|= amount=@ud
+>.$(balance (add balance amount))
++ withdraw
|= amount=@ud
+>.$(balance (sub balance amount))
--
--
```
We've chosen here to wrap our door in its own core to emulate the style of programming that is used when creating libraries. `++new-account` is the name of our door. A door is a core with one or more arms that has a sample. Here, our door has a sample of one `@ud` with the face `balance` and two arms `++deposit` and `++withdraw`.
Each of these arms produces a gate which takes an `@ud` argument. Each of these gates has a similar bit of code inside:
```hoon
+>.$(balance (add balance amount))
```
`+>` is a kind of wing syntax, lark notation. This particular wing construction looks for the tail of the tail (the third element) in `$` buc, the subject of the gate we are in. The `++withdraw` and `++deposit` arms create gates with the entire `new-account` door as the context in their cores' `[battery sample context]`, in the "tail of the tail" slot. We change `balance` to be the result of adding `balance` and `amount` and produce the door as the result. `++withdraw` functions the same way only doing subtraction instead of addition.
It's important to notice that the sample, `balance`, is stored as part of the door rather than existing outside of it.
#### Exercise: Bank Account
- Modify the `%say` generator above to accept a `@ud` unsigned decimal dollar amount and a `?(%deposit %withdraw)` term and returns the result of only that operation on the starting balance of the bank account. (Note that this will only work once on the door, and the state will not persist between generator calls.)
### Deferred Computations
_Deferred computation_ means that parts of the subject have changes that may be underdetermined at first. These must be calculated later using the appropriate runes as new or asynchronous information becomes available.
For instance, a network service call may take a while or may fail. How should the calculation deal with these outcomes? In addition, the successful result of the network data is unpredictable in content (but should not be unpredictable in format!).
We have some more tools available for managing deferred or chained computations, in addition to `=~` tissig and `=*` tistar:
- [`=^` tisket](https://urbit.org/docs/hoon/reference/rune/tis#-tisket) is used to change a leg in the tail of the subject then evaluate against it. This is commonly used for events that need to be ordered in their resolution e.g. with a `%=` cenhep. (Used in Gall agents frequently.)
- [`=*` tistar](https://urbit.org/docs/hoon/reference/rune/tis#tistar) defers an expression (rather like a macro).
- [`;<` micgal](https://urbit.org/docs/hoon/reference/rune/mic#-micgal) sequences two computations, particularly for an asynchronous event like a remote system call. (Used in [threads](https://urbit.org/docs/userspace/threads/overview).)
- [`;~` micsig](https://urbit.org/docs/hoon/reference/rune/mic#-micsig) produces a pipeline, a way of piping the output of one gate into another in a chain. (This is particularly helpful when parsing text.)
### `++og` Randomness
A _random number generator_ provides a stream of calculable but unpredictable values from some _distribution_. In [a later lesson](./R-math.md), we explain how random numbers can be generated from entropy; for now, let's see what's necessary to use such a random-number generator.
An RNG emits a sequence of values given a starting _seed_. For instance, a very simple RNG could emit digits of the number _π_ given a seed which is the number of digits to start from.
- seed 1: 1, 4, 1, 5, 9, 2, 6, 5, 3, 5
- seed 3: 1, 5, 9, 2, 6, 5, 3, 5, 8, 9
- seed 100: 8, 2, 1, 4, 8, 0, 8, 6, 5, 1
Every time you start this “random” number generator with a given seed, it will reproduce the same sequence of numbers.
While RNGs don't work like our _π_-based example, a given seed will reliably produce the same result every time it is run.
The basic RNG core in Hoon is [`++og`](https://urbit.org/docs/hoon/reference/stdlib/3d#og). `++og` is a door whose sample is its seed. We need to use `eny` to seed it non-deterministically, but we can also pin the state using `=^` tisket. [`++rads:rng`](https://urbit.org/docs/hoon/reference/stdlib/3d#radsog) produces a cell of a random whole number in a given range and a new modified core to continue the random sequence.
```hoon
> =+ rng=~(. og eny)
[-:(rads:rng 100) -:(rads:rng 100)]
[60 60]
```
Since the `rng` starts from the same seed value every single time, both of the numbers will always be the same. What we have to do is pin the updated version of the RNG (the tail of `++rads:og`'s return cell) to the subject using `=^` tisket, e.g.,
```hoon
> =/ rng ~(. og eny)
=^ r1 rng (rads:rng 100)
=^ r2 rng (rads:rng 100)
[r1 r2]
[21 47]
```
#### Tutorial: Magic 8-Ball
The Magic 8-Ball returns one of a variety of answers in response to a call. In its entirety:
```hoon
!:
:- %say
|= [[* eny=@uvJ *] *]
:- %noun
^- tape
=/ answers=(list tape)
:~ "It is certain."
"It is decidedly so."****
"Without a doubt."
"Yes - definitely."
"You may rely on it."
"As I see it, yes."
"Most likely."
"Outlook good."
"Yes."
"Signs point to yes."
"Reply hazy, try again"
"Ask again later."
"Better not tell you now."
"Cannot predict now."
"Concentrate and ask again."
"Don't count on it."
"My reply is no."
"My sources say no."
"Outlook not so good."
"Very doubtful."
==
=/ rng ~(. og eny)
=/ val (rad:rng (lent answers))
(snag val answers)
```
Most of the “work” is being done by these two lines:
```hoon
=/ rng ~(. og eny)
=/ val (rad:rng (lent answers))
```
`~(. og eny)` starts a random number generator with a seed from the current entropy. A [random number generator](https://en.wikipedia.org/wiki/Random_number_generation) is a stateful mathematical function that produces an unpredictable result (unless you know the algorithm AND the starting value, or seed). Here we pull the subject of [`++og`](https://urbit.org/docs/hoon/reference/stdlib/3d#og), the randomness core in Hoon, to start the RNG.
Then we slam the `++rad:rng` gate which returns a random number from 0 to _n_-1 inclusive. This gives us a random value from the list of possible answers.
Since this is a `%say` generator, we can run it without arguments:
```hoon
> +magic-8
"Ask again later."
```
#### Tutorial: Dice Roll
Let's look at an example that uses all three parts. Save the code below in a file called `dice.hoon` in the `/gen` directory of your `%base` desk.
```hoon
:- %say
|= [[now=@da eny=@uvJ bec=beak] [n=@ud ~] [bet=@ud ~]]
:- %noun
[(~(rad og eny) n) bet]
```
This is a very simple dice program with an optional betting functionality. In the code, our sample specifies faces on all of the Arvo data, meaning that we can easily access them. We also require the argument `[n=@ud ~]`, and allow the _optional_ argument `[bet=@ud ~]`.
We can run this generator like so:
```unknown
> +dice 6, =bet 2
[4 2]
> +dice 6
[5 0]
> +dice 6
[2 0]
> +dice 6, =bet 200
[0 200]
> +dice
nest-fail
```
We get a different value from the same generator between runs, something that isn't possible with a naked generator. Another novelty is the ability to choose to not use the second argument.
## Scrying (In Brief)
A _peek_ or a _scry_ is a request to Arvo to tell you something about the state of part of the Urbit OS. Scries are used to determine the state of an agent or a vane. The [`.^` dotket](https://urbit.org/docs/hoon/reference/rune/dot#-dotket) rune sends the scry request to a particular vane with a certain _care_ or type of scry. The request is then routed to a particular path in that vane. Scries are discused in detail in [the App Guide](https://urbit.org/docs/userspace/gall-guide/10-scry). We will only briefly introduce them here as we can use them later to find out about Arvo's system state, such as file contents and agent state.
### `%c` Clay
The Clay filesystem stores nouns persistently at hierarchical path addresses. These nouns can be accessed using marks, which are rules for structuring the data. We call the nouns “files” and the path addresses “folders”.
If we want to retrieve the contents of a file or folder, we can directly ask Clay for the data using a scry with an appropriate care.
For instance, the `%x` care to the `%c` Clay vane returns the noun at a given address as a `@` atom.
```hoon
> .^(@ %cx /===/gen/hood/hi/hoon)
3.548.750.706.400.251.607.252.023.288.575.526.190.856.734.474.077.821.289.791.377.301.707.878.697.553.411.219.689.905.949.957.893.633.811.025.757.107.990.477.902.858.170.125.439.223.250.551.937.540.468.638.902.955.378.837.954.792.031.592.462.617.422.136.386.332.469.076.584.061.249.923.938.374.214.925.312.954.606.277.212.923.859.309.330.556.730.410.200.952.056.760.727.611.447.500.996.168.035.027.753.417.869.213.425.113.257.514.474.700.810.203.348.784.547.006.707.150.406.298.809.062.567.217.447.347.357.039.994.339.342.906
```
There are tools like `/lib/pretty-file/hoon` which will render this legible to you by using formatted text `tank`s:
```hoon
> =pretty-file -build-file %/lib/pretty-file/hoon
> (pretty-file .^(noun %cx /===/gen/hood/hi/hoon))
~[
[%leaf p=":: Helm: send message to an urbit"]
[%leaf p="::"]
[%leaf p=":::: /hoon/hi/hood/gen"]
[%leaf p=" ::"]
[%leaf p="/? 310"]
[%leaf p=":- %say"]
[%leaf p="|=([^ [who=ship mez=$@(~ [a=tape ~])] ~] helm-send-hi+[who ?~(mez ~ `a.mez)])"]
]
```
Similarly, you can request the contents at a particular directory path:
```hoon
> .^(arch %cy /===/gen/hood)
[ fil=~
dir
{ [p=~.resume q=~]
[p=~.install q=~]
[p=~.pass q=~]
[p=~.doze q=~]
...
[p=~.mount q=~]
}
]
```
There are many more options with Clay than just accessing file and folder data. For instance, we can also scry all of the desks on our current ship with the `%d` care of `%c` Clay:
```hoon
> .^((set desk) %cd %)
{%bitcoin %base %landscape %webterm %garden %kids}
```
Other vanes have their own scry interfaces, which are well-documented in [the Arvo docs](TODO).

View File

@ -0,0 +1,351 @@
---
title: Gates
nodes: 185
objectives:
- "Identify tanks, tangs, wains, walls, and similar formatted printing data structures."
- "Interpret logging message structures (`%leaf`, `$rose`, `$palm`)."
- "Interpolate to tanks with `><` syntax."
- "Produce useful error annotations using `~|` sigbar."
---
# Text Processing II
_This module will elaborate on text representation in Hoon, including formatted text and `%ask` generators. It may be considered optional and skipped if you are speedrunning Hoon School._
## Text Conversions
We frequently need to convert from text to data, and between different text-based representations. Let's examine some specific arms:
- How do we convert text into all upper-case?
- [`++cass`](https://urbit.org/docs/hoon/reference/stdlib/4b#cass)
- How do we turn a `cord` into a `tape`?
- [`++trip`](https://urbit.org/docs/hoon/reference/stdlib/4b#trip)
- How can we make a list of a null-terminated tuple?
- [`++le:nl`](https://urbit.org/docs/hoon/reference/stdlib/2m#lenl)
- How can we evaluate Nock expressions?
- [`++mink`](https://urbit.org/docs/hoon/reference/stdlib/4n#mink)
(If you see a `|*` bartar rune in the code, it's similar to a `|=` bartis, but produces what's called a [_wet gate_](./Q-metals.md).)
The `++html` core of the standard libary contains some additional important tools for working with web-based data, such as [MIME types](https://en.wikipedia.org/wiki/Media_type) and [JSON strings](https://en.wikipedia.org/wiki/JSON).
- To convert a `@ux` hexadecimal value to a `cord`:
```hoon
> (en:base16:mimes:html [3 0x12.3456])
'123456'
```
- To convert a `cord` to a `@ux` hexadecimal value:
```hoon
> `@ux`q.+>:(de:base16:mimes:html '123456')
0x12.3456
```
- There are tools for working with Bitcoin wallet base-58 values, JSON strings, XML strings, and more.
```hoon
> (en-urlt:html "https://hello.me")
"https%3A%2F%2Fhello.me"
```
## Formatted Text
Hoon produces messages at the Dojo (or otherwise) using an internal formatted text system, called `tank`s. A `+$tank` is a formatted print tree. Error messages and the like are built of `tank`s. `tank`s are defined in `hoon.hoon`:
```hoon
:: $tank: formatted print tree
::
:: just a cord, or
:: %leaf: just a tape
:: %palm: backstep list
:: flat-mid, open, flat-open, flat-close
:: %rose: flat list
:: flat-mid, open, close
::
+$ tank
$~ leaf/~
$@ cord
$% [%leaf p=tape]
[%palm p=(qual tape tape tape tape) q=(list tank)]
[%rose p=(trel tape tape tape) q=(list tank)]
==
+$ tang (list tank) :: bottom-first error
```
The [`++ram:re`](https://urbit.org/docs/hoon/reference/stdlib/4c#ramre) arm is used to convert these to actual formatted output as a `tape`, e.g.
```hoon
> ~(ram re leaf+"foo")
"foo"
> ~(ram re [%palm ["|" "(" "!" ")"] leaf+"foo" leaf+"bar" leaf+"baz" ~])
"(!foo|bar|baz)"
> ~(ram re [%rose [" " "[" "]"] leaf+"foo" leaf+"bar" leaf+"baz" ~])
"[foo bar baz]"
```
Many generators build sophisticated output using `tank`s and the short-format cell builder `+`, e.g. in `/gen/azimuth-block/hoon`:
```hoon
[leaf+(scow %ud block)]~
```
which is equivalent to
```hoon
~[[%leaf (scow %ud block)]]
```
`tank`s are the primary output mechanism for more advanced generators. Even if you don't end up writing them much, you will encounter them as you delve into the Urbit codebase.
#### Tutorial: Deep Dive into `ls.hoon`
The `+ls` generator shows the contents at a particular path in Clay:
```hoon
> +cat /===/gen/ls/hoon
/~nec/base/~2022.6.22..17.25.54..1034/gen/ls/hoon
:: LiSt directory subnodes
::
:::: /hoon/ls/gen
::
/? 310
/+ show-dir
::
::::
::
~& %
:- %say
|= [^ [arg=path ~] vane=?(%g %c)]
=+ lon=.^(arch (cat 3 vane %y) arg)
tang+[?~(dir.lon leaf+"~" (show-dir vane arg dir.lon))]~
```
Let's go line by line:
```hoon
/? 310
/+ show-dir
```
The first line `/?` faswut represents now-future functionality which will allow the version number of the kernel to be pinned. It is currently non-functioning but you will see it in many Urbit-shipped files.
Then the `show-dir` library is imported.
```hoon
~& %
```
A separator `%` is printed.
```hoon
:- %say
```
A `%say` generator is a cell with a metadata tag `%say` as the head and the gate as the tail.
```hoon
|= [^ [arg=path ~] vane=?(%g %c)]
```
This generator requires a path argument in its sample and optionally accepts a vane tag (`%g` Gall or `%c` Clay). Most of the time, `+cat` is used with Gall, so `%g` as the last entry in the type union serves as the bunt value.
```hoon
=+ lon=.^(arch (cat 3 vane %y) arg)
```
We saw [`.^` dotket](https://urbit.org/docs/hoon/reference/rune/dot#-dotket) for the first time in [the previous module](./N-subject.md), where we learned that it performs a _peek_ or _scry_ into the state of an Arvo vane. Most of the time this functionality is used to ask `%c` Clay or `%g` Gall for information about a path, desk, agent, etc. In this case, `(cat 3 %c %y)` is a fancy way of collocating the two `@tas` terms into `%cy`, a Clay file or directory lookup. The type of this lookup is `+$arch`, and the location of the file or directory is given by `arg` from the sample.
```hoon
tang+[?~(dir.lon leaf+"~" (show-dir vane arg dir.lon))]~
```
The result of the lookup on the previous line is adapted into a formatted text block with a head of `%tang` and different results depending on whether the request was `~` null or not.
#### Tutorial: Deep Dive into `cat.hoon`
For instance, how does `+cat` work? Let's look at the structure of `/gen/cat/hoon`:
```hoon
:: ConCATenate file listings
::
:::: /hoon/cat/gen
::
/? 310
/+ pretty-file, show-dir
::
::::
::
:- %say
|= [^ [arg=(list path)] vane=?(%g %c)]
=- tang+(flop `tang`(zing -))
%+ turn arg
|= pax=path
^- tang
=+ ark=.^(arch (cat 3 vane %y) pax)
?^ fil.ark
?: =(%sched -:(flop pax))
[>.^((map @da cord) (cat 3 vane %x) pax)<]~
[leaf+(spud pax) (pretty-file .^(noun (cat 3 vane %x) pax))]
?- dir.ark :: handle ambiguity
~
[rose+[" " `~]^~[leaf+"~" (smyt pax)]]~
::
[[@t ~] ~ ~]
$(pax (welp pax /[p.n.dir.ark]))
::
*
=- [palm+[": " ``~]^-]~
:~ rose+[" " `~]^~[leaf+"*" (smyt pax)]
`tank`(show-dir vane pax dir.ark)
==
==
```
- What is the top-level structure of the generator? (A cell of `%say` and the gate, previewing `%say` generators.)
- Some points of interest include:
- `/?` faswut pins the expected Arvo kelvin version; right now it doesn't do anything.
- [`.^` dotket](https://urbit.org/docs/hoon/reference/rune/dot#-dotket) loads a value from Arvo (called a “scry”).
- [`++smyt`](https://urbit.org/docs/hoon/reference/stdlib/4m#smyt) pretty-prints a path.
- [`=-` tishep](https://urbit.org/docs/hoon/reference/rune/tis#--tishep) combines a faced noun with the subject, inverted relative to `=+` tislus/`=/` tisfas.
You can see how much of the generator is concerned with formatting the content of the file into a formatted text `tank` by prepending `%rose` tags and so forth.
- Work line-by-line through the file and clarify parts that are muddy to you at first glance.
### Producing Error Messages
Formal error messages in Urbit are built of tanks. “A `tang` is a list of `tank`s, and a `tank` is a structure for printing data. There are three types of `tank`: `leaf`, `palm`, and `rose`. A `leaf` is for printing a single noun, a `rose` is for printing rows of data, and a `palm` is for printing backstep-indented lists.”
One way to include an error message in your code is the [`~_` sigcab](https://urbit.org/docs/reference/hoon-expressions/rune/sig/#sigcab) rune, described as a “user-formatted tracing printf”, or the [`~|` sigbar](https://urbit.org/docs/reference/hoon-expressions/rune/sig/#sigbar) rune, a “tracing printf”. What this means is that these print to the stack trace if something fails, so you can use either rune to contribute to the error description:
```hoon
|= [a=@ud]
~_ leaf+"This code failed"
!!
```
When you compose your own library functions, consider including error messages for likely failure points.
## `%ask` Generators
Previously, we introduced the concept of a `%say` generator to produce a more versatile form of standalone single computation than a simple naked generator (gate) allowed. Another elaboration, the `%ask` generator, takes things further.
We use an `%ask` generator when we want to create an interactive program that prompts for inputs as it runs, rather than expecting arguments to be passed in at the time of initiation.
This section will briefly walk through an `%ask` generator to give you a taste of how they work. The [CLI app guide](https://urbit.org/docs/hoon/guides/cli-tutorial) walks through the libraries necessary for working with `%ask` generators in greater detail. We also recommend reading [~wicdev-wisryt's “Input and Output in Hoon”](https://urbit.org/blog/io-in-hoon) for an extended consideration of relevant input/output issues.
##### Tutorial: `%ask` Generator
The code below is an `%ask` generator that checks if the user inputs `"blue"` when prompted [per a classic Monty Python scene](https://www.youtube.com/watch?v=L0vlQHxJTp0). Save it as `/gen/axe.hoon` in your `%base` desk.
```hoon
/- sole
/+ generators
=, [sole generators]
:- %ask
|= *
^- (sole-result (cask tang))
%+ print leaf+"What is your favorite color?"
%+ prompt [%& %prompt "color: "]
|= t=tape
%+ produce %tang
?: =(t "blue")
:~ leaf+"Oh. Thank you very much."
leaf+"Right. Off you go then."
==
:~ leaf+"Aaaaagh!"
leaf+"Into the Gorge of Eternal Peril with you!"
==
```
Run the generator from the Dojo:
```hoon
> +axe
What is your favorite color?
: color:
```
Something new has happened. Instead of simply returning something, your Dojo's prompt changed from `~your-urbit:dojo>` to `~your-urbit:dojo: color:`, and now expects additional input. Let's give it an answer:
```hoon
: color: red
Into the Gorge of Eternal Peril with you!
Aaaaagh!
```
Let's go over what exactly is happening in this code.
```hoon
/- sole
/+ generators
=, [sole generators]
```
Here we bring in some of the types we are going to need from `/sur/sole` and gates we will use from `/lib/generators`. We use some special runes for this.
- `/-` fashep is a Ford rune used to import types from `/sur`.
- `/+` faslus is a Ford rune used to import libraries from `/lib`.
- `=,` tiscom is a rune that allows us to expose a namespace. We do this to avoid having to write `sole-result:sole` instead of `sole-result` or `print:generators` instead of `print`.
```hoon
:- %ask
|= *
```
This code might be familiar. Just as with their `%say` cousins, `%ask` generators need to produce a `cell`, the head of which specifies what kind of generator we are running.
With `|= *`, we create a gate and ignore the standard arguments we are given, because we're not using them.
```hoon
^- (sole-result (cask tang))
```
`%ask` generators need to have the second half of the cell be a gate that produces a `sole-result`, one that in this case contains a `cask` of `tang`. We use the `^-` kethep rune to constrain the generator's output to such a `sole-result`.
A `cask` is a pair of a `mark` name and a noun. We previously described a `mark` as a kind of complicated mold; here we add that a `mark` can be thought of as an Arvo-level [MIME](https://en.wikipedia.org/wiki/MIME) type for data.
A `tang` is a `list` of `tank`, and a `tank` is a structure for printing data, as described above. There are three types of `tank`: `leaf`, `palm`, and `rose`. A `leaf` is for printing a single noun, a `rose` is for printing rows of data, and a `palm` is for printing backstep-indented lists.
```hoon
%+ print leaf+"What is your favorite color?"
%+ prompt [%& %prompt "color: "]
|= t=tape
%+ produce %tang
```
Because we imported `generators`, we can access its contained gates, three of which we use in `axe.hoon`: `++print`, `++prompt`, and `++produce`.
- `print` is used for printing a `tank` to the console.
In our example, `%+` cenlus is the rune to call a gate, and our gate `++print` takes one argument which is a `tank` to print. The `+` here is syntactic sugar for `[%leaf "What is your favorite color?"]` that just makes it easier to write.
- `prompt` is used to construct a prompt for the user to provide input. It takes a single argument that is a tuple. Most `%ask` generators will want to use the `++prompt` gate.
The first element of the `++prompt` sample is a flag that indicates whether what the user typed should be echoed out to them or hidden. `%&` will produce echoed output and `%|` will hide the output (for use in passwords or other secret text).
The second element of the `++prompt` sample is intended to be information for use in creating autocomplete options for the prompt. This functionality is not yet implemented.
The third element of the `++prompt` sample is the `tape` that we would like to use to prompt the user. In the case of our example, we use `"color: "`.
- `produce` is used to construct the output of the generator. In our example, we produce a `tang`.
```hoon
|= t=tape
```
Our gate here takes a `tape` that was produced by `++prompt`. If we needed another type of data we could use `++parse` to obtain it.
The rest of this generator should be intelligible to those with Hoon knowledge at this point.
One quirk that you should be aware of, though, is that `tang` prints in reverse order from how it is created. The reason for this is that `tang` was originally created to display stack trace information, which should be produced in reverse order. This leads to an annoyance: we either have to specify our messages backwards or construct them in the order we want and then `++flop` the `list`.

View File

@ -0,0 +1,366 @@
---
title: Functional Programming
nodes: 233, 283, 383
objectives:
- "Reel, roll, turn a list."
- "Curry, cork functions."
- "Change arity of a gate."
- "Tokenize text simply using `find` and `trim`."
- "Identify elements of parsing: `nail`, `rule`, etc."
- "Use `++scan` to parse `tape` into atoms."
- "Construct new rules and parse arbitrary text fields."
---
# Functional Programming
_This module will discuss some gates-that-work-on-gates and other assorted operators that are commonly recognized as functional programming tools. It will also cover text parsing._
Given a gate, you can manipulate it to accept a different number of values than its sample formally requires, or otherwise modify its behavior. These techniques mirror some of the common tasks used in other [functional programming languages](https://en.wikipedia.org/wiki/Functional_programming) like Haskell, Clojure, and OCaml.
Functional programming, as a paradigm, tends to prefer rather mathematical expressions with explicit modification of function behavior. It works as a formal system of symbolic expressions manipulated according to given rules and properties. FP was derived from the [lambda calculus](https://en.wikipedia.org/wiki/Lambda_calculus), a cousin of combinator calculi like Nock. (See also [APL](https://en.wikipedia.org/wiki/APL_%28programming_language%29).)
## Changing Arity
If a gate accepts only two values in its sample, for instance, you can chain together multiple calls automatically using the [`;:` miccol](https://urbit.org/docs/hoon/reference/rune/mic#-miccol) rune.
```hoon
> (add 3 (add 4 5))
12
> :(add 3 4 5)
12
> (mul 3 (mul 4 5))
60
> :(mul 3 4 5)
60
```
This is called changing the [_arity_](https://en.wikipedia.org/wiki/Arity) of the gate. (Does this work on `++mul:rs`?)
## Binding the Sample
[_Currying_](https://en.wikipedia.org/wiki/Currying) describes taking a function of multiple arguments and reducing it to a set of functions that each take only one argument. _Binding_, an allied process, is used to set the value of some of those arguments permanently.
If you have a gate which accepts multiple values in the sample, you can fix one of these. To fix the head of the sample (the first argument), use [`++cury`](https://urbit.org/docs/hoon/reference/stdlib/2n#cury); to bind the tail, use [`++curr`](https://urbit.org/docs/hoon/reference/stdlib/2n#curr).
Consider calculating _a x² + b x + c_, a situation we earlier resolved using a door. We can resolve the situation differently using currying:
```hoon
> =full |=([x=@ud a=@ud b=@ud c=@ud] (add (add (mul (mul x x) a) (mul x b)) c))
> (full 5 4 3 2)
117
> =one (curr full [4 3 2])
> (one 5)
117
```
One can also [`++cork`](https://urbit.org/docs/hoon/reference/stdlib/2n#cork) a gate, or arrange it such that it applies to the result of the next gate. This pairs well with `;:` miccol. (There is also [`++corl`](https://urbit.org/docs/hoon/reference/stdlib/2n#corl), which composes backwards rather than forwards.) This example converts a value to `@ux` then decrements it by corking two molds:
```hoon
> ((cork dec @ux) 20)
0x13
```
#### Exercise: Bind Gate Arguments
- Create a gate `++inc` which increments a value in one step, analogous to `++dec`.
#### Exercise: Chain Gate Values
- Write an expression which yields the parent galaxy of a planet's sponsoring star by composing two gates.
## Working Across `list`s
turn
The turn function takes a list and a gate, and returns a list of the products of applying each item of the input list to the gate. For example, to add 1 to each item in a list of atoms:
> (turn `(list @)`~[11 22 33 44] |=(a=@ +(a)))
~[12 23 34 45]
Or to double each item in a list of atoms:
> (turn `(list @)`~[11 22 33 44] |=(a=@ (mul 2 a)))
~[22 44 66 88]
turn is Hoon's version of Haskell's map.
We can rewrite the Caesar cipher program using turn:
|= [a=@ b=tape]
^- tape
?: (gth a 25)
$(a (sub a 26))
%+ turn b
|= c=@tD
?: &((gte c 'A') (lte c 'Z'))
=. c (add c a)
?. (gth c 'Z') c
(sub c 26)
?: &((gte c 'a') (lte c 'z'))
=. c (add c a)
?. (gth c 'z') c
(sub c 26)
c
[`++roll`](https://urbit.org/docs/hoon/reference/stdlib/2b#roll) and [`++reel`](https://urbit.org/docs/hoon/reference/stdlib/2b#reel) are used to left-fold and right-fold a list, respectively. To fold a list is similar to [`++turn`](https://urbit.org/docs/hoon/reference/stdlib/2b#turn), except that instead of yielding a `list` with the values having had each applied, `++roll` and `++reel` produce an accumulated value.
```hoon
> (roll `(list @)`[1 2 3 4 5 ~] add)
q=15
> (reel `(list @)`[1 2 3 4 5 ~] mul)
120
```
#### Exercise:
- Use `++reel` to produce a gate which calculates the factorial of a number.
## Parsing Text
We need to build a tool to accept a `tape` containing some characters, then turn it into something else, something computational.
For instance, a calculator could accept an input like `3+4` and return `7`. A command-line interface may look for a program to evaluate (like Bash and `ls`). A search bar may apply logic to the query (like Google and `-` for `NOT`).
The basic problem all parsers face is this:
1. You need to accept a character string.
2. You need to ingest one or more characters and decide what they “mean”, including storing the result of this meaning.
3. You need to loop back to #1 again and again until you are out of characters.
We could build a simple parser out of a trap and `++snag`, but it would be brittle and difficult to extend. The Hoon parser is very sophisticated, since it has to take a file of ASCII characters (and some UTF-8 strings) and turn it via an AST into Nock code. What makes parsing challenging is that we have to wade directly into a sea of new types and processes. To wit:
- A `tape` is the string to be parsed.
- A `hair` is the position in the text the parser is at, as a cell of column & line, `[p=@ud q=@ud]`.
- A `nail` is parser input, a cell of `hair` and `tape`.
- An `edge` is parser output, a cell of `hair` and a `unit` of `hair` and `nail`. (There are some subtleties around failure-to-parse here that we'll defer a moment.)
- A `rule` is a parser, a gate which applies a `nail` to yield an `edge`.
Basically, one uses a `rule` on `[hair tape]` to yield an `edge`.
A substantial swath of the standard library is built around parsing for various scenarios, and there's a lot to know to effectively use these tools. **If you can parse arbitrary input using Hoon after this lesson, you're in fantastic shape for building things later.** It's worth spending extra effort to understand how these programs work.
There is a [full guide on parsing](https://urbit.org/docs/hoon/guides/parsing) which goes into more detail than this quick overview.
### Scanning Through a `tape`
[`++scan`](https://urbit.org/docs/hoon/reference/stdlib/4g#scan) parses a `tape` or crashes, simple enough. It will be our workhorse. All we really need to know in order to use it is how to build a `rule`.
Here we will preview using `++shim` to match characters with in a given range, here lower-case. If you change the character range, e.g. putting `' '` in the `++shim` will span from ASCII `32`, `' '` to ASCII `122`, `'z'`.
```hoon
> `(list)`(scan "after" (star (shim 'a' 'z')))
~[97 102 116 101 114]
> `(list)`(scan "after the" (star (shim 'a' 'z')))
{1 6}
syntax error
dojo: hoon expression failed
```
### `rule` Building
The `rule`-building system is vast and often requires various components together to achieve the desired effect.
#### `rule`s to parse fixed strings
- [`++just`](https://urbit.org/docs/hoon/reference/stdlib/4f/#just) takes in a single `char` and produces a `rule` that attempts to match that `char` to the first character in the `tape` of the input `nail`.
```hoon
> ((just 'a') [[1 1] "abc"])
[p=[p=1 q=2] q=[~ [p='a' q=[p=[p=1 q=2] q="bc"]]]]
```
- [`++jest`](https://urbit.org/docs/hoon/reference/stdlib/4f/#jest) matches a `cord`. It takes an input `cord` and produces a `rule` that attempts to match that `cord` against the beginning of the input.
```hoon
> ((jest 'abc') [[1 1] "abc"])
[p=[p=1 q=4] q=[~ [p='abc' q=[p=[p=1 q=4] q=""]]]]
> ((jest 'abc') [[1 1] "abcabc"])
[p=[p=1 q=4] q=[~ [p='abc' q=[p=[p=1 q=4] q="abc"]]]]
> ((jest 'abc') [[1 1] "abcdef"])
[p=[p=1 q=4] q=[~ [p='abc' q=[p=[p=1 q=4] q="def"]]]]
```
(Keep an eye on the structure of the return `edge` there.)
- [`++shim`](https://urbit.org/docs/hoon/reference/stdlib/4f/#shim) parses characters within a given range. It takes in two atoms and returns a `rule`.
```hoon
> ((shim 'a' 'z') [[1 1] "abc"])
[p=[p=1 q=2] q=[~ [p='a' q=[p=[p=1 q=2] q="bc"]]]]
```
- [`++next`](https://urbit.org/docs/hoon/reference/stdlib/4f/#next) is a simple `rule` that takes in the next character and returns it as the parsing result.
```hoon
> (next [[1 1] "abc"])
[p=[p=1 q=2] q=[~ [p='a' q=[p=[p=1 q=2] q="bc"]]]]
```
#### `rule`s to parse flexible strings
So far we can only parse one character at a time, which isn't much better than just using `++snag` in a trap.
```hoon
> (scan "a" (shim 'a' 'z'))
'a'
> (scan "ab" (shim 'a' 'z'))
{1 2}
syntax error
dojo: hoon expression failed
```
How do we parse multiple characters in order to break things up sensibly?
- [`++star`](https://urbit.org/docs/hoon/reference/stdlib/4f#star) will match a multi-character list of values.
```hoon
> (scan "a" (just 'a'))
'a'
> (scan "aaaaa" (just 'a'))
! {1 2}
! 'syntax-error'
! exit
> (scan "aaaaa" (star (just 'a')))
"aaaaa"
```
- [`++plug`](https://urbit.org/docs/hoon/reference/stdlib/4e/#plug) takes the `nail` in the `edge` produced by one rule and passes it to the next `rule`, forming a cell of the results as it proceeds.
```hoon
> (scan "starship" ;~(plug (jest 'star') (jest 'ship')))
['star' 'ship']
```
- [`++pose`](https://urbit.org/docs/hoon/reference/stdlib/4e/#pose) tries each `rule` you hand it successively until it finds one that works.
```hoon
> (scan "a" ;~(pose (just 'a') (just 'b')))
'a'
> (scan "b" ;~(pose (just 'a') (just 'b')))
'b'
> (;~(pose (just 'a') (just 'b')) [1 1] "ab")
[p=[p=1 q=2] q=[~ u=[p='a' q=[p=[p=1 q=2] q=[i='b' t=""]]]]]
```
- [`++glue`](https://urbit.org/docs/hoon/reference/stdlib/4e/#glue) parses a delimiter in between each `rule` and forms a cell of the results of each `rule`. Delimiter names hew to the aural ASCII pronunciation of symbols, plus `prn` for printable characters and
```hoon
> (scan "a b" ;~((glue ace) (just 'a') (just 'b')))
['a' 'b']
> (scan "a,b" ;~((glue com) (just 'a') (just 'b')))
['a' 'b']
> (scan "a,b,a" ;~((glue com) (just 'a') (just 'b')))
{1 4}
syntax error
> (scan "a,b,a" ;~((glue com) (just 'a') (just 'b') (just 'a')))
['a' 'b' 'a']
```
- The [`;~` micsig](https://urbit.org/docs/hoon/reference/rune/mic/#-micsig) will create `;~(combinator (list rule))` to use multiple `rule`s.
```hoon
> (scan "after the" ;~((glue ace) (star (shim 'a' 'z')) (star (shim 'a' 'z'))))
[[i='a' t=<|f t e r|>] [i='t' t=<|h e|>]
> (;~(pose (just 'a') (just 'b')) [1 1] "ab")
[p=[p=1 q=2] q=[~ u=[p='a' q=[p=[p=1 q=2] q=[i='b' t=""]]]]]
```
<!-- TODO
~tinnus-napbus:
btw you should almost always avoid recursive welding cos weld has to traverse the entire first list in order to weld it
so you potentially end up traversing the list thousands of times
which involves chasing a gorillion pointers
as a rule of thumb you wanna avoid the recursive use of stdlib list functions in general
-->
At this point we have two problems: we are just getting raw `@t` atoms back, and we can't iteratively process arbitrarily long strings. `++cook` will help us with the first of these:
- [`++cook`](https://urbit.org/docs/hoon/reference/stdlib/4f/#cook) will take a `rule` and a gate to apply to the successful parse.
```hoon
> ((cook ,@ud (just 'a')) [[1 1] "abc"])
[p=[p=1 q=2] q=[~ u=[p=97 q=[p=[p=1 q=2] q="bc"]]]]
> ((cook ,@tas (just 'a')) [[1 1] "abc"])
[p=[p=1 q=2] q=[~ u=[p=%a q=[p=[p=1 q=2] q="bc"]]]]
> ((cook |=(a=@ +(a)) (just 'a')) [[1 1] "abc"])
[p=[p=1 q=2] q=[~ u=[p=98 q=[p=[p=1 q=2] q="bc"]]]]
> ((cook |=(a=@ `@t`+(a)) (just 'a')) [[1 1] "abc"])
[p=[p=1 q=2] q=[~ u=[p='b' q=[p=[p=1 q=2] q="bc"]]]]
```
However, to parse iteratively, we need to use the [`++knee`]() function, which takes a noun as the bunt of the type the `rule` produces, and produces a `rule` that recurses properly. (You'll probably want to treat this as a recipe for now and just copy it when necessary.)
```hoon
|-(;~(plug prn ;~(pose (knee *tape |.(^$)) (easy ~))))
```
There is an example of a calculator [in the docs](https://urbit.org/docs/hoon/guides/parsing#recursive-parsers) that's worth a read. It uses `++knee` to scan in a set of numbers at a time.
```hoon
|= math=tape
|^ (scan math expr)
++ factor
%+ knee *@ud
|. ~+
;~ pose
dem
(ifix [pal par] expr)
==
++ term
%+ knee *@ud
|. ~+
;~ pose
((slug mul) tar ;~(pose factor term))
factor
==
++ expr
%+ knee *@ud
|. ~+
;~ pose
((slug add) lus ;~(pose term expr))
term
==
--
```
#### Example: Parse a String of Numbers
A simple `++shim`-based parser:
```hoon
> (scan "1234567890" (star (shim '0' '9')))
[i='1' t=<|2 3 4 5 6 7 8 9 0|>]
```
A refined `++cook`/`++cury`/`++jest` parser:
```hoon
> ((cook (cury slaw %ud) (jest '1')) [[1 1] "123"])
[p=[p=1 q=2] q=[~ u=[p=[~ 1] q=[p=[p=1 q=2] q="23"]]]]
> ((cook (cury slaw %ud) (jest '12')) [[1 1] "123"])
[p=[p=1 q=3] q=[~ u=[p=[~ 12] q=[p=[p=1 q=3] q="3"]]]]
```

View File

@ -0,0 +1,690 @@
---
title: Gates
nodes: 288, 299
objectives:
- "Distinguish dry and wet cores."
- "Describe use cases for wet gates (using genericity)."
- "Enumerate and distinguish use cases for dry cores (using variance):"
- "- Covariant (`%zinc`)"
- "- Contravariant (`%iron`)"
- "- Bivariant (`%lead`)"
- "- Invariant (`%gold`)"
---
# Adaptive Cores
_This module introduces how cores can be extended for different behavioral patterns. It may be considered optional and skipped if you are speedrunning Hoon School._
Cores can expose and operate with many different assumptions about their inputs and structure. `[battery payload]` describes the top-level structure of a core, but within that we already know other requirements can be enforced, like `[battery [sample context]]` for a gate, or no `sample` for a trap. Cores can also expose and operate on their input values with different relationships. This lesson is concerned with examining [_genericity_](https://en.wikipedia.org/wiki/Generic_programming) including certain kinds of [parametric polymorphism](https://en.wikipedia.org/wiki/Parametric_polymorphism), which allows flexibility in type, and [_variance_](https://en.wikipedia.org/wiki/Covariance_and_contravariance_%28computer_science%29), which allows cores to use different sets of rules as they evaluate.
If cores never changed, we wouldn't need polymorphism. Of course, nouns are immutable and never change, but we use them as templates to construct new nouns around.
Suppose we take a core, a cell `[battery payload]`, and replace `payload` with a different noun. Then, we invoke an arm from the battery.
Is this legal? Does it make sense? Every function call in Hoon does this, so we'd better make it work well.
The full core stores _both_ payload types: the type that describes the `payload` currently in the core, and the type that the core was compiled with.
In the [Bertrand Meyer tradition of type theory](https://en.wikipedia.org/wiki/Object-Oriented_Software_Construction), there are two forms of polymorphism: _variance_ and _genericity_. In Hoon this choice is per core: a core can be either `%wet` or `%dry`. Dry polymorphism relies on variance; wet polymorphism relies on genericity.
This lesson discusses both genericity and variance for core management. These two sections may be read separately or in either order, and all of this content is not a requirement for working extensively with Gall agents. If you're just starting off, wet gates (genericity) make the most sense to have in your toolkit now.
## Genericity
Polymorphism is a programming concept that allows a piece of code to use different types at different times. It's a common technique in most languages to make code that can be reused for many different situations, and Hoon is no exception.
### Dry Cores
A dry gate is the kind of gate that you're already familiar with: a one-armed [core](https://urbit.org/docs/glossary/core/) with a sample. A wet gate is also a one-armed [core](https://urbit.org/docs/glossary/core/) with a sample, but there is a difference in how types are handled. With a dry gate, when you pass in an argument and the code gets compiled, the type system will try to cast to the type specified by the gate; if you pass something that does not fit in the specified type, for example a `cord` instead of a `cell` you will get a `nest-fail` error.
A core's payload can change from its original value. In fact, this happens in the typical function call: the default sample is replaced with an input value. How can we ensure that the core's arms are able to run correctly, that the payload type is still appropriate despite whatever changes it has undergone?
There is a type check for each arm of a dry core, intended to verify that the arm's parent core has a payload of the correct type.
When the `$` buc arm of a dry gate is evaluated it takes its parent core—the dry gate itself—as the subject, often with a modified sample value. But any change in sample type should be conservative; the modified sample value must be of the same type as the default sample value (or possibly a subtype). When the `$` buc arm is evaluated it should have a subject of a type it knows how to use.
### Wet Gates
When you pass arguments to a wet gate, their types are preserved and type analysis is done at the definition site of the gate rather than at the call site. In other words, for a wet gate, we ask: “Suppose this core was actually _compiled_ using the modified payload instead of the one it was originally built with? Would the Nock formula we generated for the original template actually work for the modified `payload`?” Basically, wet gates allow you to hot-swap code at runtime and see if it “just works”—they defer the actual substitution in the `sample`. Wet gates are rather like [macros](https://en.wikipedia.org/wiki/Macro_%28computer_science%29) in this sense.
Consider a function like `++turn` which transforms each element of a list. To use `++turn`, we install a list and a transformation function in a generic core. The type of the list we produce depends on the type of the list and the type of the transformation function. But the Nock formulas for transforming each element of the list will work on any function and any list, so long as the function's argument is the list item.
A wet gate is defined by a [`|*` bartar](https://urbit.org/docs/hoon/reference/rune/bar#-bartar) rune rather than a `|=` bartis. More generally, cores that contain wet arms **must** be defined using [`|@` barpat](https://urbit.org/docs/hoon/reference/rune/bar#-barpat) instead of `|%` barcen (`|*` expands to a `|@` core with `$` buc arm). There is also [`|$` barbuc](https://urbit.org/docs/hoon/reference/rune/bar#-barbuc) which defines the wet gate mold builder (remember, we like gates that build gates).
In a nutshell, compare these two gates:
```hoon
> =dry |=([a=* b=*] [b a])
> =wet |*([a=* b=*] [b a])
> (dry %cat %dog)
[6.778.724 7.627.107]
> (wet %cat %dog)
[%dog %cat]
```
The dry gate does not preserve the type of `a` and `b`, but downcasts it to `*`; the wet gate does preserve the input types. It is good practice to include a cast in all gates, even wet gates. But in many cases the desired output type depends on the input type. How can we cast appropriately? Often we can cast by example, using the input values themselves (using `^+` ketlus).
Wet gates are therefore used when incoming type information is not well known and needs to be preserved. This includes parsing, building, and structuring arbitrary nouns. (If you are familiar with them, you can think of C++'s templates and operator overloading, and Haskell's typeclasses.) Wet gates are very powerful; they're enough rope to hang yourself with. Don't use them unless you have a specific reason to do so. (If you see `mull-*` errors then something has gone wrong with using wet gates.)
#### Exercise: The Trapezoid Rule
The [trapezoid rule](https://en.wikipedia.org/wiki/Trapezoidal_rule) solves a definite integral. It approximates the area under the curve by a trapezoid or (commonly) a series of trapezoids. The rule requires a function as one of the inputs, i.e. it applies _for a specific function_. We will use wet gates to accomplish this without stripping type information of the input gate core.
![](https://upload.wikimedia.org/wikipedia/commons/thumb/d/d1/Integration_num_trapezes_notation.svg/573px-Integration_num_trapezes_notation.svg.png)
<img src="https://latex.codecogs.com/svg.image?\large&space;\int_a^b&space;f(x)&space;\,&space;dx&space;\approx&space;\sum_{k=1}^N&space;\frac{f(x_{k-1})&space;&plus;&space;f(x_k)}{2}&space;\Delta&space;x_k&space;=&space;\tfrac{\Delta&space;x}{2}\left(f(x_0)&space;&plus;&space;2f(x_1)&plus;2f(x_2)&plus;&space;2f(x_3)&plus;2f(x_4)&plus;\cdots&plus;2f(x_{N-1})&space;&plus;&space;f(x_N)\right)" title="https://latex.codecogs.com/svg.image?\large \int_a^b f(x) \, dx \approx \sum_{k=1}^N \frac{f(x_{k-1}) + f(x_k)}{2} \Delta x_k = \tfrac{\Delta x}{2}\left(f(x_0) + 2f(x_1)+2f(x_2)+ 2f(x_3)+2f(x_4)+\cdots+2f(x_{N-1}) + f(x_N)\right)" />
<!--
\int_a^b f(x) \, dx \approx \sum_{k=1}^N \frac{f(x_{k-1}) + f(x_k)}{2} \Delta x_k = \tfrac{\Delta x}{2}\left(f(x_0) + 2f(x_1)+2f(x_2)+ 2f(x_3)+2f(x_4)+\cdots+2f(x_{N-1}) + f(x_N)\right)
-->
- Produce a trapezoid-rule integrator which accepts a wet gate (as a function of a single variable) and a list of _x_ values, and yields the integral as a `@rs` floating-point value. (If you are not yet familiar with these, you may wish to skip ahead to the next lesson.)
```hoon
++ trapezint
|* [a=(list @rs) b=gate]
=/ n (lent a)
=/ k 1
=/ sum .0
|- ^- @rs
?: =(+(k) n) (add:rs sum (b (snag k a)))
?: =(k 1)
$(k +(k), sum (add:rs sum (b (snag k a))))
$(k +(k), sum (mul:rs .2 (add:rs sum (b (snag k a)))))
```
The meat of this gate is concerned with correctly implementing the mathematical equation. In particular, wetness is required because `b` can be _any_ gate (although it should only be a gate with one argument, lest the whole thing `mull-grow` fail). If you attempt to create the equivalent dry gate (`|=` bartis), Hoon fails to build it with a `nest-fail` due to the loss of type information from the gate `b`.
#### Tutorial: `++need`
Wet gates and wet cores are used in Hoon when type information isn't well-characterized ahead of time, as when constructing `++map`s or `++set`s. For instance, almost all of the arms in `++by` and `++in`, as well as most `++list` tools, are wet gates.
Let's take a look at a particular wet gate from the Hoon standard library, [`++need`](https://urbit.org/docs/hoon/reference/stdlib/2a#need). `++need` works with a `unit` to produce the value of a successful `unit` call, or crash on `~`. (As this code is already defined in your `hoon.hoon`, you do not need to define it in the Dojo to use it.)
```hoon
++ need :: demand
|* a=(unit)
?~ a ~>(%mean.'need' !!)
u.a
```
Line by line:
```hoon
|* a=(unit)
```
This declares a wet gate which accepts a `unit`.
```hoon
?~ a ~<(%mean.'need' !!)
```
If `a` is empty, `~`, then the `unit` cannot be unwrapped. Crash with [`!!` zapzap](https://urbit.org/docs/hoon/reference/rune/zap#-zapzap), but use [`~<` siggal](https://urbit.org/docs/hoon/reference/rune/sig#-siggal) to hint to the runtime interpreter how to handle the crash.
```hoon
u.a
```
This returns the value in the `unit` since we now know it exists.
`++need` is wet because we don't want to lose type information when we extract from the `unit`.
### Parametric Polymorphism
We encountered `|$` barbuc above as a wet gate that is a mold builder rune which takes in a list of molds and produces a new mold. Here we take another look at this rune as an implementation of _parametric polymorphism_ in Hoon.
For example, we have `list`s, `tree`s, and `set`s in Hoon, which are each defined in `hoon.hoon` as wet gate mold builders. Take a moment to see for yourself. Each `++` arm is followed by `|$` and a list of labels for input types inside brackets `[ ]`. After that subexpression comes another that defines a type that is parametrically polymorphic with respect to the input values. For example, here is the definition of `list` from `hoon.hoon`:
```hoon
++ list
|$ [item]
:: null-terminated list
::
:: mold generator: produces a mold of a null-terminated list of the
:: homogeneous type {a}.
::
$@(~ [i=item t=(list item)])
```
The `|$` barbuc rune is especially useful for defining containers of various kinds. Indeed, `list`s, `tree`s, and `set`s are all examples of containers that accept subtypes. You can have a `(list @)`, a `(list ^)`, a `(list *)`, a `(tree @)`, a `(tree ^)`, a `(tree *)`, etc. The same holds for `set`.
One nice thing about containers defined by `|$` is that they nest in the expected way. Intuitively a `(list @)` should nest under `(list *)`, because `@` nests under `*`. And so it does:
```hoon
> =a `(list @)`~[11 22 33]
> ^-((list *) a)
~[11 22 33]
```
Conversely, a `(list *)` should not nest under `(list @)`, because `*` does not nest under `@`:
```hoon
> =b `(list *)`~[11 22 33]
> ^-((list @) b)
nest-fail
```
## Variadicity
Dry polymorphism works by substituting cores. Typically, one core is used as the interface definition, then replaced with another core which does something useful.
For core `b` to nest within core `a`, the batteries of `a` and `b` must have the same tree shape, and the product of each `b` arm must nest within the product of the `a` arm. Wet arms (described above) are not compatible unless the Hoon expression is exactly the same. But for dry cores we also apply a payload test that depends on the rules of variance.
There are four kinds of cores: `%gold`, `%iron`, `%zinc`, and `%lead`. You are able to use core-variance rules to create programs which take other programs as arguments. Which particular rules depends on which kind of core your program needs to complete.
Before we embark on the following discussion, we want you to know that [variance](https://en.wikipedia.org/wiki/Covariance_and_contravariance_%28computer_science%29) is a bright-line idea, much like cores themselves, which once you “get” illuminates you further about Hoon-nature. For the most part, though, you don't need to worry about core variance much unless you are writing kernel code, since it impinges on how cores evaluate with other cores as inputs. Don't sweat it if it takes a while for core variance to click for you. (If you want to dig into resources, check out Meyer type theory. The rules should make sense if you think about them intuitively and don't get hung up on terminology.) You should read up on the [Liskov substitution principle](https://en.wikipedia.org/wiki/Liskov_substitution_principle) if you want to dive deeper. [Vadzim Vysotski](https://vadzimv.dev/2019/10/01/generic-programming-part-1-introduction.html) and [Jamie Kyle](https://medium.com/@thejameskyle/type-systems-covariance-contravariance-bivariance-and-invariance-explained-35f43d1110f8) explain the theory of type system variance accessibly, while [Eric Lippert](https://archive.ph/QmiqB) provides a more technical description. There are many wrinkles that particular languages, such as object-oriented programming languages, introduce which we can elide here.
<!--
https://stackoverflow.com/questions/37467882/why-does-c-sharp-use-contravariance-not-covariance-in-input-parameters-with-de
https://docs.microsoft.com/en-us/dotnet/standard/generics/covariance-and-contravariance
--->
Briefly, computer scientist Eric Lippert [clarifies](https://stackoverflow.com/questions/37467882/why-does-c-sharp-use-contravariance-not-covariance-in-input-parameters-with-de) that “variance is a fact about the preservation of an assignment compatibility relationship across a transformation of types.” What trips learners up about variance is that **variance rules apply to the input and output of a core, not directly to the core itself**. A core has a _variance property_, but that property doesn't manifest until cores are used together with each other.
Variance describes the four possible relationships that type rules are able to have to each other. Hoon imaginatively designates these by metals. Briefly:
1. **Covariance (`%zinc`)** means that specific types nest inside of generic types: it's like claiming that a core that produces a `%plant` can produce a `%tree`, a subcategory of `%plant`. Covariance is useful for flexibility in return values.
2. **Contravariance (`%iron`)** means that generic types are expected to nest inside of specific types: it's like claiming that a core that can accept a `%tree` can accept a `%plant`, the supercategory of `%tree`. (Contravariance seems counterintuitive for many developers when they encounter it for the first time.) Contravariance is useful for flexibility in input values (`sample`s).
3. **Bivariance (`%lead`)** means that we can allow both covariant and contravariant behavior. While bivariance is included for completeness (including a worked example below), it is not commonly used and only a few examples exist in the standard library for building shared data structure cores.
4. **Invariance (`%gold`)** means that types must mutually nest compatibly: a core that accepts or produces a `%tree` can only accept or produce a `%tree`. This is the default behavior of cores, so it's the strongest model you have imprinted on. Cores which allow variance are changing that behavior.
A `%gold` core can be cast or converted to any metal, and any metal can be cast or converted to `%lead`.
<!--
TODO
would be nice to explain similar to aura nesting rules, but at the core level
https://medium.com/@thejameskyle/type-systems-covariance-contravariance-bivariance-and-invariance-explained-35f43d1110f8
-->
### `%zinc` Covariance
Covariance means that specific types nest inside of generic types: `%tree` nests inside of `%plant`. Covariant data types are sources, or read-only values.
A zinc core `z` has a read-only sample (payload head, `+6.z`) and an opaque context (payload tail, `+7.z`). (_Opaque_ here means that the faces and arms are not exported into the namespace, and that the values of faces and arms can't be written to. The object in question can be replaced by something else without breaking type safety.) A core `y` which nests within it must be a gold or zinc core, such that `+6.y` nests within `+6.z`. Hence, **covariant**.
<!-- If type `x` nests within type `xx`, and type `y` nests within type `yy`, then a core accepting `yy` and producing `x` nests within an iron core accepting `y` and producing `xx`. TODO not adjusted yet -->
You can read from the sample of a `%zinc` core, but not change it:
```hoon
> =mycore ^&(|=(a=@ 1))
> a.mycore
0
> mycore(a 22)
-tack.a
-find.a
ford: %slim failed:
ford: %ride failed to compute type:
```
Informally, a function fits an interface if the function has a more specific result and/or a less specific argument than the interface.
The [`^&` ketpam](https://urbit.org/docs/hoon/reference/rune/ket#-ketpam) rune converts a core to a `%zinc` covariant core.
### `%iron` Contravariance
Contravariance means that generic types nest inside of specific types. Contravariant data types are sinks, or write-only values.
An `%iron` core `i` has a write-only sample (payload head, `+6.i`) and an opaque context (payload tail, `+7.i`). A core `j` which nests within it must be a `%gold` or `%iron` core, such that `+6.i` nests within `+6.j`. Hence, **contravariant**.
If type `x` nests within type `xx`, and type `y` nests within type `yy`, then a core accepting `yy` and producing `x` nests within an iron core accepting `y` and producing `xx`.
Informally, a function fits an interface if the function has a more specific result and/or a less specific argument than the interface.
For instance, the archetypal Gall agents in `/sys/lull.hoon` are composed using iron gates since they will be used as examples for building actual agent cores. The `++rs` and sister gates in `/sys/hoon.hoon` are built using iron doors with specified rounding behavior so when you actually use the core (like `++add:rs`) the core you are using has been built as an example.
The [`|~` barsig](https://urbit.org/docs/hoon/reference/rune/bar#-barsig) rune produces an iron gate. The [`^|` ketbar](https://urbit.org/docs/hoon/reference/rune/ket#-ketbar) rune converts a `%gold` invariant core to an iron core.
### `%lead` Bivariance
Bivariance means that both covariance and contravariance apply. Bivariant data types have an opaque `payload` that can neither be read or written to.
A lead core `l` has an opaque `payload` which can be neither read nor written to. There is no constraint on the payload of a core `m` which nests within it. Hence, **bivariant**.
If type `x` nests within type `xx`, a lead core producing `x` nests within a lead core producing `xx`.
Bivariant data types are neither readable nor writeable, but have no constraints on nesting. These are commonly used for `/mar` marks and `/sur` structure files. They are useful as examples which produce types.
Informally, a more specific generator can be used as a less specific generator.
For instance, several archetypal cores in `/sys/lull.hoon` which define operational data structures for Arvo are composed using lead gates.
The [`|?` barwut](https://urbit.org/docs/hoon/reference/rune/bar#-barwut) rune produces a lead trap. The [`^?` ketwut](https://urbit.org/docs/hoon/reference/rune/ket#-ketwut) rune converts any core to a `%lead` bivariant core.
### `%gold` Invariance
Invariance means that type nesting is disallowed. Invariant data types have a read-write `payload`.
A `%gold` core `g` has a read-write payload; another core `h` that nests within it (i.e., can be substituted for it) must be a `%gold` core whose `payload` is mutually compatible (`+3.g` nests in `+3.h`, `+3.h` nests in `+3.g`). Hence, **invariant**.
By default, cores are `%gold` invariant cores.
### Illustrations
#### Tutorial: `%gold` Invariant Polymorphism
Usually it makes sense to cast for a `%gold` core type when you're treating a core as a state machine. The check ensures that the payload, which includes the relevant state, doesn't vary in type.
Let's look at simpler examples here, using the `^+` ketlus rune:
```hoon
> ^+(|=(^ 15) |=(^ 16))
< 1.jcu
[ [* *]
[our=@p now=@da eny=@uvJ]
<17.bny 33.ehb 14.dyd 53.vlb 77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
]
>
> ^+(|=(^ 15) |=([@ @] 16))
mint-nice
-need.@
-have.*
nest-fail
> ^+(|=(^ 15) |=(* 16))
mint-nice
-need.[* *]
-have.*
nest-fail
```
The first cast goes through because the right-hand gold core has the same sample type as the left-hand gold core. The sample types mutually nest. The second cast fails because the right-hand sample type is more specific than the left-hand sample type. (Not all cells, `^`, are pairs of atoms, `[@ @]`.) And the third cast fails because the right-hand sample type is broader than the left-hand sample type. (Not all nouns, `*`, are cells, `^`.)
Two more examples:
```
> ^+(=>([1 2] |=(@ 15)) =>([123 456] |=(@ 16)))
<1.xqz {@ @ud @ud}>
> ^+(=>([1 2] |=(@ 15)) =>([123 456 789] |=(@ 16)))
nest-fail
```
In these examples, the `=>` rune is used to give each core a simple context. The context of the left-hand core in each case is a pair of atoms, `[@ @]`. The first cast goes through because the right-hand core also has a pair of atoms as its context. The second cast fails because the right-hand core has the wrong type of context -- three atoms, `[@ @ @]`.
#### Tutorial: `%iron` Contravariant Polymorphism
`%iron` gates are particularly useful when you want to pass gates (having various payload types) to other gates. We can illustrate this use with a very simple example. Save the following as `/gen/gatepass.hoon` in your `%base` desk:
```hoon
|= a=_^|(|=(@ 15))
^- @
=/ b=@ (a 10)
(add b 20)
```
This generator is rather simple except for the first line. The sample is defined as an `%iron` gate and gives it the face `a`. The function as a whole is for taking some gate as input, calling it by passing it the value `10`, adding `20` to it, and returning the result. Let's try it out in the Dojo:
```hoon
> +gatepass |=(a=@ +(a))
31
> +gatepass |=(a=@ (add 3 a))
33
> +gatepass |=(a=@ (mul 3 a))
50
```
But we still haven't fully explained the first line of the code. What does `_^|(|=(@ 15))` mean? The inside portion is clear enough: `|=(@ 15)` produces a normal (i.e., `%gold`) gate that takes an atom and returns `15`. The [`^|` ketbar](https://urbit.org/docs/hoon/reference/rune/ket#-ketbar) rune is used to turn `%gold` gates to `%iron`. (Reverse alchemy!) And the `_` character turns that `%iron` gate value into a structure, i.e. a type. So the whole subexpression means, roughly: “the same type as an iron gate whose sample is an atom, `@`, and whose product is another atom, `@`”. The context isn't checked at all. This is good, because that allows us to accept gates defined and produced in drastically different environments. Let's try passing a gate with a different context:
```hoon
> +gatepass =>([22 33] |=(a=@ +(a)))
31
```
It still works. You can't do that with a gold core sample!
There's a simpler way to define an iron sample. Revise the first line of `/gen/gatepass.hoon` to the following:
```hoon
|= a=$-(@ @)
^- @
=/ b=@ (a 10)
(add b 20)
```
If you test it, you'll find that the generator behaves the same as it did before the edits. The [`$-` buchep](https://urbit.org/docs/hoon/reference/rune/buc#--buchep) rune is used to create an `%iron` gate structure, i.e., an `%iron` gate type. The first expression defines the desired sample type, and the second subexpression defines the gate's desired output type.
The sample type of an `%iron` gate is contravariant. This means that, when doing a cast with some `%iron` gate, the desired gate must have either the same sample type or a superset.
Why is this a useful nesting rule for passing gates? Let's say you're writing a function `F` that takes as input some gate `G`. Let's also say you want `G` to be able to take as input any **mammal**. The code of `F` is going to pass arbitrary **mammals** to `G`, so that `G` needs to know how to handle all **mammals** correctly. You can't pass `F` a gate that only takes **dogs** as input, because `F` might call it with a **cat**. But `F` can accept a gate that takes all **animals** as input, because a gate that can handle any **animal** can handle **any mammal**.
`%iron` cores are designed precisely with this purpose in mind. The reason that the sample is write-only is that we want to be able to assume, within function `F`, that the sample of `G` is a **mammal**. But that might not be true when `G` is first passed into `F`—the default value of `G` could be another **animal**, say, a **lizard**. So we restrict looking into the sample of `G` by making the sample write-only. The illusion is maintained and type safety secured.
Let's illustrate `%iron` core nesting properties:
```hoon
> ^+(^|(|=(^ 15)) |=(^ 16))
< 1|jcu
[ [* *]
[our=@p now=@da eny=@uvJ]
<17.bny 33.ehb 14.dyd 53.vlb 77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
]
>
> ^+(^|(|=(^ 15)) |=([@ @] 16))
mint-nice
-need.@
-have.*
nest-fail
> ^+(^|(|=(^ 15)) |=(* 16))
< 1|jcu
[ [* *]
[our=@p now=@da eny=@uvJ]
<17.bny 33.ehb 14.dyd 53.vlb 77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
]
>
```
(As before, we use the `^|` ketbar rune to turn `%gold` gates to `%iron`.)
The first cast goes through because the two gates have the same sample type. The second cast fails because the right-hand gate has a more specific sample type than the left-hand gate does. If you're casting for a gate that accepts any cell, `^`, it's because we want to be able to pass any cell to it. A gate that is only designed for pairs of atoms, `[@ @]`, can't handle all such cases, naturally. The third cast goes through because the right-hand gate sample type is broader than the left-hand gate sample type. A gate that can take any noun as its sample, `*`, works just fine if we choose only to pass it cells, `^`.
We mentioned previously that an `%iron` core has a write-only sample and an opaque context. Let's prove it.
Let's define a trivial gate with a context of `[g=22 h=44 .]`, convert it to `%iron` with `^|`, and bind it to `iron-gate` in the dojo:
```hoon
> =iron-gate ^| =>([g=22 h=44 .] |=(a=@ (add a g)))
> (iron-gate 10)
32
> (iron-gate 11)
33
```
Not a complicated function, but it serves our purposes. Normally (i.e., with `%gold` cores) we can look at a context value `p` of some gate `q` with a wing expression: `p.q`. Not so with the iron gate:
```hoon
> g.iron-gate
-find.g.iron-gate
```
And usually we can look at the sample value using the face given in the gate definition. Not in this case:
```hoon
> a.iron-gate
-find.a.iron-gate
```
If you really want to look at the sample you can check `+6` of `iron-gate`:
```hoon
> +6.iron-gate
0
```
… and if you really want to look at the head of the context (i.e., where `g` is located, `+14`) you can:
```hoon
> +14.iron-gate
22
```
… but in both cases all the relevant type information has been thrown away:
```hoon
> -:!>(+6.iron-gate)
#t/*
> -:!>(+14.iron-gate)
#t/*
```
#### Tutorial: `%zinc` Covariant Polymorphism
As with `%iron` cores, the context of `%zinc` cores is opaque—they cannot be written-to or read-from. The sample of a `%zinc` core is read-only. That means, among other things, that `%zinc` cores cannot be used for function calls. Function calls in Hoon involve a change to the sample (the default sample is replaced with the argument value), which is disallowed as type-unsafe for `%zinc` cores.
We can illustrate the casting properties of `%zinc` cores with a few examples. The [`^&` ketpam](https://urbit.org/docs/hoon/reference/rune/ket#-ketpam) rune is used to convert `%gold` cores to `%zinc`:
```hoon
> ^+(^&(|=(^ 15)) |=(^ 16))
< 1&jcu
[ [* *]
[our=@p now=@da eny=@uvJ]
<17.bny 33.ehb 14.dyd 53.vlb 77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
]
>
> ^+(^&(|=(^ 15)) |=([@ @] 16))
< 1&jcu
[ [* *]
[our=@p now=@da eny=@uvJ]
<17.bny 33.ehb 14.dyd 53.vlb 77.lrt 232.oiq 51.qbt 123.zao 46.hgz 1.pnw %140>
]
>
> ^+(^&(|=(^ 15)) |=(* 16))
mint-nice
-need.[* *]
-have.*
nest-fail
```
The first two casts succeed because the right-hand core sample type is either the same or a subset of the left-hand core sample type. The last one fails because the right-hand sample type is a superset.
Even though you can't function call a `%zinc` core, the arms of a `%zinc` core can be computed and the sample can be read. Let's test this with a `%zinc` gate of our own:
```hoon
> =zinc-gate ^& |=(a=_22 (add 10 a))
> (zinc-gate 12)
payload-block
> a.zinc-gate
22
> $.zinc-gate
32
```
#### Tutorial: `%lead` Bivariant Polymorphism
`%lead` cores have more permissive nesting rules than either `%iron` or `%zinc` cores. There is no restriction on which payload types nest. That means, among other things, that the payload type of a `%lead` core is both covariant and contravariant ( bivariant).
In order to preserve type safety when working with `%lead` cores, a severe restriction is needed. The whole payload of a `%lead` core is opaque—the payload can neither be written-to or read-from. For this reason, as was the case with `%zinc` cores, `%lead` cores cannot be called as functions.
The arms of a `%lead` core can still be evaluated, however. We can use the `^?` rune to convert a `%gold`, `%iron`, or `%zinc` core to lead:
```hoon
> =lead-gate ^? |=(a=_22 (add 10 a))
> $.lead-gate
32
```
But don't try to read the sample:
```hoon
> a.lead-gate
-find.a.lead-gate
```
#### Tutorial: `%lead` Bivariant Polymorphism
- Calculate the Fibonacci series using `%lead` and `%iron` cores.
This program produces a list populated by the first ten elements of the `++fib` arm. It consists of five arms; in brief:
- `++fib` is a trap (core with no sample and default arm `$` buc)
- `++stream` is a mold builder that produces a trap, a function with no argument. This trap can yield a value or a `~`.
- `++stream-type` is a wet gate that produces the type of items stored in `++stream`.
- `++to-list` is a wet gate that converts a `++stream` to a `list`.
- `++take` is a wet gate that takes a `++stream` and an atom and yields a modified subject (!) and another trap of `++stream`'s type.
**`/gen/fib.hoon`**
```hoon
=< (to-list (take fib 10))
|%
++ stream
|* of=mold
$_ ^? |.
^- $@(~ [item=of more=^$])
~
++ stream-type
|* s=(stream)
$_ => (s)
?~ . !!
item
++ to-list
|* s=(stream)
%- flop
=| r=(list (stream-type s))
|- ^+ r
=+ (s)
?~ - r
%= $
r [item r]
s more
==
++ take
|* [s=(stream) n=@]
=| i=@
^+ s
|.
?: =(i n) ~
=+ (s)
?~ - ~
:- item
%= ..$
i +(i)
s more
==
++ fib
^- (stream @ud)
=+ [p=0 q=1]
|. :- q
%= .
p q
q (add p q)
==
--
```
Let's examine each arm in detail.
##### `++stream`
```hoon
++ stream
|* of=mold
$_ ^? |.
^- $@(~ [item=of more=^$])
~
```
`++stream` is a mold-builder. It's a wet gate that takes one argument, `of`, which is a `mold`, and produces a `%lead` trap—a function with no `sample` and an arm `$` buc, with opaque `payload`.
`$_` buccab is a rune that produces a type from an example; `^?` ketwut converts (casts) a core to lead; `|.` bardot forms the trap. So to follow this sequence we read it backwards: we create a trap, convert it to a lead trap (making its payload inaccessible), and then use that lead trap as an example from which to produce a type.
With the line `^- $@(~ [item=of more=^$])`, the output of the trap will be cast into a new type. `$@` bucpat is the rune to describe a data structure that can either be an atom or a cell. The first part describes the atom, which here is going to be `~`. The second part describes a cell, which we define to have the head of type `of` with the face `item`, and a tail with a face of `more`. The expression `^$` is not a rune (no children), but rather a reference to the enclosing wet gate, so the tail of this cell will be of the same type produced by this wet gate.
The final `~` here is used as the type produced when initially calling this wet gate. This is valid because it nests within the type we defined on the previous line.
Now you can see that a `++stream` is either `~` or a pair of a value of some type and a `++stream`. This type represents an infinite series.
##### `++stream-type`
```hoon
++ stream-type
|* s=(stream)
$_ => (s)
?~ . !!
item
```
`++stream-type` is a wet gate that produces the type of items stored in the `stream` arm. The `(stream)` syntax is a shortcut for `(stream *)`; a stream of some type.
Calling a `++stream`, which is a trap, will either produce `item` and `more` or it will produce `~`. If it does produce `~`, the `++stream` is empty and we can't find what type it is, so we simply crash with `!!` zapzap.
##### `++take`
```hoon
++ take
|* [s=(stream) n=@]
=| i=@
^+ s
|.
?: =(i n) ~
=+ (s)
?~ - ~
:- item
%= ..$
i +(i)
s more
==
```
`++take` is another wet gate. This time it takes a `++stream` `s` and an atom `n`. We add an atom to the subject and then make sure that the trap we are creating is going to be of the same type as `s`, the `++stream` we passed in.
If `i` and `n` are equal, the trap will produce `~`. If not, `s` is called and has its result put on the front of the subject. If its value is `~`, then the trap again produces `~`. Otherwise the trap produces a cell of `item`, the first part of the value of `s`, and a new trap that increments `i`, and sets `s` to be the `more` trap which produces the next value of the `++stream`. The result here is a `++stream` that will only ever produce `n` items, even if the stream otherwise would have been infinite.
##### `++take`
```hoon
++ to-list
|* s=(stream)
%- flop
=| r=(list (stream-type s))
|- ^+ r
=+ (s)
?~ - r
%= $
r [item r]
s more
==
```
`++to-list` is a wet gate that takes `s`, a `++stream`, only here it will, as you may expect, produce a `list`. The rest of this wet gate is straightforward but we can examine it quickly anyway. As is the proper style, this list that is produced will be reversed, so `flop` is used to put it in the order it is in the stream. Recall that adding to the front of a list is cheap, while adding to the back is expensive.
`r` is added to the subject as an empty `list` of whatever type is produced by `s`. A new trap is formed and called, and it will produce the same type as `r`. Then `s` is called and has its value added to the subject. If the result is `~`, the trap produces `r`. Otherwise, we want to call the trap again, adding `item` to the front of `r` and changing `s` to `more`. Now the utility of `take` should be clear. We don't want to feed `to-list` an infinite stream as it would never terminate.
##### `++fib`
```hoon
++ fib
^- (stream @ud)
=+ [p=0 q=1]
|. :- q
%= .
p q
q (add p q)
==
```
The final arm in our core is `++fib`, which is a `++stream` of `@ud` and therefore is a `%lead` core. Its subject contains `p` and `q`, which will not be accessible outside of this trap, but because of the `%=` cenhep will be retained in their modified form in the product trap. The product of the trap is a pair (`:-` colhep) of an `@ud` and the trap that will produce the next `@ud` in the Fibonacci series.
```hoon
=< (to-list (take fib 10))
```
Finally, the first line of our program will take the first 10 elements of `fib` and produce them as a list.
```unknown
~[1 1 2 3 5 8 13 21 34 55]
```
This example is a bit overkill for simply calculating the Fibonacci series, but it illustrates how you could use `%lead` cores. Instead of `++fib`, you can supply any infinite sequence and `++stream` will correctly handle it.
#### Exercise: `%lead` Bivariant Polymorphism
- Produce a `%say` generator that yields another self-referential sequence, like the [Lucas numbers](https://en.wikipedia.org/wiki/Lucas_number) or the [ThueMorse sequence](https://en.wikipedia.org/wiki/Thue%E2%80%93Morse_sequence).

View File

@ -0,0 +1,989 @@
---
title: Mathematics
nodes: 234, 236, 284
objectives:
- "Review floating-point mathematics including IEEE-754."
- "Examine `@r` atomic representation of floating-point values."
- "Manipulate and convert floating-point values using the `@r` operations."
- "Examine `@s` atomic representation of signed integer values."
- "Use `+si` to manipulate `@s` signed integer values."
- "Define entropy and its source."
- "Utilize `eny` in a random number generator (`og`)."
- "Distinguish insecure hashing (`mug`) from secure hashing (`shax` and friends)."
---
# Mathematics
_This module introduces how non-`@ud` mathematics are instrumented in Hoon. It may be considered optional and skipped if you are speedrunning Hoon School._
All of the math we've done until this point relied on unsigned integers: there was no negative value possible, and there were no numbers with a fractional part. How can we work with mathematics that require more than just bare unsigned integers?
`@u` unsigned integers (whether `@ud` decimal, `@ux` hexadecimal, etc.) simply count upwards by binary place value from zero. However, if we apply a different interpretive rule to the resulting value, we can treat the integer (in memory) _as if_ it corresponded to a different real value, such as a [negative number](https://en.wikipedia.org/wiki/Integer) or a [number with a fractional part](https://en.wikipedia.org/wiki/Rational_number). Auras make this straightforward to explore:
```hoon
> `@ud`1.000.000
1.000.000
> `@ux`1.000.000
0xf.4240
> `@ub`1.000.000
0b1111.0100.0010.0100.0000
> `@sd`1.000.000
--500.000
> `@rs`1.000.000
.1.401298e-39
> `@rh`1.000.000
.~~3.125
> `@t`1.000.000
'@B\0f'
```
How can we actually treat other modes of interpreting numbers as mathematical quantities correctly? That's the subject of this lesson.
(Ultimately, we are using a concept called [Gödel numbering](https://en.wikipedia.org/wiki/G%C3%B6del_numbering) to justify mapping some data to a particular representation as a unique integer.)
## Floating-Point Mathematics
A number with a fractional part is called a “floating-point number” in computer science. This derives from its solution to the problem of representing the part less than one.
Consider for a moment how you would represent a regular decimal fraction if you only had integers available. You would probably adopt one of three strategies:
1. [**Rational numbers**](https://en.wikipedia.org/wiki/Fraction). Track whole-number ratios like fractions. Thus 1.25 = 5/4, thence the pair `(5, 4)`. Two numbers have to be tracked: the numerator and the denominator.
2. [**Fixed-point**](https://en.wikipedia.org/wiki/Fixed-point_arithmetic). Track the value in smaller fixed units (such as thousandths). By defining the base unit to be ¹/₁₀₀₀, 1.25 may be written 1250. One number needs to be tracked: the value in terms of the scale. (This is equivalent to rational numbers with only a fixed denominator allowed.)
3. [**Floating-point**](https://en.wikipedia.org/wiki/Floating-point_arithmetic). Track the value at adjustable scale. In this case, one needs to represent 1.25 as something like 125 × 10¯². Two numbers have to be tracked: the significand (125) and the exponent (-2).
Most systems use floating-point mathematics to solve this problem. For instance, single-precision floating-point mathematics designate one bit for the sign, eight bits for the exponent (which has 127 subtracted from it), and twenty-three bits for the significand.
![](https://upload.wikimedia.org/wikipedia/commons/thumb/d/d2/Float_example.svg/640px-Float_example.svg.png)
This number, `0b11.1110.0010.0000.0000.0000.0000.0000`, is converted to decimal as (-1)⁰ × 2¹²⁴¯¹²⁷ × 1.25 = 2¯³ × 1.25 = 0.15625.
(If you want to explore the bitwise representation of values, [this tool](https://evanw.github.io/float-toy/) allows you to tweak values directly and see the results.)
### Hoon Operations
Hoon utilizes the [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) implementation of floating-point math for four bitwidth representations.
| Aura | Meaning | Example |
| ---- | ------- | ------- |
| `@r` | Floating-point value | |
| `@rh` | Half-precision 16-bit mathematics | `.~~4.5` |
| `@rs` | Single-precision 32-bit mathematics | `.4.5` |
| `@rd` | Double-precision 64-bit mathematics | `.~4.5` |
| `@rq` | Quadruple-precision 128-bit mathematics | `.~~~4.5` |
There are also a few molds which can represent the separate values of the FP representation. These are used internally but mostly don't appear in userspace code.
As the arms for the four `@r` auras are identical within their appropriate core, we will use [`@rs` single-precision floating-point mathematics](https://urbit.org/docs/hoon/reference/stdlib/3b#rs) to demonstrate all operations.
#### Conversion to and from other auras
Any `@ud` unsigned decimal integer can be directly cast as an `@rs`.
```hoon
> `@ud`.1
1.065.353.216
```
However, as you can see here, the conversion is not “correct” for the perceived values. Examining the `@ux` hexadecimal and `@ub` binary representation shows why:
```hoon
> `@ux`.1
0x3f80.0000
> `@ub`.1
0b11.1111.1000.0000.0000.0000.0000.0000
```
If you refer back to the 32-bit floating-point example above, you'll see why: to represent one exactly, we have to use 1.0 = (-1)⁰ × 2¹²⁷¯¹²⁷ × 1 and thus `0b11.1111.1000.0000.0000.0000.0000.0000`.
So to carry out this conversion from `@ud` to `@rs` correctly, we should use the [`++sun:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#sunrs) arm.
```hoon
> (sun:rs 1)
.1
```
To go the other way requires us to use an algorithm for converting an arbitrary number with a fractional part back into `@ud` unsigned integers. The `++fl` named tuple representation serves this purpose, and uses the [Dragon4 algorithm](https://dl.acm.org/doi/10.1145/93548.93559) to accomplish the conversion:
```hoon
> (drg:rs .1)
[%d s=%.y e=--0 a=1]
> (drg:rs .3.1415926535)
[%d s=%.y e=-7 a=31.415.927]
> (drg:rs .1000)
[%d s=%.y e=--3 a=1]
```
It's up to you to decide how to handle this result, however! Perhaps a better option for many cases is to round the answer to an `@s` integer with [`++toi:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#toirs):
```hoon
> (toi:rs .3.1415926535)
[~ --3]
```
(`@s` signed integer math is discussed below.)
### Floating-point specific operations
As with aura conversion, the standard mathematical operators don't work for `@rs`:
```hoon
> (add .1 1)
1.065.353.217
> `@rs`(add .1 1)
.1.0000001
```
The `++rs` core defines a set of `@rs`-affiliated operations which should be used instead:
```hoon
> (add:rs .1 .1)
.2
```
This includes:
- [`++add:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#addrs), addition
- [`++sub:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#subrs), subtraction
- [`++mul:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#mulrs), multiplication
- [`++div:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#divrs), division
- [`++gth:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#gthrs), greater than
- [`++gte:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#gters), greater than or equal to
- [`++lth:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#lthrs), less than
- [`++lte:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#lters), less than or equal to
- [`++equ:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#equrs), check equality (but not nearness!)
- [`++sqt:rs`](https://urbit.org/docs/hoon/reference/stdlib/3b#sqtrs), square root
#### Exercise: `++is-close`
The `++equ:rs` arm checks for complete equality of two values. The downside of this arm is that it doesn't find very close values:
```hoon
> (equ:rs .1 .1)
%.y
> (equ:rs .1 .0.9999999)
%.n
```
- Produce an arm which check for two values to be close to each other by an absolute amount. It should accept three values: `a`, `b`, and `atol`. It should return the result of the following comparison:
<img src="https://latex.codecogs.com/svg.image?\large&space;|a-b|&space;\leq&space;\texttt{atol}" title="https://latex.codecogs.com/svg.image?\large |a-b| \leq \texttt{atol}" />
### `++rs` as a Door
What is `++rs`? It's a door with 21 arms:
```hoon
> rs
<21|hqd [r=?(%d %n %u %z) <51.qbt 123.ppa 46.hgz 1.pnw %140>]>
```
The battery of this core, pretty-printed as `21|hqd`, has 21 arms that define functions specifically for `@rs` atoms. One of these arms is named `++add`; it's a different `add` from the standard one we've been using for vanilla atoms, and thus the one we used above. When you invoke `add:rs` instead of just `add` in a function call, (1) the `rs` door is produced, and then (2) the name search for `add` resolves to the special `add` arm in `rs`. This produces the gate for adding `@rs` atoms:
```hoon
> add:rs
<1.uka [[a=@rs b=@rs] <21.hqd [r=?(%d %n %u %z) <51.qbt 123.ppa 46.hgz 1.pnw %140>]>]>
```
What about the sample of the `rs` door? The pretty-printer shows `r/?($n $u $d $z)`. The `rs` sample can take one of four values: `%n`, `%u`, `%d`, and `%z`. These argument values represent four options for how to round `@rs` numbers:
- `%n` rounds to the nearest value
- `%u` rounds up
- `%d` rounds down
- `%z` rounds to zero
The default value is `%z`, round to zero. When we invoke `++add:rs` to call the addition function, there is no way to modify the `rs` door sample, so the default rounding option is used. How do we change it? We use the `~( )` notation: `~(arm door arg)`.
Let's evaluate the `add` arm of `rs`, also modifying the door sample to `%u` for 'round up':
```hoon
> ~(add rs %u)
<1.uka [[a=@rs b=@rs] <21.hqd [r=?(%d %n %u %z) <51.qbt 123.ppa 46.hgz 1.pnw %140>]>]>
```
This is the gate produced by `add`, and you can see that its sample is a pair of `@rs` atoms. But if you look in the context you'll see the `rs` door. Let's look in the sample of that core to make sure that it changed to `%u`. We'll use the wing `+6.+7` to look at the sample of the gate's context:
```hoon
> +6.+7:~(add rs %u)
r=%u
```
It did indeed change. We also see that the door sample uses the face `r`, so let's use that instead of the unwieldy `+6.+7`:
```hoon
> r:~(add rs %u)
%u
```
We can do the same thing for rounding down, `%d`:
```hoon
> r:~(add rs %d)
%d
```
Let's see the rounding differences in action. Because `~(add rs %u)` produces a gate, we can call it like we would any other gate:
```hoon
> (~(add rs %u) .3.14159265 .1.11111111)
.4.252704
> (~(add rs %d) .3.14159265 .1.11111111)
.4.2527037
```
This difference between rounding up and rounding down might seem strange at first. There is a difference of 0.0000003 between the two answers. Why does this gap exist? Single-precision floats are 32-bit and there's only so many distinctions that can be made in floats before you run out of bits.
Just as there is a door for `@rs` functions, there is a Hoon standard library door for `@rd` functions (double-precision 64-bit floats), another for `@rq` functions (quad-precision 128-bit floats), and one more for `@rh` functions (half-precision 16-bit floats).
## Signed Integer Mathematics
Similar to floating-point representations, [signed integer](https://en.wikipedia.org/wiki/Signed_number_representations) representations use an internal bitwise convention to indicate whether a number should be treated as having a negative sign in front of the magnitude or not. There are several ways to represent signed integers:
1. [**Sign-magnitude**](https://en.wikipedia.org/wiki/Signed_number_representations#Sign%E2%80%93magnitude). Use the first bit in a fixed-bit-width representation to indicate whether the whole should be multiplied by -1, e.g. `0010.1011` for 43₁₀ and `1010.1011` for -43₁₀. (This is similar to the floating-point solution.)
2. [**One's complement**](https://en.wikipedia.org/wiki/Ones%27_complement). Use the bitwise `NOT` operation to represent the value, e.g. `0010.1011` for 43₁₀ and `1101.0100` for -43₁₀. This has the advantage that arithmetic operations are trivial, e.g. 43₁₀-41₁₀ = `0010.1011` + `1101.0110` = `1.0000.0001`, end-around carry the overflow to yield `0000.0010` = 2. (This is commonly used in hardware.)
3. [**Offset binary**](https://en.wikipedia.org/wiki/Offset_binary). This represents a number normally in binary _except_ that it counts from a point other than zero, like `-256`.
4. [**ZigZag**](https://developers.google.com/protocol-buffers/docs/encoding?hl=en#signed-ints). Positive signed integers correspond to even atoms of twice their absolute value, and negative signed integers correspond to odd atoms of twice their absolute value minus one.
There are tradeoffs in compactness of representation and efficiency of mathematical operations.
### Hoon Operations
`@u`-aura atoms are _unsigned_ values, but there is a complete set of _signed_ auras in the `@s` series. ZigZag was chosen for Hoon's signed integer representation because it represents negative values with small absolute magnitude as short binary terms.
| Aura | Meaning | Example |
| ---- | ------- | ------- |
| `@s` | signed integer| |
| `@sb` | signed binary | `--0b11.1000` (positive) |
| | | `--0b11.1000` (negative) |
| `@sd` | signed decimal | `--1.000.056` (positive) |
| | | `-1.000.056` (negative) |
| `@sx` | signed hexadecimal | `--0x5f5.e138` (positive) |
| | | `-0x5f5.e138` (negative) |
The [`++si`](https://urbit.org/docs/hoon/reference/stdlib/3a#si) core supports signed-integer operations correctly. However, unlike the `@r` operations, `@s` operations have different names (likely to avoid accidental mental overloading).
To produce a signed integer from an unsigned value, use [`++new:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#newsi) with a sign flag, or simply use [`++sun:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#sunsi)
```hoon
> (new:si & 2)
--2
> (new:si | 2)
-2
> `@sd`(sun:si 5)
--5
```
To recover an unsigned integer from a signed integer, use [`++old:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#oldsi), which returns the magnitude and the sign.
```hoon
> (old:si --5)
[%.y 5]
> (old:si -5)
[%.n 5]
```
- [`++sum:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#sumsi), addition
- [`++dif:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#difsi), subtraction
- [`++pro:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#prosi), multiplication
- [`++fra:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#frasi), division
- [`++rem:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#remsi), modulus (remainder after division), b modulo a as `@s`
- [`++abs:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#abssi), absolute value
- [`++cmp:si`](https://urbit.org/docs/hoon/reference/stdlib/3a#synsi), test for greater value (as index, `>``--1`, `<``-1`, `=``--0`)
To convert a floating-point value from number (atom) to text, use [`++scow`](https://urbit.org/docs/hoon/reference/stdlib/4m#scow) or [`++r-co:co`](https://urbit.org/docs/hoon/reference/stdlib/4k#r-coco) with [`++rlys`](https://urbit.org/docs/hoon/reference/stdlib/3b#rlys) (and friends):
```hoon
> (scow %rs .3.14159)
".3.14159"
> `tape`(r-co:co (rlys .3.14159))
"3.14159"
```
### Beyond Arithmetic
The Hoon standard library at the current time omits many [transcendental functions](https://en.wikipedia.org/wiki/Transcendental_function), such as the trigonometric functions. It is useful to implement pure-Hoon versions of these, although they are not as efficient as jetted mathematical code would be.
- Produce a version of `++factorial` which can operate on `@rs` inputs correctly.
- Produce an exponentiation function `++pow-n` which operates on integer `@rs` only.
```hoon
++ pow-n
:: restricted power, based on integers only
|= [x=@rs n=@rs]
^- @rs
?: =(n .0) .1
=/ p x
|- ^- @rs
?: (lth:rs n .2) p
$(n (sub:rs n .1), p (mul:rs p x))
```
- Using both of the above, produce the `++sine` function, defined by
<img src="https://latex.codecogs.com/svg.image?\large&space;\sin(x)&space;=&space;\sum_{n=0}^\infty&space;\frac{(-1)^n}{(2n&plus;1)!}x^{2n&plus;1}=&space;x&space;-&space;\frac{x^3}{3!}&space;&plus;&space;\frac{x^5}{5!}&space;-&space;\frac{x^7}{7!}&space;&plus;&space;\cdots" title="https://latex.codecogs.com/svg.image?\large \sin(x) = \sum_{n=0}^\infty \frac{(-1)^n}{(2n+1)!}x^{2n+1}= x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + \cdots" />
<!--
\sin(x) = \sum_{n=0}^\infty \frac{(-1)^n}{(2n+1)!}x^{2n+1}= x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + \cdots
-->
```hoon
++ sine
:: sin x = x - x^3/3! + x^5/5! - x^7/7! + x^9/9! - ...
|= x=@rs
^- @rs
=/ rtol .1e-5
=/ p .0
=/ po .-1
=/ i .0
|- ^- @rs
?: (lth:rs (absolute (sub:rs po p)) rtol) p
=/ ii (add:rs (mul:rs .2 i) .1)
=/ term (mul:rs (pow-n .-1 i) (div:rs (pow-n x ii) (factorial ii)))
$(i (add:rs i .1), p (add:rs p term), po p)
```
- Implement `++cosine`.
<img src="https://latex.codecogs.com/svg.image?\large&space;\cos(x)&space;=&space;\sum_{n=0}^\infty&space;\frac{(-1)^n}{(2n)!}x^{2n}&space;=&space;1&space;-&space;\frac{x^2}{2!}&space;&plus;&space;\frac{x^4}{4!}&space;-&space;\frac{x^6}{6!}&space;&plus;&space;\cdots&space;\\[8pt]&space;" title="https://latex.codecogs.com/svg.image?\large \cos(x) = \sum_{n=0}^\infty \frac{(-1)^n}{(2n)!}x^{2n} = 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} + \cdots \\[8pt] " />
<!--
\cos(x) = \sum_{n=0}^\infty \frac{(-1)^n}{(2n)!}x^{2n} = 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} + \cdots
-->
- Implement `++tangent`.
<img src="https://latex.codecogs.com/svg.image?\large&space;\tan(x)&space;=&space;\frac{\sin(x)}{\cos(x)}" title="https://latex.codecogs.com/svg.image?\large \tan(x) = \frac{\sin(x)}{\cos(x)}" />
<!--
\tan(x) = \frac{\sin(x)}{\cos(x)}
-->
- As a stretch exercise, look up definitions for [exp (e^x)](https://en.wikipedia.org/wiki/Exponentiation#The_exponential_function) and [natural logarithm](https://en.wikipedia.org/wiki/Natural_logarithm), and implement these. You can implement a general-purpose exponentiation function using the formula
<img src="https://latex.codecogs.com/svg.image?\large&space;x^n&space;=&space;\exp(n&space;\,\text{ln}\,&space;x)" title="https://latex.codecogs.com/svg.image?\large x^n = \exp(n \,\text{ln}\, x)" />
<!--
x^n = \exp(n \,\text{ln}\, x)
-->
(We will use these in subsequent examples.)
#### Exercise: Calculate the Fibonacci Sequence
The Binet expression gives the _n_th Fibonacci number.
<img src="https://latex.codecogs.com/svg.image?\large&space;F_n&space;=&space;\frac{\varphi^n-(-\varphi)^{-n}}{\sqrt&space;5}&space;=&space;\frac{\varphi^n-(-\varphi)^{-n}}{2&space;\varphi&space;-&space;1}" title="https://latex.codecogs.com/svg.image?\large F_n = \frac{\varphi^n-(-\varphi)^{-n}}{\sqrt 5} = \frac{\varphi^n-(-\varphi)^{-n}}{2 \varphi - 1}" />
<!--
F_n = \frac{\varphi^n-(-\varphi)^{-n}}{\sqrt 5} = \frac{\varphi^n-(-\varphi)^{-n}}{2 \varphi - 1}
-->
- Implement this analytical formula for the Fibonacci series as a gate.
## Date & Time Mathematics
Date and time calculations are challenging for a number of reasons: What is the correct granularity for an integer to represent? What value should represent the starting value? How should time zones and leap seconds be handled?
One particularly complicating factor is that there is no [Year Zero](https://en.wikipedia.org/wiki/Year_zero); 1 B.C. is immediately followed by A.D. 1.
The Julian date system used in astronomy differs from standard time in this regard.
In computing, absolute dates are calculated with respect to some base value; we refer to this as the _epoch_. Unix/Linux systems count time forward from Thursday 1 January 1970 00:00:00 UT, for instance. Windows systems count in 10¯⁷ s intervals from 00:00:00 1 January 1601. The Urbit epoch is `~292277024401-.1.1`, or 1 January 292,277,024,401 B.C.; since values are unsigned integers, no date before that time can be represented.
Time values, often referred to as _timestamps_, are commonly represented by the [UTC](https://www.timeanddate.com/time/aboututc.html) value. Time representations are complicated by offset such as timezones, regular adjustments like daylight savings time, and irregular adjustments like leap seconds. (Read [Dave Taubler's excellent overview](https://levelup.gitconnected.com/why-is-programming-with-dates-so-hard-7477b4aeff4c) of the challenges involved with calculating dates for further considerations, as well as [Martin Thoma's “What Every Developer Should Know About Time” (PDF)](https://zenodo.org/record/1443533/files/2018-10-06-what-developers-should-know-about-time.pdf).)
### Hoon Operations
A timestamp can be separated into the time portion, which is the relative offset within a given day, and the date portion, which represents the absolute day.
There are two molds to represent time in Hoon: the `@d` aura, with `@da` for a full timestamp and `@dr` for an offset; and the [`+$date`](https://urbit.org/docs/hoon/reference/stdlib/2q#date)/[`+$tarp`](https://urbit.org/docs/hoon/reference/stdlib/2q#tarp) structure:
| Aura | Meaning | Example |
| ---- | ------- | ------- |
| `@da` | Absolute date | `~2022.1.1` |
| | | `~2022.1.1..1.1.1..0000` |
| `@dr` | Relative date (difference) | `~h5.m30.s12` |
| | | `~d1000.h5.m30.s12..beef` |
```hoon
+$ date [[a=? y=@ud] m=@ud t=tarp]
+$ tarp [d=@ud h=@ud m=@ud s=@ud f=(list @ux)]
```
`now` returns the `@da` of the current timestamp (in UTC).
To go from a `@da` to a `+$tarp`, use [`++yell`](https://urbit.org/docs/hoon/reference/stdlib/3c#yell):
```hoon
> *tarp
[d=0 h=0 m=0 s=0 f=~]
> (yell now)
[d=106.751.991.821.625 h=22 m=58 s=10 f=~[0x44ff]]
> `tarp`(yell ~2014.6.6..21.09.15..0a16)
[d=106.751.991.820.172 h=21 m=9 s=15 f=~[0xa16]]
> (yell ~d20)
[d=20 h=0 m=0 s=0 f=~]
```
To go from a `@da` to a `+$date`, use [`++yore`](https://urbit.org/docs/hoon/reference/stdlib/3c#yore):
```hoon
> (yore ~2014.6.6..21.09.15..0a16)
[[a=%.y y=2.014] m=6 t=[d=6 h=21 m=9 s=15 f=~[0xa16]]]
> (yore now)
[[a=%.y y=2.022] m=5 t=[d=24 h=16 m=20 s=57 f=~[0xbaec]]]
```
To go from a `+$date` to a `@da`, use [`++year`](https://urbit.org/docs/hoon/reference/stdlib/3c#year):
```hoon
> (year [[a=%.y y=2.014] m=8 t=[d=4 h=20 m=4 s=57 f=~[0xd940]]])
~2014.8.4..20.04.57..d940
> (year (yore now))
~2022.5.24..16.24.16..d184
```
To go from a `+$tarp` to a `@da`, use [`++yule`](https://urbit.org/docs/hoon/reference/stdlib/3c#yule):
```hoon
> (yule (yell now))
0x8000000d312b148891f0000000000000
> `@da`(yule (yell now))
~2022.5.24..16.25.48..c915
> `@da`(yule [d=106.751.991.823.081 h=16 m=26 s=14 f=~[0xf727]])
~2022.5.24..16.26.14..f727
```
The Urbit date system correctly compensates for the lack of Year Zero:
```hoon
> ~0.1.1
~1-.1.1
> ~1-.1.1
~1-.1.1
```
The [`++yo`](https://urbit.org/docs/hoon/reference/stdlib/3c#yo) core contains constants useful for calculating time, but in general you should not hand-roll time or timezone calculations.
## Unusual Bases
### Phonetic Base
The `@q` aura is similar to `@p` except for two details: it doesn't obfuscate names (as planets do) and it can be used for any size of atom without adjust its width to fill the same size. Prefixes and suffixes are in the same order as `@p`, however. Thus:
```hoon
> `@q`0
.~zod
> `@q`256
.~marzod
> `@q`65.536
.~nec-dozzod
> `@q`4.294.967.296
.~nec-dozzod-dozzod
> `@q`(pow 2 128)
.~nec-dozzod-dozzod-dozzod-dozzod-dozzod-dozzod-dozzod-dozzod
```
`@q` auras can be used as sequential mnemonic markers for values.
The [`++po`](https://urbit.org/docs/hoon/reference/stdlib/4a#po) core contains tools for directly parsing `@q` atoms.
### Base-32 and Base-64
The base-32 representation uses the characters `0123456789abcdefghijklmnopqrstuv` to represent values. The digits are separated into collections of five characters separated by `.` dot.
```hoon
> `@uv`0
0v0
> `@uv`100
0v34
> `@uv`1.000.000
0vugi0
> `@uv`1.000.000.000.000
0vt3a.aa400
```
The base-64 representation uses the characters `0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-~` to represent values. The digits are separated into collections of five characters separated by `.` dot.
```hoon
> `@uw`0
0w0
> `@uw`100
0w1A
> `@uw`1.000.000
0w3Q90
> `@uw`1.000.000.000
0wXCIE0
> `@uw`1.000.000.000.000
0wez.kFh00
```
## Randomness
### Entropy
You previously saw entropy introduced when we discussed stateful random number generation. Let's dig into what's actually going on with entropy.
It is not straightforward for a computer, a deterministic machine, to produce an unpredictable sequence. We can either use a source of true randomness (such as the third significant digit of chip temperature or another [hardware source](https://en.wikipedia.org/wiki/Hardware_random_number_generator)) or a source of artificial randomness (such as a sequence of numbers the user cannot predict).
For instance, consider the sequence _3 1 4 1 5 9 2 6 5 3 5 8 9 7 9 3_. If you recognize the pattern as the constant π, you can predict the first few digits, but almost certainly not more than that. The sequence is deterministic (as it is derived from a well-characterized mathematical process) but unpredictable (as you cannot _a priori_ guess what the next digit will be).
Computers often mix both deterministic processes (called “pseudorandom number generators”) with random inputs, such as the current timestamp, to produce high-quality random numbers for use in games, modeling, cryptography, and so forth. The Urbit entropy value `eny` is derived from the underlying host OS's `/dev/urandom` device, which uses sources like keystroke typing latency to produce random bits.
### Random Numbers
Given a source of entropy to seed a random number generator, one can then use the [`++og`](https://urbit.org/docs/hoon/reference/stdlib/3d#og) door to produce various kinds of random numbers. The basic operations of `++og` are described in [the lesson on subject-oriented programming](./N-subject.md).
#### Exercise: Implement a random-number generator from scratch
- Produce a random stream of bits using the linear congruential random number generator.
The linear congruential random number generator produces a stream of random bits with a repetition period of 2³¹. Numericist John Cook [explains how LCGs work](https://www.johndcook.com/blog/2017/07/05/simple-random-number-generator/):
> The linear congruential generator used here starts with an arbitrary seed, then at each step produces a new number by multiplying the previous number by a constant and taking the remainder by 2³¹-1.
**`/gen/lcg.hoon`**
```hoon
|= n=@ud :: n is the number of bits to return
=/ z 20.220.524 :: z is the seed
=/ a 742.938.285 :: a is the multiplier
=/ e 31 :: e is the exponent
=/ m (sub (pow 2 e) 1) :: modulus
=/ index 0
=/ accum *@ub
|- ^- @ub
?: =(index n) accum
%= $
index +(index)
z (mod (mul a z) m)
accum (cat 5 z accum)
==
```
Can you verify that `1`s constitute about half of the values in this bit stream, as Cook illustrates in Python?
#### Exercise: Produce uniformly-distributed random numbers
- Using entropy as the source, produce uniform random numbers: that is, numbers in the range [0, 1] with equal likelihood to machine precision.
We use the LCG defined above, then chop out 23-bit slices using [`++rip`](https://urbit.org/docs/hoon/reference/stdlib/2c#rip) to produce each number, manually compositing the result into a valid floating-point number in the range [0, 1]. (We avoid producing special sequences like [`NaN`](https://en.wikipedia.org/wiki/NaN).)
**`/gen/uniform.hoon`**
```hoon
!:
=<
|= n=@ud :: n is the number of values to return
^- (list @rs)
=/ values (rip 5 (~(lcg gen 20.220.524) n))
=/ mask-clear 0b111.1111.1111.1111.1111.1111
=/ mask-fill 0b11.1111.0000.0000.0000.0000.0000.0000
=/ clears (turn values |=(a=@rs (dis mask-clear a)))
(turn clears |=(a=@ (sub:rs (mul:rs .2 (con mask-fill a)) .1.0)))
|%
++ gen
|_ [z=@ud]
++ lcg
|= n=@ud :: n is the number of bits to return
=/ a 742.938.285 :: a is the multiplier
=/ e 31 :: e is the exponent
=/ m (sub (pow 2 e) 1) :: modulus
=/ index 0
=/ accum *@ub
|- ^- @ub
?: =(index n) accum
%= $
index +(index)
z (mod (mul a z) m)
accum (cat 5 z accum)
==
--
--
```
- Convert the above to a `%say` generator that can optionally accept a seed; if no seed is provided, use `eny`.
- Produce a higher-quality Mersenne Twister uniform RNG, such as [per this method](https://xilinx.github.io/Vitis_Libraries/quantitative_finance/2022.1/guide_L1/RNGs/RNG.html).
#### Exercise: Produce normally-distributed random numbers
- Produce a normally-distributed random number generator using the uniform RNG described above.
The normal distribution, or bell curve, describes the randomness of measurement. The mean, or average value, is at zero, while points fall farther and farther away with increasingly less likelihood.
![A normal distribution curve with standard deviations marked](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8c/Standard_deviation_diagram.svg/640px-Standard_deviation_diagram.svg.png)
One way to get from a uniform random number to a normal random number is [to use the uniform random number as the _cumulative distribution function_ (CDF)](https://www.omscs-notes.com/simulation/generating-uniform-random-numbers/), an index into “how far” the value is along the normal curve.
![A cumulative distribution function for three normal distributions](https://upload.wikimedia.org/wikipedia/commons/thumb/c/ca/Normal_Distribution_CDF.svg/640px-Normal_Distribution_CDF.svg.png)
This is an approximation which is accurate to one decimal place:
<img src="https://latex.codecogs.com/svg.image?\large&space;Z&space;=&space;\frac{U^{0.135}-(1-U)^{0.135}}{0.1975}" title="https://latex.codecogs.com/svg.image?\large Z = \frac{U^{0.135}-(1-U)^{0.135}}{0.1975}" />
where
- sgn is the signum or sign function.
<!--
$$
Z = \frac{U^{0.135}-(1-U)^{0.135}}{0.1975}
$$
-->
To calculate an arbitrary power of a floating-point number, we require a few transcendental functions, in particular the natural logarithm and exponentiation of base _e_. The following helper core contains relatively inefficient but clear implementations of standard numerical methods.
**`/gen/normal.hoon`**
```hoon
!:
=<
|= n=@ud :: n is the number of values to return
^- (list @rs)
=/ values (rip 5 (~(lcg gen 20.220.524) n))
=/ mask-clear 0b111.1111.1111.1111.1111.1111
=/ mask-fill 0b11.1111.0000.0000.0000.0000.0000.0000
=/ clears (turn values |=(a=@rs (dis mask-clear a)))
=/ uniforms (turn clears |=(a=@ (sub:rs (mul:rs .2 (con mask-fill a)) .1.0)))
(turn uniforms normal)
|%
++ factorial
:: integer factorial, not gamma function
|= x=@rs
^- @rs
=/ t=@rs .1
|- ^- @rs
?: |(=(x .1) (lth x .1)) t
$(x (sub:rs x .1), t (mul:rs t x))
++ absrs
|= x=@rs ^- @rs
?: (gth:rs x .0)
x
(sub:rs .0 x)
++ exp
|= x=@rs
^- @rs
=/ rtol .1e-5
=/ p .1
=/ po .-1
=/ i .1
|- ^- @rs
?: (lth:rs (absrs (sub:rs po p)) rtol) p
$(i (add:rs i .1), p (add:rs p (div:rs (pow-n x i) (factorial i))), po p)
++ pow-n
:: restricted power, based on integers only
|= [x=@rs n=@rs]
^- @rs
?: =(n .0) .1
=/ p x
|- ^- @rs
?: (lth:rs n .2) p
$(n (sub:rs n .1), p (mul:rs p x))
++ ln
:: natural logarithm, z > 0
|= z=@rs
^- @rs
=/ rtol .1e-5
=/ p .0
=/ po .-1
=/ i .0
|- ^- @rs
?: (lth:rs (absrs (sub:rs po p)) rtol)
(mul:rs (div:rs (mul:rs .2 (sub:rs z .1)) (add:rs z .1)) p)
=/ term1 (div:rs .1 (add:rs .1 (mul:rs .2 i)))
=/ term2 (mul:rs (sub:rs z .1) (sub:rs z .1))
=/ term3 (mul:rs (add:rs z .1) (add:rs z .1))
=/ term (mul:rs term1 (pow-n (div:rs term2 term3) i))
$(i (add:rs i .1), p (add:rs p term), po p)
++ powrs
:: general power, based on logarithms
:: x^n = exp(n ln x)
|= [x=@rs n=@rs]
(exp (mul:rs n (ln x)))
++ normal
|= u=@rs
(div:rs (sub:rs (powrs u .0.135) (powrs (sub:rs .1 u) .0.135)) .0.1975)
++ gen
|_ [z=@ud]
++ lcg
|= n=@ud :: n is the number of bits to return
=/ a 742.938.285 :: a is the multiplier
=/ e 31 :: e is the exponent
=/ m (sub (pow 2 e) 1) :: modulus
=/ index 0
=/ accum *@ub
|- ^- @ub
?: =(index n) accum
%= $
index +(index)
z (mod (mul a z) m)
accum (cat 5 z accum)
==
--
--
```
#### Exercise: Upgrade the normal RNG
A more complicated formula uses several constants to improve the accuracy significantly:
<img src="https://latex.codecogs.com/svg.image?\large&space;Z&space;=&space;\text{sgn}\left(U-\frac{1}{2}\right)&space;\left(&space;t&space;-&space;\frac{c_{0}&plus;c_{1}&space;t&plus;c_{2}&space;t^{2}}{1&plus;d_{1}&space;t&plus;d_{2}&space;t^{2}&space;&plus;&space;d_{3}&space;t^{3}}&space;\right)" title="https://latex.codecogs.com/svg.image?\large Z = \text{sgn}\left(U-\frac{1}{2}\right) \left( t - \frac{c_{0}+c_{1} t+c_{2} t^{2}}{1+d_{1} t+d_{2} t^{2} + d_{3} t^{3}} \right)" />
where
- sgn is the signum or sign function;
- _t_ is √-ln[min(_U_, 1-_U_)²]; and
- the constants are:
- _c_₀ = 2.515517
- _c_₁ = 0.802853
- _c_₂ = 0.010328
- _d_₁ = 1.532788
- _d_₂ = 0.189268
- _d_₃ = 0.001308
<!--
$$
Z = \text{sgn}\left(U-\frac{1}{2}\right) \left( t - \frac{c_{0}+c_{1} t+c_{2} t^{2}}{1+d_{1} t+d_{2} t^{2} + d_{3} t^{3}} \right)
$$
-->
- Implement this formula in Hoon to produce normally-distributed random numbers.
- How would you implement other random number generators?
<!--
**`/gen/normal2.hoon`**
```hoon
!:
=<
|= n=@ud :: n is the number of values to return
^- (list @rs)
=/ values (rip 5 (~(lcg gen 20.220.524) n))
=/ mask-clear 0b111.1111.1111.1111.1111.1111
=/ mask-fill 0b11.1111.0000.0000.0000.0000.0000.0000
=/ clears (turn values |=(a=@rs (dis mask-clear a)))
=/ uniforms (turn clears |=(a=@ (sub:rs (mul:rs .2 (con mask-fill a)) .1.0)))
(turn uniforms normal)
|%
++ sgn
|= x=@rs
^- @rs
?: (lth:rs x .0)
.-1
?: (gth:rs x .0)
.1
.0
++ factorial
:: integer factorial, not gamma function
|= x=@rs
^- @rs
=/ t=@rs .1
|- ^- @rs
?: |(=(x .1) (lth x .1)) t
$(x (sub:rs x .1), t (mul:rs t x))
++ absrs
|= x=@rs ^- @rs
?: (gth:rs x .0)
x
(sub:rs .0 x)
++ exp
|= x=@rs
^- @rs
=/ rtol .1e-5
=/ p .1
=/ po .-1
=/ i .1
|- ^- @rs
?: (lth:rs (absrs (sub:rs po p)) rtol) p
$(i (add:rs i .1), p (add:rs p (div:rs (pow-n x i) (factorial i))), po p)
++ pow-n
:: restricted power, based on integers only
|= [x=@rs n=@rs]
^- @rs
?: =(n .0) .1
=/ p x
|- ^- @rs
?: (lth:rs n .2) p
$(n (sub:rs n .1), p (mul:rs p x))
++ ln
:: natural logarithm, z > 0
|= z=@rs
^- @rs
=/ rtol .1e-5
=/ p .0
=/ po .-1
=/ i .0
|- ^- @rs
?: (lth:rs (absrs (sub:rs po p)) rtol)
(mul:rs (div:rs (mul:rs .2 (sub:rs z .1)) (add:rs z .1)) p)
=/ term1 (div:rs .1 (add:rs .1 (mul:rs .2 i)))
=/ term2 (mul:rs (sub:rs z .1) (sub:rs z .1))
=/ term3 (mul:rs (add:rs z .1) (add:rs z .1))
=/ term (mul:rs term1 (pow-n (div:rs term2 term3) i))
$(i (add:rs i .1), p (add:rs p term), po p)
++ powrs
:: general power, based on logarithms
:: x^n = exp(n ln x)
|= [x=@rs n=@rs]
(exp (mul:rs n (ln x)))
++ minrs
|= [a=@rs b=@rs]
?: (lth:rs a b) a b
++ normal
|= u=@rs
=/ c0 .2.515517
=/ c1 .0.802853
=/ c2 .0.010328
=/ d1 .1.532788
=/ d2 .0.189268
=/ d3 .0.001308
=/ t (sqt:rs (powrs (sub:rs .1 (ln (minrs u (sub:rs .1 u)))) .2))
=/ znum :(add:rs c0 (mul:rs c1 t) (mul:rs c2 (mul:rs t t)))
=/ zden :(add:rs .1 (mul:rs d1 t) (mul:rs d2 (mul:rs t t)) (mul:rs d3 (powrs t .3)))
(mul:rs (sgn (sub:rs u .0.5)) (sub:rs t (div:rs znum zden)))
++ gen
|_ [z=@ud]
++ lcg
|= n=@ud :: n is the number of bits to return
=/ a 742.938.285 :: a is the multiplier
=/ e 31 :: e is the exponent
=/ m (sub (pow 2 e) 1) :: modulus
=/ index 0
=/ accum *@ub
|- ^- @ub
?: =(index n) accum
%= $
index +(index)
z (mod (mul a z) m)
accum (cat 5 z accum)
==
--
--
```
-->
## Hashing
A [hash function](https://en.wikipedia.org/wiki/Hash_function) is a tool which can take any input data and produce a fixed-length value that corresponds to it. Hashes can be used for many purposes:
1. **Encryption**. A [cryptographic hash function](https://en.wikipedia.org/wiki/Cryptographic_hash_function) leans into the one-way nature of a hash calculation to produce a fast, practically-irreversible hash of a message. They are foundational to modern cryptography.
2. **Attestation or preregistration**. If you wish to demonstrate that you produced a particular message at a later time (including a hypothesis or prediction), or that you solved a particular problem, hashing the text of the solution and posting the hash publicly allows you to verifiably timestamp your work.
3. **Integrity verification**. By comparing the hash of data to its expected hash, you can verify that two copies of data are equivalent (such as a downloaded executable file). The [MD5](https://en.wikipedia.org/wiki/MD5) hash algorithm is frequently used for this purpose as [`md5sum`](https://en.wikipedia.org/wiki/Md5sum).
4. **Data lookup**. [Hash tables](https://en.wikipedia.org/wiki/Hash_table) are one way to implement a key→value mapping, such as the functionality offered by Hoon's `++map`.
Theoretically, since the number of fixed-length hashes are finite, an infinite number of possible programs can yield any given hash. This is called a [_hash collision_](https://en.wikipedia.org/wiki/Hash_collision), but for many practical purposes such a collision is extremely unlikely.
### Hoon Operations
The Hoon standard library supports fast insecure hashing with [`++mug`](https://urbit.org/docs/hoon/reference/stdlib/2e#mug), which accepts any noun and produces an atom of the hash.
```hoon
> `@ux`(mug 1)
0x715c.2a60
> `@ux`(mug 2)
0x718b.9468
> `@ux`(mug 3)
0x72a8.ef1a
> `@ux`(mug 1.000.000)
0x5145.9d7d
> `@ux`(mug .)
0x6c91.8422
```
`++mug` operates on the raw form of the noun however, without Hoon-specific metadata like aura:
```hoon
> (mug 0x5)
721.923.263
> (mug 5)
721.923.263
```
Hoon also includes [SHA-256 and SHA-512](https://en.wikipedia.org/wiki/SHA-2) [tooling](https://urbit.org/docs/hoon/reference/stdlib/3d). ([`++og`](https://urbit.org/docs/hoon/reference/stdlib/3d#og), the random number generator, is based on SHA-256 hashing.)
- [`++shax`](https://urbit.org/docs/hoon/reference/stdlib/3d#shax) produces a hashed atom of 256 bits from any atom.
```hoon
> (shax 1)
69.779.012.276.202.546.540.741.613.998.220.636.891.790.827.476.075.440.677.599.814.057.037.833.368.907
> `@ux`(shax 1)
0x9a45.8577.3ce2.ccd7.a585.c331.d60a.60d1.e3b7.d28c.bb2e.de3b.c554.4534.2f12.f54b
> `@ux`(shax 2)
0x86d9.5764.98ea.764b.4924.3efe.b05d.f625.0104.38c6.a55d.5b57.8de4.ff00.c9b4.c1db
> `@ux`(shax 3)
0xc529.ffad.9a5a.b611.62b1.1d61.6b63.9e00.586b.a846.746a.197d.4daf.78b9.08ed.4f08
> `@ux`(shax 1.000.000)
0x84a4.929b.1d69.708e.d4b7.0fb8.ca97.cc85.c4a6.1aae.4596.f753.d0d2.6357.e7b9.eb0f
```
- [`++shaz`](https://urbit.org/docs/hoon/reference/stdlib/3d#shaz) produces a hashed atom of 512 bits from any atom.
```hoon
> (shaz 1)
3.031.947.054.025.992.811.210.838.487.475.158.569.967.793.095.050.169.760.709.406.427.393.828.309.497.273.121.275.530.382.185.415.047.474.588.395.933.812.689.047.905.034.106.140.802.678.745.778.695.328.891
> `@ux`(shaz 1)
0x39e3.d936.c6e3.1eaa.c08f.cfcf.e7bb.4434.60c6.1c0b.d5b7.4408.c8bc.c35a.6b8d.6f57.00bd.cdde.aa4b.466a.e65f.8fb6.7f67.ca62.dc34.149e.1d44.d213.ddfb.c136.68b6.547b
> `@ux`(shaz 2)
0xcadc.698f.ca01.cf29.35f7.6027.8554.b4e6.1f35.4539.75a5.bb45.3890.0315.9bc8.485b.7018.dd81.52d9.cc23.b6e9.dd91.b107.380b.9d14.ddbf.9cc0.37ee.53a8.57b6.c948.b8fa
> `@ux`(shaz 3)
0x4ba.a6ba.4a01.12e6.248b.5e89.9389.4786.aced.1a59.136b.78c6.7076.eb90.2221.d7a5.453a.56d1.446d.17d1.33cd.b468.f798.eb6b.dcee.f071.7040.7a2f.aa94.df7d.81f5.5be4
> `@ux`(shaz 1.000.000)
0x4c13.ef8b.09cf.6e59.05c4.f203.71a4.9cec.3432.ba26.0174.f964.48f1.5475.b2dd.2c59.98c2.017c.9c03.cbea.9d5f.591b.ff23.bbff.b0ae.9c67.a4a9.dd8d.748a.8e14.c006.cbcc
```
#### Exercise: Produce a secure password tool
- Produce a basic secure password tool. It should accept a password, salt it (add a predetermined value to the password), and hash it. _That_ hash is then compared to a reference hash to determine whether or not the password is correct.

View File

@ -0,0 +1,33 @@
+++
title = "Hoon School"
weight = 5
sort_by = "weight"
template = "sections/docs/chapters.html"
insert_anchor_links = "right"
+++
## Table of Contents
- [Introduction](/guides/core/hoon-school/A-intro)
#### Lessons
1. [Hoon Syntax](/guides/core/hoon-school/B-syntax) - This module will discuss the fundamental data concepts of Hoon and how programs effect control flow.
2. [Azimuth](/guides/core/hoon-school/C-azimuth) - This module introduces how Urbit ID is structured and provides practice in converting and working with `@p` identity points.
3. [Gates (Functions)](/guides/core/hoon-school/D-gates) - This module will teach you how to produce deferred computations for later use, like functions in other languages.
4. [Molds (Types)](/guides/core/hoon-school/E-types) - This module will introduce the Hoon type system and illustrate how type checking and type inference work.
5. [Cores](/guides/core/hoon-school/F-cores) - This module will introduce the key Hoon data structure known as the **core**, as well as ramifications.
6. [Trees & Addressing](/guides/core/hoon-school/G-trees) - This module will elaborate how we can use the structure of nouns to locate data and evaluate code in a given expression. It will also discuss the important `list` mold builder and a number of standard library operations.
7. [Libraries](/guides/core/hoon-school/H-libraries) - This module will discuss how libraries can be produced, imported, and used.
8. [Testing Code](/guides/core/hoon-school/I-testing) - This module will discuss how we can have confidence that a program does what it claims to do, using unit testing and debugging strategies.
9. [Text Processing I](/guides/core/hoon-school/J-stdlib-text) - This module will discuss how text is represented in Hoon, discuss tools for producing and manipulating text, and introduce the `%say` generator, a new generator type.
10. [Cores & Doors](/guides/core/hoon-school/K-doors) - This module will start by introducing the concept of gate-building gates; then it will expand our notion of cores to include doors; finally it will introduce a common door, the `++map`, to illustrate how doors work.
11. [Type Checking](/guides/core/hoon-school/L-struct) - This module will cover how the Hoon compiler infers type, as well as various cases in which a type check is performed.
12. [Data Structures](/guides/core/hoon-school/L2-struct) - This module will introduce you to several useful data structures built on the door, then discuss how the compiler handles types and the sample.
13. [Conditional Logic](/guides/core/hoon-school/M-logic) - This module will cover the nature of loobean logic and the rest of the `?` wut runes.
14. [Subject-Oriented Programming](/guides/core/hoon-school/N-subject) - This module discusses how Urbit's subject-oriented programming paradigm structures how cores and values are used and maintain state, as well as how deferred computations and remote value lookups (“scrying”) are handled.
15. [Text Processing II](/guides/core/hoon-school/O-stdlib-io) - This module will elaborate on text representation in Hoon, including formatted text, and `%ask` generators.
16. [Functional Programming](/guides/core/hoon-school/P-func) - This module will discuss some gates-that-work-on-gates and other assorted operators that are commonly recognized as functional programming tools. It will also cover text parsing.
17. [Adaptive Cores](/guides/core/hoon-school/Q-metals) - This module introduces how cores can be extended for different behavioral patterns.
18. [Mathematics](/guides/core/hoon-school/R-math) - This module introduces how non-`@ud` mathematics are instrumented in Hoon.

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 104 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

View File

@ -0,0 +1,77 @@
+++
title = "System Overview"
weight = 200
sort_by = "weight"
template = "sections/docs/chapters.html"
+++
Urbit is a clean-slate software stack designed to implement an encrypted P2P
network of general-purpose personal servers. Each server on this network is a
deterministic computer called an 'urbit' that runs on a Unix-based virtual
machine.
The Urbit stack primarily comprises:
- [Arvo](/docs/system-overview/arvo): the functional operating system of
each urbit, written in Hoon.
- [Hoon](/docs/system-overview/hoon): a strictly typed functional
programming language whose standard library includes a Hoon-to-Nock compiler.
- [Nock](/docs/system-overview/nock): a low-level combinator language whose
formal specification fits readably on a t-shirt.
- [Vere](/docs/system-overview/vere): a Nock interpreter and Unix-based
virtual machine that mediates between each urbit and the Unix software layer.
- [Azimuth](/docs/system-overview/azimuth): the Urbit identity layer, built
on the Ethereum blockchain.
Central to the operation of Urbit are cryptographic methods. We give a
high-level overview on the usage of cryptography in Urbit and how it is
implemented [here](/docs/system-overview/cryptography).
## Anatomy of a personal server
Your urbit is a deterministic computer in the sense that its state is a pure
function of its event history. Every event in this history is a
[transaction](https://en.wikipedia.org/wiki/Transaction_processing); your
urbit's state is effectively an [ACID database](https://en.wikipedia.org/wiki/ACID).
Because each urbit is deterministic we can describe its role appropriately in
purely functional terms: it maps an input event and the old urbit state to a
list of output actions and the subsequent state. This is the Urbit transition
function.
```
<input event, old state> -> <output actions, new state>
```
For example, one input event could be a keystroke from the terminal, say
`[enter]` after having already typed `(add 2 2)`; and an output action could be
to print in the terminal window the resulting value of a computation performed
when the user hit `[enter]`, in this case `4`. The input event is stored in the
urbit's event history.
Events always start from outside of your urbit, whether they're local to the
computer running the urbit (e.g., a keystroke in the terminal) or they originate
elsewhere (e.g., a packet received from another urbit). When an event is
processed, various parts of the urbit state can be modified before the resulting
list of output actions is returned.
Can output actions from your urbit cause side-effects in the outside world?
The answer had better be "yes," because a personal server without side effects
isn't useful for much. In another sense the answer had better be "no," or else
there is a risk of losing functional purity; your urbit cannot _guarantee_ that
the side effects in question actually occur. What's the solution?
Each urbit is
[sandboxed](https://en.wikipedia.org/wiki/Sandbox_%28computer_security%29) in a
virtual machine, Vere, which runs on Unix. Code running in your urbit cannot
make Unix system calls or otherwise directly affect the underlying platform.
Strictly speaking, internal urbit code can only change internal urbit state; it
has no way of sending events outside of its runtime environment. Functional
purity is preserved.
In practical terms, however, you don't want your urbit to be an impotent
[brain in a vat](https://en.wikipedia.org/wiki/Brain_in_a_vat). That's why
Vere also serves as the intermediary between your urbit and Unix. Vere observes
the list of output events, and when external action is called for makes the
appropriate system calls itself. When external events relevant to your urbit
occur in the Unix layer, Vere encodes and delivers them as input events.

32
content/overview/arvo.md Normal file
View File

@ -0,0 +1,32 @@
+++
title = "Arvo"
weight = 10
template = "doc.html"
+++
Arvo is a purely functional,
[non-preemptive](https://en.wikipedia.org/wiki/Cooperative_multitasking) OS,
written in Hoon, that serves as the event manager of your urbit. It can upgrade
itself from over the network without downtime. The Arvo kernel proper is quite
simple -- it's only about 600 lines of code, excluding its various modules.
The Urbit transition function is implemented in Arvo. Upon being 'poked' by Vere
with the pair of `<input event, state>`, Arvo directs the event to the
appropriate OS module. The result of each Vere 'poke' is a pair of
`<output events, new state>`. Events are typed, and each has an explicit call-stack
structure indicating the event's source module in Arvo.
For a more in-depth technical introduction, see [Arvo Overview](/docs/arvo/overview).
Arvo modules are also called 'vanes'. Arvo's vanes are:
- [Ames](/docs/arvo/ames/ames): defines and implements Urbit's encrypted P2P network protocol, as well
as Urbit's identity protocol.
- [Behn](/docs/arvo/behn/behn): manages timer events for other vanes.
- [Clay](/docs/arvo/clay/clay): global, version-controlled, and referentially-transparent file system.
Also includes our typed functional build system.
- [Dill](/docs/arvo/dill/dill): terminal driver.
- [Eyre](/docs/arvo/eyre/eyre): HTTP server.
- [Gall](/docs/arvo/gall/gall): application sandbox and manager.
- [Iris](/docs/arvo/iris/iris-api): HTTP client.
- [Jael](/docs/arvo/jael/jael-api): Public and private key storage.

View File

@ -0,0 +1,34 @@
+++
title = "Azimuth"
weight = 50
template = "doc.html"
+++
Azimuth is a general-purpose public-key infrastructure (PKI) on the Ethereum
blockchain, used as a decentralized ledger for what are known as **Urbit
identities**, or simply **identities**. Having an identity is necessary to use
the Urbit network, which makes it important to have a neutral ledger to
determine who owns what.
Azimuth is not, however, part of the Urbit stack. Azimuth is a parallel system
that can be used as a generalized identity system for other projects. Azimuth
"touches" the Urbit ecosystem when an Urbit identity is used to boot a virtual
computer on the Arvo network for the first time. When that happens, the identity
considered **linked** to Azimuth and the identity's full powers are available
for use. Once an identity is linked, it cannot be unlinked.
A metaphor might make the relationship between these two systems to understand:
Azimuth is the bank vault that stores the deed to your house. The Urbit network
is the neighborhood that you live in.
## Further Reading
* [Azimuth](/docs/azimuth/azimuth): An overview of the Ethereum-based public
key infrastructure utilized by Urbit.
* [Advanced Azimuth Tools](/docs/azimuth/advanced-azimuth-tools):
Expert-level tooling for generating, signing, and sending Azimuth-related
transactions from within Urbit itself.

View File

@ -0,0 +1,124 @@
+++
title = "Cryptography"
weight = 50
template = "doc.html"
+++
Cryptography is central to the operation of Urbit. Here we give an overview of
how it is utilized.
There are two categories of keys and five components of the system involved with
cryptography on Urbit. We first summarize the two categories of keys and how
they are utilized by each ship type, then cover how different parts of Urbit are
involved in cryptography.
### Types of keys
The two categories of keys are your Azimuth/Ethereum keys and your networking
keys. In both cases, these are public/private key pairs utilized for public key
cryptography.
#### Azimuth keys
Your Urbit ID exists as an ERC-721 non-fungible token on the Ethereum
blockchain, and as such is contained in a wallet whose private key you possess.
If you are are utilizing a [master
ticket](/docs/azimuth/azimuth#master-ticket), this private key is derived
from a seed, which is what you use to login to
[Bridge](/docs/glossary/bridge). Otherwise, you have generated the key by
some other process, of which there are too many to list here. Besides the
private key which unlocks your ownership wallet address, you may have a few
other private keys which unlock a wallet that corresponds to your ship's
[proxies](/docs/glossary/proxies). We refer collectively to these keys as
your _Azimuth keys_.
Only [planets](/docs/glossary/planet), [stars](/docs/glossary/star), and
[galaxies](/docs/glossary/galaxy) have Azimuth keys.
[Moons](/docs/glossary/moon) and [comets](/docs/glossary/comet) do not,
as they do not exist on the Ethereum blockchain.
It is important to note that no Azimuth keys are stored anywhere within your
ship's [pier](/docs/glossary/pier) - Ethereum and Urbit ID are entirely
separate entities from Urbit itself, and so you lose access to your Azimuth
private keys there is no way to retrieve them somehow from your ship.
For more information on the usage of these keys and the associated proxies, see
the [Azimuth documentation](/docs/azimuth/azimuth).
#### Networking keys
All communications in Urbit over the [Ames](/docs/glossary/ames) network
are end-to-end encrypted, and thus your ship stores its own public/private pair
of _networking keys_ utilized for encryption and authentication. Networking keys
for all ship types are stored within the ship's [Jael](/docs/glossary/jael)
[vane](/docs/glossary/vane).
For planets, stars, and galaxies, your networking public key is configured on
the Ethereum blockchain using one of your Azimuth private keys - either the one
associated to the wallet which owns the ship, or the one which holds the
[management proxy](/docs/glossary/proxies). This is typically accomplished
with Bridge. Your networking private key (in the
[keyfile](/docs/glossary/keyfile)) is a necessary input for the initial boot
sequence of your ship (sometime called its `%dawn`), and this is also provided
by Bridge.
Public keys for moons are not tracked on the blockchain; instead they are
tracked by their parent ship. The networking private key is generated by the
parent ship and injected into the `%dawn` sequence for the moon, and its public
key is stored by the parent in its Jael vane. Networking keys for moons may be
changed by the parent ship, but not by the moon itself.
For comets, their 128-bit `@p` name is the hash of their networking public key, and
the "mining" process to generate a comet consists of guessing a private key for
which the associated public key for which the last two bytes of their `@p`
matches one of the stars on the comet sponsors list downloaded during boot.
Thus, comets cannot change their networking keys - to get a new private
networking key, a new comet must be generated. For a comet to perform an initial
handshake with another ship, it utilizes its networking private key stored in
Jael to sign an unencrypted attestation packet to verify that it is the owner of
the associated public key. Because of this, it is currently impossible for a
ship to initiate communication with a comet - the comet must always be the
initiator. This means that ultimately two comets cannot communicate with one
another unless they have somehow verified each others' public keys via some
other method. This is merely a technical limitation imposed by the design of the
system, not an intentional handicapping of comet abilities. A workaround to this
limitation is slated to be implemented as of May 2021.
### System components
[Ames](/docs/arvo/ames/ames) is Arvo's networking vane. All packets sent by
Ames are encrypted utilizing a cryptosuite found in `zuse`. The only exception
to this are comet self-attestation packets utilized to transmit authentication
of ownership of the private networking key associated to their public key. Ames
is responsible for encryption, decryption, and authentication of all packets. By
default, this utilizes AES symmetric key encryption, whose shared private key is
got by elliptic curve Diffie-Hellman key exchange of the ships' networking keys.
[Jael](/docs/arvo/jael/jael-api) is primarily utilized for the safe storage
of private networking keys and retrieval of public networking keys utilized by
Ames. The Jael vane of planets, stars, and galaxies are responsible for
distributing the public keys of their moons (ultimately via Ames).
`zuse` is part of the standard library. It contains cryptographic functions
which are utilized by Ames. All cryptographic primitives are
[jetted](/docs/vere/jetting) in Vere with standard vetted implementations of
cryptographic libraries.
[Vere](/docs/vere/) is Urbit's Nock runtime system, written in C. All
cryptographic functions implemented in Hoon are hinted to the interpreter,
causing it to utilize the [jet system](/docs/vere/jetting) to run standard vetted cryptographic
libraries.
[Azimuth](/docs/azimuth/) is an Ethereum-based public key
infrastructure utilized by Urbit. `azimuth-tracker` obtains networking public
keys for planets, stars, and galaxies from this store, which are then stored in
Jael and utilized by Ames for end-to-end encrypted communication.
### Additional documentation
The following pages contained more detailed information about the cryptography
utilized by each of the system components.
- [Ames](/docs/arvo/ames/cryptography)
- [Zuse](/docs/arvo/reference/cryptography)
- [Vere](/docs/vere/cryptography)

28
content/overview/hoon.md Normal file
View File

@ -0,0 +1,28 @@
+++
title = "Hoon"
weight = 20
template = "doc.html"
+++
Hoon is a strictly typed functional programming language that compiles itself
to Nock and is designed to support higher-order functional programming without
requiring knowledge of category theory or other advanced mathematics. Haskell
is fun but it isn't for everybody.
Hoon aspires to a concrete, imperative feel. To discourage the creation of
write-only code, Hoon forbids user-level macros and uses ASCII digraphs instead
of keywords. The type system infers only forward and does not use unification,
but is not much weaker than Haskell's. The compiler and inference engine is
about 3000 lines of Hoon.
## Further Reading
* [Hoon Overview](/docs/hoon/overview): Learn why we created a new language
to build Urbit in.
* [Hoon School](/docs/hoon/hoon-school/): A collection of tutorials
designed to teach you the Hoon language.
* [Guides](/docs/hoon/guides/): Guides to specific Hoon tasks,
including testing, command-line interface apps, and parsing.
* [Reference](/docs/hoon/reference/): Reference material primarily
intended for Hoon developers with some experience.

34
content/overview/nock.md Normal file
View File

@ -0,0 +1,34 @@
+++
title = "Nock"
weight = 30
template = "doc.html"
+++
Nock is a low-level [homoiconic](https://en.wikipedia.org/wiki/Homoiconicity)
combinator language. It's so simple that its [specification](/docs/nock/definition)
fits on a t-shirt. In some ways Nock resembles a nano-Lisp but its ambitions
are more narrow. Most Lisps are one-layer: a practical language is to be
created by extending a theoretically simple interpreter. The abstraction is
simple and the implementation is practical. Unfortunately it's far more difficult
to enforce both simplicity and practicality in an actual Lisp codebase. Hoon
and Nock are two layers: Hoon, the practical layer, compiles itself to Nock, the
simple layer. Your urbit runs in Vere, which includes a Nock interpreter, so it
can upgrade Hoon over the network without downtime.
The Nock data model is quite simple. Every piece of data is a 'noun'. A [noun](/docs/glossary/noun/)
is an [atom](/docs/glossary/atom/) or a cell. An atom is any unsigned integer. A cell is an ordered
pair of nouns. Nouns are acyclic and expose no pointer equality test.
## Further Reading
* [Nock Definition](/docs/nock/definition): The Nock specification.
* [An explanation of Nock](/docs/nock/explanation): A comprehensive
walkthrough of the Nock spec.
* [Nock by hand](/docs/nock/example): Learn Nock by example.
* [Nock Implementations](/docs/nock/implementations): The many ways that
Nock has been implemented.
* [Nock for Everyday Coders](https://blog.timlucmiptev.space/part1.html): A
comprehensive guide to Nock by community member `~timluc-miptev`.

45
content/overview/vere.md Normal file
View File

@ -0,0 +1,45 @@
+++
title = "Vere"
weight = 40
template = "doc.html"
+++
Vere is the Nock runtime environment and Urbit VM. It's written in C, runs on
Unix, and is the intermediate layer between your urbit and Unix. As noted
earlier, Unix system calls are made by Vere, not Arvo; Vere must also encode
and deliver relevant external events to Arvo. Vere is also responsible for
implementing jets and maintaining the persistent state of each urbit.
In principle, Vere keeps a comprehensive log of every event from the time you
initially booted your urbit. What happens if the physical machine loses power
and your urbit's state is 'lost' from memory? When your urbit restarts it will
replay its entire event history and totally recover its latest state from
scratch.
In practice, event logs become large and unwieldy over time. Periodically a
snapshot of the permanent state is taken and the logs are pruned. You're still
able to rebuild your state in case of power outage, down to the last keystroke.
Vere is not essential to the Urbit stack; one can imagine using Urbit on a
hypervisor, or even bare metal. One member of the community is even working on
an independent implementation of Urbit using Graal/Truffle on the JVM.
The Urbit stack (compiler, standard library, kernel, modules, and applications,
but excluding Vere) is about 30,000 lines of Hoon. Urbit is patent-free and MIT
licensed.
## Further Reading
* [C runtime system](/docs/vere/runtime): The Urbit interpreter is built on
a Nock runtime system written in C, `u3`. This section is a relatively complete
description.
* [c3: C in Urbit](/docs/vere/c): Under `u3` is the simple `c3` layer, which
is just how we write C in Urbit.
* [u3: Land of nouns](/docs/vere/nouns): The division between `c3` and `u3`
is that you could theoretically imagine using `c3` as just a generic C
environment. Anything to do with nouns is in `u3`.
* [u3: API overview by prefix](/docs/vere/api): A walkthrough of each of the
`u3` modules.
* [How to write a jet](/docs/vere/jetting): A jetting guide for new Urbit
developers.

View File

@ -0,0 +1,29 @@
+++
title = "Distribution"
weight = 900
sort_by = "weight"
template = "sections/docs/chapters.html"
insert_anchor_links = "right"
+++
Developer documentation for desk/app distribution and management.
## [Overview](/docs/userspace/dist/dist)
An overview of desk/app distribution and management.
## [Docket Files](/docs/userspace/dist/docket)
Documentation of `desk.docket` files.
## [Glob](/docs/userspace/dist/glob)
Documentation of `glob`s (client bundles).
## [Guide](/docs/userspace/dist/guide)
A walkthrough of creating, installing and publishing a new desk with a tile and front-end.
## [Dojo Tools](/docs/userspace/dist/tools)
Documentation of useful generators for managing and distributing desks.

View File

@ -0,0 +1,105 @@
+++
title = "Overview"
weight = 1
template = "doc.html"
+++
Urbit allows peer-to-peer distribution and installation of applications. A user can click on a link to an app hosted by another ship to install that app. The homescreen interface lets users manage their installed apps and launch their interfaces in new tabs.
This document describes the architecture of Urbit's app distribution system. For a walkthrough of creating and distributing an app, see the [`Guide`](/docs/userspace/dist/guide) document.
## Architecture
The unit of software distribution is the desk. A desk is a lot like a git branch, but full of typed files, and designed to work with the Arvo kernel. In addition to files full of source code, a desk specifies the Kelvin version of the kernel that it's expecting to interact with, and it includes a manifest file describing which of the Gall agents it defines should be run by default.
Every desk is self-contained: the result of validating its files and building its agents is a pure function of its contents and the code in the specified Kelvin version of the kernel. A desk on one ship will build into the same files and programs, noun for noun, as on any other ship.
This symmetry is broken during agent installation, which can emit effects that might trigger other actions that cause the Arvo event to fail and be rolled back. An agent can ask the kernel to kill the Arvo event by using the new `%pyre` effect. Best practice, though, is for no desk to have a hard dependency on another desk.
If you're publishing an app that expects another app to be installed in order to function, the best practice is to check in `+on-init` for the presence of that other app's desk. If it's not installed, your app should display a message to the user and a link to the app that they should install in order to support your app. App-install links are well-supported in Tlon's Landscape, a suite of user-facing applications developed by Tlon.
For the moment, every live desk must have the same Kelvin version as the kernel. Future kernels that know how to maintain backward compatibility with older kernels will also allow older desks, but no commitment has yet been made to maintain backward compatibility across kernel versions, so for the time being, app developers should expect to update their apps accordingly.
Each desk defines its own filetypes (called `mark`s), in its `/mar` folder. There are no longer shared system marks that all userspace code knows, nor common libraries in `/lib` or `/sur` — each desk is completely self-contained.
It's common for a desk to want to use files that were originally defined in another desk, so that it can interact with agents on that desk. The convention is that if I'm publishing an app that I expect other devs to build client apps for (on other desks), I split out a "dev desk" containing just the external interface to my desk. Typically, both my app desk and clients' app desks will sync from this dev desk.
Tlon has done this internally. Most desks will want to sync the `%base-dev` desk so they can easily interact with the kernel and system apps in the `%base` desk. The `%base` desk includes agents such as `%dojo` and `%hood` (with Kiln as an informal sub-agent of `%hood` that manages desk installations).
A "landscape app", i.e. a desk that defines a tile that the user can launch from the home screen, should also sync from the `%garden-dev` desk. This desk includes the versioned `%docket-0` mark, which the app needs in order to include a `/desk/docket-0` file.
The `%docket` agent reads the `/desk/docket-0` file to display an app tile on the home screen and hook up other front-end functionality, such as downloading the app's client bundle ([glob](/docs/userspace/dist/glob)). Docket is a new agent, in the `%garden` desk, that manages app installations. Docket serves the home screen, downloads client bundles, and communicates with Kiln to configure the apps on your system.
For those of you familiar with the old `%glob` and `%file-server` agents, they have now been replaced by Docket.
### Anatomy of a Desk
Desks still contain helper files in `/lib` and `/sur`, generators in `/gen`, marks in `/mar`, threads in `/ted`, tests in `/tests`, and agents in `/app`. In addition, desks now also contain these files:
```
/sys/kelvin :: Kernel kelvin, e.g. [%zuse 418]
/desk/bill :: (optional, read by Kiln) list of agents to run
/desk/docket-0 :: (optional, read by Docket) app metadata
/desk/ship :: (optional, read by Docket) ship of original desk publisher, e.g. ~zod
```
Only the `%base` desk contains a `/sys` directory with the standard library, zuse, Arvo code and vanes. All other desks simply specify the kernel version with which they're compatible in the `/sys/kelvin` file.
### Updates
The main idea is that an app should only ever be run by a kernel that knows how to run it. For now, since there are not yet kernels that know how to run apps designed for an older kernel, this constraint boils down to ensuring that all live desks have the same kernel Kelvin version as the running kernel itself.
To upgrade your kernel to a new version, you need to make a commit to the `%base` desk. Clay will then check if any files in `/sys` changed in this commit. If so, Clay sends the new commit to Arvo, which decides if it needs to upgrade (or upgrade parts of itself, such as a vane). After Arvo upgrades (or decides not to), it wakes up Clay, which finalizes the commit to the `%base` desk and notify the rest of the system.
That's the basic flow for upgrading the kernel. However, some kernel updates also change the Kelvin version. If the user has also installed apps, those apps are designed to work with the old Kelvin, so they won't work with the new Kelvin — at least, not at the commit that's currently running.
Kiln, part of the system app `%hood` in the `%base` desk, manages desk installations, including the `%base` desk. It can install an app in two ways: a local install, sourced from a desk on the user's machine, or a remote install, which downloads a desk from another ship. Both are performed using the same generator, `|install`.
A remote install syncs an upstream desk into a local desk by performing a merge into the local desk whenever the upstream desk changes.
The Kelvin update problem is especially thorny for remote installs, which are the most common. By default, a planet has its `%base` desk synced from its sponsor's `%kids` desk, and it will typically have app desks synced from their publishers' ships.
Kiln listens (through Clay, which knows how to query remote Clays) for new commits on a remote-installed app's upstream ship and desk. When Clay hears about a new commit, it downloads the files and stores them as a "foreign desk", without validating or building them. It also tells Kiln.
When Kiln learns of these new foreign files, it reads the new `/sys/kelvin`. If it's the same as the live kernel's, Kiln asks Clay to merge the new files into the local desk where the app is installed. If the new foreign Kelvin is further ahead (closer to zero) than the kernel's, Kiln does not merge it into the local desk yet. Instead, it enqueues it.
Later, when Kiln hears of a new kernel Kelvin version on the upstream `%base` desk, it checks whether all the other live desks have a commit enqueued at that Kelvin. If so, it updates `%base` and then all the other desks, in one big Arvo event. This brings the system from fully at the old Kelvin, to fully at the new Kelvin, atomically — if any part of that fails, the Arvo event will abort and be rolled back, leaving the system back fully at the old Kelvin.
If not all live desks have an enqueued commit at the new kernel Kelvin, then Kiln notifies its clients that a kernel update is blocked on a set of desks. Docket, listening to Kiln, presents the user with a choice: either dismiss the notification and keep the old kernel, or suspend the blocking desks and apply the kernel update.
Suspending a desk turns off all its agents, saving their states in Gall. If there are no agents running from a desk, that desk doesn't force the kernel to be at the same Kelvin version. It's just inert data. If a later upstream update allows this desk to be run with a newer kernel, the user can revive the desk, and the system will migrate the old state into the new agent.
### Managing Apps and Desks in Kiln
Turning agents on and off is managed declaratively, rather than imperatively. Kiln maintains state for each desk about which agents should be forced on and which should be forced off. The set of running agents is now a function of the desk's `/desk/bill` manifest file and that user configuration state in Kiln. This means starting or stopping an agent is idempotent, both in Kiln and Gall.
For details of the generators for managing desks and agents in Kiln, see the [`Dojo Tools`](/docs/userspace/dist/tools) document.
### Landscape apps
It's possible to create and distribute desks without a front-end, but typically you'll want to distribute an app with a user interface. Such an app has two primary components:
- Gall agents and associated backend code which reside in the desk.
- A client bundle called a [`glob`](/docs/userspace/dist/glob), which contains the front-end files like HTML, CSS, JS, images, and so forth.
When a desk is installed, Kiln will start up the Gall agents in the `desk.bill` manifest, and the `%docket` agent will read the `desk.docket-0` file. This file will specify the name of the app, various metadata, the appearance of the app's tile in the homescreen, and the source of the `glob` so it can serve the interface. For more details of the docket file, see the [Docket File](/docs/userspace/dist/docket) document.
### Globs
The reason to separate a glob from Clay is that Clay is a revision-controlled system. Like in most revision control systems, deleting data from it is nontrivial due to newer commits referencing old commits. If Clay grows the ability to delete data, perhaps glob data could be moved into it. Until then, since client bundles tend to be updated frequently, it's best practice not to put your glob in your app host ship's Clay at all to make sure it doesn't fill up your ship's "loom" memory arena.
If the glob is to be served over Ames, there is an HTTP-based glob uploader that allows you to use a web form to upload a folder into your ship, which will convert the folder to a glob and link to it in your app desk's docket manifest file.
Note that serving a glob over Ames might increase the install time for your app, since Ames is currently pretty slow compared to HTTP — but being able to serve a glob from your ship allows you to serve your whole app, both server-side and client-side, without setting up a CDN or any other external web tooling. Your ship can do it all on its own.
For further details of globs, see the [Glob](/docs/userspace/dist/glob) document.
## Sections
- [Glob](/docs/userspace/dist/glob) - Documentation of `glob`s (client bundles).
- [Docket Files](/docs/userspace/dist/docket) - Documentation of docket files.
- [Guide](/docs/userspace/dist/guide) - A walkthrough of creating, installing and publishing a new desk with a tile and front-end.
- [Dojo Tools](/docs/userspace/dist/tools) - Documentation of useful generators for managing and distributing desks.

View File

@ -0,0 +1,296 @@
+++
title = "Docket File"
weight = 3
template = "doc.html"
+++
The docket file sets various options for desks with a tile and (usually) a browser-based front-end of some kind. Mainly it configures the appearance of an app's tile, the source of its [glob](/docs/userspace/dist/glob), and some additional metadata.
The docket file is read by the `%docket` agent when a desk is `|install`ed. The `%docket` agent will fetch the glob if applicable and create the tile as specified on the homescreen. If the desk is published with `:treaty|publish`, the information specified in the docket file will also be displayed for others who are browsing apps to install on your ship.
The docket file is _optional_ in the general case. If it is omitted, however, the app cannot have a tile in the homescreen, nor can it be published with the `%treaty` agent, so others will not be able to browse for it from their homescreens.
The docket file must be named `desk.docket-0`. The `%docket` `mark` is versioned to facilitate changes down the line, so the `-0` suffix may be incremented in the future.
The file must contain a `hoon` list with a series of clauses. The clauses are defined in `/sur/docket.hoon` as:
```hoon
+$ clause
$% [%title title=@t]
[%info info=@t]
[%color color=@ux]
[%glob-http url=cord hash=@uvH]
[%glob-ames =ship hash=@uvH]
[%image =url]
[%site =path]
[%base base=term]
[%version =version]
[%website website=url]
[%license license=cord]
==
```
The `%image` clause is optional. It is mandatory to have exactly one of either `%site`, `%glob-http` or `%glob-ames`. All other clauses are mandatory.
Here's what a typical docket file might look like:
```hoon
:~
title+'Foo'
info+'An app that does a thing.'
color+0xf9.8e40
glob-ames+[~zod 0v0]
image+'https://example.com/tile.svg'
base+'foo'
version+[0 0 1]
license+'MIT'
website+'https://example.com'
==
```
Details of each clause and their purpose are described below.
---
## `%title`
_required_
The `%title` field specifies the name of the app. The title will be the name shown on the app's tile, as well as the name of the app when others search for it.
#### Type
```hoon
[%title title=@t]
```
#### Example
```hoon
title+'Bitcoin'
```
---
## `%info`
_required_
The `%info` field is a brief summary of what the app does. It will be shown as the subtitle in _App Info_.
#### Type
```hoon
[%info info=@t]
```
#### Example
```hoon
info+'A Bitcoin Wallet that lets you send and receive Bitcoin directly to and from other Urbit users'
```
---
## `%color`
_required_
The `%color` field specifies the color of the app tile as an `@ux`-formatted hex value.
#### Type
```hoon
[%color color=@ux]
```
#### Example
```hoon
color+0xf9.8e40
```
---
## `%glob-http`
_exactly one of either this, [glob-ames](#glob-ames) or [site](#site) is required_
The `%glob-http` field specifies the URL and hash of the app's [glob](/docs/userspace/dist/glob) if it is distributed via HTTP.
#### Type
```hoon
[%glob-http url=cord hash=@uvH]
```
#### Example
```hoon
glob-http+['https://example.com/glob-0v1.s0me.h4sh.glob' 0v1.s0me.h4sh]
```
---
## `%glob-ames`
_exactly one of either this, [glob-http](#glob-http) or [site](#site) is required_
The `%glob-ames` field specifies the ship and hash of the app's [glob](/docs/userspace/dist/glob) if it is distributed from a ship over Ames. If the glob will be distributed from our ship, the hash can initially be `0v0` as it will be overwritten with the hash produced by the [Globulator](/docs/userspace/dist/glob#globulator).
#### Type
```hoon
[%glob-ames =ship hash=@uvH]
```
#### Example
```hoon
glob-ames+[~zod 0v0]
```
---
## `%site`
_exactly one of either this, [glob-ames](#glob-ames) or [glob-http](#glob-http) is required_
It's possible for an app to handle HTTP requests from the client directly rather than with a separate [glob](/docs/userspace/dist/glob). In that case, the `%site` field specifies the `path` of the Eyre endpoint the app will bind. If `%site` is used, clicking the app's tile will simply open a new tab with a GET request to the specified Eyre endpoint.
For more information on direct HTTP handling with a Gall agent or generator, see the [Eyre Internal API Reference](/docs/arvo/eyre/tasks) documentation.
#### Type
```hoon
[%site =path]
```
#### Example
```hoon
site+/foo/bar
```
---
## `%image`
_optional_
The `%image` field specifies the URL of an image to be displayed on the app's tile. This field is optional and may be omitted entirely.
The given image will be displayed on top of the [color](#color)ed tile. The app [title](#title) (and hamburger menu upon hover) will be displayed on top of the given image, in small rounded boxes with the same background color as the main tile. The given image will be displayed at 100% of the width of the tile. The image's corners will be hidden by the rounded corners of the tile, so the image itself needn't have rounded corners. The tile is a perfect square, so if the image should occupy the whole tile, it should also be a perfect square. If the image should be a smaller icon in the center of the tile (like the bitcoin tile), it should just have a square of transparent negative space around it.
It may be tempting to set the image URL as a root-relative path like `/apps/myapp/img/tile.svg` and bundle it in the glob. While this would work locally, it means the image would fail to load for those browsing apps to install. Therefore, the image should be hosted somewhere globally available.
#### Type
```hoon
[%image =url]
```
The `url` type is a simple `cord`:
```hoon
+$ url cord
```
#### Example
```hoon
image+'http://example.com/icon.svg'
```
---
## `%base`
_required_
The `%base` field specifies the base of the URL path of the glob resources. In the browser, the path will begin with `/apps`, then the specified base, then the rest of the path to the particular glob resource like `http://localhost:8080/apps/my-base/index.html`. Note the `path`s of the glob contents themselves should not include this base element.
#### Type
```hoon
[%base base=term]
```
#### Example
```hoon
base+'bitcoin'
```
---
## `%version`
_required_
The `%version` field specifies the current version of the app. It's a triple of three `@ud` numbers representing the major version, minor version and patch version. In the client, `[1 2 3]` will be rendered as `1.2.3`. You would typically increase the appropriate number each time you published a change to the app.
#### Type
```hoon
[%version =version]
```
The `version` type is just a triple of three numbers:
```hoon
+$ version
[major=@ud minor=@ud patch=@ud]
```
#### Example
```hoon
version+[0 0 1]
```
---
## `%website`
_required_
The `%website` field is for a link to a relevant website. This might be a link to the app's github repo, company website, or whatever is appropriate. This field will be displayed when people are browsing apps to install.
#### Type
```hoon
[%website website=url]
```
The `url` type is a simple `cord`:
```hoon
+$ url cord
```
#### Example
```hoon
website+'https://example.com'
```
---
## `%license`
_required_
The `%license` field specifies the license for the app in question. It would typically be a short name like `MIT`, `GPLv2`, or what have you. The field just takes a `cord` so any license can be specified.
#### Type
```hoon
[%license license=cord]
```
#### Example
```hoon
license+'MIT'
```

View File

@ -0,0 +1,102 @@
+++
title = "Glob"
weight = 4
template = "doc.html"
+++
A `glob` contains the client bundle—client-side resources like HTML, JS, and CSS files—for a landscape app distributed in a desk. Globs are managed separately from other files in desks because they often contain large files that frequently change, and would therefore bloat a ship's state if they were subject to Clay's revision control mechanisms.
The hash and source of an app's glob is defined in a desk's [docket file](/docs/userspace/dist/docket). The `%docket` agent reads the docket file, obtains the glob from the specified source, and makes its contents available to the browser client. On a desk publisher's ship, if the glob is to be distributed over Ames, the glob is also made available to desk subscribers.
## The `glob` type
The `%docket`agent defines the type of a `glob` as:
```hoon
+$ glob (map path mime)
```
Given the following file heirarchy:
```
foo
├── css
│ └── style.css
├── img
│ ├── favicon.png
│ ├── foo.svg
│ └── bar.svg
├── index.html
└── js
└── baz.js
```
...its `$glob` form would look like:
```hoon
{ [p=/img/foo/svg q=[p=/image/svg+xml q=[p=0 q=0]]]
[p=/css/style/css q=[p=/text/css q=[p=0 q=0]]]
[p=/img/favicon/png q=[p=/image/png q=[p=0 q=0]]]
[p=/js/baz/js q=[p=/application/javascript q=[p=0 q=0]]]
[p=/img/bar/svg q=[p=/image/svg+xml q=[p=0 q=0]]]
[p=/index/html q=[p=/text/html q=[p=0 q=0]]]
}
```
Note: The mime byte-length and data are 0 in this example because it was made with empty dummy files.
A glob may contain any number of files and folders in any kind of heirarchy. The one important thing is that an `index.html` file is present in its root. The `index.html` file is automatically served when the app is opened in the browser and will fail if it is missing.
In addition to the `$glob` type, a glob can also be output to Unix with a `.glob` file extension for distribution over HTTP. This file simply contains a [`jam`](/docs/hoon/reference/stdlib/2p#jam)med `$glob` structure.
## Docket file clause
The `desk.docket-0` file must include exactly one of the following clauses:
#### `site+/some/path`
If an app binds an Eyre endpoint and handles HTTP directly, for example with a [`%connect` task:eyre](/docs/arvo/eyre/tasks#connect), the `%site` clause is used, specifying the Eyre binding. In this case a glob is omitted entirely.
#### `glob-ames+[~zod 0vs0me.h4sh]`
If the glob is to be distributed over Ames, the `%glob-ames` clause is used, with a cell of the `ship` which has the glob and the `@uv` hash of the glob. If it's our ship, the hash can just be `0v0` and the glob can instead be created with the [Globulator](#globulator).
#### `glob-http+['https://example.com/some.glob' 0vs0me.h4sh]`
If the glob is to be distributed over HTTP, for example from an s3 instance, the `%glob-http` clause is used. It takes a cell of a `cord` with the URL serving the glob and the `@uv` hash of the glob.
## Making a glob
There are a couple of different methods depending on whether the glob will be distributed over HTTP or Ames.
### Globulator
For globs distributed over Ames from our ship, the client bundle can be uploaded directly with `%docket`'s Globulator tool, which is available in the browser at `http[s]://[host]/docket/upload`. It looks like this:
![Globulator](https://media.urbit.org/docs/userspace/dist/globulator.png)
Simply select the target desk, select the folder to be globulated, and hit `glob!`.
Note the target desk must have been `|install`ed before uploading its glob. When installed, `%docket` will print `docket: awaiting manual glob for %desk-name desk` in the terminal and wait for the upload. The hash in the `%ames-glob` clause of the docket file will be overwritten by the hash of the new glob. As a result, there's no need to specify the actual glob hash in `desk.docket` - you can just use any `@uv` like `0v0`. Once uploaded, the desk can then be published with `:treaty|publish %desk-name` and the glob will become available for download by subscribers.
### `-make-glob`
There's a different process for globs to be distributed over HTTP from a webserver rather than over Ames from a ship. For this purpose, the `%garden` desk includes a `%make-glob` thread. The thread takes a folder in a desk and produces a glob of the files it contains, which it then saves to Unix in a [`jam`](/docs/hoon/reference/stdlib/2p#jam)file with a `.glob` extension.
To begin, you'll need to spin up a ship (typically a fake ship) and `|mount` a desk for which to add the files. In order for Clay to add the files, the desk must contain `mark` files in its `/mar` directory for all file extensions your folder contains. The `%garden` desk is a good bet because it includes `mark` files for `.js`, `.html`, `.png`, `.svg`, `.woff2` and a couple of others. If there's no desk with a mark for a particular file type you want included in your glob, you may need to add a new mark file. A very rudimentary mark file like the `png.hoon` mark will suffice.
With the desk mounted, add the folder to be globbed to the root of the desk in Unix. It's imporant it's in the root because the `%make-glob` thread will only strip the first level of the folder heirarchy.
Next, `|commit` the files to the desk, then run `-garden!make-glob %the-desk /folder-name`, where `%the-desk` is the desk containing the folder to be globbed and `/folder-name` is its name.
On Unix, if you look in `/path/to/pier/.urb/put`, you'll now see a file which looks like:
```
glob-0v1.7vpqa.r8pn5.6t0s1.rhc7r.5e9vo.glob
```
This file can be uploaded to your webserver and the `desk.docket-0` file of the desk you're publishing can be updated with:
```hoon
glob-http+['https://s3.example.com/glob-0v1.7vpqa.r8pn5.6t0s1.rhc7r.5e9vo.glob' 0v1.7vpqa.r8pn5.6t0s1.rhc7r.5e9vo]
```

View File

@ -0,0 +1,282 @@
+++
title = "Guide"
weight = 2
template = "doc.html"
+++
In this document we'll walk through an example of creating and publishing a desk that others can install. We'll create a simple "Hello World!" front-end with a "Hello" tile to launch it. For simplicity, the desk won't include an actual Gall agent, but we'll note everything necessary if there were one.
## Create desk
To begin, we'll need to clone the [urbit git repo](https://github.com/urbit/urbit) from the Unix terminal:
```
[user@host ~]$ git clone https://github.com/urbit/urbit urbit-git
```
Once that's done, we can navigate to the `pkg` directory in our cloned repo:
```
[user@host ~]$ cd urbit-git/pkg
[user@host pkg]$ ls .
arvo btc-wallet garden grid interface npm webterm
base-dev docker-image garden-dev herb landscape symbolic-merge.sh
bitcoin ent ge-additions hs libaes_siv urbit
```
Each desk defines its own `mark`s, in its `/mar` folder. There are no longer shared system marks that all userspace code knows, nor common libraries in `/lib` or `/sur`. Each desk is completely self-contained. This means any new desk will need a number of base files.
To make the creation of a new desk easier, `base-dev` and `garden-dev` contain symlinks to all `/sur`, `/lib` and `/mar` files necessary for interacting with the `%base` and `%garden` desks respectively. These dev desks can be copied and merged with the `symbolic-merge.sh` included.
Let's create a new `hello` desk:
```
[user@host pkg]$ mkdir hello
[user@host pkg]$ ./symbolic-merge.sh base-dev hello
[user@host pkg]$ ./symbolic-merge.sh garden-dev hello
[user@host pkg]$ cd hello
[user@host hello]$ ls
lib mar sur
```
### `sys.kelvin`
Our desk must include a `sys.kelvin` file which specifies the kernel version it's compatible with. Let's create that:
```
[user@host hello]$ echo "[%zuse 418]" > sys.kelvin
[user@host hello]$ cat sys.kelvin
[%zuse 418]
```
### `desk.ship`
We can also add a `desk.ship` file to specify the original publisher of this desk. We'll try this on a fakezod so let's just add `~zod` as the publisher:
```
[user@host hello]$ echo "~zod" > desk.ship
[user@host hello]$ cat desk.ship
~zod
```
### `desk.bill`
If we had Gall agents in this desk which should be automatically started when the desk is installed, we'd add them to a `hoon` list in the `desk.bill` file. It would look something like this:
```hoon
:~ %some-app
%another
==
```
In this example we're not adding any agents, so we'll simply omit the `desk.bill` file.
### `desk.docket-0`
The final file we need is `desk.docket-0`. This one's more complicated, so we'll open it in our preferred text editor:
```
[user@host hello]$ nano desk.docket-0
```
In the text editor, we'll add the following:
```hoon
:~ title+'Hello'
info+'A simple hello world app.'
color+0x81.88c9
image+'https://media.urbit.org/docs/userspace/dist/wut.svg'
base+'hello'
glob-ames+[~zod 0v0]
version+[0 0 1]
website+'https://urbit.org/docs/userspace/dist/guide'
license+'MIT'
==
```
You can refer to the [Docket File](/docs/userspace/dist/docket) documentation for more details of what is required. In brief, the `desk.docket-0` file contains a `hoon` list of [clauses](/docs/userspace/dist/docket) which configure the appearance of the app tile, the source of the [glob](/docs/userspace/dist/glob), and some other metadata.
We've given the app a [`%title`](/docs/userspace/dist/docket#title) of "Hello", which will be displayed on the app tile and will be the name of the app when others browse to install it. We've given the app tile a [`%color`](/docs/userspace/dist/docket#color) of `#8188C9`, and also specified the URL of an [`%image`](/docs/userspace/dist/docket#image) to display on the tile.
The [`%base`](/docs/userspace/dist/docket#base) clause specifies the base URL path for the app. We've specified "hello" so it'll be `http://localhost:8080/apps/hello/...` in the browser. For the [glob](/docs/userspace/dist/glob), we've used a clause of [`%glob-ames`](/docs/userspace/dist/docket#glob-ames), which means the glob will be served from a ship over Ames, as opposed to being served over HTTP with a [`%glob-http`](/docs/userspace/dist/docket#glob-http) clause or having an Eyre binding with a [`%site`](/docs/userspace/dist/docket#site) clause. You can refer to the [glob](/docs/userspace/dist/glob) documentation for more details of the glob options. In our case we've specified `[~zod 0v0]`. Since `~zod` is the fakeship we'll install it on, the `%docket` agent will await a separate upload of the `glob`, so we can just specify `0v0` here as it'll get overwritten later.
The [`%version`](/docs/userspace/dist/docket#version) clause specifies the version as a triple of major version, minor version and patch version. The rest is just some additional informative metadata which will be displayed in _App Info_.
So let's save that to the `desk.docket-0` file and have a look at our desk:
```
[user@host hello]$ ls
desk.docket-0 desk.ship lib mar sur sys.kelvin
```
That's everything we need for now.
## Install
Let's spin up a fakezod in which we can install our desk. By default a fakezod will be out of date, so we need to bootstrap with a pill from our urbit-git repo. The pills are stored in git lfs and need to be pulled into our repo first:
```
[user@host hello]$ cd ~/urbit-git
[user@host urbit-git]$ git lfs install
[user@host urbit-git]$ git lfs pull
[user@host urbit-git]$ cd ~/piers/fake
[user@host fake]$ urbit -F zod -B ~/urbit-git/bin/multi-brass.pill
```
Once our fakezod is booted, we'll need to create a new `%hello` desk for our app and mount it. We can do this in the dojo like so:
```
> |merge %hello our %base
>=
> |mount %hello
>=
```
Now, back in the Unix terminal, we should see the new desk mounted:
```
[user@host fake]$ cd zod
[user@host zod]$ ls
hello
```
Currently it's just a clone of the `%base` desk, so let's delete its contents:
```
[user@host zod]$ rm -r hello/*
```
Next, we'll copy in the contents of the `hello` desk we created earlier. We must use `cp -LR` to resolve all the symlinks:
```
[user@host zod]$ cp -LR ~/urbit-git/pkg/hello/* hello/
```
Back in the dojo we can commit the changes and install the desk:
```
> |commit %hello
> |install our %hello
kiln: installing %hello locally
docket: awaiting manual glob for %hello desk
```
The `docket: awaiting manual glob for %hello desk` message is because our `desk.docket-0` file includes a [`%glob-ames`](/docs/userspace/dist/docket#glob-ames) clause which specifies our ship as the source, so it's waiting for us to upload the glob.
If we open a browser now, navigate to `http://localhost:8080` and login with the default fakezod code `lidlut-tabwed-pillex-ridrup`, we'll see our tile's appeared but it says "installing" with a spinner due to the missing glob:
![Installing Tile](https://media.urbit.org/docs/userspace/dist/local-install-1.png)
## Create files for glob
We'll now create the files for the glob. We'll use a very simple static HTML page that just displayes "Hello World!" and an image. Typically we'd have a more complex JS web app that talked to apps on our ship through Eyre's channel system, but for the sake of simplicity we'll forgo that. Let's hop back in the Unix terminal:
```
[user@host zod]$ cd ~
[user@host ~]$ mkdir hello-glob
[user@host ~]$ cd hello-glob
[user@host hello-glob]$ mkdir img
[user@host hello-glob]$ wget -P img https://media.urbit.org/docs/userspace/dist/pot.svg
[user@host hello-glob]$ tree
.
└── img
└── pot.svg
1 directory, 1 file
```
We've grabbed an image to use in our "Hello world!" page. The next thing we need to add is an `index.html` file in the root of the folder. The `index.html` file is mandatory; it's what will be loaded when the app's tile is clicked. Let's open our preferred editor and create it:
```
[user@host hello-glob]$ nano index.html
```
In the editor, paste in the following HTML and save it:
```html
<!DOCTYPE html>
<html>
<head>
<style>
div {
text-align: center;
}
</style>
</head>
<title>Hello World</title>
<body>
<div>
<h1>Hello World!</h1>
<img src="img/pot.svg" alt="pot" width="219" height="196" />
</div>
</body>
</html>
```
Our `hello-glob` folder should now look like this:
```
[user@host hello-glob]$ tree
.
├── img
│ └── pot.svg
└── index.html
1 directory, 2 files
```
## Upload to glob
We can now create a glob from the directory. To do so, navigate to `http://localhost:8080/docket/upload` in the browser. This will bring up the `%docket` app's [Globulator](/docs/userspace/dist/glob#globulator) tool:
![Globulator](https://media.urbit.org/docs/userspace/dist/globulator.png)
Simply select the `hello` desk from the drop-down, click `Choose file` and select the `hello-glob` folder in the the file browser, then hit `glob!`.
Now if we return to our ship's homescreen, we should see the tile looks as we specified in the docket file:
![Installed Tile](https://media.urbit.org/docs/userspace/dist/local-install-2.png)
And if we click on the tile, it'll load the `index.html` in our glob:
![Hello World!](https://media.urbit.org/docs/userspace/dist/local-install-3.png)
Our app is working!
## Publish
The final step is publishing our desk with the `%treaty` agent so others can install it. To do this, there's a simple command in the dojo:
```
> :treaty|publish %hello
>=
```
Note: For desks without a docket file (and therefore without a tile and glob), treaty can't be used. Instead you can make the desk public with `|public %desk-name`.
## Remote install
Let's spin up another fake ship so we can try install it:
```
[user@host hello-glob]$ cd ~/piers/fake
[user@host fake]$ urbit -F bus
```
Note: For desks without a docket file (and therefore without a tile and glob), users cannot install them through the web interface. Instead remote users can install it from the dojo with `|install ~our-ship %desk-name`.
In the browser, navigate to `http://localhost:8081` and login with `~bus`'s code `riddec-bicrym-ridlev-pocsef`. Next, type `~zod/` in the search bar, and it should pop up a list of `~zod`'s published apps, which in this case is our `Hello` app:
![Remote install search](https://media.urbit.org/docs/userspace/dist/remote-install-1.png)
When we click on the app, it'll show some of the information from the clauses in the docket file:
![Remote app info](https://media.urbit.org/docs/userspace/dist/remote-install-2.png)
Click `Get App` and it'll ask as if we want to install it:
![Remote app install](https://media.urbit.org/docs/userspace/dist/remote-install-3.png)
Finally, click `Get "Hello"` and it'll be installed as a tile on `~bus` which can then be opened:
![Remote app finished](https://media.urbit.org/docs/userspace/dist/remote-install-4.png)

Some files were not shown because too many files have changed in this diff Show More