mirror of https://github.com/ilyakooo0/urbit.git synced 2024-12-19 12:51:51 +03:00

Anton Dyudin bb1826ac60 mass mdy to md renaming

2015-12-01 17:43:42 -08:00

100 KiB

Raw Blame History

sort	title
0	Urbit whitepaper

Urbit: an operating function

This is Urbit whitepaper DRAFT 40K. Some small details remain at variance with the codebase.

Abstract

Urbit is a clean-slate, full-stack redesign of system software. In 25K lines of code, it's a packet protocol, a pure functional language, a deterministic OS, an ACID database, a versioned filesystem, a web server and a global PKI. Urbit runs on a frozen combinator interpreter specified in 200 words; the rest of the stack upgrades itself over its own network.

Architecturally, Urbit is an opaque computing and communication layer above Unix and the Internet. To the user, it's a new decentralized network where you own and control your own general-purpose personal server, or "planet." A planet is not a new way to host your old apps; it's a different experience.

Download

As .md As .pdf

Objective

How can we put users back in control of their own computing?

Most people still have a general-purpose home computer, but it's atrophying into a client. Their critical data is all in the cloud. Technically, of course, that's ideal. Data centers are pretty good at being data centers.

But in the cloud, all users have is a herd of special-purpose appliances, not one of which is a general-purpose computer. Do users want their own general-purpose personal cloud computer? If so, why don't they have one now? How might we change this?

Conventional cloud computing, the way the cloud works now, is "1:n". One application, hosted on one logical server by its own developer, serves "n" users. Each account on each application is one "personal appliance" - a special-purpose computer, completely under the developer's control.

Personal cloud computing, the way we wish the cloud worked, is "n:1": each user has one logical server, which runs "n" independent applications. This general-purpose computer is a "personal server," completely under the user's control.

"Personal server" is a phrase only a marketing department could love. We prefer to say: your planet. Your planet is your digital identity, your network address, your filesystem and your application server. Every byte on it is yours; every instruction it runs is under your control.

Most people should park their planets in the cloud, because the cloud works better. But a planet is not a planet unless it's independent. A host without contractual guarantees of absolute privacy and unconditional migration is not a host, but a trap. The paranoid and those with global adversaries should migrate to their own closets while home computing remains legal.

But wait: isn't "planet" just a fancy word for a self-hosted server? Who wants to self-host? Why would anyone want to be their own sysadmin?

Managing your own computing is a cost, not a benefit. Your planet should be as easy as possible to manage. (It certainly should be easier than herding your "n" personal appliances.) But the benefit of "n:1" is controlling your own computing.

The "n:1" cloud is not a better way to implement your existing user experience. It's a different relationship between human and computer. An owner is not just another customer. A single-family home is not just another motel room. Control matters. A lot.

Perhaps this seems abstract. It's hard to imagine the "n:1" world before it exists. Let's try a thought-experiment: adding impossible levels of ownership to today's "1:n" ecosystem.

Take the web apps you use today. Imagine you trust them completely. Imagine any app can use any data from any other app, just because both accounts are you. Imagine you can "sidegrade" any app by moving its data safely to a compatible competitor. Imagine all your data is organized into a personal namespace, and you can compute your own functions on that namespace.

In this world, no app developer has any way to hold your data hostage. Forget how this works technically (it doesn't). How does it change the apps?

Actually, the rules of this thought-experiment world are so different that few of the same apps exist. Other people's apps are fundamentally different from your own apps. They're not "yours" because you developed them -- they're your apps because you can fire the developer without any pain point. You are not a hostage, so the power dynamic changes. Which changes the app.

For example: with other people's apps, when you want to shop on the Internets, you point your browser at amazon.com or use the Google bar as a full-text store. With your own apps, you're more likely to point your browser at your own shopping assistant. This program, which works entirely for you and is not slipping anyone else a cut, uses APIs to sync inventory data and send purchase orders.

Could you write this web app today? Sure. It would be a store. The difference between apps you control and apps you don't is the difference between a shopping assistant and a store. It would be absurd if a shopping assistant paid its developer a percentage of all transactions. It would be absurd if a store didn't. The general task is the same, but every detail is different.

Ultimately, the planet is a different user experience because you trust the computer more. A program running on someone else's computer can promise it's working only for you. This promise is generally false and you can't enforce it. When a program on your computer makes the same promise, it's generally true and you can enforce it. Control changes the solution because control produces trust and trust changes the problem.

Could we actually add this level of user sovereignty to the "1:n" cloud? It'd take a lot of engineers, a lot of lawyers, and a whole lot of standards conferences. Or we could just build ourselves some planets.

Obstacles

If this is such a great product, why can't you already buy it?

In 2015, general-purpose cloud servers are easily available. But they are industrial tools, not personal computers. Most users (consumer and enterprise) use personal appliances. They choose "1:n" over "n:1". Why does "1:n" beat "n:1" in the real world?

Perhaps "1:n" is just better. Your herd of developer-hosted appliances is just a better product than a self-hosted planet. Or at least, this is what the market seems to be telling us.

Actually, the market is telling us something more specific. It's saying: the appliance herd is a better product than a self-hosted Unix server on the Internet.

The market hasn't invalidated the abstract idea of the planet. It's invalidated the concrete product of the planet we can actually build on the system software we actually have.

In 1978, a computer was a VAX. A VAX cost $50K and was the size of a fridge. By 1988, it would cost $5K and fit on your desk. But if a computer is a VAX, however small or cheap, there is no such thing as a PC. And if a planet is an AWS box, there is no such thing as a planet.

The system software stack that 2015 inherited -- two '70s designs, Unix and the Internet -- remains a viable platform for "1:n" industrial servers. Maybe it's not a viable platform for "n:1" personal servers? Just as VAX/VMS was not a viable operating system for the PC?

But what would a viable platform for a personal server look like? What exactly is this stack? If it's not a Unix server on the Internet, it's an X server on network Y. What are X and Y? Do they exist?

Clearly not. So all we have to replace is Unix and the Internet. In other words, all we have to replace is everything. Is this an obstacle, or an opportunity?

A clean-slate redesign seems like the obvious path to the levels of simplicity we'll need in a viable planet. Moreover, it's actually easier to redesign Unix and the Internet than Unix or the Internet. Computing and communication are not separate concerns; if we design the network and OS as one system, we avoid all kinds of duplications and impedance mismatches.

And if not now, when? Will there ever be a clean-slate redesign of '70s system software? A significant problem demands an ambitious solution. Intuitively, clean-slate computing and the personal cloud feel like a fit. Let's see if we can make the details work. We may never get another chance to do it right.

Principles of platform reform

To review: we don't have planets because our antique system software stack, Unix and the Internet, is a lousy platform on which to build a planet. If we care about the product, we need to start with a new platform.

How can we replace Unix and the Internet? We can't. But we can tile over them, like cheap bathroom contractors. We can use the Unix/Internet, or "classical layer," as an implementation substrate for our new layer. A platform on top of a platform. Under your Brazilian slate, there's pink Color Tile from 1976.

"Tiling over" is the normal way we replace system software. The Internet originally ran over phone circuits; under your transatlantic TCP packets, there's an ATM switch from 1996. And of course, under your browser there's an OS from 1976.

After a good job of tiling over, the obsolete layer can sometimes be removed. In some cases it's useless, but would cost too much to rip out. In some cases it's still used -- a Mac doesn't (yet) boot to Safari. But arguably, that's because the browser platform is anything but a perfect tiling job.

One property the browser got right was total opacity. The old platform implements the new platform, but can't be visible through it. If web apps could make Unix system calls or use Unix libraries, there would be no such thing as web apps.

(In fact, one easy way to think of a planet is as "the browser for the server side." The browser is one universal client that hosts "n" independent client applications; the planet is one universal server that hosts "n" independent server applications.)

And the bathroom remains a bathroom. The new platform does the same general job as the old one. So to justify the reform, it has to be much better at its new, specific job. For instance, the browser is easily two orders of magnitude better than Unix at installing untrusted transient applications (ie, web pages).

Abstract targets

Again, we have two problems: Unix and the Internet. What about each do we need to fix? What exactly is wrong with the classical layer? What qualities should our replacement have?

Since we're doing a clean-slate design, it's a mistake to focus too much on fixing the flaws of the old platform. The correct question is: what is the right way to build a system software stack? Besides cosmetic details like character sets, this exercise should yield the same results on Mars as Earth.

But we come from Earth and will probably find ourselves making normal Earth mistakes. So we can at least express our abstract design goals in normal Earth terms.

A simpler OS

One common reaction to the personal-server proposition: "my mother is not a Linux system administrator." Neither is mine. She does own an iPhone, however. Which is also a general-purpose computer. A usability target: a planet should be as easy to manage as an iPhone.

(To go into way too much detail: on a planet with the usability of iOS, the user as administrator has four system configuration tasks for the casual or newbie user: (a) deciding which applications to run; (b) checking resource usage; (c) configuring API authentication to your existing Web apps (your planet wants to access, rip or even sync your silo data); and (d) maintaining a reputation database (your friends, enemies, etc). (a) and (b) exist even on iOS; (c) and (d) are inherent in any planet; (d) is common on the Web, and (c) not that rare.)

Could a planet just run iOS? The complexity of managing a computer approximates the complexity of its state. iOS is a client OS. Its apps are not much more than webpages. There is much more state on a server; it is much more valuable, and much more intricate, and much more durable.

(Moreover, an iOS app is a black box. It's running on your physical hardware, but you don't have much more control over it than if it was remote. iOS apps are not designed to share data with each other or with user-level computing tools; there is no visible filesystem, shell, etc. This is not quite the ownership experience -- it's easy to get locked in to a black box.)

And while an Apple product is a good benchmark for any usability goal, it's the exception that proves the rule for architecture. iOS is Unix, after all -- Unix with a billion-dollar makeover. Unix is not a turd, and Cupertino could probably polish it even if it was.

Simplicity is the only viable path to usability for a new platform. It's not sufficient, but it is necessary. The computer feels simple to the user not because it's presenting an illusion of simplicity, but because it really is simple.

While there is no precise way to measure simplicity, a good proxy is lines of code -- or, to be slightly more sophisticated, compressed code size. Technical simplicity is not actually usability, just a force multiplier in the fight for usability. But usability proper can only be assessed once the UI is complete, and the UI is the top layer by definition. So we assume this equation and target simplicity.

A sane network

When we look at the reasons we can't have a nice planet, Unix is a small part of the problem. The main problem is the Internet.

There's a reason what we call "social networks" on the Internet are actually centralized systems -- social servers. For a "1:n" application, social integration - communication between two users of the same application - is trivial. Two users are two rows in the same database.

When we shift to a "n:1" model, this same integration becomes a distributed systems problem. If we're building a tictactoe app in a "1:n" design, our game is a single data structure in which moves are side effects. If we're building the same app on a network of "n:1" model, our game is a distributed system in which moves are network messages.

Building and managing distributed Internet systems is not easy. It's nontrivial to build and manage a centralized API. Deploying a new global peer-to-peer protocol is a serious endeavor.

But this is what we have to do for our tictactoe app. We have to build, for instance, some kind of identity model - because who are you playing against, an IP address? To play tictactoe, you find yourself building your own distributed social network.

While there are certainly asocial apps that a planet can run, it's not clear that an asocial planet is a viable product. If the cost of building a distributed social service isn't close to the cost of a building its centralized equivalent, the design isn't really successful.

Fortunately, while there are problems in "n:1" services that no platform can solve for you (mostly around consistency) there are set of problems with "1:n" services that no platform can solve for you (mostly around scaling). Scaling problems in "n:1" social services only arise when you're Justin Bieber and have a million friends, ie, a rare case even in a mature network. Mr. Bieber can probably afford a very nice computer.

Concrete requirements

Here are some major features we think any adequate planet needs. They're obviously all features of Urbit.

Repeatablе computing

Any non-portable planet is locked in to its host. That's bad. You can have all the legal guarantees you like of unconditional migration. Freedom means nothing if there's nowhere to run to. Some of today's silos are happy to give you a tarball of your own data, but what would you do with it?

The strongest way to ensure portability is a deterministic, frozen, non-extensible execution model. Every host runs exactly the same computation on the same image and input, for all time. When you move that image, the only thing it notices is that its IP address has changed.

We could imagine a planet with an unfrozen spec, which had some kind of backward-compatible upgrade process. But with a frozen spec, there is no state outside the planet itself, no input not input to the planet itself, and no way of building a planet on one host that another host can't compute correctly.

Of course every computer is deterministic at the CPU level, but CPU-level determinism can't in practice record and replay its computation history. A computer which is deterministic at the semantic level can. Call it "repeatable computing."

Orthogonal persistence

It's unclear why we'd expose the transience semantics of the hardware memory hierarchy to either the programmer or the user. When we do so, we develop two different models for managing data: "programming languages" and "databases." Mapping between these models, eg "ORM," is the quintessence of boilerplate.

A simple pattern for orthogonal persistence without a separate database is "prevalence": a checkpoint plus a log of events since the checkpoint. Every event is an ACID transaction. In fact, most databases use this pattern internally, but their state transition function is not a general-purpose interpreter.

Identity

In the classical layer, hosts, usernames, IP addresses, domain names and public keys are all separate concepts. A planet has one routable, memorable, cryptographically independent identity which serves all these purposes.

The "Zooko's triangle" impossibility result tells us we can't build a secure, meaningful, decentralized identity system. Rather than surrendering one of these three goals, the planet can retreat a little bit on two of them. It can be memorable, but not meaningful; it can start as a centralized system, but decentralize itself over time. 100-80-70 is often preferable to 100-100-0.

A simple typed functional language

Given the level of integration we're expecting in this design, it's silly to think we could get away without a new language. There's no room in the case for glue. Every part has to fit.

The main obstacle to functional language adoption is that functional programming is math, and most human beings are really bad at math. Even most programmers are bad at math. Their intuition of computation is mechanical, not mathematical.

A pure, higher-order, typed, strict language with mechanical intuition and no mathematical roots seems best positioned to defeat this obstacle. Its inference algorithm should be almost but not quite as strong as Hindley-Milner unification, perhaps inferring "forward but not backward."

We'd also like two other features from our types. One, a type should define a subset of values against a generic data model, the way a DTD defines a set of XML values. Two, defining a type should mean defining an executable function, whose range is the type, that verifies or normalizes a generic value. Why these features? See the next section...

High-level communication

A planet could certainly use a network type descriptor that was like a MIME type, if a MIME type was an executable specification and could validate incoming content automatically. After ORM, manual data validation must be the second leading cause of boilerplate. If we have a language in which a type is also a validator, the automatic validation problem seems solvable. We can get to something very like a typed RPC.

Exactly-once message delivery semantics are impossible unless the endpoints have orthogonal persistence. Then they're easy: use a single persistent session with monotonically increasing sequence numbers. It's nice not worrying about idempotence.

With an integrated OS and protocol, it's possible to fully implement the end-to-end principle, and unify the application result code with the packet acknowledgment -- rather than having three different layers of error code. Not only is every packet a transaction at the system level; every message is a transaction at the application level. Applications can even withhold an acknowledgment to signal internal congestion, sending the other end into exponential backoff.

It's also nice to support simple higher-level protocol patterns like publish/subscribe. We don't want every application to have to implement its own subscription queue backpressure. Also, if we can automatically validate content types, perhaps we can also also diff and patch them, making remote syncing practical.

The web will be around for a while. It would be great to have a web server that let a browser page with an SSO login authenticate itself as another planet, translating messages into JSON. This way, any distributed application is also a web application.

Finally, a planet is also a great web client. There is lots of interesting data behind HTTP APIs. A core mission of a planet is collecting and maintaining the secrets the user needs to manage any off-planet data. (Of course, eventually this data should come home, but it may take a while.) The planet's operating system should serve as client-side middleware that mediates the API authentication process, letting the programmer program against the private web as if it was public, using site-specific auth drivers with user-configured secrets.

Global namespace

The URL taught us that any global identity scheme is the root of a global namespace. But the URL also made a big mistake: mutability.

A global namespace is cool. An immutable (referentially transparent) global namespace, which can use any data in the universe as if it was a constant, is really cool. It's even cooler if, in case your data hasn't been created yet, your calculation can block waiting for it. Coolest of all is when the data you get back is typed, and you can use it in your typed functional language just like dereferencing a pointer.

Of course, an immutable namespace should be a distributed version control system. If we want typed data, it needs to be a typed DVCS, clearly again with type-specific diff and patch. Also, our DVCS (which already needs a subscription mechanism to support blocking) should be very good at reactive syncing and mirroring.

Semantic drivers

One unattractive feature of a pure interpreter is that it exacts an inescapable performance tax -- since an interpreter is always slower than native code. This violates the prime directive of OS architecture: the programmer must never pay for any service that doesn't get used. Impure interpreters partly solve this problem with a foreign-function interface, which lets programmers move inner loops into native code and also make system calls. An FFI is obviously unacceptable in a deterministic computer.

A pure alternative to the FFI is a semantic registry in which functions, system or application, can declare their semantics in a global namespace. A smart interpreter can recognize these hints, match them to a checksum of known good code, and run a native driver that executes the function efficiently. This separates policy (pure algorithm as executable specification) from mechanism (native code or even hardware).

Repository-driven updates

A planet owner is far too busy to manually update applications. They have to update themselves.

Clearly the right way to execute an application is to run it directly from the version-control system, and reload it when a new version triggers any build dependencies. But in a planet, the user is not normally the developer. To install an app is to mirror the developer's repository. The developer's commit starts the global update process by updating mirror subscriptions.

Of course, data structures may change, requiring the developer to include an adapter function that converts the old state. (Given orthogonal persistence, rebooting the app is not an option.)

Added fun is provided by the fact that apps use the content types defined in their own repository, and these types may change. Ideally these changes are backward compatible, but an updated planet can still find itself sending an un-updated planet messages that don't validate. This race condition should block until the update has fully propagated, but not throw an error up to the user level -- because no user has done anything wrong.

Why not a planet built on JS or JVM?

Many programmers might accept our reasoning at the OS level, but get stuck on Urbit's decision not to reuse existing languages or interpreters. Why not JS, JVM, Scheme, Haskell...? The planet is isolated from the old layer, but can't it reuse mature designs?

One easy answer is that, if we're going to be replacing Unix and the Internet, or at least tiling over them, rewriting a bit of code is a small price to pay for doing it right. Even learning a new programming language is a small price to pay. And an essential aspect of "doing it right" is a system of components that fit together perfectly; we need all the simplicity wins we can get.

But these are big, hand-waving arguments. It takes more than this kind of rhetoric to justify reinventing the wheel. Let's look at a few details, trying not to get ahead of ourselves.

In the list above, only JS and the JVM were ever designed to be isolated. The others are tools for making POSIX system calls. Isolation in JS and the JVM is a client thing. It is quite far from clear what "node.js with browser-style isolation" would even mean. And who still uses Java applets?

Let's take a closer look at the JS/JVM options - not as the only interpreters in the world, just as good examples. Here are some problems we'd need them to solve, but they don't solve - not, at least, out of the box.

First: repeatability. JS and the JVM are not frozen, but warm; they release new versions with backward compatibility. This means they have "current version" state outside the planet proper. Not lethal but not good, either.

When pure, JS and then JVM are at least nominally deterministic, but they are also used mainly on transient data. It's not clear that the the actual implementations and specifications are built for the lifecycle of a planet - which must never miscompute a single bit. (ECC servers are definitely recommended.)

Second: orthogonal persistence. Historically, successful OP systems are very rare. Designing the language and OS as one unit seems intuitively required.

One design decision that helps enormously with OP is an acyclic data model. Acyclic data structures are enormously easier to serialize, to specify and validate, and of course to garbage-collect. Acyclic databases are far more common than cyclic ("network" or "object") databases. Cyclic languages are more common than acyclic languages -- but pure functional languages are acyclic, so we know acyclic programming can work.

(It's worth mentioning existing image-oriented execution environments - like Smalltalk and its descendants, or even the Lisp machine family. These architectures (surely Urbit's closest relatives) could in theory be adapted to use as orthogonally persistent databases, but in practice are not designed for it. For one thing, they're all cyclic. More broadly, the assumption that the image is a GUI client in RAM is deeply ingrained.)

Third: since a planet is a server and a server is a real OS, its interpreter should be able to efficiently virtualize itself. There are two kinds of interpreter: the kind that can run an instance of itself as a VM, and the kind that can't.

JS can almost virtualize itself with eval, but eval is a toy. (One of those fun but dangerous toys -- like lawn darts.) And while it's not at all the same thing, the JVM can load applets -- or at least, in 1997 it could...

(To use some Urbit concepts we haven't met yet: with semantic drivers (which don't exist in JS or the JVM, although asm.js is a sort of substitute), we don't even abandon all the world's JS or JVM code by using Urbit. Rather, we can implement the JS or JVM specifications in Hoon, then jet-propel them with practical Unix JS or JVM engines, letting us run JS or Java libraries.)

Could we address any or all of these problems in the context of JS, the JVM or any other existing interpreter? We could. This does not seem likely to produce results either better or sooner than building the right thing from scratch.

Definition

An operating function (OF) is a logical computer whose state is a fixed function of its input history:

V(I) => T

where T is the state, V is the fixed function, I is the list of input events from first to last.

Intuitively, what the computer knows is a function of what it's heard. If the V function is identical on every computer for all time forever, all computers which hear the same events will learn the same state from them.

Is this function a protocol, an OS, or a database? All three. Like an OS, V specifies the semantics of a computer. Like a protocol, V specifies the semantics of its input stream. And like a database, V interprets a store of persistent state.

Advantages

This is a very abstract description, which doesn't make it easy to see what an OF is useful for.

Two concrete advantages of any practical OF (not just Urbit):

Orthogonal persistencе

An OF is inherently a single-level store. V(I) => T is an equation for the lifecycle of a computer. It doesn't make any distinction between transient RAM and persistent disk.

The word "database" has always conflated two concepts: data that isn't transient; structures specialized for search. A substantial percentage of global programmer-hours are spent on translating data between memory and disk formats.

Every practical modern database computes something like V(I) => T. I is the transaction log. An OF is an ACID database whose history function (log to state) is a general-purpose computer. A transaction is an event, an event is a transaction.

(Obviously a practical OF, though still defined as V(I) => T, does not recompute its entire event history on every event. Rather, it computes some incremental iterator U that can compute U(E T0) => T1, where E is each event in I.

In plain English, the next state T1 is a function U of the latest event E and the current state T0. We can assume that some kind of incrementality will fall out of any reasonable V.)

In any reasonable OF, you can write a persistent application by leaving your data in the structures you use it from. Of course, if you want to use specialized search structures, no one is stopping you.

Repeatable computing

Every computer is deterministic at the CPU level, in the sense that the CPU has a manual which is exactly right. But it is not actually practical to check the CPU's work.

"Repeatable computing" is high-level determinism. It's actually practical to audit the validity of a repeatable computer by re-executing its computation history. For an OF, the history is the event log.

Not every urbit is a bank; not every urbit has to store its full log. Still, the precision of a repeatable computer also affects the user experience. (It's appalling, for instance, how comfortable the modern user is with browser hangs and crashes.)

Requirements

V can't be updated for the lifecycle of the computer. Or, since V is also a protocol, for the lifetime of the network.

So V needs to be perfect, which means it needs to be small. It's also easier to eradicate ambiguity in a small definition.

But we're designing a general-purpose computer which is programmed by humans. Thus V is a programming language in some sense. Few practical languages are small, simple or perfect.

V is a practical interpreter and should be reasonably fast. But V is also a (nonpreemptive) operating system. One universal feature of an OS is the power to virtualize user-level code. So we need a small, simple, perfect interpreter which can efficiently virtualize itself.

V is also a system of axioms in the mathematical sense. Axioms are always stated informally. This statement succeeds iff it accurately communicates the same axioms to every competent reader. Compressing the document produces a rough metric of information content: the complexity of V.

(It's possible to cheat on this test. For instance, we could design a simple V that only executes another interpreter, W. W, the only program written in V's language, is simply encoded in the first event. Which could be quite a large event.

This design achieves the precision of V, but not its stability. For stability, the true extent of V is the semantic kernel that the computer can't replace during its lifetime from events in the input history. Thus V properly includes W.)

There are no existing interpreters that ace all these tests, so Urbit uses its own. Our V is defined in 350 bytes gzipped.

Nouns

A value in Urbit is a noun. A noun is an atom or a cell. An atom is an unsigned integer of any size. A cell is an ordered pair of nouns.

In the system equation V(I) => T, T is a noun, I is a list of nouns - where a list is either 0 or a cell [item list]. The V function is defined below.

Nock

Nock or N is a combinator interpreter on nouns. It's specified by these pseudocode reduction rules:

Nock(a)          *a
[a b c]          [a [b c]]

?[a b]           0
?a               1
+[a b]           +[a b]
+a               1 + a
=[a a]           0
=[a b]           1
=a               =a

/[1 a]           a
/[2 a b]         a
/[3 a b]         b
/[(a + a) b]     /[2 /[a b]]
/[(a + a + 1) b] /[3 /[a b]]
/a               /a

*[a [b c] d]     [*[a b c] *[a d]]

*[a 0 b]         /[b a]
*[a 1 b]         b
*[a 2 b c]       *[*[a b] *[a c]]
*[a 3 b]         ?*[a b]
*[a 4 b]         +*[a b]
*[a 5 b]         =*[a b]

*[a 6 b c d]     *[a 2 [0 1] 2 [1 c d] [1 0] 2 [1 2 3] [1 0] 4 4 b]
*[a 7 b c]       *[a 2 b 1 c]
*[a 8 b c]       *[a 7 [[7 [0 1] b] 0 1] c]
*[a 9 b c]       *[a 7 c 2 [0 1] 0 b]
*[a 10 [b c] d]  *[a 8 c 7 [0 3] d]
*[a 10 b c]      *[a c]

*a               *a

Note that operators 6-10 are macros and not formally essential. Note also that Nock reduces invalid reductions to self, thus specifying nontermination (bottom).

A valid Nock reduction takes a cell [subject formula]. The formula defines a function that operates on the subject.

A valid formula is always a cell. If the head of the formula is a cell, Nock reduces head and tail separately and produces the cell of their products ("autocons"). If the head is an atom, it's an operator from 0 to 10.

In operators 3, 4, and 5, the formula's tail is another formula, whose product is the input to an axiomatic function. 3 tests atom/cell, 4 increments, 5 tests equality.

In operator 0, the tail is a "leg" (subtree) address in the subject. Leg 1 is the root, 2n the left child of n, 2n+1 the right child.

In operator 1, the tail is produced as a constant. In operator 2, the tail is a formula, producing a [subject formula] cell which Nock reduces again (rendering the system Turing complete).

In the macro department, 6 is if-then-else; 7 is function composition; 8 is a stack push; 9 is a function call; 10 is a hint.

Surprisingly, Nock is quite a practical interpreter, although it poses unusual implementation challenges. For instance, the only arithmetic operation is increment (see the implementation issues section for how this is practical). Another atypical feature is that Nock can neither test pointer equality, nor create cycles.

The fixed function

So is V just Nock? Almost. Where N is Nock, V is:

V(I) == N(I [2 [0 3] [0 2]])

If Nock was not [subject formula] but [formula subject], not N(S F) but N(F S), V would be Nock.

In either case, the head I0 of I is the boot formula; the tail is the boot subject. Intuitively: to interpret the event history, treat the first event as a program; run that program on the rest of history.

More abstractly: the event sequence I begins with a boot sequence of non-uniform events, which do strange things like executing each other. Once this sequence completes, we end up in a main sequence of uniform events (starting at I5), which actually look and act like actual input.

`I0`: The lifecycle formula

I0 is special because we run it in theory but not in practice. Urbit is both a logical definition and a practical interpreter. Sometimes there's tension between these goals. But even in practice, we check that I0 is present and correct.

I0 is a nontrivial Nock formula. Let's skip ahead and write it in our high-level language, Hoon:

=>  [load rest]=.
=+  [main step]=.*(rest load)
|-  ?~  main  step
    $(main +.main, step (step -.main))

Or in pseudocode:

from pair $load and $rest
let pair $main and $step be:
  the result of Nock $load run on $rest
loop
  if $main is empty
    return $step
  else
    continue with
      $main set to tail of $main,
      $step set to:
        call function $step on the head of $main

In actual Nock (functional machine code, essentially):

[8 [2 [0 3] [0 2]] 8 [ 1 6 [5 [1 0] 0 12] [0 13] 9 2
[0 2] [[0 25] 8 [0 13] 9 2 [0 4] [0 56] 0 11] 0 7] 9 2 0 1]

I0 is given the sequence of events from I1 on. It takes I1, the boot loader, and runs it against the rest of the sequence. I1 consumes I2, I3, and I4, and then it produces the true initial state function step and the rest of the events from I5 on.

I0 then continues on to the main sequence, iteratively computing V(I) by calling the step function over and over on each event. Each call to step produces the next state, incorporating the changes resulting from the current event.

`I1`: the language formula

The language formula, I1, is another nontrivial Nock formula. Its job is to compile I3, the Hoon compiler as source, with I2, the Hoon compiler as a Nock formula.

If I3, compiled with I2, equals I2, I1 then uses I2 to load I4, the Arvo source. The main event sequence begins with I5. The only events which are Nock formulas are I0, I1, and I2; the only events which are Hoon source are I3 and I4.

I1, in pseudo-Hoon:

=>  [fab=- src=+< arv=+>- seq=+>+]
=+  ken=(fab [[%atom %ud] 164] src)
?>  =(+>.fab +.ken)
[seq (fab ken arv)]

In pseudocode:

with tuple as [$fab, $src, $arv, $seq], // these are I2 I3 I4 and I5...
  let $ken be:
    call $fab on type-and-value "uint 164" and Hoon $src
  assert environment from $fab is equal to the value from $ken
  return pair of the remaining $seq and:
    call $fab on environment $ken and Hoon $arv

In actual Nock:

[7 [0 2] 8 [8 [0 2] 9 2 [0 4] [7 [0 3] [1 [1.836.020.833
25.717] 164] 0 6] 0 11] 6 [5 [0 27] 0 5] [[0 31] 8 [0 6]
9 2 [0 4] [7 [0 3] [0 2] 0 30] 0 11] 0 0]

(25.717? Urbit uses the German dot notation for large atoms.)

We compile the compiler source, check that it produces the same compiler formula, and then compile the OS with it.

Besides I0 and I1, there's no Nock code in the boot sequence that doesn't ship with Hoon source.

`I2` and `I3`: the Hoon compiler

Again, I2 is the Hoon compiler (including basic libraries) as a Nock formula. I3 is the same compiler as source.

Hoon is a strict, typed, higher-order pure functional language which compiles itself to Nock. It avoids category theory and aspires to a relatively mechanical, concrete style.

The Hoon compiler works with three kinds of noun: nock, a Nock formula; twig, a Hoon AST; and span, a Hoon type, which defines semantics and constraints on some set of nouns.

There are two major parts of the compiler: the front end, which parses a source file (as a text atom) to a twig; and the back end, ut, which accepts a subject span and a twig, and compiles them down to a product span and a Nock formula.

Back end and type system

The Hoon back end, (ut), about 1700 lines of Hoon, performs type inference and (Nock) code generation. The main method of ut is mint, with signature $+([span twig] [span Nock]) (i.e., in pseudocode, mint(span, twig) -> [span Nock]).

Just as a Nock formula executes against a subject noun, a Hoon expression is always compiled against a subject span. From this [span twig], mint produces a cell [span Nock] . The Nock formula generates a useful product from any noun in the subject span. The span describes the set of product nouns.

Type inference in Hoon uses only forward tracing, not unification (tracing backward) as in Hindley-Milner (Haskell, ML). Hoon needs more user annotation than a unification language, but it's easier for the programmer to follow what the algorithm is doing - just because Hoon's algorithm isn't as smart.

But the Hoon type system can solve most of the same problems as Haskell's, notably including typeclasses / genericity. For instance, it can infer the type of an AST by compiling a grammar in the form of an LL combinator parser (like Hoon's own grammar).

The Hoon type system, slightly simplified for this overview:

++  span  $|  $?  %noun
                  %void
              ==
          $%  [%atom p=cord]
              [%cell p=span q=span]
              [%core p=span q=(map term twig)]
              [%cube p=noun q=span]
              [%face p=term q=span]
              [%fork p=span q=span]
              [%hold p=span q=twig]
          ==

In pseudocode:

define $span as one of:
  'noun'
  'void'
  'atom' with text aura
  'cell' of $span and $span
  'core' of environment $span and map of name to code AST
  'cube' of constant value and $span
  'face' of name wrapping $span
  'fork' of $span and alternate $span
  'hold' of subject $span and continuation AST

(The syntax %string means a cord, or atom with ASCII bytes in LSB first order. Eg, %foo is also 0x6f.6f66 or 7.303.014.)

The simplest spans are %noun, the set of all nouns, and %void, the empty set.

Otherwise, a span is one of %atom, %cell, %core, %cube, %face, %fork or %hold, each of which is parameterized with additional data.

An %atom is any atom, plus an aura cord that describes both what logical units the atom represents, and how to print it. For instance, %ud is an unsigned decimal, %ta is ASCII text, %da is a 128-bit Urbit date. %n is nil (~).

A %cell is a recursively typed cell. %cube is a constant. %face wraps another span in a name for symbolic access. %fork is the union of two spans.

A %core is a combination of code and data - Hoon's version of an object. It's a cell [(map term formula) payload], i.e. a set of named formulas (essentially "computed attributes") and a "payload", which includes the context the core is defined in. Each arm uses the whole core as its subject.

One common core pattern is the gate, Hoon's version of a lambda. A gate is a core with a single formula, and whose payload is a cell [arguments context]. To call the function, replace the formal arguments with your actual argument, then apply. The context contains any data or code the function may need.

(A core is not exactly an object, nor a map of names to formulas a vtable. The formulas are computed attributes. Only a formula that produces a gate is the equivalent of a method.)

Syntax, text

Hoon's front end (vast) is, at 1100 lines, quite a gnarly grammar. It's probably the most inelegant section of Urbit.

Hoon is a keyword-free, ideographic language. All alphabetic strings are user text; all punctuation is combinator syntax. The content of your code is always alphabetic; the structure is always punctuation. And there are only 80 columns per line.

A programming language is a UI for programmers. UIs should be actually usable, not apparently usable. Some UI tasks are harder than they appear; others are easier. Forming associative memories is easier than it looks.

Languages do need to be read out loud, and the conventional names for punctuation are clumsy. So Hoon replaces them:

ace [1 space]   gal <               pel (
bar |           gap [>1 space, nl]  per )
bas \           gar >               sel [
buc $           hax #               sem ;
cab _           hep -               ser ]
cen %           kel {               soq '
col :           ker }               tar *
com ,           ket ^               tec `
doq "           lus +               tis =
dot .           pam &               wut ?
fas /           pat @               zap !

For example, %= sounds like "centis" rather than "percent equals." Since even a silent reader will subvocalize, the length and complexity of the sound is a tax on reading the code.

A few digraphs also have irregular sounds:

==  stet
--  shed
++  slus
->  dart
-<  dusk
+>  lark
+<  lush

Hoon defines over 100 digraph ideograms (like |=). Ideograms (or runes) have consistent internal structure; for instance, every rune with the | prefix produces a core. There are no user-defined runes. These facts considerably mitigate the memorization task.

Finally, although it's not syntax, style merits a few words. While names in Hoon code can have any internal convention the programmer wants (as long as it's lowercase ASCII plus hyphen), one convention we use a lot: face names which are random three-letter syllables, arm names which are four-letter random Scrabble words.

Again, forming associative memories is easier than it looks. Random names quickly associate to their meanings, even without any mnemonic. A good example is the use of Greek letters in math. As with math variables, consistency helps too - use the same name for the same concept everywhere.

This "lapidary" convention is appropriate for code that's nontrivial, but still clean and relatively short. But code can't be clean unless it's solving a clean problem. For dirty problems, long informative names and/or kebab-case are better. For trivial problems (like defining add), we use an even more austere "hyperlapidary" style with one-letter names.

Twig structure

A Hoon expression compiles to a twig, or AST node. A twig is always a cell.

Like Nock formulas, Hoon twigs "autocons." A twig whose head is a cell is a pair of twigs, which compiles to a pair of formulas, which is a formula producing a pair. ("autocons" is an homage to Lisp, whose (cons a (cons b c)) becomes Urbit's [a b c].)

Otherwise, a twig is a tagged union. Its tag is always a rune, stored by its phonetic name as a cord. Because it's normal for 31-bit numbers to be direct (thus, somewhat more efficient), we drop the vowels. So =+ is not %tislus, but %tsls.

There are too many kinds of twig to enumerate here. Most are implemented as macros; only about 25 are directly translated to Nock. There is nothing sophisticated about twigs - given the syntax and type system, probably anyone would design the same twig system.

Syntax geometry

Just visually, one nice feature of imperative languages is the distinction between statements, which want to grow vertically down the page, and expressions, which prefer to grow horizontally.

Most functional languages, inherently lacking the concept of statements, have a visual problem: their expressions grow best horizontally, and often besiege the right margin.

Hoon has two syntax modes: "tall," which grows vertically, and "wide," which grows horizontally. Tall code can contain wide code, but not vice versa. In general, any twig can be written in either mode.

Consider the twig [%tsls a b], where a and b are twigs. =+ defines a variables, roughly equivalent to Lisp's let. a defines the variable for use in b. In wide mode it looks like an expression: =+(a b). In tall mode, the separator is two spaces or a newline, without parentheses:

=+  a
b

which is the same code, but looks like a "statement." The deindentation is reasonable because even though b is, according to the AST, "inside" the =+, it's more natural to see it separated out.

For twigs with a constant number of components, like [%tsls a b], tall mode relies on the parser to stop itself. With a variable fanout we need a == terminator:

:*  a
    b
==

The indentation rules of constant-fanout twigs are unusual. Consider the stem ?:, which happens to have the exact same semantics as C ?: (if-then-else). Where in wide form we'd write ?:(a b c), in tall form the convention is

?:  a
  b
c

This is backstep indentation. Its motivation: defend the right margin. Ideally, c above is a longer code block than a or b. And in c, we have not lost any margin at all. Ideally, we can write an arbitrarily tall and complex twig that always grows vertically and never has a margin problem. Backstep indentation is tail call optimization for syntax.

Finally, the wide form with rune and parentheses (like =+(a b)) is the normal wide form. Hoon has a healthy variety of irregular wide forms, for which no principles at all apply. But using normal forms everywhere would get quite cumbersome.

`I4`: Arvo

I4 is the source code for Arvo: about 600 lines of Hoon. This is excessive for what Arvo does and can probably be tightened. Of course, it does not include the kernel modules (or vanes).

I4 formally produces the kernel step function, step for I0. But in a practical computer, we don't run I0, and there are other things we want to do besides step it. Inside step is the Arvo core (leg 7, context), which does the actual work.

Arvo's arms are keep, peek, poke, wish, load.

wish compiles a Hoon string. keep asks Arvo when it wants to be woken up next. peek exposes the Arvo namespace. These three are formally unnecessary, but useful in practice for the Unix process that runs Arvo.

The essential Arvo arm is poke, which produces a gate which takes as arguments the current time and an event. (Urbit time is a 128-bit atom, starting before the universe and ending after it, with 64 bits for subseconds and 64 bits for seconds, ignoring any post-2015 leap seconds.)

The product of the poke gate is a cell [action-list Arvo]. So Arvo, poked with the time and an event, produces a list of actions and a new Arvo state.

Finally, load is used to replace the Arvo core. When we want to change Hoon or Arvo proper, we build a new core with our new tools, then start it by passing load the old core's state. (State structures in the new core need not match the old, so the new core may need adapter functions.) load produces its new core wrapped in the canonical outer step function.

Since the language certainly needs to be able to change, the calling convention to the next generation is a Nock convention. load is leg 54 of the core battery, producing a gate whose formal argument at leg 6 is replaced with the Arvo state, then applied to the formula at leg 2. Of course, future cores can use a different upgrade convention. Nothing in Arvo or Urbit is technically frozen, except for Nock.

State

Arvo has three pieces of dynamic state: its network address, or plot; an entropy pool; the kernel modules, or vanes.

Configuration overview

The first main-sequence event, I5, is [%init plot], with some unique Urbit address (plot) whose secrets this urbit controls. The plot, an atom under 2^128, is both a routing address and a cryptographic name. Every Urbit event log starts by setting its own unique and permanent plot.

From I6 we begin to install the vanes: kernel modules. Vanes are installed (or reloaded) by a [%veer label code] event. Vane labels by convention are cords with zero or one letter.

The vane named %$ (zero, the empty string) is the standard library. This contains library functions that aren't needed in the Hoon kernel, but are commonly useful at a higher level.

The standard library is compiled with the Hoon kernel (I2) as subject. The other vanes are compiled with the standard library (which now includes the kernel) as subject. Vane %a handles networking; %c, storage; %f, build logic; %g, application lifecycle; %e, http interface; %d, terminal interface; %b, timers; and %j, storage of secrets.

Mechanics

Vanes are stored as a cell of a noun with its span. At least in Urbit, there is no such thing as a dynamic type system - only a static type system executed at runtime.

Storing vanes with their spans has two benefits. One, it lets us type-check internal event dispatch. Two, it lets us do typed upgrades when we replace a vane. It makes routine dispatch operations incur compiler cost, but this caches well.

Input and output

From the Unix process's perspective, Arvo consumes events (which Unix generates) and produces actions (which Unix executes). The most common event is hearing a UDP packet; the most common action is sending a UDP packet. But Arvo also interacts (still in an event-driven pattern) with other Unix services, including HTTP, the console, and the filesystem.

Events and actions share the ovum structure. An ovum is a cell [wire card]. A wire is a path - a list of cords, like [%foo %bar %baz ~], usually written with the irregular syntax /foo/bar/baz. A card is a tagged union of all the possible types of events and effects.

A wire is a cause. For example, if the event is an HTTP request %thus, the wire will contain (printed to cords) the server port that received the request, the IP and port of the caller, and the socket file descriptor.

The data in the wire is opaque to Urbit. But an Urbit poke, in response to this event or to a later one, can answer the request with an action %this - an HTTP response. Unix parses the action wire it sent originally and responds on the right socket file descriptor.

Moves and ducts

Event systems are renowned for "callback hell". As a purely functional environment, Arvo can't use callbacks with shared state mutation, a la node. But this is not the only common pitfall in the event landscape.

A single-threaded, nonpreemptive event dispatcher, like node or Arvo, is analogous to a multithreaded preemptive scheduler in many ways. In particular, there's a well-known duality between event flow and control flow.

One disadvantage of many event systems is unstructured event flow, often amounting to "event spaghetti". Indeed, the control-flow dual of an unstructured event system is goto.

Arvo is a structured event system whose dual is a call stack. An internal Arvo event is called a "move". A move has a duct, which is a list a wires, each of which is analagous to a stack frame. Moves come in two kinds: a %pass move calls upward, pushing a frame to the duct, while a %give move returns a result downward, popping a frame off the duct.

%pass contains a target vane name; a wire, which is a return pointer that defines this move's cause within the source vane; and a card, the event data. %give contains just the card.

The product of a vane event handler is a cell [moves new-state], a list of moves and the vane's new state.

On receiving a Unix event, Arvo turns it into a %pass by picking a vane from the Unix wire (which is otherwise opaque), then pushes it on a move stack.

The Arvo main loop pops a move off the move stack, dispatches it, replaces the result vane state, and pushes the new moves on the stack. The Unix event terminates when the stack is empty.

To dispatch a %pass move sent by vane %x, to vane %y, with wire /foo/bar, duct old-duct, and card new-card, we pass [[/x/foo/bar old-duct] new-card], a [duct card] cell, to the call method on vane %y. In other words, we push the given wire (adding the vane name onto the front) on to the old duct to create the new duct, and then pass that along with the card to the other vane.

To dispatch a %give move returned by vane %x, we check if the duct is only one wire deep. If so, return the card as an action to Unix. If not, pull the original calling vane from the top wire on the duct (by destructuring the duct as [[vane wire] plot-duct]), and call the take method on vane with [plot-duct wire card].

Intuitively, a pass is a service request and a give is a service response. The wire contains the information (normalized to a list of strings) that the caller needs to route and process the service result. The effect always gets back to its cause.

A good example of this mechanism is any internal service which is asynchronous, responding to a request in a different system event than its cause - perhaps many seconds later. For instance, the %c vane can subscribe to future versions of a file.

When %c gets its service request, it saves it with the duct in a table keyed by the subscription path. When the future version is saved on this path, the response is sent on the duct it was received from. Again, the effect gets back to its cause. Depending on the request, there may even be multiple responses to the same request, on the same duct.

In practice, the structures above are slightly simplified - Arvo manages both vanes and moves as vases, [span noun] cells. Every dispatch is type-checked.

One card structure that Arvo detects and automatically unwraps is [%meta vase] - where the vase is the vase of a card. %meta can be stacked up indefinitely. The result is that vanes themselves can operate internally at the vase level - dynamically executing code just as Arvo itself does.

The %g vane uses %meta to expose the vane mechanism to user-level applications. The same pattern, a core which is an event transceiver, is repeated across four layers: Unix, Arvo, the %g vane, and the :dojo shell application.

Event security

Arvo is a "single-homed" OS which defines one plot, usually with the label our, as its identity for life. All vanes are fully trusted by our. But internal security still becomes an issue when we execute user-level code which is not fully trusted, or work with data which is not trusted.

Our security model is causal, like our event system. Every event is a cause and an effect. Event security is all about who/whom: who (other than us) caused this event; whom (other than us) it will affect. our is always authorized to do everything and hear anything, so strangers alone are tracked.

Every event has a security mask with two fields who and hum, each a (unit (set plot)). (A unit is the equivalent of Haskell's Maybe - (unit x) is either [~ x] or ~, where ~ is nil.)

If who is ~, nil, anyone else could have caused this move -- in other words, it's completely untrusted. If who is [~ ~], the empty set, no one else caused this move -- it's completely trusted. Otherwise, the move is "tainted" by anyone in the set.

If hum is ~, nil, anyone else can be affected by this move -- in other words, it's completely unfiltered. If hum is [~ ~], the empty set, no one else can hear the move -- it's completely private. Otherwise, the move is permitted to "leak" to anyone in the set.

Obviously, in most moves the security mask is propagated from cause to effect without changes. It's the exceptions that keep security exciting.

Namespace

Besides call and take, each vane exports a scry gate whose argument is a path - a list of strings, like a wire.

scry implements a global monotonic namespace - one that (a) never changes its mind, for the lifetime of the urbit; and (b) never conflicts across well-behaved urbits.

This invariant is semantic - it's not enforced by the type system. Getting it right is up to the vane's developer. Installing kernel modules is always a high-trust operation.

The product of the scry gate is (unit (unit value)).

So scry can produce ~, meaning the path is not yet bound; [~ ~], meaning the path is bound to empty; or [~ ~ value], an actual value is bound. This value is of the form [mark span noun], where span is the type of the noun and mark is a higher-level "type" label best compared to a MIME type or filename extension. Marks are discussed under the %f vane.

Vane activation and namespace invariant

The vane that Arvo stores is not the core that exports call etc, but a gate that produces the core. The argument to this gate is a cell [@da $+(path (unit (unit value)))]. Note that $+(path (unit (unit value))) is just the function signature of the scry namespace.

So, the head of the argument is the current time; the tail is the system namespace. The general namespace is constructed from the scry method of the vanes. The first letter of the head of the path identifies the vane; the rest of the head, with the rest of the path, is passed to the vane. The vane core is activated just this way for all its methods, including scry itself.

This answers the question of how to expose dynamic state without violating our referential transparency invariants. If we require the query to include exactly the current time (which is guaranteed to change between events) in the path, then we are able to respond with the current state of the dynamic data. If the path doesn't contain the current time, then fail with [~ ~]. This technique is only needed, of course, when you don't know what the state was in the past.

Additionally, since each vane knows the plot it's on, a scry query with the plot in the path is guaranteed to be globally unique.

But vanes do not have to rely on these safe methods to maintain monotonicity - for instance, the %c revision-control vane binds paths around the network and across history.

A tour of the vanes

The vanes are separately documented, because that's the entire point of vanes. Let's take a quick tour through the system as it currently stands, however.

`%a` `%ames`: networking

%a is the network vane, currently %ames (2000 lines). %ames implements an encrypted P2P network over UDP packets.

As a vane, %ames provides a simple %pass service: send a message to another plot. The message is an arbitrary [wire noun] cell: a message channel and a message body. %ames will respond with a %give that returns either success or failure and an error dump.

%ames messages maintain causal integrity across the network. The sender does not send the actual duct that caused the message, of course, but an opaque atom mapped to it. This opaque bone, plus the channel itself, are the "socket" in the RFC sense.

Messages are ordered within this socket and delivered exactly once. (Yes, as normally defined this is impossible. Urbit can do exactly-once because %ames messaging is not a tool to build a consistency layer, but a tool built on the consistency layer. Thus, peers can have sequence numbers that are never reset.)

More subtly, local and remote causes treated are exactly the same way, improving the sense of network abstraction. CORBA taught us that it's not possible to abstract RPC over local function calls without producing a leaky abstraction. The Arvo structured event model is designed to abstract over messages.

One unusual feature is that %ames sends end-to-end acks. Acknowledging the last packet received in a message acknowledges that the message itself has been fully accepted. There is no separate message-level result code. Like its vane interface, %ames acks are binary: a positive ack has no data, a negative ack has a dump. Thus, Arvo is transactional at the message level as well as the packet/event level.

Another %ames technique is dropping packets. For instance, it is always a mistake to respond to a packet with bad encryption: it enables timing attacks. Any packet the receiver doesn't understand should be dropped. Either the sender is deranged or malicious, or the receiver will receive a future upgrade that makes a later retransmission of this packet make sense.

Messages are encrypted in a simple PKI (currently RSA, soon curve/ed25519) and AES keys are exchanged. Certificate transfer and key exchange are aggressively piggybacked, so the first packet to a stranger is always signed but not encrypted. The idea is that if you've never talked to someone before, probably the first thing you say is "hello." Which is not interesting for any realistic attacker. In the rare cases where this isn't true, it's trivial for the application to work around. (See the network architecture section for more about the PKI approach.)

(For packet geeks only: %ames implements both "STUN" and "TURN" styles of peer-to-peer NAT traversal, using parent plots (see below) as supernodes. If the sending plot lacks a current IP/port (Urbit uses random ports) for the destination, it forwards across the parent hierarchy. As we forward a packet, we attach the sender's address, in case we have a full-cone NAT that can do STUN.

When these indirect addresses are received, they are regarded with suspicion, and duplicate packets are sent - one forwarded, one direct. Once a direct packet is received, the indirect address is dropped. Forwarding up the hierarchy always succeeds, because galaxies (again, see below) are bound into the DNS at galaxy.urbit.org.)

`%c` `%clay`: filesystem

%c is the filesystem, currently %clay (3000 lines). %clay is a sort of simplified, reactive, typed git.

The %clay filesystem uses a uniform inode structure, ankh. An ankh contains: a Merkle hash of this subtree, a set of named children, and possibly file data. If present, the file data is typed and marked (again, see the %f vane for more about marked data).

A path within the tree always contains the plot and desk (lightweight branch) where the file is located, which version of the file to use, and what path within the desk the file is at.

Paths can be versioned in three ways: by change number (for changes within this desk); by date; or by label.

Where git has multiple parallel states, %clay uses separate desks. For instance, where git creates a special merge state, %clay just uses a normal scratch desk. Don't detach your head, merge an old version to a scratch desk. Don't have explicit commit messages, edit a log file. %clay is a RISC git.

%clay is a typed filesystem; not only does it save a type with each file, but it uses mark services from %f to perform typed diff and patch operations that match the mark types. Obviously, Hunt-McIlroy line diff only works for text files. %clay can store and revise arbitrary custom data structures, not a traditional revision-control feature.

The scry namespace exported by clay uses the mode feature to export different views of the filesystem. For instance, [%cx plot desk version path], which passes %x as the mode to clay's scry, produces the file (if any) at that path. %y produces the directory, essentially - the whole ankh with its subtrees trimmed. %z produces the whole ankh.

%clay is reactive; it exports a subscription interface, with both local and network access. A simple use is waiting for a file or version which has not yet been committed.

A more complex use of %clay subscription is synchronization, which is actually managed at the user level within a %gall application (:hood). It's straightforward to set up complex, multistep or even cyclical chains of change propagation. Intuitively, git only pulls; %clay both pulls and pushes.

Finally, the easiest way to use %clay is to edit files directly from Unix. %clay can mount and synchronize subtrees in the Urbit home directory. There are two kinds of mount: export and sync. While an export mount marks the Unix files read-only, sync mounts support "dropbox" style direct manipulation. The Unix process watches the subtree with inotify() or equivalent, and generates filesystem change events automagically. A desk which is synced with Unix is like your git working tree; a merge from this working desk to a more stable desk is like a git commit.

Access control in %clay is an attribute of the desk, and is either a blacklist or whitelist of plots. Creating and synchronizing desks is easy enough that the right way to create complex access patterns is to make a desk for the access pattern, and sync only what you want to share into it.

`%f` `%ford`: builder

%f is the functional build system, currently %ford (1800 lines). It might be compared to make et al, but loosely.

%ford builds things. It builds applications, resources, content type conversions, filter pipelines, more or less any functional computation specified at a high level. Also, it caches repeated computations and exports a dependency tracker, so that %ford users know when they need to rebuild.

%ford does all its building and execution in the virtual Nock interpreter, mock. mock is a Nock interpreter written in Hoon, and of course executed in Nock. (See the implementation issues section for how this is practical.)

mock gives us two extra affordances. One, when code compiled with tracing hints crashes deterministically, we get a deterministic stack trace. Two, we add a magic operator 11 (.^ in Hoon) which dereferences the global namespace. This is of course referentially transparent, and mock remains a functional superset of Nock.

%ford is passed one card, [%exec silk], which specifies a computation - essentially, a makefile as a noun. The silk structure is too large to enumerate here, but some highlights:

The simplest task of %ford is building code, by loading sources and resources from %clay. For kernel components, a source file is just parsed directly into a twig. %ford can do a lot of work before it gets to the twig.

The Hoon parser for a %ford build actually parses an extended build language wrapped around Hoon. Keeping the build instructions and the source in one file precludes a variety of exciting build mishaps.

For any interesting piece of code, we want to include structures, libraries, and data resources. Some of these requests should be in the same beak (plot/desk/version) as the code we're building; some are external code, locked to an external version.

%ford loads code from %clay with a role string that specifies its use, and becomes the head of the spur. (Ie, the first path item after plot, desk and version.) Roles: structures are in /sur, libraries in /lib, marks in /mar, applications in /app, generators in /gen, fabricators in /fab, filters in /tip, and anything not a Hoon file in /doc.

Data resources are especially interesting, because we often want to compile whole directory trees of resources into a single noun, a typed constant from the point of view of the programmer. Also, requests to %clay may block - %ford will have to delay giving its result, and wait until a subscription for the result returns.

Moreover, for any interesting build, the builder needs to track dependencies. With every build, %ford exports a dependency key that its customers can subscribe to, so that they get a notification when a dependency changes. For example, %gall uses this feature to auto-reload applications.

Finally, both for loading internal resources in a build and as a vane service (notably used by the web vane %eyre), %ford defines a functional namespace which is mapped over the static document space in /doc.

If we request a resource from %ford that does not correspond to a file in /doc, we search in /fab for the longest matching prefix of the path. The fabricator receives as its sole argument that part of the path not part of the matching prefix.

Thus, if we search for /a/b/c/d, and there's no /doc/a/b/c/d, then we first check for /fab/a/b/c/d, then /fab/a/b/c, and so on. If we find, for example, a /fab/a/b, then we run that fabricator with the argument /c/d. Thus, a fabricator can present an arbitrary virtual document tree.

The %ford vane also exports its namespace through scry, but scry has no way to block if a computation waits. It will just produce ~, binding unknown.

Another major %ford feature is the mark system. While the Hoon type system works very well, it does not fill the niche that MIME types fill on the Internets. Marks are like MIME types, if MIME types came with with executable specifications which defined how to validate, convert, diff, and patch content objects.

The mark %foo is simply the core built from /mar/foo. If there is no /mar/foo, all semantics are defaulted - %foo is treated as %noun. If there is a core, it can tell %ford how to validate/convert from other marks (validation is always equivalent to a conversion from %noun, and seldom difficult, since every Hoon structure is defined as a fixpoint normalizer); convert to other marks; patch, diff, and merge. Obviously %clay uses these revision control operations, but anyone can.

For complex conversions, %ford has a simple graph analyzer that can convert, or at least try to convert, any source mark to any target mark.

There are two ford request modes, %c and %r - cooked and raw. A cooked request, %fc in scry, has a target mark and converts the result to it. A raw request, %fr, returns whatever %ford finds easiest to make.

Generators and filters, /gen and /tip, are actually not used by any other vane at present, but only by the user-level :dojo app, which constructs dataflow calculations in Unix pipe style. Applications, /app, are used only by the %g vane.

`%g` `%gall`: application driver

%g is the application system, currently %gall (1300 lines) %gall is half process table and half systemd, sort of.

%gall runs user-level applications much as Arvo runs vanes. Why this extra layer? Vanes are kernel components written by kernel hackers, for whom expressiveness matters more than convenience. A bad kernel module can disable the OS; a bad application has to be firewalled.

But the basic design of an application and a vane is the same: a stateful core that's called with events and produces moves. Only the details differ, and there are too many to cover here.

One example is that %gall apps don't work with Arvo ducts directly, but opaque integers that %gall maps to ducts; the consequences of a bad app making a bad duct would be odd and potentially debilitating, so we don't let them touch ducts directly. Also, %gall does not like to crash, so to execute user-level code it uses the virtual interpeter mock. At the user level, the right way to signal an error is to crash and let %gall pick up the pieces - in a message handler, for instance, this returns the crash dump in a negative ack.

%gall uses %ford to build cores and track dependencies. Like vanes, %gall applications update when new code is available, sometimes using adapter functions to translate old state types to new ones. Unlike Arvo, %gall triggers updates automatically when a dependency changes.

Other than convenience and sanity checks, the principal value add of %gall is its inter-application messaging protocol. %gall messaging supports two communication patterns: a one-way message with a success/error result (%poke), and classic publish and subscribe (%peer).

%peer is designed for synchronizing state. The subscriber specifies a path which defines an application resource, and receives a stream of %full and %diff cards, total and incremental updates. (A simple "get" request is a special case in which there is a single full and no diffs.) Backpressure breaks the subscription if the queue of undelivered updates grows too deep. Broken subscriptions should not cause user-level errors; the user should see an error only if the subscription can't be reopened.

Recall that %ames routes messages which are arbitrary nouns between vanes on different urbits. %gall uses these %ames messages for remote urbits, and direct moves for local IPC. Thus, from the app's perspective, messaging an app on another Urbit is the same as messaging another app on the same Urbit. The abstraction does not leak.

Messages are untyped at the %ames level, but all %gall messages carry a mark and are validated by the receiver. Marks are not absolute and timeless - %ford needs a beak (plot, desk, version) to load the mark source.

When a message fails to validate, the packet is dropped silently. This puts the sender into exponential backoff retransmission. The general cause of a message that doesn't validate is that the sender's mark has received a backward-compatible update (mark updates that aren't backward compatible are a bad idea - use a new name), and the receiver doesn't have this update yet. The retransmitted packet will be processed correctly once the update has propagated.

With this mechanism, Urbit can update a distributed system (such as our own :talk network), upgrading both applications and protocols, silently without user intervention or notification. The general pattern of application distribution is that the app executes from a local desk autosynced to a remote urbit, which performs the function of an app store or distro. As this server fills its subscriptions, a period of network heterogeneity is inevitable; and so is transient unavailability, as new versions try to talk to old ones. But it resolves without hard errors as all clients are updated.

`%e` `%eyre`: web server/client

%e is the web system, currently %eyre (1600 lines).

%eyre has three purposes. First, it is an HTTP and HTTPS client - or rather, it interfaces via actions/events to HTTP client handlers in the Unix layer.

Second, %eyre serves the %ford namespace over HTTP. The URL extension is parsed as a mark, but the default mark for requests is urb, which creates HTML and injects an autoupdate script that long-polls on the dependencies, reloading the page in the background when they change.

The normal way of publishing static or functional content in Urbit is to rely on %ford for format translation. Most static content is in markdown, the %md mark. Dynamic content is best generated with the sail syntax subsystem in Hoon, which is is essentially an integrated template language that reduces to XML.

Third, %eyre acts as a client for %gall apps, translating the %gall message flow into JSON over HTTP. %poke requests become POST requests, %peer becomes a long-poll stream. Our urb.js framework is a client-side wrapper for these requests.

The abstraction does not leak. On the server side, all it takes to support web clients is ensuring that outbound marks print to %json and inbound marks parse from %json. (The standard library includes JSON tools.) The %gall application does not even know it's dealing with a web client, all it sees are messages and subscriptions, just like it would receive from another Urbit app.

The dataflow pattern of %gall subscriptions is ideal for React and similar "one-way data binding" client frameworks.

Of course, %gall apps expect to be talking to an urbit plot, so web clients need to (a) identify themselves and (b) talk over HTTPS. %eyre contains a single sign-on (SSO) flow that authenticates an urbit user either to her own server or her friends'.

Ideally, an urbit named ~plot is DNSed to plot.urbit.org. If you use your urbit through the web, you'll have an insecure cookie on *.urbit.org. Other urbits read this and drive the SSO flow; if you're ~tasfyn-partyv, you can log in as yourself to ~talsur-todres.urbit.org. The SSO confirmation message is sent directly as an %ames message between the %eyre vanes on the respective urbits.

Urbit also has an nginx configuration and node cache server, which (a) let a relatively slow Urbit server drive reasonably high request bandwidth, and (b) serve HTTPS by proxy.

Note that while external caching helps %ford style functional publishing, it does not help %gall style clients, which use server resources directly. Urbit is a personal server; it can handle an HN avalanche on your blog, but it's not designed to power your new viral startup.

`%j` `%jael`: secret storage

%j, currently %jael (200 lines), saves secrets in a tree. Types of secrets that belong in %jael: Urbit private keys, Urbit symmetric keys, web API user keys and/or passwords, web API consumer (application) keys.

%jael has no fixed schema and is simply a classic tree registry. Besides a simpler security and type model, the main difference between secrets and ordinary %clay data is that secrets expire - sometimes automatically, sometimes manually. When another vane uses a %jael secret, it must register to receive an expiration notice.

`%d` `%dill`: console and Unix

%d, currently %dill (450 lines) handles the terminal and miscellaneous Unix interfaces, mainly for initialization and debugging.

The console terminal abstraction, implemented directly with terminfo in raw mode, gives Urbit random-access control over the input line. Keystrokes are not echoed automatically. Output lines are write-only and appear above the input line.

%dill also starts default applications, measures system memory consumption, dumps error traces, etc.

`%b` `%behn`: timers

%b, currently %behn (250 lines), is a timer vane that provides a simple subscription service to other vanes.

Base applications

Arvo ships with three major default applications: :hood (1600 lines), :dojo (700 lines), and :talk (1600 lines).

`:dojo`: a shell

:dojo is a shell for ad-hoc computation. A dojo command is a classic filter pipeline, with sources, filters, and sinks.

Sources can be shell variables, %clay files, immediate expressions, source files in /gen, or uploads (Unix files in ~/urbit/$plot/.urb/get). There are three kinds of generator: simple expressions, dialog cores, and web scrapers. Generators are executed by %ford through mock, and can read the Urbit namespace via virtual operator 11.

Urbit does not use short-lived applications like Unix commands. A %gall application is a Unix daemon. A generator is not like a Unix process at all; it cannot send moves. A dialog generator, waiting for user input, does not talk to the console; it tells :dojo what to say to the console. A web scraper does not talk to %eyre; it tells :dojo what resources it needs (GET only). This is POLA (principle of least authority); it makes the command line less powerful and thus less scary.

:dojo filters are immediate expressions or /tip source files. Sinks are console output, shell variables, %clay, or downloads (Unix files in ~/urbit/$plot/.urb/put). (The upload/download mechanism is a console feature, not related to %clay sync - think of browser uploading and downloading.)

`:talk`: a communication bus

:talk is a distributed application for sharing telegrams, or typed social messages. Currently we use :talk as a simple chat service, but telegram distribution is independent of type.

Every urbit running :talk can create any number of "stations." Every telegram has a unique id and a target audience, to which its sender uploads it. But :talk stations also subscribe to each other to construct feeds.

Mutually subscribing stations will mirror, though not preserving message order. A :talk station is not a decentralized entity; but urbits can use subscription patterns to federate into "global" stations - very much as in NNTP (Usenet).

Stations have a flexible admission control model, with a blanket policy mode and an exception list, that lets them serve as "mailboxes" (public write), "journals" (public read), "channels" (public read/write), or "chambers" (private read/write).

Subscribers are also notified of presence and configuration changes, typing indicator, etc. Also, telegrams contain a "flavor", which senders can use to indicate categories of content that other readers may prefer not to see.

:talk is usable both a command-line application and a reactive web client.

`:hood`: a system daemon

:hood is a classic userspace system daemon, like Unix init. It's a system component, but it's in userspace because it can be.

:hood is actually a compound application made of three libraries, /helm, /drum, and /kiln. /helm manages the PKI; /drum multiplexes the console; /kiln controls %clay sync.

/drum routes console activity over %gall messages, and can connect to both local and foreign command-line interfaces - ssh, essentially.

One pattern in Urbit console input is that an application doesn't just parse its input after a line is is entered, but on each character. This way, we can reject or correct syntax errors as they happen. It's a much more pleasant user experience.

But since both sides of a conversation (the user and the application) are making changes to a single shared state (the input line), we have a reconciliation problem. The Urbit console protocol uses operational transformation (like Google Wave or git replot) for eventual consistency.

/helm manages public keys and (in future) hosted urbits. See the network architecture section below.

/kiln implements mirroring and synchronization. Any desk can mirror any other desk (given permission). Mirrors can form cycles -- a two-way mirror is synchronization.

Implementation issues

We've left a couple of knotty implementation issues unresolved up until now. Let's resolve them.

Jets

How can an interpreter whose only arithmetic operator is increment compute efficiently? For instance, the only way to decrement n is to count up to n - 1, which is O(n).

Obviously, the solution is: a sufficiently smart optimizer.

A sufficiently smart optimizer doesn't need to optimize every Nock formula that could calculate a decrement function. It only needs to optimize one: the one we actually run.

The only one we run is the one compiled from the decrement function dec in the Hoon standard library. So there's no sense in which our sufficiently smart optimizer needs to analyze Nock formulas to see if they're decrement formulas. It only needs to recognize the standard dec.

The easiest way to do this is for the standard dec to declare, with a hint (Nock 10), in some formalized way, that it is a decrement formula. The interpreter implementation can check this assertion by simple comparison - it knows what formula the standard dec compiles to. Our sufficiently smart optimizer isn't very smart at all!

The C module that implements the efficient decrement is called a "jet." The jet system should not be confused with an FFI: a jet has no reason to make system calls, and should never be used to produce side effects. Additionally, a jet is incorrect unless it accurately duplicates an executable specification (the Hoon code). Achieving jet correctness is difficult, but we can spot-check it easily by running both soft and hard versions.

Jets separate mechanism and policy in Nock execution. Except for perceived performance, neither programmer nor user has any control over whether any formula is jet-propelled. A jet can be seen as a sort of "software device driver," although invisible integration of exotic hardware (like FPGAs) is another use case. And jets do not have to be correlated with built-in or low-level functionality; for instance, Urbit has a markdown parser jet.

Jets are of course the main fail vector for both computational correctness and security intrusion. Fortunately, jets don't make system calls, so sandboxing policy issues are trivial, but the sandbox transition needs to be very low-latency. Another approach would be a safe systems language, such as Rust.

The correctness issue is more interesting, because errors happen. They are especially likely to happen early in Urbit's history. A common scenario will be that the host audits an urbit by re-executing all the events, and produces a different state. In this case, the urbit must become a "bastard" - logically instantiated at the current state. The event log is logically discarded as meaningless. Hosting a bastard urbit is not a huge problem, but if you have one you want to know.

In the long run, jet correctness is an excellent target problem for fuzz testers. A sophisticated implementation might even put 10% of runtime into automatic jet testing. While it's always hard for implementors to match a specification perfectly, it's much easier with an executable specification that's only one or two orders of magnitude slower.

Event planning

How do we actually process Urbit events? If Urbit is a database, how does our database execute and maintain consistency in practice, either on a personal computer or a normal cloud server? How does orthogonal persistence work? Can we use multiple cores?

An event is a transaction. A transaction either completes or doesn't, and we can't execute its side effects until we have committed it. For instance, if an incoming packet causes an outgoing packet, we can't send the outgoing packet until we've committed the incoming one to durable storage.

Unfortunately, saving even a kilobyte of durable storage on a modern PC, with full write-through sync, can take 50 to 100ms. Solid-state storage improves this, but the whole machine is just not designed for low-latency persistence.

In the cloud the situation is better. We can treat consensus on a nontrivial Raft cluster in a data center as persistence, even though the data never leaves RAM. Urbit is highly intolerant of computation error, for obvious reasons, and should be run in an EMP shielded data center on ECC memory.

There are a number of open-source reliable log processors that work quite well. We use Apache Kafka.

With either logging approach, the physical architecture of an Urbit implementation is clear. The urbit is stored in two forms: an event log, and a checkpoint image (saved periodically, always between events). The log can be pruned at the last reliably recorded checkpoint, or not. This design (sometimes called "prevalence") is widely used and supported by common tools.

The checkpointing process is much easier because Urbit has no cycles and needs no tracing garbage collector. Without careful tuning, a tracing GC tends to turn all memory available to it into randomly allocated memory soup. The Urbit interpreter uses a region system with a copy step to deallocate the whole region used for processing each event.

Events don't always succeed. How does a functional operating system deal with an infinite loop? The loop has to be interrupted, as always. How this happens depends on the event. If the user causes an infinite loop from a local console event, it's up to the user to interrupt it with ^C. Network packets have a timeout, currently a minute.

When execution fails, we want a stack trace. But Urbit is a deterministic computer and the stack trace of an interrupt is inherently nondeterministic. How do we square this circle?

Urbit is deterministic, but it's a function of its input. If an event crashes, we don't record the event in I. Instead, we record a %crud card that contains the stack trace and the failed event. To Urbit, this is simply another external event.

Processing of %crud depends on the event that caused it. For keyboard input, we print the error to the screen. For a packet, we send a negative ack on the packet, with the trace. For instance, if you're at the console of urbit A and logged in over the network to B, and you run an infinite loop, the B event loop will time out; the network console message that A sent to B will return a negative ack; the console application on A will learn that its message failed, and print the trace.

Finally, the possibilities of aggressive optimization in event execution haven't been explored. Formally, Urbit is a serial computer - but it's probable that a sophisticated, mature implementation would find a lot of ways to cheat. As always, a logically simple system is the easiest system to hack for speed.

Network and PKI architecture

Two more questions we've left unanswered: how you get your urbit plot, and how packets get from one plot to another.

Again, a plot is both a digital identity and a routing address. Imagine IPv4 if you owned your own address, converted it to a string that sounded like a foreign name, and used it as your Internet handle.

Bases are parsed and printed with the %p aura, which is designed to make them as human-memorable as possible. %p uses phonemic plot-256 and renders smaller plots as shorter strings:

8 bits    galaxy  ~syd
16 bits   star    ~delsym
32 bits   planet  ~mighex-forfem
64 bits   moon    ~dabnev-nisseb-nomlec-sormug
128 bits  comet   ~satnet-rinsyr-silsec-navhut--bacnec-todmeb-sarseb-pagmul

Of course, not everyone who thinks he's Napoleon is. For the urbit to actually send and receive messages under the plot it assigns itself in the %init event (I5), later events must convey the secrets that authenticate it. In most cases this means a public key, though some urbits are booted with a symmetric session key that only works with the parent plot.

How do you get a plot? Comets are hashes of a random public key. "Civil" non-comets are issued by their parent plot - the numerical prefix in the next rank up: moons by planet, planets by star, stars by galaxy. The fingerprints of the initial galactic public keys (currently test keys) are hardcoded in %ames.

The comet network is fully decentralized and "free as in beer," so unlikely to be useful. Any network with disposable identities can only be an antisocial network, because disposable identities cannot be assigned default positive reputation. Perhaps comets with nontrivial hashcash in their plots could be an exception, but nonmemorable plots remain a serious handicap.

The civil network is "semi-decentralized" - a centralized system designed to evolve into a decentralized one. Urbit does not use a blockchain - address space is digital land, not digital money.

The initial public key of a civil plot is signed by its parent. But galaxies, stars and planets are independent once they've been created - they sign their own updates. (Moons are dependent; their planet signs updates.)

The certificate or will is a signing chain; only the last key is valid; longer wills displace their prefixes. Thus update is revocation, and revocation is a pinning race. To securely update a key is to ensure that the new will reaches any peer before any messages signed by an attacker who has stolen the old key.

The engineering details of this race depend on the actual threat model, which is hard to anticipate. If two competing parties have the same secret, no algorithm can tell who is in the right. Who pins first should win, and there will always be a time window during which the loser can cause problems. Aggressively flooding and expiring certificates reduces this window; caching and lazy distribution expands it. There is no tradeoff-free solution.

Broadly, the design difference between Urbit and a blockchain network is that blockchains are "trust superconductors" - they eliminate any dependency on social, political or economic trust. Urbit is a "trust conductor" - engineered to minimize, but not eliminate, dependencies on trust.

For instance, bitcoin prevents double-spending with global mining costs in multiple dollars per transaction (as of 2015). Trusted transaction intermediation is an easily implemented service whose cost is a tiny fraction of this. And the transaction velocity of money is high; transactions in land are far more rare.

Another trust engineering problem in Urbit is the relationship between a plot and its parent hierarchy. The hierarchy provides a variety of services, starting with peer-to-peer routing. But children need a way to escape from bad parents.

Urbit's escape principle: (a) any planet can depend on either its star's services, or its galaxy's; (b) any star can migrate to any galaxy; (c) stars and galaxies should be independently and transparently owned.

Intuitively, an ordinary user is a planet, whose governor is its star, and whose appeals court is its galaxy. Since a star or galaxy should have significant reputation capital, it has an economic incentive to follow the rules. The escape system is a backup. And the comet network is a backup to the backup.

From a privacy perspective, a planet is a personal server; a star or galaxy is a public service. Planet ownership should be private and secret; star or galaxy ownership should be public and transparent. Since, for the foreseeable future, individual planets have negligible economic value, Urbit is not a practical money-laundering tool. This is a feature, not a bug.

Finally, a general principle of both repositories and republics is that the easier it is (technically) to fork the system, the harder it is (politically) to fork. Anyone could copy Urbit and replace the galactic fingerprint block. Anyone can also fork the DNS. If the DNS was mismanaged badly enough, someone could and would; since it's competently managed, everyone can't and won't. Forkability is an automatic governance corrector.

From personal server to digital republic

Republics? Any global system needs a political design. Urbit is designed as a digital republic.

The word republic is from Latin, "res publica" or "public thing." The essence of republican goverment is its constitutional quality - government by law, not people.

A decentralized network defined by a deterministic interpreter comes as close to a digital constitution as we can imagine. Urbit is not the only member of this set - bitcoin is another; ethereum is even a general-purpose computer.

The "law" of bitcoin and ethereum is self-enforcing; the "law" of Urbit is not. Urbit is not a blockchain, and no urbit can assume that any other urbit is computing the Nock function correctly.

But at least the distinction between correct and incorrect computing is defined. In this sense, Nock is Urbit's constitution. It's not self-enforcing like Ethereum, but it's exponentially more efficient.

Non-self-verifying rules are useful, too. Defining correctness is not enforcing correctness. But Urbit doesn't seek to eliminate its dependency on conventional trust. Correctness precisely defined is easily enforced with social tools: either the computation is tampered with or it isn't.

Conclusion

On the bottom, Urbit is an equation. In the middle it's an operating system. On the top it's a civilization -- or at least, a design for one.

When we step back and look at that civilization, what we see isn't that surprising. It's the present that the past expected. The Internet is what the Arpanet of 1985 became; Urbit is what the Arpanet of 1985 wanted to become.

In 1985 it seemed completely natural and inevitable that, by 2015, everyone in the world would have a network computer. Our files, our programs, our communication would all go through it.

When we got a video call, our computer would pick up. When we had to pay a bill, our computer would pay it. When we wanted to listen to a song, we'd play a music file on our computer. When we wanted to share a spreadsheet, our computer would talk to someone else's computer. What could be more obvious? How else would it work?

(We didn't anticipate that this computer would live in a data center, not on our desk. But we didn't appreciate how easily a fast network can separate the server from its UI.)

2015 has better chips, better wires, better screens. We know what the personal cloud appliance does with this infrastructure. We can imagine our 1985 future ported to it. But the most interesting thing about our planets: we don't know what the world will do with them.

There's a qualitative difference between a personal appliance and a personal server; it's the difference between a social "network" and a social network. A true planet needs to work very hard to make social programming easier. Still, distributed social applications and centralized social applications are just different. A car is not a horseless carriage.

We know one thing about the whole network: by default, a social "network" is a monarchy. It has one corporate dictator, its developer. By default, a true network is a republic; its users govern it. And more important: a distributed community cannot coerce its users. Perhaps there are cases where monarchy is more efficient and effective -- but freedom is a product people want.

But no product is inevitable. Will we even get there? Will Urbit, or any planet, or any personal cloud server, actually happen?

It depends. The future doesn't just happen. It happens, sometimes. When people work together to make it happen. Otherwise, it doesn't.

The Internet didn't scale into an open, high-trust network of personal servers. It scaled into a low-trust network that we use as a better modem -- to talk to walled-garden servers that are better AOLs. We wish it wasn't this way. It is this way.

If we want the network of the future, even the future of 1985, someone has to build it. Once someone's built it, someone else has to move there. Otherwise, it won't happen.

And for those early explorers, the facilities will be rustic. Not at all what you're used to. More 1985 than 2015, even. These "better AOLs", the modern online services that are our personal appliances, are pretty plush in the feature department.

Could they just win? Easily. Quite easily. It's a little painful herding multiple appliances, but the problem is easily solved by a few more corporate mergers. We could all just have one big appliance. If nothing changes, we will.

To build a new world, we need the right equation and the right pioneers. Is Urbit the right equation? We think so. Check our work, please. Are you the right pioneer? History isn't over -- it's barely gotten started.

100 KiB Raw Blame History Unescape Escape