12 KiB
Concrete advantages of urbit
2015-7-18
Philip Monk
urbit is a cloud operating system and network designed to replace the traditional Unix stack and permit the creating of a personal cloud computing experience. It consists of an operating system, arvo, written in a purely functional language, hoon, run on a precisely defined virtual machine, nock. Instances of arvo communicate over a P2P network, ames. Unix was designed to solve the problems of the 1970's. Modern web apps require a large set of services and abstractions not provided by Unix. These are generally provided by the deep stack of third party software that every app needs. urbit is a clean-slate layer over the internet, and it provides services and abstractions suitable for the development of modern web services.
A clean-slate programming platform must provide a series of concrete wins over the current unix/internet platform, and we enumerate some of them here. For technical overviews and motivation, look elsewhere.
It is very difficult to build useful truly decentralized apps, which is why it's so little done. We contend that a large portion of this difficulty is accidental difficulty rather than essential difficulty. Specifically, it is the result of building these services on an operating system (Unix) and netowrk (TCP/IP) designed and created in the 1970's to solve 1970's problems. A multitude of good ideas have arisen in the last forty years, but none of them were considered when our current software stack was designed.
Consider a simple chat service built to modern standards. The building blocks it needs are: (1) a browser front end, (2) a protocol for sending and receiving data from the server, (3), a transactional, ACID database (4) mappings from the data structures in memory to the database format, (5) a way to verify identity, and (6) some sort of way to manage reputation and prevent spam.
Unix itself provides little help to the developer of this service. These problems are solved by mixing and matching a variety of third-party userspace solutions, each of which provides a useful mostly-nonleaky abstraction. Putting together and maintaining this stack is a company-sized problem.
More perniciously, even when the stack is put together and the service is launched, this is a completely noncomposable, centralized service. None of the parts of this service can be used by other services, and if anyone wants to do something similar, they'll need to wrangle the same type of stack. Additionally, all the data for the app is on the company's servers, so (1) the users have no privacy, (2) a security breach at the company compromises everyone's data at once, and (3) the users are unable to do anything with their data except whatever the service's public interface allows.
The services and abstractions that Unix provides are far too low level to properly service the modern web. urbit provides much higher level services, and it provides them with decentralized applications in mind.
Unix itself provides none of these, but there exist userspace components for each. In general, the solutions up to this point have been piecemeal -- a stack component here, a service there, with a generous helping of glue to keep it all together.
We present urbit as a clean-slate solution to all of these problems. Our thesis is that the best way to solve all these problems is to do so in a coherent, cohesive whole. Rather than a thin OS with a thick stack, we've built a thick OS to provide high-level abstractions and services. By attacking these problems from an OS perspective, we can deal with them thoroughly.
Identities.
Nearly every application needs a concept of identity. For many cloud services, the only reason it needs to know the user's identity is to show them their own data. In a decentralized application, the user has their data on their own server, so this problem goes away completely.
Many services, though, need a concept of identity that spans the network. unix has usernames, and the internet has domains, but these have nothing to do with each other. Each service ends up having to construct its own notion of identity.
urbit has cryptographic identity built into the network. An essentially infinite number of anonymous identities may be created, but there are a finite number of paracentrally-issued identities -- 2^32, to be exact. Because of the finite nature of these identities (technically, addresses), they have nonzero value, and thus may have positive default reputation.
Of course, if there are a finite number of identities, they must be allocated somehow. In urbit, this is accomplished through a simple hierarchy, with 256 top-level identity-issuing authorities, each of which may issue tickets for 255 second-tier urbits, each of which may issue tickets for 65,535 human-sized urbits. Once an identity is issued, it cannot be revoked. Lower-level urbits are independent entities, and the higher-level urbit has no control over them. Even though identities are issued paracentrally, the identity system is ultimately decentralized.
Transactional events.
urbit is a functional operating system, which means its state is a deterministic function of its event history. We persist this log before applying any effects, so events are fully transactional. Either they happen or they don't, and we are never left in an inconsistent state when a failure happens mid-event. In the same way that databases guarantee consistency of data storage, urbit guarantees the consistency of its state. urbit is an ACID operating system.
Thus, urbit is a single-level store. There's no need for a separate step to persist state. There is no need to write out data to a filesystem to get persistence -- you can just leave your data in the structures you use them in. All memory is persistent by default.
Global, reactive, typed, revision-controlled filesystem
urbit's filesystem is typed, so your data structures can be stored directly -- no need to serialize. It's also revision-controlled, so no need to lose file history ever again. Even the revision-control is type-aware, so branching and merging of arbitrary typed data structures works as it should. The filesystem is global, so referring to files on other urbits is easy. It's reactive, so changes on one machine can automatically cause various effects, like recompiling code on other urbits and auto-updating web pages.
Type validation and conversion
urbit has a built-in framework for dealing with typed data intelligently. This framework is used by the networking system to validate typed data from the netowrk. It's used by the filesystem to revision-control typed data. It's used by the web server to convert your data structures to JSON or HTML before sending them to a browser. With the built-in markdown-to-html converter, combined with urbit's filesystem and web server, hosting a blog on urbit can be as simple as keeping a directory of markdown posts.
The only blanks you need to fill in are the specific conversion functions to and from your application-defined types to more standard ones (like JSON).
In a centralized system, it's usually the same software sending and receiving data structures over the network. Thus, the protocol is defined in English if you're lucky, else it's defined by the implementation of the serializer and deserializer. In urbit, since types are baked into the system, they are executable specifications and are sent over the network in a consistent way, so that as long as both sides have the same type definitions, they can communicate.
Updating code and data structures over time
Because urbit's filesystem is reactive, even across the network, when an update is made to an app, it is usually pushed to everyone's machines immediately. A commonly-cited benefit to centralized systems is the ability to update everyone's experience at once and without user intervention. urbit allows a similar experience even in decentralized apps. Of course, since the user has control over their own personal cloud server, they may stop listening for code updates if they wish.
Besides distributing code, there is the problem of updating running apps. In urbit, when the app's code changes we automatically update the running app, saving the state and moving it over to the new version of the app. If the app's data structures have changed, then the programmer must write a state adapter function, which gets automatically run. Thus, urbit provides a sort of automatic hot code swapping with the ability to simultaenously perform a "schema migration".
Pulling data from other services
Since urbit is typed all the way through, communicating with other applications and urbits is straightforward -- it's just passing typed messages back and forth. It's a much more pleasant experience than traditional web APIs. Remember, of course, that the identity problem has already been solved, so it's just a matter of asking for what you want, and the whole conversion happens over a formally-defined protocol (a type really).
Often, pulling data just means reading it from the filesystem, which, as previously mentioned, is global, revision-controlled, and referentially transparent.
Overall, a lot less glue is required for apps to communicate. Of course, the most important thing that eases interapp communication is that urbit is a personal cloud computer, so all the user's data is usually on their own urbit. No need to ask some other company's servers for data and hope they're willing to give it to you. Decentralization makes some things harder, but it brings its own set of technical advantages.
Decentralized ownership and computation.
Building truly decentralized apps and services is very difficult in Unix, so it's not often attempted. Most cloud services are centralized, with all the data on a single logical node (usually spread across many physical servers). For most social "networks", the networking is between rows in a database, not computers on an actual network.
urbit eases the creation of decentralized apps. On the modern web, these building blocks are supplied by centralized systems. Verifying identity works by keeping all the identities in a single, centralized database. We avoid validating typed data off the wire by keeping all the data together in the same logical server. We avoid the need for a global namespace by centralization as well. Sane updating happens by keeping all the software on a single set of servers managed by the company. When we really can't get around the need to get data from another service, we bite the bullet and deal with HTTP APIs.
Since urbit solves all these problems in a decentralized manner, there's no longer near so much accidental difficulty in creating decentralized apps.
In a world where personal cloud computers host decentralized cloud services, the user finally owns their data. Having all their data in one trusted place allows their apps to mix and match it in hitherto unimagined ways. The privacy and security benefits are immense as well.
But perhaps greatest of all is the new frontiers this opens up. When Unix was built, a few rudimentary tools were built into the system, and with them programmers have built incredible pieces of software and accomplished things the creators couldn't have dreamed of. Unix provided the best tools the 1970's could offer, and it changed the world.
Imagine what will happen when the current generation of programmers gets a new set of tools, built to incorporate the last four decades of research and hard-won experience. We believe it will usher in an era of unprecedented increase in the power, privacy, and security available to the end user.