Restructure the developer documentation (#751)

This commit is contained in:
Ara Adkins 2020-05-15 11:41:26 +01:00 committed by GitHub
parent 7025918553
commit d2f93488b3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
89 changed files with 7981 additions and 6995 deletions

8
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@ -0,0 +1,8 @@
blank_issues_enabled: false
contact_links:
- name: Issues with the IDE?
url: https://github.com/luna/ide/issues/new/choose
about: Please report problems with the IDE here.
- name: Found a vulnerability?
url: https://github.com/luna/enso/docs/SECURITY.md#reporting-a-vulnerability
about: Please report security vulnerabilities here.

View File

@ -14,6 +14,5 @@
Please include the following checklist in your PR:
- [ ] The documentation has been updated if necessary.
- [ ] All code conforms to the [Scala](https://github.com/luna/enso/blob/master/doc/style-guides/scala.md), [Java](https://github.com/luna/enso/blob/master/doc/style-guides/java.md), [Rust](https://github.com/luna/enso/blob/master/doc/style-guides/rust.md) or [Haskell](https://github.com/luna/enso/blob/master/doc/style-guides/haskell.md) style guides as appropriate.
- [ ] All code conforms to the [Scala](https://github.com/luna/enso/blob/master/docs/style-guide/scala.md) and [Java](https://github.com/luna/enso/blob/master/docs/style-guide/java.md) style guides.
- [ ] All code has been tested where possible.

View File

@ -54,11 +54,9 @@ Enso consists of several sub projects, including the
You can also check out the [Enso Website](https://enso.org) for more
information.
This repository contains [Enso Engine](engine/), which consists of the compiler,
type-checker, runtime and language server. These components implement Enso the
language in its entirety, and are usable in isolation.
<br/>
This repository contains [Enso Engine](./engine/), which consists of the
compiler, type-checker, runtime and language server. These components implement
Enso the language in its entirety, and are usable in isolation.
### Getting Started
Enso is distributed as [pre-built packages](https://github.com/luna/enso/releases)
@ -66,49 +64,41 @@ for MacOS, Linux and Windows, as well as universal `.jar` packages that can run
anywhere that [GraalVM](https://graalvm.org) can. See the
[documentation](http://enso.org) for more.
<br/>
If you want to read more about the internals of Enso, please take a look at the
developer documentation rendered [here](https://dev.enso.org), or in the
repository [here](./docs).
### Building
The project builds on any platform where [GraalVM](https://graalvm.org) can run.
You will need the source code, and [`sbt`](https://www.scala-sbt.org/). For more
information, please read the detailed instructions in
[CONTRIBUTING.md](CONTRIBUTING.md).
<br/>
[CONTRIBUTING.md](./docs/CONTRIBUTING.md).
### Enso's Design
If you would like to gain a better understanding of the principles on which Enso
is based, or just delve into the why's and what's of Enso's design, please take
a look in the [`doc/` folder](./doc/). It is split up into subfolders for each
component of Enso, and then further subdivided into:
- `specification`: Specification of elements of the language.
- `design`: Documents detailing the design process and how decisions were made.
- `implementation`: Documentation detailing complexities, or design decisions
made at the implementation level.
a look in the [`docs/` folder](./docs/). It is split up into subfolders for each
component of Enso. You can view this same documentation in a rendered form at
[the developer docs website](https://dev.enso.org).
This folder also contains a document on Enso's
[design philosophy](./doc/enso-philosophy.md), that details the thought process
[design philosophy](./docs/enso-philosophy.md), that details the thought process
that we use when contemplating changes or additions to the language.
This documentation will evolve as Enso does, both to help newcomers to the
project understand the reasoning behind the code, but also to act as a record of
the decisions that have been made through Enso's evolution.
<br/>
### License
This repository is licensed under the
[Apache 2.0](https://opensource.org/licenses/apache-2.0), as specified in the
[LICENSE](https://github.com/luna/luna/blob/master/LICENSE) file.
[LICENSE](https://github.com/luna/enso/blob/master/LICENSE) file.
This license set was choosen to both provide you with a complete freedom to use
Enso, create libraries, and release them under any license of your choice, while
also allowing us to release commercial products on top of the platform,
including Enso Cloud and Enso Enterprise server managers.
<br/>
### Contributing to Enso
Enso is a community-driven open source project which is and will always be open
and free to use. We are committed to a fully transparent development process and
@ -118,11 +108,14 @@ implement new features, improve the documentation or spread the word!
If you'd like to help us make this vision a reality, please feel free to join
our [chat](http://chat.luna-lang.org/), and take a look at our
[development and contribution guidelines](CONTRIBUTING.md). The latter describes
all the ways in which you can help out with the project, as well as provides
detailed instructions for building and hacking on Enso.
[development and contribution guidelines](./docs/CONTRIBUTING.md). The latter
describes all the ways in which you can help out with the project, as well as
provides detailed instructions for building and hacking on Enso.
If you believe that you have found a security vulnerability in Enso, or that
you have a bug report that poses a security risk to Enso's users, please take
a look at our [security guidelines](./docs/SECURITY.md) for a course of action.
<a href="https://github.com/luna/enso/graphs/contributors">
<img src="https://opencollective.com/enso-language/contributors.svg?width=890&button=false">
</a>

View File

@ -1,24 +0,0 @@
# Enso Documentation
This folder contains the documentation for the implementation of the Enso
programming language. The documentation is generally broken up into three
categories: design, specification, and implementation.
- **Design:** Documents in this category deal with the reasoning behind the
language specification.
- **Specification:** The language specification.
- **Implementation:** Details of how the specification is _implemented_ in the
compiler and the runtime.
It is broken up into categories as follows:
- **[Language Server](language-server/):** Documentation pertaining to the Enso
language server.
- **[Runtime](runtime/):** Documentation pertaining to the Enso runtime.
- **[Semantics](semantics/):** Documentation pertaining to the Enso language
semantics, insofar as it is seperable from the other categories.
- **[Style Guides](style-guides/):** Style guides for the various languages with
which we work.
- **[Syntax](syntax/):** Documentation pertaining to the syntax of Enso.
- **[Types](types/):** Documentation pertaining to Enso's type system.
- **[Enso Philosophy](enso-philosophy.md):** Information on the design
philosophy that underlies the development of Enso.

View File

@ -1,21 +0,0 @@
# Library Packaging Design
This document deals with the design for the packaging of libraries in Enso. We
want to strike a balance between providing an excellent user experience and
allowing power users to get the most out of the platform.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Build Reproducibility](#build-reproducibility)
<!-- /MarkdownTOC -->
## Build Reproducibility
It is crucial for any good development environment to provide reproducible
builds, such that it is impossible for it to go wrong by mismatching library
versions.
> The actionables for this section are:
> - Decide on the strategies of ensuring consistent library resolution. This
> may include hashing the downloaded versions of libraries and publishing
> stack-style resolvers for sets of libraries that are proven to work well
> together.

View File

@ -1,17 +0,0 @@
# Enso Distribution Implementation Notes
This document deals with the implementation of various distribution-related
functionalities for Enso.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Deeply Nested Directory Trees on Windows](#deeply-nested-directory-trees-on-windows)
<!-- /MarkdownTOC -->
## Deeply Nested Directory Trees on Windows
Given that we use a fairly deeply nested structure for our directories,
we must take extra care to support it properly on Windows, due to the 256
characters in a path limitation.
There is no special action required as long as we're using the JVM for handling
paths, as it automatically inserts the `\\?\ ` prefix for Windows paths.

File diff suppressed because it is too large Load Diff

View File

@ -1,141 +0,0 @@
# Caching and Evaluation
It is not uncommon for users in data-analysis jobs to work with data on the
order of _gigabytes_ or even _terabytes_. As fast as computers have become, and
as efficient as programming languages can be, you still don't want to compute on
such large amounts of data unless you absolutely have to.
This wouldn't usually be an issue, with such data-analysis tasks being able to
run in a 'batch mode', where the user starts their job in a fire-and-forget
fashion. Enso, however, is a highly _interactive_ environment for working with
data, where waiting _seconds_, let alone _hours_, would severely hamper the user
experience.
To that end, Enso's runtime includes an in-built value caching mechanism. This
mechanism is used to remember values of computations such that they can be
re-used as down-stream code is changed. This means that only the _necessary_
parts of the computation are re-evaluated when code is changed, providing users
with a far more interactive and responsive experience.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Cache Candidates](#cache-candidates)
- [Partial-Evaluation and Side-Effects](#partial-evaluation-and-side-effects)
- [Side Effects in the Initial Version](#side-effects-in-the-initial-version)
- [In The Future](#in-the-future)
- [Cache Eviction Strategies](#cache-eviction-strategies)
- [Initial Eviction Strategies](#initial-eviction-strategies)
- [Future Eviction Strategies](#future-eviction-strategies)
<!-- /MarkdownTOC -->
## Cache Candidates
The key use of the Enso value cache is to support the interactive editing of
user code. This means that it caches all bindings within the scope of a given
function, including the function arguments. This means that, as users edit their
code, we can ensure that the minimal amount of their program is recomputed.
Consider the following example:
```ruby
foo a b =
c = a.frob b
d = c.wibble b
a.quux d
```
The cache is active for the _currently visible scope_ in Enso Studio, so when a
user enters the function `foo`, the cache stores the intermediate results in
this function (in this case `c` and `d`), as well as the inputs to the function
(in this case `a`, and `b`).
All intermediate results and inputs are considered as candidates, though as the
cache design evolves, the _selected_ candidates may be refined. Once the values
for each candidate are stored, changes to the code will re-use these cached
values as much as possible.
## Partial-Evaluation and Side-Effects
The more theoretically-minded people among those reading this document may
instantly realise that there is a _problem_ with this approach. In the presence
of caching, it becomes _entirely_ unpredictable as to when side effects are
executed. This is problematic in that side-effecting computations are rarely
idempotent, and problems might be caused by executing them over and over again.
Furthermore, the nature of the interpreter's support for entering functions
inherently requires that it recompute portions of that function in a different
context, thereby potentially re-evaluating side-effecting computations as well.
In general, it is clear that many kinds of side effect have _problems_ in the
presence of caching and partial-evaluation.
### Side Effects in the Initial Version
Many of the mechanisms required to deal with this kind of issue properly are
complex and require deep type-level support in the compiler. To that end, the
initial version of the interpreter is going to pretend that the problem doesn't
really exist.
- All intermediate values will be cached.
- Cached values will be recomputed as necessary as described in the section on
[initial eviction strategies](#initial-eviction-strategies).
This can and _will_ recompute side-effecting computations indiscriminately, but
we cannot initially do much better.
#### A Stopgap
While the compiler won't have the machinery in place to properly track
information about side-effects, we can implement a stop-gap solution that at
least allows the GUI to allow users to make the decision about whether or not to
recompute a side-effecting value. This is very similar to the initial approach
used for functions with arguments marked as `Suspended`, and works as follows:
- We provide explicit signatures (containing `IO`) for functions that perform
side-effects.
- Whenever the runtime wants to recompute a value that performs side-effects, it
can use this information to ask for user input.
- We can also display a box on these types `always_reevaluate` that lets users
opt in to automatic re-evaluation of these values.
### In The Future
As the compiler evolves, however, we can do better than this. In particular, we
can employ type-system information to determine which functions are
side-effecting (in absence of annotations), and to class some kinds of said
functions as safe for either caching, re-evaluation, or both. What follows is a
brief sketch of how this might work:
- Rather than having a single type capturing side effects (like `IO` in Haskell)
we divide the type up into fine-grained descriptions of side-effects that let
us better describe the particular behaviours of given functions (e.g.
`IO.Read`, `IO.Write`), all of which are more-specific versions of the base
`IO` type.
- We provide a set of interfaces that determine whether a given kind of side
effect can be safely cached or re-evaluated (e.g. `No_Cache` or
`No_Reevaluate`).
- We can use this information to ask the user about recomputation in far less
situations.
> The actionables for this section are:
>
> - Evolve the strategy for handling side effects as the compiler provides more
> capabilities that will be useful in doing so.
## Cache Eviction Strategies
The cache eviction strategy refers to the method by which we determine which
entries in the cache are invalidated (if any) after a given change to the code.
### Initial Eviction Strategies
In the initial version of the caching mechanism, the eviction strategies are
intended to be fairly simplistic and conservative to ensure correctness.
- The compiler performs data-dependency analysis between expressions.
- If an expression is changed, all cached values for expressions that depend on
it are evicted from the cache.
- These evicted values must be computed.
### Future Eviction Strategies
In the future, however, the increasing sophistication of the front-end compiler
for Enso will allow us to do better than this by accounting for more granular
information in the eviction decisions.
> The actionables for this section are:
>
> - Evolve the cache eviction strategy by employing more granular information
> as the compiler evolves to provide it.

View File

@ -1,47 +0,0 @@
# GraalVM
The notes contained below are out of date, but still may e relevant.
The following are notes based on questions that we have about using GraalVM to
provide a backend for Enso.
- C-API is a potentially useful tool for combining the Java/Truffle side and the
Haskell side.
- The polyglot API is the bindings to Graal itself, with access from both C and
Java. This is _not_ the same thing as the truffle framework for building new
programming languages.
- You can define your own C-API (as C-entry points) for your hosted language,
which then allows for you to call into the hosted language.
- There are no current plans to allow for compilation of the hosted language to
native code via SubstrateVM. You can store part of the JVM heap in the native
image, allowing for standalone executables. Serialisation of heap into the
image.
- JavaScript and LLVM are possible backends for SubstrateVM, so we can compile
Enso code indirectly to JS. The LLVM backend is on the roadmap, while the
former is possible.
- No explicit control of memory layout for the hosted language, instead with a
high level (JS-style) API for working with objects. There is no
non-encapsulated mechanism for accessing objects due to optimisation reasons.
If we wanted explicit memory layout control. Hybrid systems are supported.
- GraalPython is at an early stage of development, with a critical focus on the
scientific libraries. It is very much a first-class language on the Graal
platform.
- The Graal garbage collector is evolving to be a Java port of G1. Fairly high
priority to improve the GC, and its support for immutability.
- Graal doesn't currently have support for saving the heap except via something
like SubstrateVM.
- Caching: Truffle is designed for live specialisation, and has support for
inline caching. No support for retention or eviction at this stage.
- Truffle provides some support for reusable front-end optimisations in the form
of AST specialisation. Strong support for AST re-writing as an optimisation
technique. You can walk the tree at creation time, or at execution time, in
order to implement more global optimisations. Separation between allocation
and re-writing.
- Commercial support for bugfixes, but not necessarily for feature requests yet.
- Introspection/Debugging framework lets you introspect all portions of the code
during execution. We can dynamically add those introspection hooks in order to
do this optimally.
- There is no way to force compilation, but could modify truffle to add some
support for an API that allows this control. There is currently only one tier
of compilation, but it is _possible_ to write additional JIT phases.
- There is the polyglot REPL, but it is definitely possible to build a proper
polyglot repl based around Enso. Take a look at `polyglot - shell`.

View File

@ -1,19 +0,0 @@
# Enso Polyglot at Runtime
This document deals with runtime implementation details of the polyglot
functionality of Enso.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Finding Polyglot Bindings](#finding-polyglot-bindings)
<!-- /MarkdownTOC -->
## Finding Polyglot Bindings
Polyglot objects for various languages are found in the `polyglot` subdirectory
of an Enso project. This folder is subdivided into directories based on the
polyglot language. The name of each subdirectory must match the language
identifier used in the source code.
Inside each directory is an implementation-defined structure, with the polyglot
implementation for that particular language needing to specify it. Please see
the language-specific documentation for details.

View File

@ -1,210 +0,0 @@
# Caching
Given that Enso is a highly interactive language and platform, we want to take
every measure to ensure that we provide a highly responsive experience to our
users. To that end, one of the key tenets of the new runtime's featureset for
aiding in this is the inclusion of a _caching_ mechanism.
Caching, in this case, refers to the runtime's ability to 'remember' the values
computed in the currently observed scopes. In combination with the data
dependency analysis performed by the compiler, this allows the runtime to
recompute the _minimal_ set of expressions when the user makes a change, rather
than having to recompute the entire program.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Dataflow Analysis](#dataflow-analysis)
- [Identifying Expressions](#identifying-expressions)
- [Specifying Dataflow](#specifying-dataflow)
- [Cache Eviction Strategy](#cache-eviction-strategy)
<!-- /MarkdownTOC -->
## Dataflow Analysis
Dataflow analysis is the process by which the compiler discovers the
relationships between program expressions. The output of the process is a data
dependency graph that can be queried for an expression, and returns the set of
all expressions that depended on that expression.
Internally we represent this as a directed graph:
- An edge from `a` to `b` indicates that the expression `a` is depended on by
the expression `b`.
- These dependencies are _direct_ dependencies on `a`.
- We reconstruct transitive dependencies from this graph.
An expression `a` can be any Enso expression, including definitions of dynamic
symbols. Given that dynamic symbols need not be in scope, care has to be taken
with registering them properly.
Each expression in the compiler IR is annotated with both the set of expressions
that depend on it, and the set of expressions that it depends on.
### Identifying Expressions
Expressions are identified, for the purposes of dataflow analysis, by unique
identifiers on every IR node. The dataflow analysis process creates a dependency
graph between these identifiers.
However, at runtime, the IDE uses a different and separate concept of
identifiers. Translating between these external identifiers and the internal
identifiers is left to the runtime and is not the responsibility of the dataflow
analysis pass.
### Specifying Dataflow
Dataflow analysis takes place on the core set of language constructs, defined
as those that extend `IRKind.Primitive`. Their dataflow is specified as follows,
with arrows representing edges in the graph.
#### Atom
An atom is dependent on the definitions of its arguments, particularly with
regard to any defaults.
```
atom <- arguments
```
#### Method
A method is dependent on the definition of its body. Methods at the point that
dataflow analysis runs are 'bare' methods, meaning that they are defined as
functions.
```
method <- body
```
#### Block
The value of a block is dependent purely on the value of its return expression.
The return expression may depend on other values.
```
block <- returnValue
```
#### Binding
The value of a binding is dependent both on the name of the binding and the
expression being assigned in the binding.
```
binding <- name
binding <- expression
```
#### Lambda
The value of a lambda is dependent on the return value from its body, as well as
the definitions of any defaults for its arguments.
```
lambda <- body
lambda <- argumentDefaults
```
#### Definition Argument
The value of a function definition argument is dependent purely on the value of
its default, if that default is present.
```
defArgument <- defaultExpression
```
#### Prefix Application
The value of a prefix application is dependent on the values of both the
function expression being called, and the arguments.
```
prefix <- function
prefix <- arguments
```
#### Call Argument
The value of a call argument is dependent both on the value that it's wrapping,
as well as the name it has, if it exists.
```
callArgument <- argumentValue
callArgument <- argumentName
```
#### Forced Term
A forced term is purely dependent on the value of the term being forced (the
`target`).
```
force <- target
```
#### Typeset Members
A typeset member is dependent on the definition of its label, as well as the
possibly present definitions of its type and value.
```
typesetMember <- label
typesetMember <- memberType
typesetMember <- memberValue
```
#### Typing Operators
All typing operators in Enso (`IR.Type`) are dependent on their constituent
parts:
```
typingExpr <- expressionChildren
```
#### Name
An occurrence of a name is dependent on the definition site of that name. This
means that it is broken down into two options:
1. **Static Dependency:** The definition site for a given usage can be
statically resolved.
```
name <- staticUseSite
```
2. **Dynamic Dependency:** The definition site for a given usage can only be
determined to be a symbol resolved dynamically.
```
name <- dynamicSymbol
```
Under these circumstances, if any definition for `dynamicSymbol` changes,
then _all_ usages of that symbol must be invalidated, whether or not they
used the changed definition in particular.
#### Case Expressions
The value of a case expression depends on the value of its scrutinee, as well
as the definitions of its branches.
```
case <- scrutinee
case <- branches
case <- fallback
```
#### Case Branches
The value of a case branch depends on both the pattern expression and the result
expression.
```
caseBranch <- casePattern
caseBranch <- expression
```
#### Comments
The value of a comment is purely dependent on the value of the commented entity.
```
comment <- commented
```
## Cache Eviction Strategy
The cache eviction strategy refers to the process by which, for a given change,
we decide which elements should be evicted from the cache. In the current form,
the following rules are applied when an expression identified by some key `k` is
changed:
1. All expressions that depend on the result of `k` are evicted from the cache.
2. If `k` is a dynamic symbol, all expressions that depend on _any instance_ of
the dynamic symbol are evicted from the cache.
Expressions that have been evicted from the cache subsequently have to be
recomputed by the runtime.

View File

@ -1,41 +0,0 @@
# Scoping
In addition to the scoping rules defined in the language semantics document,
the implementation makes some additions or alterations not visible in the
surface language that nevertheless change the scoping rules as implemented in
the runtime.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Function Call Arguments](#function-call-arguments)
- [Collapsing Scopes](#collapsing-scopes)
<!-- /MarkdownTOC -->
## Function Call Arguments
In order to support suspended function arguments in the interpreter in a
performant way, we implicitly wrap _all_ function arguments in a suspension. In
conjunction with making the function itself responsible for when its arguments
are evaluated, this lets us have incredibly performant suspended computations in
Enso.
However, it _does_ require creating a small hack in the Alias Analysis process:
- In order for an expression to be a suspension, it must occur in its own scope
(the suspended scope).
- Alias analysis must account for this, otherwise the code generator will get
frame accesses incorrect.
To this end, we account for this implementation detail in Alias analysis.
## Collapsing Scopes
Another quirk of the internal alias analysis process is down to the fact that
the Enso IR represents Methods, functions, and blocks as separate constructs.
This means that if you had a method containing a function containing a block, a
naive implementation of alias analysis would allocate three scopes here.
This is incorrect, according to the semantic specification of the language, so
the alias analysis process needs to handle the collapsing of these scopes as it
allocates them. The rules are as follows:
- If you have a method whose body is a function, they share a scope.
- If you have a function whose body is a block, they share a scope.

View File

@ -1,67 +0,0 @@
# Polyglot Java Semantics
This document deals with the Enso-level semantics of polyglot interop with
Java. It talks about the issues with (and solutions for) matching the runtime
semantics of two quite different languages.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Impedance Mismatch](#impedance-mismatch)
- [The Polyglot FFI](#the-polyglot-ffi)
- [Importing Polyglot Bindings](#importing-polyglot-bindings)
- [Using Polyglot Bindings](#using-polyglot-bindings)
<!-- /MarkdownTOC -->
## Impedance Mismatch
Polyglot interoperation in Enso has a significant impedance mismatch. In
essence, this means that there is a mismatch between Enso's language semantics
and the semantics of the foreign languages that are being worked with.
While some of this mismatch can be worked around by manually wrapping the
foreign constructs in Enso, there are still concepts that can't easily be
represented by Enso.
> The actionables for this section are:
>
> - Expand on the impedance mismatch and how it leads to the defined semantics.
## The Polyglot FFI
The low-level polyglot FFI mechanism refers to a way to use polyglot objects
directly in Enso code. This can be used to underlie a library implementaion in
Enso, or to interoperate with code running in other languages.
### Importing Polyglot Bindings
When importing a polyglot binding into scope in an Enso file, this introduces a
_polyglot object_ into scope. This object will have appropriate fields and/or
methods defined on it, as described by the foreign language implementation.
> The actionables for this section are:
>
> - Expand greatly on the detail of this as the semantics of the imports become
> clearer.
### Using Polyglot Bindings
With a polyglot object in scope, the user is free to call methods on it
directly. These polyglot objects are inherently dynamically typed, meaning that
any operation may _fail_ at runtime.
Enso implements a generic variadic syntax for calling polyglot functions using
vectors of arguments. In essence, this is necessary due to the significant
impedance mismatch between Enso's runtime semantics (let alone the type system)
and the runtime semantics of many of the polyglot languages.
We went the way of the variadic call for multiple reasons:
- It allows us to match up with a wide range of language semantics (such as
subtyping and overloading).
- It is flexible and easy to expand in the future.
- We can easily build a more Enso-feeling interface on top of it.
- It can still be typed due to our plans for dependent vector types.
By way of illustrative example, Java supports method overloading and subtyping,
two things which have no real equivalent in the Enso type system.
> The actionables for this section are:
>
> - Expand greatly on the runtime semantics of working with polyglot bindings.
> - Determine how to make the inherent 'failability' of polyglot objects safer.

View File

@ -1,57 +0,0 @@
# Enso: The Syntax
When working with a programming language, the syntax is the first thing that a
user encounters. This makes it _utterly integral_ to how users experience the
language, and, in the case of Enso, the tool as a whole.
Enso is a truly novel programming language in that it doesn't have _one_ syntax,
but instead has two. These syntaxes are dual: visual and textual. Both are
first-class, and are truly equivalent ways to represent and manipulate the
program. To that end, the design of the language's syntax requires careful
consideration, and this document attempts to explain both the _what_, of Enso's
syntax, but also the _why_.
This document exists to expand on some of the design philosophy behind the
language's syntactic [specification](../specification/syntax.md).
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Naming](#naming)
- [Identifiers](#identifiers)
- [Operators](#operators)
- [This vs. Self](#this-vs-self)
<!-- /MarkdownTOC -->
## Naming
This section discusses decisions made around the naming of various language
constructs.
### Identifiers
While, through much of the language's history, we have used `camelCase` (with
its disambiguating cousin `CamelCase`), this has been decided against for one
primary reason:
- Names using snake case are far easier to read, and optimising code for
readability is _overwhelmingly_ important in a context where novice users are
involved.
### Operators
While some languages allow use of unicode characters for naming operators, we
will not. The reasoning behind this is simple, and is best explained by
paraphrasing the [Idris wiki](https://github.com/idris-lang/Idris-dev/wiki/Unofficial-FAQ#will-there-be-support-for-unicode-characters-for-operators).
- Unicode operators are hard to type, making it far more difficult to use other
peoples' code. Even if some editors provide input methods for such symbols,
they do not provide a good UX.
- Not every piece of software has good support for Unicode. Even though this is
changing, it is not there yet, thus raising barriers to entry.
- Many Unicode characters are hard to distinguish.
In essence, while the use of Unicode operators can make code look pretty, a font
with well-defined ligatures can do the same.
### This vs. Self
Though it varies greatly between programming languages, we have chosen `this` to
be the name of the 'current type' rather than `self`. This is a purely aesthetic
decision, and the final clincher was the ability to write `this` and `that`, as
opposed to `self` and `that`.

View File

@ -1,99 +0,0 @@
# Polyglot Syntax
Enso is a language with first-class polyglot support. In essence, this means
that Enso has first-class support for high-level interoperation with a variety
of programming languages.
It supports this through two main mechanisms:
1. **Polyglot FFI:** The low-level polyglot support provides a fairly low-level
syntax sugar system for working with values from foreign languages.
2. **Embedded Syntax:** This system allows users to write code from other
languages directly in their `.enso` files, and to seamlessly share values
between Enso and that foreign code.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Polyglot FFI](#polyglot-ffi)
- [Importing Polyglot Bindings](#importing-polyglot-bindings)
- [Using Polyglot Bindings](#using-polyglot-bindings)
- [Embedded Syntax](#embedded-syntax)
<!-- /MarkdownTOC -->
## Polyglot FFI
The polyglot FFI is a low-level multi-language foreign function interface. It
allows users to import bindings from other languages and call them through a
generic mechanism.
### Importing Polyglot Bindings
Polyglot bindings can be imported using a polyglot import directive. This is
constructed as follows:
- The `polyglot` keyword
- A language identifier (e.g. `java`).
- The keyword `import`.
- Optionally (where the language supports it), an identifier for the type of
language entity being imported (e.g. `struct` for `c`).
- A path that uniquely identifies the polyglot object to import.
- Optionally, the keyword `as`, followed by a new name.
For example:
```ruby
polyglot java import org.example.MyClass as MyClassJava
polyglot c import struct NetworkPacket as NetworkPacketC
```
### Using Polyglot Bindings
A polyglot binding is a polyglot object that has methods and/or fields defined
on it. Due to an impedance mismatch between languages, Enso implements a
variadic syntax for calling these polyglot bindings using vectors.
In essence, we have a primitive function as follows:
```ruby
Polyglot.method : Polyglot.Object -> [Any] -> Any
```
It works as follows:
- It is a method called `method` defined on the `Polyglot` type. The name
`method` is, however, a stand-in for the name of the method in question.
- It takes an object instance of the polyglot object.
- It takes a vector of arguments (and is hence variadic).
- And it returns some value.
By way of example, the following is a valid usage of a polyglot binding:
```ruby
polyglot java import com.example.MyClass as MyClassJava
main =
x = MyClassJava.foo [1, 2, 3] # a static method
inst = MyClassJava.new [a, b, c] # a constructor
bar = inst.metod [x, y] # a static method
```
## Embedded Syntax
This high-level polyglot functionality takes the form of using the syntax of
other languages _directly_ embedded in `.enso` files. A polyglot block is
introduced as follows:
- The `polyglot` keyword starts a block.
- This must be followed by a language identifier (e.g. `java`).
- After the language identifier, the remaining syntax behaves like it is an
Enso function definition until the `=`.
- After the `=`, the user may write their foreign code.
```ruby
polyglot python concat a b =
def concat(a, b):
str(a) + str(b)
```
In the above example, this defines a function `concat` that takes two arguments
`a` and `b`, implemented in Python.
> The actionables for this section are:
>
> - Greatly flesh out the syntax for the high-level polyglot functionality.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,3 +1,11 @@
---
layout: developer-doc
title: The Enso Code of Conduct
category: summary
tags: [summary, code-of-conduct]
order: 3
---
# The Enso Code of Conduct
## Conduct

View File

@ -1,3 +1,11 @@
---
layout: developer-doc
title: Contributing to Enso
category: summary
tags: [summary, contributing]
order: 2
---
# Contributing to Enso
Thank you for your interest in contributing to Enso! We believe that only
through community involvement can Enso be the best it can be! There are a whole
@ -53,7 +61,7 @@ about, so report as many bugs as you can! If you're not sure whether something
is a bug, file it anyway!
**If you are concerned that your bug publicly presents a security risk to the
users of Enso, please contact [security@enso.org](mailto:security@enso.org).**
users of Enso, please look at our [security guidelines](./security.md).**
Even though GitHub search can be a bit hard to use sometimes, we'd appreciate if
you could

41
docs/README.md Normal file
View File

@ -0,0 +1,41 @@
---
layout: docs-index
title: Enso Developer Documentation
category: summary
tags: [summary, readme]
order: 0
---
# Enso Developer Documentation
This folder contains the documentation for the implementation of the Enso
programming language. The documentation is broken up by subject, within the
below-listed categories, and each subject combines information about the
design, specification, and implementation of the feature to which it pertains.
We provide a number of useful resources for getting a quick understanding of
the Enso project:
- [**The Enso Philosophy:**](./enso-philosophy.md) Information on the design
philosophy behind Enso, and why we build things in the way we do.
- [**Contributing Guidelines:**](./CONTRIBUTING.md) Information for people
wanting to contribute to Enso (in many different ways).
- [**Community Code of Conduct:**](./CODE_OF_CONDUCT.md) The code of conduct for
members of our community, developers and users alike.
- [**Security Guidelines:**](./SECURITY.md) Security guidelines for the Enso
project, including supported versions and our vulnerability reporting process.
It is broken up into categories as follows:
- [**Distribution:**](./distirbution) Information on how we distribute Enso to
our users, and how Enso packages themselves work.
- [**Language Server:**](./language-server) Information on the Enso language
server, its protocol, and how it integrates with the runtime.
- [**Polyglot:**](./polyglot) Information on Enso's polyglot functionality, and
how it is integrated into the surface Enso language.
- [**Runtime:**](./runtime) Specification and documentation of the way that the
Enso runtime is designed and implemented.
- [**Semantics:**](./semantics) A specification of Enso's semantics.
- [**Style Guides:**](./style-guide) Style guides for the code written as part
of the Enso project.
- [**Syntax:**](./syntax) A specification of Enso's syntax.
- [**Types:**](./types) A specification of Enso's type system and type theory.

54
docs/SECURITY.md Normal file
View File

@ -0,0 +1,54 @@
---
layout: developer-doc
title: Security Policy
category: summary
tags: [summary, security, vulnerability, report]
order: 4
---
# Security Policy
This document outlines the security policy for Enso and its libraries.
> **If you believe that you have found a vulnerability in Enso or one of its
> libraries, please see the section on
> [reporting a vulnerability](#reporting-a-vulnerability) below.**
<!-- MarkdownTOC levels="2" autolink="true" -->
- [Supported Versions](#supported-versions)
- [Reporting a Vulnerability](#reporting-a-vulnerability)
<!-- /MarkdownTOC -->
## Supported Versions
Security updates for Enso are provided for the versions shown below with a
:white_check_mark: next to them. No other versions have security updates
provided.
| Version | Supported |
|---------------|--------------------|
| `master@HEAD` | :white_check_mark: |
| `wip/*` | :x: |
## Reporting a Vulnerability
If you believe that you've found a security vulnerability in the Enso codebase
or one of the libraries maintained in this repository, please contact
[security@enso.org](mailto:security@enso.org) and provide details of the bug.
You can expect an update on a reported vulnerability within one business day,
and the timeline works as follows:
1. We analyse your report to determine the risk posed by the vulnerability, and
our further steps forward. This may involve asking for more information.
2. We will email the submitter with our verdict as to whether it is, or isn't a
vulnerability, as well as the severity if it is.
3. We plan and outline any steps necessary to fixing the bug, including the
timeline for fixing the vulnerability within 90 days.
4. We will communicate the planned fix with the person who submitted the
vulnerability report.
5. We will fix the bug and communicate with the submitter when the fix has
landed on `master`, and when it has been backported to the above supported
versions.
6. The submitted may then disclose the bug publicly.
All communication will take place via email with a member of our team.

View File

@ -0,0 +1,16 @@
---
layout: section-summary
title: Enso Distribution
category: distribution
tags: [distribution, readme]
order: 0
---
# Enso Distribution
Documents in this section deal with the process of packaging both Enso and its
dependencies, and Enso projects for use by our users.
- [**Distribution:**](./distribution.md) Information on how we distribute Enso
to our users, and for use by the [IDE](https://github.com/luna/ide).
- [**Packaging:**](./packaging.md) Information on the structure of an Enso
project/package.

View File

@ -1,20 +1,25 @@
---
layout: developer-doc
title: The Enso Distribution
category: distribution
tags: [distribution, layout]
order: 1
---
# The Enso Distribution
This document describes the behaviour of the first prototype of an Enso
distribution, together with some future expansion plans.
This document provides a specification of how the Enso distribution should
be structured and how it should behave.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Installation Home Layout](#installation-home-layout)
- [Enso Home Layout](#enso-home-layout)
- [Universal Launcher Script](#universal-launcher-script)
- [Layout of a Distribution of a Single Version of Enso](#layout-of-a-distribution-of-a-single-version-of-enso)
- [Layout of an Enso Version Package](#layout-of-an-enso-version-package)
- [Package Sets](#package-sets)
<!-- /MarkdownTOC -->
## Installation Home Layout
*Please keep in mind that this system is just a proposal and none of it is
implemented yet.*
## Enso Home Layout
All of Enso's binaries, packages, etc. are installed into a directory in
the user's home directory. For macOS and linux distributions that's `~/.enso`,
by default. The distribution is fully portable, so it never makes any
@ -50,7 +55,7 @@ or use the default version if none specified.
> - Determine the proper technology to implement the script (i.e. Java/Bash).
> - Determine the features needed in the launcher script.
## Layout of a Distribution of a Single Version of Enso
## Layout of an Enso Version Package
This section describes the structure of a single version distribution. This
system is intended to be implemented first and used e.g. for the Enso nightly
builds / releases.
@ -90,6 +95,13 @@ enso-1.0.0
└── my-package-set.yaml
```
> **Implementation Note:**
> This structure makes use of deep nesting, which may give some with knowledge
> of Windows' path-name limits pause (windows paths are historically limited to
> 256 characters). However, there is no special action required to handle this
> limit as long as we are building on top of the JVM. The JVM automatically
> inserts the `\\?\` prefix required to bypass the windows path length limit.
### Package Sets
A package set is a manifest containing library versions that are made available
for importing when no `package.yaml` is available (e.g. when using the CLI to

View File

@ -1,7 +1,16 @@
---
layout: developer-doc
title: Enso Libraries Packaging
category: distribution
tags: [packaging, distribution, layout]
order: 2
---
# Enso Libraries Packaging
Given the open-community model of Enso as a programming language, it is crucial
to provide an extensible package management system. This document describes the
behaviour of the first prototype of such a system.
Given the nature of Enso as an open-source programming language and platform,
it is crucial that we provide users with an extensible package management
system. This document describes the current state of our packaging efforts, as
well as future directions and enhancements to it.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
@ -9,6 +18,7 @@ behaviour of the first prototype of such a system.
- [The `src` Directory](#the-src-directory)
- [The `polyglot` Directory](#the-polyglot-directory)
- [The `package.yaml` File](#the-packageyaml-file)
- [Build Reproducibility](#build-reproducibility)
<!-- /MarkdownTOC -->
@ -37,7 +47,7 @@ are imported in all of Enso code.
Note that all files and directories in this subtree must be named according
to the Enso style for referent names (i.e. `Upper_Snake_Case`, see
[Syntax Specification](../../syntax/specification/syntax.md#naming)).
[the syntax specification](../syntax/naming.md#naming-constructs)).
A file located at the path `My_Package/src/Sub_Module/Helper.enso` will be
imported like so:
@ -65,7 +75,7 @@ The exact transformation is as follows:
The `polyglot` directory contains per-language subdirectories containing files
used by the supported polyglot languages. The contents of each subdirectory is
specified on a per-language basis, in the
[polyglot documentation](../../runtime/implementation/polyglot/).
[polyglot documentation](../polyglot/README.md).
### The `package.yaml` File
`package.yaml` describes certain package metadata, such as its name, authors
@ -93,7 +103,7 @@ The following is the specification of the manifest fields.
`None`, meaning the package is not safe for use by third parties.
#### version
**Required** *String*: The [Semantic Versioning](https://semver.org/) string,
**Required** *String*: The [semantic versioning](https://semver.org/) string,
in the `major.minor.patch` format.
#### author
@ -126,3 +136,15 @@ version: <semver string of the required library version>
> The actionables for this section are:
>
> - Extend the library version field to handle version bounds.
## Build Reproducibility
It is crucial for any good development environment to provide reproducible
builds, such that it is impossible for it to go wrong by mismatching library
versions.
> The actionables for this section are:
>
> - Decide on the strategies of ensuring consistent library resolution. This
> may include hashing the downloaded versions of libraries and publishing
> stack-style resolvers for sets of libraries that are proven to work well
> together.

View File

@ -1,3 +1,11 @@
---
layout: developer-doc
title: Enso Developer Documentation
category: summary
tags: [summary, design]
order: 1
---
# Enso: Simplicity and Correctness
Enso is an award-winning, general-purpose, purely functional programming
language with equivalent, first-class visual and textual representations and a
@ -111,7 +119,8 @@ correctness and provide as much valuable information to the user as possible in
as automated and intuitive a fashion as possible.
## Connections to your Data
Software creation is a very creative process. However, while using conventional languages, programmers are like chess players trying to play with a blindfold on
Software creation is a very creative process. However, while using conventional
languages, programmers are like chess players trying to play with a blindfold on
- so much of their mental energy is spent just trying to picture where the
pieces are that theres hardly any left over to think about the game itself.

View File

@ -1,3 +1,11 @@
---
layout: section-summary
title: Language Server Documentation
category: language-server
tags: [language-server, readme]
order: 0
---
# Language Server Documentation
The Enso Language Server is responsible for providing language services to the
Enso IDE (and other clients). This mainly involves speaking the Enso protocol
@ -20,5 +28,17 @@ and orchestrating the runtime in response to this. It is responsible for:
- **Type Interactions:** Features for type-driven-development that allow users
to interact with the types of their programs.
This folder contains all documentation pertaining to the Language Server.
This folder contains all documentation pertaining to the Language Server, which
is broken up as follows:
- [**The Enso Protocol Architecture:**](./protocol-architecture.md) The
architecture of the Enso protocol.
The protocol messages are broken up into documents as follows:
- [**Common Message Specification:**](./protocol-common.md) The common messages
and types
- [**Project Manager Message Specification:**](./protocol-project-manager.md)
The messages and types pertaining to the project manager component.
- [**Language Server Message Specification:**](./protocol-language-server.md)
The messages and types pertaining to the language server component.

View File

@ -0,0 +1,661 @@
---
layout: developer-doc
title: The Enso Protocol
category: language-server
tags: [language-server, protocol, architecture]
order: 1
---
# The Enso Protocol
Enso is a sophisticated language, but in order to provide a great user
experience to our users we also need the ability to provide great tooling. This
tooling means a language server, but it also means a set of extra peripheral
components that ensure we can run Enso in a way that the product requires.
These services are responsible for providing the whole-host of language- and
project-level tooling to the IDE components, whether they're hosted in the cloud
or locally on a user's machine.
This document contains the architectural and functional specification of the
Enso protocol.
For a detailed specification of all of the messages that make up the protocol,
please see [the protocol message specifications](./README.md).
<!-- MarkdownTOC levels="2,3,4" autolink="true" -->
- [Architecture](#architecture)
- [The Project Manager](#the-project-manager)
- [Language Server](#language-server)
- [Textual Protocol](#textual-protocol)
- [Textual Protocol Communication Patterns](#textual-protocol-communication-patterns)
- [Textual Protocol Transport](#textual-protocol-transport)
- [The Protocol Format](#the-protocol-format)
- [Textual Protocol Functionality](#textual-protocol-functionality)
- [Textual Diff Management](#textual-diff-management)
- [Handling Multiple Clients](#handling-multiple-clients)
- [Project State Management](#project-state-management)
- [File Management and Storage](#file-management-and-storage)
- [Execution Management](#execution-management)
- [Caching](#caching)
- [Progress Reporting](#progress-reporting)
- [Completion](#completion)
- [Analysis Operations](#analysis-operations)
- [Functionality Post 2.0](#functionality-post-20)
- [Binary Protocol](#binary-protocol)
- [Binary Protocol Communication Patterns](#binary-protocol-communication-patterns)
- [Binary Protocol Transport](#binary-protocol-transport)
- [Binary Protocol Functionality](#binary-protocol-functionality)
- [Displaying Visualisations](#displaying-visualisations)
- [Service Connection Setup](#service-connection-setup)
- [Service Connection Teardown](#service-connection-teardown)
<!-- /MarkdownTOC -->
## Architecture
The divisions of responsibility between the backend engine services are dictated
purely by necessity. As multi-client editing necessitates careful
synchronisation and conflict resolution, between the actions of multiple
clients. This section deals with the intended architecture for the Engine
Services.
The engine services are divided into two main components:
1. **The Project Manager:** This component is responsible for listing and
managing user projects, as well as spawning the language server for a given
project when it is opened.
2. **The Language Server:** This component is responsible for dealing with
incoming connections and resolving conflicts between multiple clients. It is
also responsible for servicing all of the requests from the clients.
Both components will be implemented as akka actors such that we can defer the
decision as to run them in different processes until the requirements become
more clear.
### The Project Manager
The project manager service is responsible for both allowing users to work with
their projects but also the setup and teardown of the language server itself.
Its responsibilities can be summarised as follows:
- Allowing users to manage their projects.
- Starting up the language server for a given project upon user selection.
- Notifying the language server of a pending shutdown on project exit to allow
it to persist any state that it needs to disk.
### Language Server
The language server is responsible for managing incoming connections and
communicating with the clients, as well as resolving any potential conflicts
between the clients. It is responsible for the following:
- Negotiating and accepting connections from multiple clients.
- Resolving conflicts between messages from multiple clients.
- Optimising incoming requests wherever possible.
It is also responsible for actually servicing all of the incoming requests,
which includes but isn't limited to:
- **Completion Information:** It should be able to provide a set of candidate
completions for a given location in the code.
- **Introspection Information:** It should be able to provide introspection
information from the running interpreter, which consists primarily of types
and values.
- **Textual Diff Management:** It needs to be able to accept and publish diffs
of the program source code. As part of this, it needs to keep the node
metadata up to date.
- **Analysis Operations:** It should be able to service various IDE-style
analysis requests (e.g. jump-to-definition or find usages)
- **Arbitrary Code Execution:** It should be able to execute arbitrary Enso code
on values in scope.
- **Refactoring:** Common refactoring operations for Enso programs, including
renaming, code formatting, and so on.
- **IO Management:** Though this is arguably a feature of the runtime rather
than the language server itself, this refers to the ability to watch files and
monitor IO in order to recompute minimal subsets of the program.
It should be noted that the language server _explicitly does not_ talk using
LSP. The component is solely responsible for _servicing_ requests, instead of
dealing with the minutiae of connection negotiation and request handling.
Additionally, it is _very important_ to note that the language server _must not_
depend directly on the runtime (as a compile-time dependency). This would
introduce significant coupling between the runtime implementation and the
language server. Instead, the LS should only depend on `org.graalvm.polyglot` to
interface with the runtime.
## Textual Protocol
The protocol refers to the communication format that all of the above services
speak between each other and to the GUI. This protocol is not specialised only
to language server operations, as instead it needs to work for all of the
various services in this set.
The protocol we are using intends to be fully compatible with the Microsoft LSP
[specification](https://microsoft.github.io/language-server-protocol/specifications/specification-3-145)
(version 3.15). In essence, we will operate as follows:
- Where our use case matches with a function provided by LSP, we will use the
specified LSP message (e.g. completions).
- Where our use-case does not match a message provided by LSP, we will use the
following process:
1. If we can implement this on top of one of LSP's extensible mechanisms (e.g.
commands) we will do so.
2. If this is not possible, we will specify an _extension_ to the protocol.
This extension will be well-specified within this document, and should be
in the spirit of the existing protocol. If relevant, we may propose it as
a future extension to the specification.
Aside from the language server protocol-based operations, we will definitely
need a protocol extension to support Enso's custom language functionality.
### Textual Protocol Communication Patterns
Whatever protocol we decide on will need to have support for a couple of main
communication patterns:
- **Pub/Sub:** A standard publisher/subscriber model, the server will need to be
able to support this kind of connection to deal with events that do not occur
strictly in response to client actions (e.g. updates to observed values).
- **Req/Res:** A standard request/response model, the server will need to be
able to support this kind of connection to deal with one-off requests from
the client, and potentially to make requests to the client (e.g. list modules
in the current project, please refresh your file state).
There are also certain messages that follow the request/response model but where
the responses are trivial acknowledgements. For simplicity's sake these are
currently subsumed by the generic request-response model.
As we have decided to remain compatible with LSP, we can use any communication
pattern that we desire, either by employing existing LSP messages, or writing
our own protocol extensions. Both of the above-listed patterns are supported by
LSP.
We can support additional patterns through LSP's mechanisms:
- Asynchronous responses can be sent as notifications.
- Protocol-level acknowledgements is supported directly in LSP.
### Textual Protocol Transport
The transport of the protocol refers to the underlying layer over which its
messages (discussed in [the protocol format](#the-protocol-format) below) are
sent. As we are maintaining compatibility with LSP, the protocol transport
format is already defined for us.
- Textual messages are sent using
[JSON-RPC](https://en.wikipedia.org/wiki/JSON-RPC) over a WebSocket connection
(as defined in the LSP spec).
- As a protocol extension we also negotiate a secondary binary WebSocket
connection for sending visualisation data. This transport is independent of
the LSP spec, and hence is defined entirely by us.
> The actionables for this section are:
>
> - Determine the details for the binary WebSocket, including how we want to
> encode messages and pointers into the stream, as well as how we set it up
> and tear it down.
### The Protocol Format
Protocol messages are defined by LSP. Any extensions to the messages defined in
the standard should use similar patterns such that they are not incongruous with
LSP messages. The following notes apply:
- Textual messages should be sent as LSP messages or extensions to them.
- We have a hybrid extension to the protocol to allow us to send binary data
(for visualisations) over a second WebSocket connection.
This means that we have two pipes: one is the textual WebSocket defined by LSP,
and the other is a binary WebSocket.
## Textual Protocol Functionality
This entire section deals with the _functional_ requirements placed upon the
protocol used by the engine services. These requirements are overwhelmingly
imposed by the IDE, but also include additional functionality for the future
evolution of the language.
All of the following pieces of functionality that are explained in detail are
those _expected_ for the 2.0 release. Any additional functionality beyond this
milestone is described in a [dedicated section](#functionality-post-20).
### Textual Diff Management
The engine services need to support robust handling of textual diffs. This is
simply because it is the primary form of communication for synchronising source
code between the IDE and the engine. It will need to support the following
operations:
- Synchronisation requests to ensure that the engine and IDE have the same view
of the files in the project.
- Diff update requests, that send a textual diff between client and server (or
vice versa).
Both of these are supported natively within the LSP, and we will be using those
messages to implement this.
It should be noted that we _explicitly_ do not intend to handle updates to node
metadata within the language server.
- These updates should be sent as part of each diff the client provides (as a
separate segment in the `didChange` message).
- We may support this _in the future_, but we do not for now.
We place the following requirements upon the implementation of this:
- We must be able to handle diffs of _any size_, even though we prefer that the
client sends us minimal diffs. We do not know what _all_ clients will do, and
hence to remain compatible we must handle all diffs.
- Diffs may require some AST-based or semantic minimisation in order to assist
in the compiler's incremental pipeline.
- The gateway must handle diffs from multiple clients properly.
The implementation is as follows:
- Support for the LSP messages `didOpen`, `didChange`, `willSaveWaitUntil`,
`didSave`, `didClose`, and support for informing the runtime on each of these.
- It must track which files are currently open in the editor.
- The language server and runtime should watch the project folder in order to
track updates as necessary.
### Handling Multiple Clients
Multiple-client support will be implemented while remaining compatible with the
LSP specification.
- In the initial implementation, we will work on the principle of 'write lock',
where only one of the multiple connected clients has the ability to write to
the file.
- In the future we will work on true conflict resolution of edits, mediated by
the language server.
It will work as follows:
- We will use as much of the LSP initialisation and connection flow as possible
to connect additional clients. The extensions should be minimal and should
_not_ break compatibility with LSP.
- We will extend the protocol (or use extension mechanisms) to implement
write-lock negotiation. The first client to connect is granted write-lock, but
this may be changed later via negotiation.
- If any client sends a `didChange` message without holding the write lock, it
should receive an `applyEdit` message that reverts the change, as well as a
notification of the error.
- The language server / gateway is responsible for synchronising state with new
clients as they connect. As part of initialisation it should receive client
state, and then `applyEdit` to synchronise views of the code.
### Project State Management
One of the most important functionalities for this service set is the ability to
manage the state of a project in general. The project state refers to the whole
set of the project files and metadata and needs to support the following
functionalities:
- Get project metadata (name, maintainer, version, dependencies, and so on)
- Change requests for the above
All file-based operations in the project can be handled by the editor directly,
or the language server (when doing refactoring operations), and as such need no
support in this section of the protocol.
At the current time, the language server has a 1:1 correspondence with a
project. In the future, however, we may want to add LSP support for multiple
projects in a single engine, as this would allow users to work with multiple
related projects in a single instance of the IDE.
### File Management and Storage
The nature of LSP means that file management and storage is _not_ handled by the
language server, and is instead handled by the editor. The protocol makes a
distinction between:
- **Open Files:** These are the ones currently open in the editor, and are
'owned' by the editor. The language server has to work with an in-memory
representation for these files.
- **Closed Files:** These are the ones not open in the editor, and can be
accessed and modified _directly_ by the language server.
The language server must have _direct_ access to the project directory on the
machine where it is running (either the local machine or in the Enso cloud), and
file operations between the IDE and that machine are handled _indepdendently_ of
the language server.
### Execution Management
The language server process will need to be able to respond to requests for
various kinds of execution of Enso code. Furthermore, it needs to be able to
respond to requests to 'listen' to the execution of various portions of code.
This implies that the following functionalities are needed:
- Execution of a function with provided arguments.
- Execution of a function from a given call site (stack position and code
position).
- Attach an execution listener to an arbitrary code span.
- Detach an execution listener by ID.
- Implement heartbeat messages for execution listeners. If a heartbeat response
isn't received before some time-out, the language server should detach the
listener.
- Force cache invalidation for arbitrary code spans.
- Attach an automatic execution request.
- Detach an automatic execution request.
- Redirect `stdout`/`stdin`/`stderr` to and from the IDE.
All of these functionalities will need to take the form of custom extensions to
the LSP, as they do not fit well into any of the available extension points. To
that end, these extensions should fit well with the LSP.
A subscription (execution listener) is applied to an arbitrary span of code at a
given position in the call stack.
- A subscription may encompass multiple nodes or a single node. Information is
received for _all_ nodes covered by the provided span.
- A subscription will ensure that the client receives information on changes in:
+ Execution state (whether the node is being computed or is cached)
+ Profiling information
+ Values
+ Types
+ Where we are in the call stack (useful for recursive execution)
- Such subscriptions _must_ be accompanied by heartbeat messages in order to
allow the language server to cull unused subscriptions.
- Additionally, it will be important for each subscription to be able to
configure a _rate limit_, such that the update messages do not overwhelm the
client. If unspecified this should be set to a sensible default.
#### Caching
One of the most important elements of execution management for the language
server is the ability to control and interact with the execution cache state in
the runtime.
- This cache stores intermediate values, and every value can be in one of three
states: invalid, valid but evicted, and valid but present.
- The cache works based on dependencies between data, such that if `foo` is used
by `bar`, then changing `foo` must recompute `bar`.
The cache eviction strategy is one that will need to evolve. This comes down to
the simple fact that we do not yet have the tools to implement sophisticated
strategies, but we need to be correct.
- In the initial version we will invalidate _all_ call sites for a given method
name when a name is changed. Internally this is implemented as the
invalidation of all occurrences of a dynamic symbol by name, while ignoring
the type it was defined on.
- We also need to account for dependencies between data such that if there is a
dependency `b => a`, then a change to `a` must invalidate the cache result of
`b`.
- In future, the typechecker will be able to help constrain the set of evicted
methods by exploiting dependencies between values and types with more
information.
- There may also be non-obvious data dependencies that can be exploited to make
better cache-eviction decisions.
#### Progress Reporting
In the future it will be desirable for long running computations to provide
real-time progress information (e.g. for training a neural network it would be
great to know which epoch is running).
- This could be achieved by a special kind of Monadic context (similar to
writer, but mutable buffer based).
- This would allow the function to log values without needing to return.
- These would be sent as visualisations for use in the IDE.
LSP provides an inbuilt mechanism for reporting progress, but that will not work
with visualisations. As a result that should be reserved for reporting progress
of long-running operations within the _language server_ rather than in user
code.
### Completion
The IDE needs the ability to request completions for some target point (cursor
position) in the source code. In essence, this boils down to _some_ kind of
smart completion. The completion should provide the following:
- Sensible suggestions at the cursor position, ranked by relevance.
- Local variables, where relevant.
- Suggestions for symbols that would make sense (e.g. by type) but are not
imported. To support this, selection of such a symbol will trigger the
automatic addition of the relevant import on the language server.
- Searching in tags and documentation. This metadata-based search functionality
should be used to refine suggestions and help suggest functionality relevant
to user tasks.
- Browsing the symbol hierarchy. The user should be able to click through
modules to browse the various symbols contained within.
- Import Completion for when a user has typed `import` and hits `<tab>`. This
feature should suggest libraries that are available, along with provide their
top-level documentation to give users an idea of what they can be used for.
Hints should be gathered by the runtime in an un-ranked fashion based upon the
above criteria. This will involve combining knowledge from both the compiler and
the interpreter to deliver a sensible set of hints.
- Hints should be scored on a type match. For example, if we have a type `5`,
`foo : 5 -> String` scores higher than `bar : Nat -> Dynamic`, scores higher
than `baz : Any -> Any`. This should be done by heuristics initially, and
later by querying the typechecker for subsumption relationships (the notion of
specificity discussed in the types design document).
- Information contained in the `tags` section of the documentation should also
be used to rank candidates.
- Please note that this ranking algorithm will be required to get more complex
in the future, so please design it for extensibility _and_ high performance.
- Local variables should rank higher than global symbols.
From an implementation perspective, the following notes apply:
- This will be implemented on top of the `completion` and `completionResolve`
messages provided by the LSP spec.
- We will extend these messages with an _optional_ field that specifies the type
being queried upon. This is a stop-gap solution until inference can determine
the type.
- The request does _not_ contain the query string, as text matching is handled
by the IDE. The language server only handles candidate completions.
- We should determine if there are any sensible ways in which this process can
be optimised, as we have the potential to return very large completion sets.
It is probably worth waiting to see if this is necessary before implementing
any optimisations here.
### Analysis Operations
We also want to be able to support a useful set of semantic analysis operations
to help users navigate their code. As these rely on knowledge of the language
semantics, they must be explicitly supported by the language server:
- **List Symbols in Scope:** The scope should be specified by a code span.
- **Insert Import for Symbol:** This should use an `applyEdit` message to ask
the IDE to insert an import for the symbol specified in the request. If the
file is closed, then the edit should be made directly, as the LSP specifies.
### Functionality Post 2.0
In addition to the functionality discussed in detail above, there are further
augmentations that could sensibly be made to the Engine services to support a
much better editing and user-experience for Enso. These are listed briefly below
and will be expanded upon as necessary in the future.
- **Refactoring Operations:** As all of these operations rely on a semantic
analysis of the source program, they must be performed by the language server.
These should include (but may not be limited to) the renaming, moving,
extraction and inlining of entities. In future this could be expanded to
include refactoring hints a la IntelliJ.
- **Arbitrary Visualisation Code:** Visualisations should be able to be defined
using Enso code and will require additional support.
- **IO Manager:** The ability to do sophisticated IO monitoring, such as
watching for file changes, in order to support minimal re-execution of
analysis pipelines.
- **Enhanced Type-Manipulation:** Get fits for holes, case splitting, insert
type, refine type, solve type, and so on. Inspiration for these operations can
be taken from programs that provide for interactive type-driven development.
- **REPL:** Protocol messages to support a REPL-style of interactive
development. This should include, at a minimum, the ability to execute
arbitrary code statements in a REPL, but could be enhanced by the ability to
execute code from the editor in the REPL, and send changes back from the REPL
to the file.
- **Debugging:**: The user should be able to place break-points, and easily
inspect values during execution of a program. This debugging functionality
should allow for hot-reloading of code, changing of values within a live
program, and various other debugger functionality (step over, step in, step
out, continue, etc). Future debugger functionality should be based on the
standard [debug adapter protocol](https://microsoft.github.io/debug-adapter-protocol/specification).
- **Profiling Information:** Profiling information for the executing code, able
to be displayed visually in Enso Studio.
- **Code Formatting:** Automatic formatting of Enso code using the One True
Style ™.
- **Server-Side Metadata Management:** The lack of node metadata management in
the language server currently means that any language client other than Enso
Studio is guaranteed to corrupt the node metadata when editing Enso code. This
will reset the node layout and can be quite annoying.
- **True Multi-Client Support:** The initial release will only support multiple
connected clients through the use of a write lock. This is not a great user
experience, and in future we should instead use proper conflict resolution for
true collaborative editing. This will use a combination of `didChange` and
`applyEdit` messages to reconcile all clients' views of the files. This is
also why `willSaveWaitUntil` is important, as it can ensure that no client
editor saves until it has the authority to do so (all changes are reconciled).
- **Enhanced Semantic Analysis:** Enhanced semantic analysis operations that
rely on compiler analysis and typechecking. This includes things like "find
usages", "jump to definition" and "find symbol", as these can greatly enhance
a user's development experience.
- **LSP Spec Completeness:** We should also support all LSP messages that are
relevant to our language. Currently we only support a small subset thereof.
## Binary Protocol
The binary protocol refers to the auxiliary protocol used to transport raw
binary data between the engine and the client. This functionality is _entirely_
extraneous to the operation of the [textual protocol](#textual-protocol), and is
used for transferring large amounts of data between Enso components.
As the protocol is a binary transport, it is _mediated and controlled_ by
messages that exist as part of the textual protocol.
In order to deserialize a family of messages and correlate responses with
requests, each request/response/notification is wrapped in an envelope
structure. There is a separate envelope for incoming and outgoing messages:
```idl
namespace org.enso.languageserver.protocol.binary;
//A mapping between payload enum and inbound payload types.
union InboundPayload {
INIT_SESSION_CMD: InitSessionCommand,
WRITE_FILE_CMD: WriteFileCommand,
READ_FILE_CMD: ReadFileCommand
}
//An envelope for inbound requests and commands.
table InboundMessage {
//A unique id of the message sent to the server.
messageId: EnsoUUID (required);
//An optional correlation id used to correlate a response with a request.
correlationId: EnsoUUID;
//A message payload that carries requests sent by a client.
payload: InboundPayload (required);
}
```
```idl
namespace org.enso.languageserver.protocol.binary;
//A mapping between payload enum and outbound payload types.
union OutboundPayload {
ERROR: Error,
SUCCESS: Success,
VISUALISATION_UPDATE: VisualisationUpdate,
FILE_CONTENTS_REPLY: FileContentsReply
}
//An envelope for outbound responses.
table OutboundMessage {
//A unique id of the message sent from the server.
messageId: EnsoUUID (required);
//An optional correlation id used to correlate a response with a request.
correlationId: EnsoUUID;
//A message payload that carries responses and notifications sent by a server
payload: OutboundPayload (required);
}
```
```idl
namespace org.enso.languageserver.protocol.binary;
//This message type is used to indicate failure of some operation performed.
table Error {
//A unique error code identifying error type.
code: int;
//An error message.
message: string;
}
//Indicates an operation has succeeded.
table Success {}
```
### Binary Protocol Communication Patterns
The binary protocol currently only supports a single type of communication
pattern:
- **Push:** Messages containing data are pushed in response to operations
performed using the textual protocol.
### Binary Protocol Transport
The binary protocol uses [flatbuffers](https://github.com/google/flatbuffers)
for the protocol transport format. This choice has been made for a few reasons:
- Robust multi-language support, including Rust and Java on the JVM.
- High performance, including support for zero-copy data handling and streaming
data.
- Robust, schema-based messages.
## Binary Protocol Functionality
The binary protocol exists in order to serve the high-bandwidth data transfer
requirements of the engine and the GUI.
### Displaying Visualisations
A major part of Enso Studio's functionality is the rich embedded visualisations
that it supports. This means that the following functionality is necessary:
- Execution of an arbitrary Enso expression on a cached value designated by
a source location.
- The ability to create and destroy visualisation subscriptions with an
arbitrary piece of Enso code as the preprocessing function.
- The ability to update _existing_ subscriptions with a new preprocessing
function.
Visualisations in Enso are able to output arbitrary data for display in the GUI,
which requires a mechanism for transferring arbitrary data between the engine
and the GUI. These visualisations can output data in common formats, which will
be serialised by the transport (e.g. text), but they can also write arbitrary
binary data that can then be interpreted by the visualisation component itself
in any language that can be used from within the IDE.
From the implementation perspective:
- This will need to be an entirely separate set of protocol messages that should
be specified in detail in this document.
- Visualisations should work on a pub/sub model, where an update is sent every
time the underlying data is recomputed.
- Protocol responses must contain a pointer into the binary pipe carrying the
visualisation data to identify an update.
## Service Connection Setup
As these services need to support multiple clients in future, there is some
rigmarole around setting up the various connections needed by each client. The
process for spawning and connecting to an engine instance is as follows:
1. **Spawn the Server:** The project manager spawns the language server,
passing the socket information as part of the initialisation flow.
2. **Client ID Generation:** The client generates and stores a UUID that will
be used to identify the client while it is connected.
3. **Protocol Connection Initialisation:** The client performs the init for the
textual protocol connection, passing its client identifier as it does so.
See [`session/initProtocolConnection`](./protocol-language-server.md#sessioninitprotocolconnection)
for more information.
4. **Data Connection Initialisation:** The client performs the init for the
data connection, passing its client identifier as it does so. See
[`session/initDataConnection`](./protocol-language-server.md#sessioninitdataconnection) below more
information.
## Service Connection Teardown
As the engine performs sophisticated caching and persisting of data where
possible, it is very important that the client informs the engine of the end of
its session. In contrast to the initialisation flow above, this is not an
involved process.
1. **Notify the Engine:** _Prior_ to disconnecting from the sockets, the client
must send `session/end` to the server.
2. **Disconnect:** Once that message has been sent, the client may disconnect
at any time.

View File

@ -0,0 +1,113 @@
---
layout: developer-doc
title: Enso Protocol Common Message Specification
category: language-server
tags: [language-server, protocol, specification]
order: 2
---
# Enso Protocol Common Message Specification
This document contains the specification of the Enso protocol messages that are
common to multiple sub-components of the protocol.
For information on the design and architecture of the protocol, as well as its
transport formats, please look [here](./protocol-architecture).
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Protocol Message Specification](#protocol-message-specification)
- [Common Types](#common-types)
- [`Path`](#path)
- [`IPWithSocket`](#ipwithsocket)
- [`EnsoUUID`](#ensouuid)
<!-- /MarkdownTOC -->
## Protocol Message Specification
The message specification for protocol messages must include the following
fields:
- **Type:** The type of the message (e.g. Request or Notification).
- **Direction:** The direction in which the _originating_ message is sent
(either `Client -> Server` or `Server -> Client`).
- **Connection:** Which connection the message should be sent on. Write
'Protocol' for the textual connection and 'Data' for the binary connection.
- **Visibility:** Whether the method should be used by the public or is an
internal / implementation detail ('Public' or 'Private').
They must also contain separate sections specifying their parameters, result (if
it has one), and any errors that may occur. These specifications should be
either in typescript or flatbuffers syntax, depending on the connection on
which the message occurs.
The capability specifications must include the following fields, as well as a
section 'Enables' stating which protocol messages are gated by the capability.
- **method:** The name of the capability.
- **registerOptions:** The options that must be provided to register the
capability, described using typescript type syntax.
## Common Types
There are a number of types that are shared between many of the protocol
messages. They are specified below.
### `Path`
A path is a representation of a path relative to a specified content root.
#### Format
Please note that segments can only be ordinary file names, `..` and `.` may not
be supported.
```typescript
interface Path {
rootId: UUID;
segments: [String];
}
```
```idl
namespace org.enso.languageserver.protocol.binary;
//A representation of a path relative to a specified content root.
table Path {
//a content root id that the path is relative to
rootId: EnsoUUID;
//path segments
segments: [string];
}
```
### `IPWithSocket`
A IPWithSocket is an endpoint for communication between machines.
#### Format
```typescript
interface IPWithSocket {
host: String;
port: Int;
}
```
### `EnsoUUID`
An EnsoUUID is a value object containing 128-bit universally unique identifier.
#### Format
```idl
namespace org.enso.languageserver.protocol.binary;
//A binary representation of universally unique identifiers.
struct EnsoUUID {
//The most significant bits of the UUID.
leastSigBits:uint64;
//The most significant bits of the UUID.
mostSigBits:uint64;
}
```

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,358 @@
---
layout: developer-doc
title: Enso Protocol Project Manager Message Specification
category: language-server
tags: [language-server, protocol, specification]
order: 3
---
# Enso Protocol Project Manager Message Specification
This document contains the specification of the Enso protocol messages that
pertain to the project manager component. Please familiarise yourself with the
[common](./protocol-common.md) features of the protocol before reading this
document.
For information on the design and architecture of the protocol, as well as its
transport formats, please look [here](./protocol-architecture).
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Types](#types)
- [`ProjectMetadata`](#projectmetadata)
- [Project Management Operations](#project-management-operations)
- [`project/open`](#projectopen)
- [`project/close`](#projectclose)
- [`project/listRecent`](#projectlistrecent)
- [`project/create`](#projectcreate)
- [`project/delete`](#projectdelete)
- [`project/listSample`](#projectlistsample)
- [Language Server Management](#language-server-management)
- [Errors](#errors)
- [`ProjectNameValidationError`](#projectnamevalidationerror)
- [`ProjectDataStoreError`](#projectdatastoreerror)
- [`ProjectExistsError`](#projectexistserror)
- [`ProjectNotFoundError`](#projectnotfounderror)
- [`ProjectOpenError`](#projectopenerror)
- [`ProjectCloseError`](#projectcloseerror)
- [`ProjectNotOpenError`](#projectnotopenerror)
- [`ProjectOpenByOtherPeersError`](#projectopenbyotherpeerserror)
- [`CannotRemoveOpenProjectError`](#cannotremoveopenprojecterror)
<!-- /MarkdownTOC -->
## Types
There are a number of types that are used only within the project server's
protocol messages. These are specified here.
### `ProjectMetadata`
This type represents information about a project.
#### Format
```typescript
interface ProjectMetadata {
name: String;
id: UUID;
lastOpened: UTCDateTime;
}
```
## Project Management Operations
The primary responsibility of the project managers is to allow users to manage
their projects.
### `project/open`
This message requests that the project manager open a specified project. This
operation also includes spawning an instance of the language server open on the
specified project.
- **Type:** Request
- **Direction:** Client -> Server
- **Connection:** Protocol
- **Visibility:** Public
#### Parameters
```typescript
interface ProjectOpenRequest {
projectId: UUID;
}
```
#### Result
```typescript
interface ProjectOpenResult {
languageServerJsonAddress: IPWithSocket;
languageServerBinaryAddress: IPWithSocket;
}
```
#### Errors
- [`ProjectNotFoundError`](#projectnotfounderror) to signal that the project
doesn't exist.
- [`ProjectDataStoreError`](#projectdatastoreerror) to signal problems with
underlying data store.
- [`ProjectOpenError`](#projectopenerror) to signal failures during server boot.
### `project/close`
This message requests that the project manager close a specified project. This
operation includes shutting down the language server gracefully so that it can
persist state to disk as needed.
- **Type:** Request
- **Direction:** Client -> Server
- **Connection:** Protocol
- **Visibility:** Public
#### Parameters
```typescript
interface ProjectCloseRequest {
projectId: UUID;
}
```
#### Result
```typescript
{}
```
#### Errors
- [`ProjectNotFoundError`](#projectnotfounderror) to signal that the project
doesn't exist.
- [`ProjectDataStoreError`](#projectdatastoreerror) to signal problems with
underlying data store.
- [`ProjectCloseError`](#projectcloseerror) to signal failures that occurred
during language server stoppage.
- [`ProjectNotOpenError`](#projectnotopenerror) to signal cannot close a project
that is not open.
- [`ProjectOpenByOtherPeersError`](#projectopenbyotherpeerserror) to signal
that cannot close a project that is open by other clients.
### `project/listRecent`
This message requests that the project manager lists the user's most recently
opened projects.
- **Type:** Request
- **Direction:** Client -> Server
- **Connection:** Protocol
- **Visibility:** Public
#### Parameters
```typescript
interface ProjectListRecentRequest {
numberOfProjects: Int;
}
```
#### Result
```typescript
interface ProjectListRecentResponse {
projects: [ProjectMetadata];
}
```
#### Errors
- [`ProjectDataStoreError`](#projectdatastoreerror) to signal problems with
underlying data store.
### `project/create`
This message requests the creation of a new project.
- **Type:** Request
- **Direction:** Client -> Server
- **Connection:** Protocol
- **Visibility:** Public
#### Parameters
```typescript
interface ProjectCreateRequest {
name: String;
}
```
#### Result
```typescript
interface ProjectOpenResponse {
projectId: UUID;
}
```
#### Errors
- [`ProjectNameValidationError`](#projectnamevalidationerror) to signal
validation failures.
- [`ProjectDataStoreError`](#projectdatastoreerror) to signal problems with
underlying data store.
- [`ProjectExistsError`](#projectexistserror) to signal that the project
already exists.
### `project/delete`
This message requests the deletion of a project.
- **Type:** Request
- **Direction:** Client -> Server
- **Connection:** Protocol
- **Visibility:** Public
#### Parameters
```typescript
interface ProjectDeleteRequest {
projectId: UUID;
}
```
#### Result
```typescript
{}
```
#### Errors
- [`ProjectDataStoreError`](#projectdatastoreerror) to signal problems with
underlying data store.
- [`ProjectNotFoundError`](#projectnotfounderror) to signal that the project
doesn't exist.
- [`CannotRemoveOpenProjectError`](#cannotremoveopenprojecterror) to signal that
the project cannot be removed, because is open by at least one user.
### `project/listSample`
This request lists the sample projects that are available to the user.
- **Type:** Request
- **Direction:** Client -> Server
- **Connection:** Protocol
- **Visibility:** Public
#### Parameters
```typescript
interface ProjectListSampleRequest {
numProjects: Int;
}
```
#### Result
```typescript
interface ProjectListSampleResponse {
projects: [ProjectMetadata];
}
```
#### Errors
TBC
## Language Server Management
The project manager is also responsible for managing the language server. This
means that it needs to be able to spawn the process, but also tell the process
when to shut down.
> The actionables for this section are:
>
> - Fill it in when we have more of an idea about exactly how this spawning
> relationship is going to work.
## Errors
The project manager component has its own set of errors. This section is not a
complete specification and will be updated as new errors are added.
### `ProjectNameValidationError`
Signals validation failures.
```typescript
"error" : {
"code" : 4001,
"message" : "Cannot create project with empty name"
}
```
### `ProjectDataStoreError`
Signals problems with underlying data store.
```typescript
"error" : {
"code" : 4002,
"message" : "Cannot load project index"
}
```
### `ProjectExistsError`
Signals that the project already exists.
```typescript
"error" : {
"code" : 4003,
"message" : "Project with the provided name exists"
}
```
### `ProjectNotFoundError`
Signals that the project doesn't exist.
```typescript
"error" : {
"code" : 4004,
"message" : "Project with the provided id does not exist"
}
```
### `ProjectOpenError`
Signals that the project cannot be open due to boot failures.
```typescript
"error" : {
"code" : 4005,
"message" : "A boot failure."
}
```
### `ProjectCloseError`
Signals failures during shutdown of a server.
```typescript
"error" : {
"code" : 4009,
"message" : "A shutdown failure."
}
```
### `ProjectNotOpenError`
Signals that cannot close project that is not open.
```typescript
"error" : {
"code" : 4006,
"message" : "Cannot close project that is not open"
}
```
### `ProjectOpenByOtherPeersError`
Signals that cannot close a project that is open by other clients.
```typescript
"error" : {
"code" : 4007,
"message" : "Cannot close project because it is open by other peers"
}
```
### `CannotRemoveOpenProjectError`
Signals that cannot remove open project.
```typescript
"error" : {
"code" : 4008,
"message" : "Cannot remove open project"
}
```

25
docs/polyglot/README.md Normal file
View File

@ -0,0 +1,25 @@
---
layout: section-summary
title: Enso Polyglot Support
category: polyglot
tags: [polyglot, readme]
order: 0
---
# Enso Polyglot Support
Enso supports robust polyglot interoperation with other programming languages
that are supported on its platform. This section of the design documentation
deals with
It is broken down into the following sections:
- [**Polyglot Bindings:**](./polyglot-bindings.md) A document providing an
overview of the mechanisms provided to work with polyglot bindings in Enso.
- [**Typing Polyglot Bindings:**](./typing-polyglot-bindings.md) An exploration
of how we can provide a modicum of type safety for the polyglot bindings in
Enso.
It also provides language-specific documentation for the various supported
polyglot languages. These are as follows:
- [**Java:**](./java.md) Information specific to the Java polyglot bindings.

View File

@ -1,6 +1,15 @@
# Polyglot Java Runtime
---
layout: developer-doc
title: Polyglot Java
category: polyglot
tags: [polyglot, java]
order: 3
---
# Polyglot Java
This document deals with the implementation of polyglot interoperation with
Java in the runtime.
Java in the runtime. Please familiarise yourself with the general operation of
[polyglot bindings](./polyglot-bindings.md).
<!-- MarkdownTOC levels="2,3" autolink="true" -->

View File

@ -0,0 +1,180 @@
---
layout: developer-doc
title: Polyglot Bindings
category: polyglot
tags: [polyglot, bindings, interop]
order: 1
---
# Polyglot Bindings
This document deals with the specification and design for the polyglot interop
system provided in the Enso runtime. This system allows users to connect Enso to
other supported programming languages, to both provide access to a wealth of
libraries, and to integrate Enso into existing systems.
The polyglot support in Enso is best-in class, and it supports this through two
main mechanisms:
1. **Polyglot FFI:** The low-level polyglot support provides a fairly low-level
syntax sugar system for working with values from foreign languages.
2. **Embedded Syntax:** This system allows users to write code from other
languages directly in their `.enso` files, and to seamlessly share values
between Enso and that foreign code.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Impedance Mismatch](#impedance-mismatch)
- [The Polyglot FFI](#the-polyglot-ffi)
- [Importing Polyglot Bindings](#importing-polyglot-bindings)
- [Using Polyglot Bindings](#using-polyglot-bindings)
- [Importing Polyglot Bindings \(Syntax\)](#importing-polyglot-bindings-syntax)
- [Using Polyglot Bindings \(Syntax\)](#using-polyglot-bindings-syntax)
- [Finding Polyglot Bindings](#finding-polyglot-bindings)
- [Embedded Syntax](#embedded-syntax)
- [Embedded Syntax Usage \(Syntax\)](#embedded-syntax-usage-syntax)
<!-- /MarkdownTOC -->
## Impedance Mismatch
Polyglot interoperation in Enso has a significant impedance mismatch. In
essence, this means that there is a mismatch between Enso's language semantics
and the semantics of the foreign languages that are being worked with.
While some of this mismatch can be worked around by manually wrapping the
foreign constructs in Enso, there are still concepts that can't easily be
represented by Enso.
> The actionables for this section are:
>
> - Expand on the impedance mismatch and how it leads to the defined semantics.
## The Polyglot FFI
The low-level polyglot FFI mechanism refers to a way to use polyglot objects
directly in Enso code. This can be used to underlie a library implementaion in
Enso, or to interoperate with code running in other languages.
The mechanism provides users with the facilities to import bindings from other
languages and call them via a generic mechanism.
### Importing Polyglot Bindings
When importing a polyglot binding into scope in an Enso file, this introduces a
_polyglot object_ into scope. This object will have appropriate fields and/or
methods defined on it, as described by the foreign language implementation.
> The actionables for this section are:
>
> - Expand greatly on the detail of this as the semantics of the imports become
> clearer.
### Using Polyglot Bindings
With a polyglot object in scope, the user is free to call methods on it
directly. These polyglot objects are inherently dynamically typed, meaning that
any operation may _fail_ at runtime.
Enso implements a generic variadic syntax for calling polyglot functions using
vectors of arguments. In essence, this is necessary due to the significant
impedance mismatch between Enso's runtime semantics (let alone the type system)
and the runtime semantics of many of the polyglot languages.
We went the way of the variadic call for multiple reasons:
- It allows us to match up with a wide range of language semantics (such as
subtyping and overloading).
- It is flexible and easy to expand in the future.
- We can easily build a more Enso-feeling interface on top of it.
- It can still be typed due to our plans for dependent vector types.
By way of illustrative example, Java supports method overloading and subtyping,
two things which have no real equivalent in the Enso type system.
> The actionables for this section are:
>
> - Expand greatly on the runtime semantics of working with polyglot bindings.
> - Determine how to make the inherent 'failability' of polyglot objects safer.
### Importing Polyglot Bindings (Syntax)
Polyglot bindings can be imported using a polyglot import directive. This is
constructed as follows:
- The `polyglot` keyword
- A language identifier (e.g. `java`).
- The keyword `import`.
- Optionally (where the language supports it), an identifier for the type of
language entity being imported (e.g. `struct` for `c`).
- A path that uniquely identifies the polyglot object to import.
- Optionally, the keyword `as`, followed by a new name.
For example:
```ruby
polyglot java import org.example.MyClass as MyClassJava
polyglot c import struct NetworkPacket as NetworkPacketC
```
### Using Polyglot Bindings (Syntax)
A polyglot binding is a polyglot object that has methods and/or fields defined
on it. Due to an impedance mismatch between languages, Enso implements a
variadic syntax for calling these polyglot bindings using vectors.
In essence, we have a primitive function as follows:
```ruby
Polyglot.method : Polyglot.Object -> [Any] -> Any
```
It works as follows:
- It is a method called `method` defined on the `Polyglot` type. The name
`method` is, however, a stand-in for the name of the method in question.
- It takes an object instance of the polyglot object.
- It takes a vector of arguments (and is hence variadic).
- And it returns some value.
By way of example, the following is a valid usage of a polyglot binding:
```ruby
polyglot java import com.example.MyClass as MyClassJava
main =
x = MyClassJava.foo [1, 2, 3] # a static method
inst = MyClassJava.new [a, b, c] # a constructor
bar = inst.metod [x, y] # a static method
```
### Finding Polyglot Bindings
Polyglot objects for various languages are found in the `polyglot` subdirectory
of an Enso project. This folder is subdivided into directories based on the
polyglot language. The name of each subdirectory must match the language
identifier used in the source code.
Inside each directory is an implementation-defined structure, with the polyglot
implementation for that particular language needing to specify it. Please see
the language-specific documentation for details.
## Embedded Syntax
The term "Embedded Syntax" is our terminology for the ability to use foreign
language syntaxes from directly inside `.enso` files. This system builds upon
the more generic mechanisms used by the [polyglot FFI](#the-polyglot-ffi) to
provide a truly seamless user experience.
### Embedded Syntax Usage (Syntax)
A polyglot block is introduced as follows:
- The `polyglot` keyword starts a block.
- This must be followed by a language identifier (e.g. `java`).
- After the language identifier, the remaining syntax behaves like it is an
Enso function definition until the `=`.
- After the `=`, the user may write their foreign code.
```ruby
polyglot python concat a b =
def concat(a, b):
str(a) + str(b)
```
In the above example, this defines a function `concat` that takes two arguments
`a` and `b`, implemented in Python.
> The actionables for this section are:
>
> - Greatly flesh out the syntax for the high-level polyglot functionality.

View File

@ -1,3 +1,11 @@
---
layout: developer-doc
title: Typing the Polyglot Bindings
category: polyglot
tags: [polyglot, types]
order: 2
---
# Typing the Polyglot Bindings
The polyglot bindings inherently provide a problem for the Enso type system.
When many of the languages with which we can interoperate are highly dynamic and

View File

@ -1,3 +1,11 @@
---
layout: section-summary
title: Enso Runtime
category: runtime
tags: [runtime, readme]
order: 0
---
# Enso Runtime
The Enso runtime refers to both the compiler and the optimising JIT runtime for
Enso. While this might seem like a strange choice, it makes sense internally as
@ -17,5 +25,18 @@ encompasses the following functionality:
- **Introspection Hooks:** Providing hooks into the running code to allow the
language server to inspect information about the code as it runs.
This folder contains all of the documentation related to the runtime.
This folder contains all of the documentation related to the runtime, which is
broken up as follows:
- [**Caching:**](./caching.md) A description of the runtime's value caching
mechanism.
- [**Demand Analysis:**](./demand-analysis.md) A specification for the demand
analysis process in the Enso compiler that assists users with working with
suspended computations.
- [**Function Calling Flow:**](./function-call-flow.md) A description of the
involved logic that goes into a calling a function performantly in the Enso
runtime, while also supporting the flexible function syntax.
- [**Runtime Features:**](./runtime-features.md) A description of (and plan for)
the features of the Enso runtime.
- [**Unbounded Recursion:**](./unbounded-recursion.md) An exploration of
techniques for achieving unbounded recursion on the JVM.

348
docs/runtime/caching.md Normal file
View File

@ -0,0 +1,348 @@
---
layout: developer-doc
title: Caching
category: runtime
tags: [runtime, caching, execution]
order: 1
---
# Caching
It is not uncommon for users in data-analysis jobs to work with data on the
order of _gigabytes_ or even _terabytes_. As fast as computers have become, and
as efficient as programming languages can be, you still don't want to compute on
such large amounts of data unless you absolutely have to.
This wouldn't usually be an issue, with such data-analysis tasks being able to
run in a 'batch mode', where the user starts their job in a fire-and-forget
fashion. Enso, however, is a highly _interactive_ environment for working with
data, where waiting _seconds_, let alone _hours_, would severely hamper the user
experience.
Given that Enso is a highly interactive language and platform, we want to take
every measure to ensure that we provide a highly responsive experience to our
users. To that end, one of the key tenets of the new runtime's featureset for
aiding in this is the inclusion of a _caching_ mechanism.
Caching, in this case, refers to the runtime's ability to 'remember' the values
computed in the currently observed scopes. In combination with the data
dependency analysis performed by the compiler, this allows the runtime to
recompute the _minimal_ set of expressions when the user makes a change, rather
than having to recompute the entire program.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Cache Candidates](#cache-candidates)
- [Partial-Evaluation and Side-Effects](#partial-evaluation-and-side-effects)
- [Side Effects in the Initial Version](#side-effects-in-the-initial-version)
- [In The Future](#in-the-future)
- [Cache Eviction Strategies](#cache-eviction-strategies)
- [Initial Eviction Strategies](#initial-eviction-strategies)
- [Future Eviction Strategies](#future-eviction-strategies)
- [Dataflow Analysis](#dataflow-analysis)
- [Identifying Expressions](#identifying-expressions)
- [Specifying Dataflow](#specifying-dataflow)
- [Cache Eviction Strategy](#cache-eviction-strategy)
<!-- /MarkdownTOC -->
## Cache Candidates
The key use of the Enso value cache is to support the interactive editing of
user code. This means that it caches all bindings within the scope of a given
function, including the function arguments. This means that, as users edit their
code, we can ensure that the minimal amount of their program is recomputed.
Consider the following example:
```ruby
foo a b =
c = a.frob b
d = c.wibble b
a.quux d
```
The cache is active for the _currently visible scope_ in Enso Studio, so when a
user enters the function `foo`, the cache stores the intermediate results in
this function (in this case `c` and `d`), as well as the inputs to the function
(in this case `a`, and `b`).
All intermediate results and inputs are considered as candidates, though as the
cache design evolves, the _selected_ candidates may be refined. Once the values
for each candidate are stored, changes to the code will re-use these cached
values as much as possible.
## Partial-Evaluation and Side-Effects
The more theoretically-minded people among those reading this document may
instantly realise that there is a _problem_ with this approach. In the presence
of caching, it becomes _entirely_ unpredictable as to when side effects are
executed. This is problematic in that side-effecting computations are rarely
idempotent, and problems might be caused by executing them over and over again.
Furthermore, the nature of the interpreter's support for entering functions
inherently requires that it recompute portions of that function in a different
context, thereby potentially re-evaluating side-effecting computations as well.
In general, it is clear that many kinds of side effect have _problems_ in the
presence of caching and partial-evaluation.
### Side Effects in the Initial Version
Many of the mechanisms required to deal with this kind of issue properly are
complex and require deep type-level support in the compiler. To that end, the
initial version of the interpreter is going to pretend that the problem doesn't
really exist.
- All intermediate values will be cached.
- Cached values will be recomputed as necessary as described in the section on
[initial eviction strategies](#initial-eviction-strategies).
This can and _will_ recompute side-effecting computations indiscriminately, but
we cannot initially do much better.
#### A Stopgap
While the compiler won't have the machinery in place to properly track
information about side-effects, we can implement a stop-gap solution that at
least allows the GUI to allow users to make the decision about whether or not to
recompute a side-effecting value. This is very similar to the initial approach
used for functions with arguments marked as `Suspended`, and works as follows:
- We provide explicit signatures (containing `IO`) for functions that perform
side-effects.
- Whenever the runtime wants to recompute a value that performs side-effects, it
can use this information to ask for user input.
- We can also display a box on these types `always_reevaluate` that lets users
opt in to automatic re-evaluation of these values.
### In The Future
As the compiler evolves, however, we can do better than this. In particular, we
can employ type-system information to determine which functions are
side-effecting (in absence of annotations), and to class some kinds of said
functions as safe for either caching, re-evaluation, or both. What follows is a
brief sketch of how this might work:
- Rather than having a single type capturing side effects (like `IO` in Haskell)
we divide the type up into fine-grained descriptions of side-effects that let
us better describe the particular behaviours of given functions (e.g.
`IO.Read`, `IO.Write`), all of which are more-specific versions of the base
`IO` type.
- We provide a set of interfaces that determine whether a given kind of side
effect can be safely cached or re-evaluated (e.g. `No_Cache` or
`No_Reevaluate`).
- We can use this information to ask the user about recomputation in far less
situations.
> The actionables for this section are:
>
> - Evolve the strategy for handling side effects as the compiler provides more
> capabilities that will be useful in doing so.
## Cache Eviction Strategies
The cache eviction strategy refers to the method by which we determine which
entries in the cache are invalidated (if any) after a given change to the code.
### Initial Eviction Strategies
In the initial version of the caching mechanism, the eviction strategies are
intended to be fairly simplistic and conservative to ensure correctness.
- The compiler performs data-dependency analysis between expressions.
- If an expression is changed, all cached values for expressions that depend on
it are evicted from the cache.
- These evicted values must be computed.
### Future Eviction Strategies
In the future, however, the increasing sophistication of the front-end compiler
for Enso will allow us to do better than this by accounting for more granular
information in the eviction decisions.
> The actionables for this section are:
>
> - Evolve the cache eviction strategy by employing more granular information
> as the compiler evolves to provide it.
## Dataflow Analysis
Dataflow analysis is the process by which the compiler discovers the
relationships between program expressions. The output of the process is a data
dependency graph that can be queried for an expression, and returns the set of
all expressions that depended on that expression.
Internally we represent this as a directed graph:
- An edge from `a` to `b` indicates that the expression `a` is depended on by
the expression `b`.
- These dependencies are _direct_ dependencies on `a`.
- We reconstruct transitive dependencies from this graph.
An expression `a` can be any Enso expression, including definitions of dynamic
symbols. Given that dynamic symbols need not be in scope, care has to be taken
with registering them properly.
Each expression in the compiler IR is annotated with both the set of expressions
that depend on it, and the set of expressions that it depends on.
### Identifying Expressions
Expressions are identified, for the purposes of dataflow analysis, by unique
identifiers on every IR node. The dataflow analysis process creates a dependency
graph between these identifiers.
However, at runtime, the IDE uses a different and separate concept of
identifiers. Translating between these external identifiers and the internal
identifiers is left to the runtime and is not the responsibility of the dataflow
analysis pass.
### Specifying Dataflow
Dataflow analysis takes place on the core set of language constructs, defined
as those that extend `IRKind.Primitive`. Their dataflow is specified as follows,
with arrows representing edges in the graph.
#### Atom
An atom is dependent on the definitions of its arguments, particularly with
regard to any defaults.
```
atom <- arguments
```
#### Method
A method is dependent on the definition of its body. Methods at the point that
dataflow analysis runs are 'bare' methods, meaning that they are defined as
functions.
```
method <- body
```
#### Block
The value of a block is dependent purely on the value of its return expression.
The return expression may depend on other values.
```
block <- returnValue
```
#### Binding
The value of a binding is dependent both on the name of the binding and the
expression being assigned in the binding.
```
binding <- name
binding <- expression
```
#### Lambda
The value of a lambda is dependent on the return value from its body, as well as
the definitions of any defaults for its arguments.
```
lambda <- body
lambda <- argumentDefaults
```
#### Definition Argument
The value of a function definition argument is dependent purely on the value of
its default, if that default is present.
```
defArgument <- defaultExpression
```
#### Prefix Application
The value of a prefix application is dependent on the values of both the
function expression being called, and the arguments.
```
prefix <- function
prefix <- arguments
```
#### Call Argument
The value of a call argument is dependent both on the value that it's wrapping,
as well as the name it has, if it exists.
```
callArgument <- argumentValue
callArgument <- argumentName
```
#### Forced Term
A forced term is purely dependent on the value of the term being forced (the
`target`).
```
force <- target
```
#### Typeset Members
A typeset member is dependent on the definition of its label, as well as the
possibly present definitions of its type and value.
```
typesetMember <- label
typesetMember <- memberType
typesetMember <- memberValue
```
#### Typing Operators
All typing operators in Enso (`IR.Type`) are dependent on their constituent
parts:
```
typingExpr <- expressionChildren
```
#### Name
An occurrence of a name is dependent on the definition site of that name. This
means that it is broken down into two options:
1. **Static Dependency:** The definition site for a given usage can be
statically resolved.
```
name <- staticUseSite
```
2. **Dynamic Dependency:** The definition site for a given usage can only be
determined to be a symbol resolved dynamically.
```
name <- dynamicSymbol
```
Under these circumstances, if any definition for `dynamicSymbol` changes,
then _all_ usages of that symbol must be invalidated, whether or not they
used the changed definition in particular.
#### Case Expressions
The value of a case expression depends on the value of its scrutinee, as well
as the definitions of its branches.
```
case <- scrutinee
case <- branches
case <- fallback
```
#### Case Branches
The value of a case branch depends on both the pattern expression and the result
expression.
```
caseBranch <- casePattern
caseBranch <- expression
```
#### Comments
The value of a comment is purely dependent on the value of the commented entity.
```
comment <- commented
```
## Cache Eviction Strategy
The cache eviction strategy refers to the process by which, for a given change,
we decide which elements should be evicted from the cache. In the current form,
the following rules are applied when an expression identified by some key `k` is
changed:
1. All expressions that depend on the result of `k` are evicted from the cache.
2. If `k` is a dynamic symbol, all expressions that depend on _any instance_ of
the dynamic symbol are evicted from the cache.
Expressions that have been evicted from the cache subsequently have to be
recomputed by the runtime.

View File

@ -1,3 +1,11 @@
---
layout: developer-doc
title: Demand Analysis
category: runtime
tags: [runtime, demand-analysis, execution]
order: 2
---
# Demand Analysis
Demand analysis is the process of deciding when to compute the value of a
suspended computation in a language which supports suspended computation.

View File

@ -1,19 +1,27 @@
# Function Call Logic in the Enso Interpreter
---
layout: developer-doc
title: Function Calling Flow
category: runtime
tags: [runtime, execution, interpreter]
order: 3
---
# Function Calling Flow
With Enso being a functional language, it is crucial for function calls to be
efficient and easily inlined.
At the same time, the logic of calling functions is involved there's
a plethora of features this system needs to support, making the functions the
most involved and somewhat convoluted part of the interpreter's design.
The following is an overview of the logic, outlining the most important
features supported by this system and a diagram of how they fit into the
Truffle framework.
efficient and easily inlined. At the same time, the logic of calling functions
is involved there's a plethora of features this system needs to support,
making the functions the most involved and somewhat convoluted part of the
interpreter's design.
The following is an overview of the logic, outlining the most important features
supported by this system and a diagram of how they fit into the Truffle
framework.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Tail Call Optimization](#tail-call-optimization)
- [Named Application Arguments](#named-application-arguments)
- [Definition-Site Arguments Laziness](#definition-site-arguments-laziness)
- [Definition-Site Argument Suspension](#definition-site-argument-suspension)
- [Currying and Eta-Expansion](#currying-and-eta-expansion)
- [Dynamic Dispatch](#dynamic-dispatch)
- [Defaulted Arguments and Application](#defaulted-arguments-and-application)
@ -22,30 +30,37 @@ Truffle framework.
<!-- /MarkdownTOC -->
## Tail Call Optimization
It is very important for Enso to perform TCO well, since there are no intrinsic
looping constructs in the language.
The idea behind Enso's TCO is simple, even if the implementation is confusing.
Whenever we call a possibly-tail-recursive function in a tail position, a tail
call exception containing the function and its arguments is thrown.
This exception is then caught in a loop, effectively translating recursion into
a loop. With the use of Truffle's `ControlFlowException`, this code is
optimized like a builtin language loop construct.
call exception containing the function and its arguments is thrown. This
exception is then caught in a loop, effectively translating recursion into a
loop. With the use of Truffle's `ControlFlowException`, this code is optimized
like a builtin language loop construct.
In pseudocode, a tail recursive function, like:
```
```ruby
foo x = if x == 0 then print "foo" else foo (x - 1)
```
becomes:
```
```ruby
foo x = if x == 0 then Y else throw (TailCallException foo [x-1])
```
Then, any non-tail call site like:
```
```ruby
z = foo 1000000
```
is translated into
```
```ruby
z = null
_fun = foo
_args = [1000000]
@ -57,15 +72,17 @@ while true
_fun = e.function
_args = e.args
```
This logic is encapsulated in the subclasses of `CallOptimiserNode`.
This logic is encapsulated in the various subclasses of `CallOptimiserNode`.
## Named Application Arguments
Enso allows applying function arguments by name, e.g.
```
```ruby
pow base exp = x ** y
z = foo exp=10 base=3
```
While certainly useful for the user, it requires some non-trivial facilitation
in the interpreter. An easy solution, would be to simply pass arguments in
a dictionary-like structure (e.g. a HashMap), but that has unacceptable
@ -80,21 +97,21 @@ Based on the knowledge of the call-site schema (i.e. a description like
the third is applied positionally") and the definition-site schema ("the
function has 3 parameters, named [baz, foo, bar]"), a mapping ("move the first
argument to the second position, the second becomes third and the third becomes
the first") is computed, that is then memoized and used to efficiently reorder
arguments on each execution, without the need to employ any more involved data
structures. A standard Truffle Polymorphic Inline Cache is used to store these
mappings, therefore it may be subject to the standard problems storing rarely
accessed paths in the cache, as well as cache misses for highly polymorphic
call sites.
the first") is computed.
This mapping is then memoized and used to efficiently reorder arguments on each
execution, without the need to employ any more involved data structures. A
standard Truffle Polymorphic Inline Cache is used to store these mappings,
therefore it may be subject to the standard problems storing rarely accessed
paths in the cache, as well as cache misses for highly polymorphic call sites.
This logic is encapsulated in the `ArgumentSorterNode`.
## Definition-Site Arguments Laziness
Enso allows functions to define certain arguments as lazy, so that when these
are passed to a function, the corresponding expressions are not evaluated at
call site, but rather passed to the function as closures and evaluated at the
function's discretion.
## Definition-Site Argument Suspension
Enso allows functions to define certain arguments as `Suspended`, so that when
these are passed to a function, the corresponding expressions are not evaluated
at the call side, but are instead passed to the function as closures for
evaluation at the function's discretion.
Therefore, all application arguments are actually treated as thunks and only
evaluated at call-site when the function signature defines them as eager.
@ -102,7 +119,6 @@ evaluated at call-site when the function signature defines them as eager.
Argument execution is happening inside the `ArgumentSorterNode`.
## Currying and Eta-Expansion
Functions can also be applied partially (i.e. with less arguments than required
by the signature, in which case the result is a function with certain arguments
fixed) or over-saturated (i.e. if a function returns another function, in which
@ -112,7 +128,6 @@ expression).
This logic is handled inside the `CurryNode`.
## Dynamic Dispatch
Functions can be dispatched dynamically, meaning a name can denote different
functions, based on the (runtime) type of the first argument.
@ -132,9 +147,8 @@ function is a function, and if it is fully saturated with defaults it will call
it.
## Flow Diagram
The following diagram summarizes all the nodes participating in a function
call. The entry points to this system are `ApplicationNode` (for in-language
function calls) and `InteropLibrary<Function>` (for polyglot function calls).
![diagram](function-call-diagram.png)
![diagram](resources/function-call-diagram.png)

View File

Before

Width:  |  Height:  |  Size: 363 KiB

After

Width:  |  Height:  |  Size: 363 KiB

View File

@ -1,4 +1,12 @@
# Enso Runtime
---
layout: developer-doc
title: Runtime Features
category: runtime
tags: [runtime, design]
order: 4
---
# Runtime Features
This document contains a detailed specification of Enso's runtime. It includes
a description of the technologies on which it is built, as well as the features
and functionality that it is required to support. In addition, the document aims
@ -261,7 +269,7 @@ used by Enso programs.
The typechecker is the portion of the runtime that handles the type-inference
and type-checking of Enso code. This is a sophisticated piece of machinery, with
the primary theory under which it operates being described in the specification
of [the type-system](../../types/design/types.md).
of [the type system](../types/README.md).
<!-- TODO
- A description of the typechecker's architecture as graph transformations.

View File

@ -1,3 +1,11 @@
---
layout: developer-doc
title: Unbounded Recursion
category: runtime
tags: [runtime, recursion, execution]
order: 5
---
# Unbounded Recursion
The JVM (and hence, GraalVM) do not have support for segmented stacks, and hence
do not allow for computation of unbounded recursion - if you make too many

30
docs/semantics/README.md Normal file
View File

@ -0,0 +1,30 @@
---
layout: section-summary
title: Enso's Semantics
category: semantics
tags: [semantics, readme]
order: 0
---
# Enso's Semantics
Much like we have specifications for the [syntax](../syntax/README.md) and the
[type system](../types/README.md) in Enso, we also need a specification of the
language semantics. These documents specify the executable semantics of the
Enso language.
> The actionables for this section are:
>
> - As we make more semantic determinations about the language these should be
> written down here.
This specification is broken down into the following sections:
- [**Bindings:**](./bindings.md) The semantics of Enso's binding expressions.
- [**Diagnostics:**](./diagnostics.md) The semantics of Enso's executable
compiler diagnostics.
- [**Dispatch:**](./dispatch.md) The semantics of Enso's dispatch system.
- [**Errors:**](./errors.md) The semantics of Enso's various error systems.
- [**Evaluation:**](./evaluation.md) Enso's evaluation semantics, including
those of suspended computations.
- [**Modules:**](./modules.md) The semantics of Enso's modules.
- [**Scoping:**](./scoping.md) Enso's scoping and identifier resolution rules.

View File

@ -0,0 +1,33 @@
---
layout: developer-doc
title: Bindings
category: semantics
tags: [semantics, bindings]
order: 1
---
# Bindings
A "binding" is a portion of and Enso program that creates a new name and binds
a value to that name.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Binding Return Value](#binding-return-value)
<!-- /MarkdownTOC -->
## Binding Return Value
While some expression-based languages with bindings have the binding return the
value assigned to the binding, we feel that this is far too error prone.
Consider the following code as a demonstration:
```ruby
if x = someExprEvaluatingToBool then foo else bar
```
This is the perennially-discussed C++ bug where you fail to type `==` in an
if-statement.
Enso, instead, takes the approach where a binding expression returns the
singleton value of the type `Nothing`, making the above-written code a type
error.

View File

@ -0,0 +1,38 @@
---
layout: developer-doc
title: Diagnostics
category: semantics
tags: [semantics, diagnostics, runtime]
order: 2
---
# Diagnostics
Due to the highly interactive and always-online nature of a running Enso
program, it is very important that compile diagnostics (such as lints, errors,
and warnings) are surfaced in the _running_ language. This is particularly
important for visual mode.
To this end, Enso provides mechanisms by which diagnostics are scoped to the
smallest possible program component, and are even translated into executable
nodes in the interpreter so that they can be interacted with at runtime.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Warnings at Runtime](#warnings-at-runtime)
- [Errors at Runtime](#errors-at-runtime)
<!-- /MarkdownTOC -->
## Warnings at Runtime
> The actionables for this section are:
>
> - Specify how warnings behave at runtime.
## Errors at Runtime
> The actionables for this section are:
>
> - Specify how errors behave at runtime.
> - Talk about their limited scope and how they can be interacted with at
> runtime.

View File

@ -0,0 +1,31 @@
---
layout: developer-doc
title: Dispatch
category: semantics
tags: [semantics, dispatch, runtime]
order: 3
---
# Dispatch
Enso has support for functions that are both statically and dynamically
dispatched. We call the first ones _functions_, while the latter are referred to
as _methods_.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Static Dispatch](#static-dispatch)
- [Dynamic Dispatch](#dynamic-dispatch)
<!-- /MarkdownTOC -->
## Static Dispatch
> The actionables for this section are:
>
> - Specify the semantics of static dispatch in Enso.
## Dynamic Dispatch
> The actionables for this section are:
>
> - Specify the semantics of dynamic dispatch in Enso.

30
docs/semantics/errors.md Normal file
View File

@ -0,0 +1,30 @@
---
layout: developer-doc
title: Errors
category: semantics
tags: [semantics, errors, runtime]
order: 4
---
# Errors
Due to its hybrid nature, Enso supports two main exception systems that help
users to deal with errors at runtime in the language.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Asynchronous Exceptions](#asynchronous-exceptions)
- [Broken Values](#broken-values)
<!-- /MarkdownTOC -->
## Asynchronous Exceptions
> The actionables for this section are:
>
> - Specify the semantics of Enso's async exceptions.
## Broken Values
> The actionables for this section are:
>
> - Specify the semantics of Enso's broken values.

View File

@ -0,0 +1,57 @@
---
layout: developer-doc
title: Evaluation Semantics in Enso
category: semantics
tags: [semantics, evaluation, suspension]
order: 5
---
# Evaluation Semantics in Enso
Enso's evaluation semantics can be succinctly described as 'strict, but with
optional laziness'. By default, expressions in Enso are evaluated strictly, but
the programmer may choose to 'suspend' computations, and instead evaluate them
at the point they are needed.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Strict Evaluation](#strict-evaluation)
- [Optional Suspension](#optional-suspension)
<!-- /MarkdownTOC -->
## Strict Evaluation
Though Enso shares many syntactic similarities with Haskell, the most famous
example of a lazily evaluated language, Enso is not lazy. Instead, Enso is a
language that is strict.
- Statements in Enso are evaluated as soon as they are bound to a name.
- This means that arguments to a function are always evaluated before the
function is applied.
- Statements are _only_ evaluated when they contain fully-applied function
applications. Otherwise they return curried functions.
> The actionables for this section are:
>
> - Make this far better specified.
## Optional Suspension
Laziness, however, can often be quite useful for defining certain kinds of API.
To that end, Enso provides support for optional laziness, more specifically
optional _suspension_, through the built-in `Suspended` type.
- When a type `a` is wrapped in a `Suspended`, it is turned into a thunk.
- A value of type `Suspended a` may be forced to execute the suspended
computation and thereby obtain an `a`.
This forcing can take place in two ways:
- The user calls the standard library function `force : Suspended a -> a` on the
value.
- Automatically, at a site where its evaluation is demanded. The algorithm for
this is simple. If a value of type `Suspended a` is provided in a location
that expects a value of type `a`, the compiler will insert an implicit call to
`force` to produce the `a`.
> The actionables for this section are:
>
> - Make this far better specified.

29
docs/semantics/modules.md Normal file
View File

@ -0,0 +1,29 @@
---
layout: developer-doc
title: Modules
category: semantics
tags: [semantics, modules, runtime]
order: 6
---
# Modules
Enso has a rudimentary module system in order to aid code modularity.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Enso Modules](#enso-modules)
- [Future Directions](#future-directions)
<!-- /MarkdownTOC -->
## Enso Modules
> The actionables for this section are:
>
> - Specify the current state of the Enso module system.
## Future Directions
> The actionables for this section are:
>
> - Specify what we'd like to do with modules in the future.

View File

@ -1,34 +1,29 @@
# Enso: The Semantics
Much like we have specifications for the
[syntax](../../syntax/specification/syntax.md) and the
[type system](../../types/design/types.md), we also need a document where we can
specify the executable semantics of portions of Enso.
---
layout: developer-doc
title: Scoping Rules
category: semantics
tags: [semantics, scoping]
order: 7
---
This is that document, and it contains descriptions of key semantic portions of
Enso.
> The actionables for this section are:
>
> - As we make more semantic determinations about the language these should be
> written down here.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Scoping](#scoping)
- [Scoping Rules](#scoping-rules)
- [Scoping of Type Signatures](#scoping-of-type-signatures)
- [Strict Evaluation](#strict-evaluation)
- [Optional Suspension](#optional-suspension)
- [Bindings](#bindings)
<!-- /MarkdownTOC -->
## Scoping
# Scoping Rules
Enso's scoping rules should be fairly familiar to those coming from other
languages that are immutable (or make heavy use of immutability). In essence,
Enso is a lexically-scoped language where bindings may be shadowed in child
scopes.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Scopes](#scopes)
- [Introducing New Scopes](#introducing-new-scopes)
- [Scoping of Type Signatures](#scoping-of-type-signatures)
- [Implementation Notes](#implementation-notes)
- [Function Call Arguments](#function-call-arguments)
- [Collapsing Scopes](#collapsing-scopes)
<!-- /MarkdownTOC -->
## Scopes
A scope is the span in the code within which a set of accessible identifiers
occurs. A nested scope may:
@ -58,7 +53,7 @@ valid entity," and hence implies "can have its value used."
> - Once we are capable of supporting `fix` and recursive pure bindings in
> contexts, we need to revisit the above rules.
### Scoping Rules
## Introducing New Scopes
The following constructs introduce new scopes in Enso:
- **Modules:** Each module (file) introduces a new scope.
@ -111,55 +106,35 @@ Currently, type signatures in Enso obey a simple set of typing rules:
> - Do we actually want to support this?
> - What complexities does this introduce wrt typechecking?
## Strict Evaluation
Though Enso shares many syntactic similarities with Haskell, the most famous
example of a lazily evaluated language, Enso is not lazy. Instead, Enso is a
language that is strict.
## Implementation Notes
This section contains notes on the implementation of the Enso scoping rules in
the interpreter.
- Statements in Enso are evaluated as soon as they are bound to a name.
- This means that arguments to a function are always evaluated before the
function is applied.
- Statements are _only_ evaluated when they contain fully-applied function
applications. Otherwise they return curried functions.
### Function Call Arguments
In order to support suspended function arguments in the interpreter in a
performant way, we implicitly wrap _all_ function arguments in a suspension. In
conjunction with making the function itself responsible for when its arguments
are evaluated, this lets us have incredibly performant suspended computations in
Enso.
> The actionables for this section are:
>
> - Make this far better specified.
However, it _does_ require creating a small hack in the Alias Analysis process:
### Optional Suspension
Laziness, however, can often be quite useful for defining certain kinds of API.
To that end, Enso provides support for optional laziness, more specifically
optional _suspension_, through the built-in `Suspended` type.
- In order for an expression to be a suspension, it must occur in its own scope
(the suspended scope).
- Alias analysis must account for this, otherwise the code generator will get
frame accesses incorrect.
- When a type `a` is wrapped in a `Suspended`, it is turned into a thunk.
- A value of type `Suspended a` may be forced to execute the suspended
computation and thereby obtain an `a`.
To this end, we account for this implementation detail in alias analysis.
This forcing can take place in two ways:
### Collapsing Scopes
Another quirk of the internal alias analysis process is down to the fact that
the Enso IR represents Methods, functions, and blocks as separate constructs.
This means that if you had a method containing a function containing a block, a
naive implementation of alias analysis would allocate three scopes here.
- The user calls the standard library function `force : Suspended a -> a` on the
value.
- Automatically, at a site where its evaluation is demanded. The algorithm for
this is simple. If a value of type `Suspended a` is provided in a location
that expects a value of type `a`, the compiler will insert an implicit call to
`force` to produce the `a`.
This is incorrect, according to the semantic specification of the language, so
the alias analysis process needs to handle the collapsing of these scopes as it
allocates them. The rules are as follows:
> The actionables for this section are:
>
> - Make this far better specified.
## Bindings
While some expression-based languages with bindings have the binding return the
value assigned to the binding, we feel that this is far too error prone.
Consider the following code as a demonstration:
```ruby
if x = someExprEvaluatingToBool then foo else bar
```
This is the perennially-discussed C++ bug where you fail to type `==` in an
if-statement.
Enso, instead, takes the approach where a binding expression returns the
singleton value of the type `Nothing`, making the above-written code a type
error.
- If you have a method whose body is a function, they share a scope.
- If you have a function whose body is a block, they share a scope.

View File

@ -0,0 +1,16 @@
---
layout: section-summary
title: Enso Developer Style Guides
category: style-guide
tags: [style-guide, readme]
order: 0
---
# Enso Developer Style Guides
This folder contains the style guides for developers working on the various
languages that make up the Enso project.
- [Scala](./scala.md)
- [Java](./java.md)
- [Rust](./rust.md)
- [Haskell](./haskell.md)

View File

@ -1,3 +1,11 @@
---
layout: style-guide
title: Haskell Style Guide
category: style-guide
tags: [style-guide]
order: 4
---
# Haskell Style Guide
Like many style guides, this Haskell style guide exists for two primary reasons.
The first is to provide guidelines that result in a consistent code style across

View File

@ -1,3 +1,11 @@
---
layout: style-guide
title: Java Style Guide
category: style-guide
tags: [style-guide]
order: 2
---
# Java Style Guide
Like many style guides, this Java style guide exists for two primary reasons.
The first is to provide guidelines that result in a consistent code style across

View File

@ -1,3 +1,11 @@
---
layout: style-guide
title: Rust Style Guide
category: style-guide
tags: [style-guide]
order: 3
---
# Rust Style Guide
Like many style guides, this Rust style guide exists for two primary reasons.
The first is to provide guidelines that result in a consistent code style across
@ -19,6 +27,7 @@ programmer burden; there is usually only _one way_ to lay out code correctly.
- [Spacing](#spacing)
- [Impl Definitions](#impl-definitions)
- [Getters and Setters](#getters-and-setters)
- [Trait Exports](#trait-exports)
- [Naming](#naming)
- [Package Structure and Naming](#package-structure-and-naming)
- [The Public API](#the-public-api)

View File

@ -1,3 +1,11 @@
---
layout: style-guide
title: Scala Style Guide
category: style-guide
tags: [style-guide]
order: 1
---
# Scala Style Guide
Like many style guides, this Scala style guide exists for two primary reasons.
The first is to provide guidelines that result in a consistent code style across

45
docs/syntax/README.md Normal file
View File

@ -0,0 +1,45 @@
---
layout: section-summary
title: Enso's Syntax
category: syntax
tags: [syntax, readme]
order: 0
---
# Enso's Syntax
When working with a programming language, the syntax is the first thing that a
user encounters. This makes it _utterly integral_ to how users experience the
language, and, in the case of Enso, the tool as a whole.
Enso is a truly novel programming language in that it doesn't have _one_ syntax,
but instead has two. These syntaxes are dual: visual and textual. Both are
first-class, and are truly equivalent ways to represent and manipulate the
program. To that end, the design of the language's syntax requires careful
consideration, and this document attempts to explain both the _what_, of Enso's
syntax, but also the _why_.
Furthermore, Enso is novel in the fact that it does not enforce any artificial
restriction between the syntaxes of its type and value levels: they are one and
the same. This enables a staggering level of uniformity when programming in the
language, allowing arbitrary computations on types, because in a
dependently-typed world, they are just values.
The various components of Enso's syntax are described below:
- [**Encoding:**](./encoding.md) The source encoding of Enso files.
- [**Naming:**](./naming.md) The naming of Enso language constructs.
- [**Layout Rules:**](./layout.md) The layout rules for Enso code.
- [**Literals:**](./literals.md) The syntax for Enso literals.
- [**Assignment:**](./assignment.md) The syntax for various forms of assignment
expression in Enso.
- [**Types and Type Signatures:**](./types.md) The syntax for writing types and
type signatures in Enso.
- [**Macros:**](./macros.md) The syntax for writing macros in Enso.
- [**Top-Level Syntax:**](./top-level.md) The syntax at the top-level of an Enso
file.
- [**Functions:**](./functions.md) The syntax for writing functions in Enso.
- [**Function Arguments:**](./function-arguments.ms) The syntax for function
arguments in Enso.
- [**Field Access:**](./projections.md) The syntax for working with fields of
data types in Enso.
- [**Comments:**](./comments.md) The syntax for writing comments in Enso.

140
docs/syntax/assignment.md Normal file
View File

@ -0,0 +1,140 @@
---
layout: developer-doc
title: Assignment Expressions
category: syntax
tags: [syntax, assignment]
order: 5
---
# Assignment Expressions
Assignment syntax in Enso is fairly magical, given that it is the language's
syntax for monadic bind.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [How Assignment Works](#how-assignment-works)
- [Function Definitions](#function-definitions)
- [Pattern Match Bindings](#pattern-match-bindings)
- [Extension Methods](#extension-methods)
- [Method Syntax](#method-syntax)
- [Function Syntax](#function-syntax)
- [Top-Level Assignments](#top-level-assignments)
<!-- /MarkdownTOC -->
## How Assignment Works
Assignment in Enso operates as follows:
- An assignment is an _expression_.
- The left-hand-side introduces a pattern context.
- The pattern on the left-hand-side is matched against (unified with) the value
that occurs on its right-hand-side.
- A single line must contain at most one assignment.
- An assignment may only appear as the _root expression_ of a line of code in a
file.
- An assignment returns the value `Nothing`, and does not return the value that
is assigned to it.
The assignment operator has myriad uses, and is used to define variables,
functions, extension methods, and to perform pattern matching. Each different
case will see an appropriate desugaring applied (see below).
Please note that not _all_ occurrences of the `=` operator are assignments in
the general sense. The above rules do not apply when using said operator to
pass arguments by name.
## Function Definitions
If the left hand side of an assignment is syntactically a prefix application
chain, where the left-most name is a _variable_ name, the assignment is
considered to introduce a _function definition_ (the syntax sugared version).
For a prefix chain `a b c = ...`, this operates as follows:
- The name `a` is bound in the enclosing scope, and is called the 'function
name'.
- The names `b` and `c` (the 'function arguments') are converted into nested
lambda arguments in the function body.
In essence, the above example is equivalent to:
```ruby
a = b -> c -> ...
```
Please note that by the rules of naming specified previously, if an operator
occurs in the same position as `a` it will also be defined.
## Pattern Match Bindings
If the left hand side of an assignment is syntactically a prefix application
chain, where the left-most name is a _type_ name, the assignment is considered
to introduce a pattern match binding.
It operates as follows for code consisting of a prefix chain `A b c = expr` and
trailing code `tail...`.
```ruby
A b c = expr
tail...
```
- A case expression is created with scrutinee `expr`.
- The matching names `A`, `b`, and `c` are used in a case expression branch's
pattern. The branch's expression is `tail...`.
- A catch-all branch is created that has expression `error`.
As each branch in a case expression has its own scope, this desugaring means
that the names `b` and `c` are made visible in the scope where the pattern match
binding occurs. This is due to the fact that pattern match branches are lambda
expressions, and reuse the same scoping rules.
This also works for operators in an infix position, where its operands will be
matched against.
## Extension Methods
There are two cases where an assignment creates an extension method:
1. **Method Syntax:** If the left-hand-side of an assignment is syntactically a
prefix application chain where the left-most expression is an infix
application of `.`, this assignment is considered to introduce an extension
method.
2. **Function Syntax:** If the left hand side of an assignment is syntactically
a prefix application chain where the left-most expression is a variable
identifier and the second expression from the left is a variable named `this`
with an explicit type ascription, this is also considered to introduce an
extension method.
### Method Syntax
This syntax for extension methods works as follows:
- The target of the method syntax (left argument to `.`) defines the type on
which the extension method is created.
- An implicit `this` argument is inserted with that type at the start of the
arguments list.
- All arguments are desugared to lambda arguments.
```ruby
My_Type.method_name a b c = ...
```
### Function Syntax
This syntax for extension methods works as follows:
- The `this` argument type is used to define the type on which the extension
method is created.
- `this` and all remaining arguments are desugared to lambda arguments.
```ruby
method_name (this : My_Type) a b c = ...
```
## Top-Level Assignments
In order to aid with disambiguation, any binding made in the root scope without
an explicit target is implicitly defined on a type representing the current
module. For example, a binding `main = ...` is implicitly `here.main = ...`.
This works as follows:
- All non-extension methods defined at the top level are augmented with an
implicit first parameter `here`.
- They are callable by `name` if not ambiguous, but can be disambiguated by
using `here.name` where necessary.

160
docs/syntax/comments.md Normal file
View File

@ -0,0 +1,160 @@
---
layout: developer-doc
title: Comments
category: syntax
tags: [syntax, comments]
order: 12
---
# Comments
Enso supports a variety of types of comments:
- **Disable Comments:** TODO
- **Documentation Comments:** Documentation comments allow users to attach
documentation to language constructs. This documentation can later be rendered
to produce user-accessible HTML documentation, similar to tools included with
most programming languages.
> The actionables for this section are:
>
> - Solidify exactly how each type of comment should behave.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Disable Comments](#disable-comments)
- [Documentation Comments](#documentation-comments)
- [Tags](#tags)
- [Sections](#sections)
- [Links](#links)
- [Lists](#lists)
- [Code](#code)
- [Text Formatting](#text-formatting)
<!-- /MarkdownTOC -->
## Disable Comments
Disable comments are the standard form of comment seen in a programming language
in that they prevent a given piece of code from executing. In Enso, they are
created by prefixing the expression to disable with the `#` character.
These aren't _exactly_ like most language's disable comments however:
- When you disable a line in Enso, it is still run through the parser to see if
it is syntactically valid.
- No identifiers in it are resolved, however.
- This is important as it allows the disabled expression to still be displayed
in the visual syntax.
## Documentation Comments
Documentation comments allow users to attach documentation to Enso language
constructs that can later be displayed in a rich format for users of the API.
Such comments are automatically connected to the language construct, and can be
used both for displaying static documentation as well as providing dynamic help
to the user in Enso Studio itself.
The tool that generates this documentation aims to be fairly robust, and tries
to assign produce sensible results even if the user makes a mistake. Such
mistakes will be highlighted to the user.
The documentation syntax is broken down into the following elements.
### Tags
Tags allow users to annotate their construct with information about its usage
state. The documentation syntax supports the following tags:
- `DEPRECATED`: Used for constructs that should no longer be used and that may
be removed in the future.
- `MODIFIED`: Used for constructs that have had their behaviour change after a
certain version of the library.
- `ADDED`: Used to describe when a given construct was added to the library.
- `UPCOMING`: Used to describe constructs that will be added in future versions
of the library.
- `REMOVED`: Used to describe constructs that have been removed and are no
longer functional.
Tags are added at the _top_ of the documentation block, and may also be
accompanied by a description. This description directly follows the tag
declaration with one space.
```ruby
# DEPRECATED Use `seeFoo` instead
```
If the user provides an unknown tag the documentation will contain that tag, but
it will be undefined.
### Sections
Documentation comments can be broken up into sections, with each section
delineated by significant whitespace.
The first section that the user writes will be attributed to the 'synopsis' part
of the documentation, and the second section becomes the 'body'. They should be
used as follows:
- **Synopsis:** A brief summary of the function's behaviour.
- **Body:** More in-depth documentation where details of usage can be provided.
Sections may also have a title. If the whitespace before the section is _three_
newlines instead of _two_, then the first line of the section will be understood
to be a title.
The body can be broken down into multiple sections, with support for four
different types of section:
- **Raw:** A block of text, delineated purely by two blank lines before it.
- **Important:** A block of text describing important details about the
functionality of the construct. To create an important section, prefix the
title with `!`.
- **Info:** An information section that should be used to provide non-crucial
details about the construct's usage. To create an info section, prefix the
title with `?`.
- **Example:** For providing usage examples to the user. To create an example
section, prefix the title with `>`.
### Links
Users are able to embed links and images into their documentation. These links
can serve to provide access to external resources or demonstrations, and also
link between various program constructs.
- **URLs:** `[Link title](URI)`
- **Images:** `![Image name](URI)`
Linked images are rendered in the generated documentation, and URLs will be
displayed like standard hyperlinks.
> The actionables for this section are:
>
> - We probably want a construct that lets you reference other API constructs.
### Lists
The Enso documentation syntax also supports ordered and unordered lists. These
can be nested, and the nesting may swap the types. Both list types must be
intended some multiple of 2 spaces from the left margin of the documentation
comment.
- **Unordered:** List items are indicated by the `-` character.
- **Ordered:** List items are indicated by the `*` character.
To nest a list inside another list, add another 2-character indent to the nested
list.
### Code
The Enso documentation syntax allows users to write code that will be displayed
as code rather than prose. It supports two types of code.
- **Inline Code:** Text enclosed in `` ` `` will be formatted as inline code.
- **Multi-Line Code:** A block that is indented from the baseline of the current
section will be formatted as a code block.
### Text Formatting
Enso's documentation syntax also supports some basic syntax for adding rich
text formatting to the documentation.
- **Italics:** Enclosing text in `_` (e.g. `_Italics_`).
- **Bold:** Enclosing text in `*` (e.g. `*Bold*`).
- **Strikethrough:** Enclosing text in `~` (e.g. `~Strikethrough~`).
These syntaxes may be combined, and the order of opening need not equal the
order of closing. However, if the formatting syntaxes are not closed, this will
result in an error.

29
docs/syntax/encoding.md Normal file
View File

@ -0,0 +1,29 @@
---
layout: developer-doc
title: Source File Encoding
category: syntax
tags: [syntax, encoding]
order: 1
---
# Source File Encoding
While many modern programming languages are moving in a direction of being
liberal with the input they accept, we find that this often leads to the
resultant code being more difficult to use.
This file describes the source file encoding for Enso files.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Source Encoding](#source-encoding)
- [Indentation](#indentation)
<!-- /MarkdownTOC -->
## Source Encoding
All input source in Enso is UTF-8 encoded.
## Indentation
Indentation in Enso is performed using spaces. An indent level is 4 spaces.
This is intentionally not made configurable.

View File

@ -0,0 +1,139 @@
---
layout: developer-doc
title: Function Arguments
category: syntax
tags: [syntax, functions]
order: 10
---
# Function Arguments
One of the biggest usability innovations of Enso is the set of argument types
that it supports. The combination of named and defaulted arguments with a
curried language creates a tool in which it is very clear to express even
complex APIs.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Positional Arguments](#positional-arguments)
- [Named Arguments](#named-arguments)
- [Defaulted Arguments](#defaulted-arguments)
- [Optional Arguments](#optional-arguments)
- [Splats Arguments \(Variadics\)](#splats-arguments-variadics)
- [Type Applications](#type-applications)
- [Underscore Arguments](#underscore-arguments)
<!-- /MarkdownTOC -->
## Positional Arguments
Much like most programming languages, functions in Enso can be called with their
arguments provided positionally. This is the simple case that everybody is
familiar with.
## Named Arguments
All arguments in Enso are defined with a name. Like all programming languages,
this is necessary for that argument to be used. However, what Enso allows is for
users to then _call_ those arguments by name.
- An argument is called by name using the syntax `(name = value)` (or one may
also take advantage of the operator precedence to write `name=value`).
- Named arguments are applied in the order they are given. This means that if
you positionally apply to an argument `foo` and then try to later apply to it
by name, this will fail due to currying of functions.
- Named arguments _cannot_ be used while using operator syntax. This means that
an expression of the form `a + b` cannot apply arguments by name. However,
when calling the operator as a method (`a.+ b`), the call-by-name syntax may
indeed be used (`a.+ (that = b)`).
This is a great usability boon as in complex APIs it can often be difficult to
remember the order or arguments.
## Defaulted Arguments
Enso also allows users to define their functions with _defaults_ for the
function's arguments. This is very useful for complex APIs as it allows users to
experiment and iterate quickly by only providing the arguments that they want to
customise.
- An argument is defined with a default using the syntax `(name = default_val)`,
which, as above, accounts for precedence rules.
- Argument defaults are applied to the function if no argument value is provided
by position or name for that argument.
- Argument defaults are evaluated lazily if the function is lazy in that
argument.
- We provide a `...` operator which suspends application of the default
arguments for the purposes of currying.
## Optional Arguments
There are certain cases where the type information for an argument may be able
to be inferred by the compiler. This is best explained by example. Consider the
implementation of a `read` function that reads text and outputs a value of a
particular type.
```ruby
read : Text -> t -> t
read text this = t.fromText text
```
You can use this function by explicitly providing the type information in either
of the following ways:
```ruby
val1 = read '5' Int
val2 = Int.read '5'
```
This, however, is often tedious, especially in contexts where this information
could be inferred by the compiler. We can re-write `read` as follows:
```ruby
read : Text -> (t=t) -> t
read text (this=this) = t.fromText text
```
This allows users both to provide the argument explicitly or leave it out. In
the case where it is not provided, the compiler will attempt to infer it from
usage. If this is impossible, an error would be raised.
Enso provides a syntactic sugar for the `t=t` syntax. The above code can be
written instead using `?`.
```ruby
read : Text -> t? -> t
read text this? = t.fromText text
```
## Splats Arguments (Variadics)
Enso provides users with the ability to define variadic functions, or _splats_
functions in our terminology. These are very useful for defining expressive APIs
and flexible code.
- These work for both positional and keyword arguments.
- They are defined using the syntax `name...`, where `name` is an arbitrary
argument name.
> The actionables for this section are:
>
> - Work out how (and if) this can interact with currying.
> - Do we even want this?
## Type Applications
There are sometimes cases where the user wants to explicitly refine the type of
an argument at the _call_ site of a function. This can be useful for debugging,
and for writing ad-hoc code. Much like the named-arguments in applications
above, Enso also provides a syntax for refining types at the application site.
- To refine an argument type by name at the application site, use the `:=`
operator (e.g. `arg_name := T`).
- This _will_ be type-checked by the compiler, and so `T` must be a valid
subtype for the type inferred for (or defined for) the function being called.
## Underscore Arguments
Enso provides the `_` argument as a quick way to create a lambda from a function
call. It obeys the following rules.
- Replacing any function argument with `_` will create a lambda that accepts an
argument and passes it in the place of the underscore. All other function
arguments are applied as normal.
- This works both by name and positionally.
- When a function is provided multiple `_` arguments, they are desugared left to
right as the arguments would be applied to the function definition, creating
nested lambdas.

316
docs/syntax/functions.md Normal file
View File

@ -0,0 +1,316 @@
---
layout: developer-doc
title: Defining Functions
category: syntax
tags: [syntax, functions]
order: 9
---
# Defining Functions
Enso is a purely-functional programming language. As a result it has support for
[first-class and higher-order functions](https://en.wikipedia.org/wiki/Functional_programming#First-class_and_higher-order_functions),
meaning that you can pass functions as arguments to other functions, return
functions from functions, assign them to variables, store them in data
structures and so on.
Functions in Enso are curried by default, meaning that all functions are
actually functions in one argument, but may return functions accepting further
arguments.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Lambdas](#lambdas)
- [Defining Functions](#defining-functions)
- [Methods](#methods)
- [Universal Call Syntax](#universal-call-syntax)
- [Code Blocks](#code-blocks)
- [Operators](#operators)
- [Precedence](#precedence)
- [Sections](#sections)
- [Mixfix Functions](#mixfix-functions)
<!-- /MarkdownTOC -->
## Lambdas
The most primitive non-atom construct in Enso is the lambda. This is an
anonymous function in one argument. A lambda is defined using the `->` operator,
where the left hand side is an argument, and the right hand side is the body of
the function (containing arbitrary code).
Some functional languages such as Haskell allow for the definition of a lambda
with multiple arguments, but in Enso the type signature use of `-> `and the
lambda use of `->` are one and the same. We do not want to have to put the
components of a type signature in parentheses, so we only allow one argument
before each arrow.
- Lambdas can close over variables in their surrounding scope.
- If you want to define a multi-argument lambda, you can do it by having a
lambda return another lambda (e.g. `a -> b -> a + b`).
Additionally, lambdas in Enso have the following properties:
- The lambda introduces a new scope shared by the left and right operands.
- The left operand introduces a pattern context.
- If a lambda occurs in a pattern context, its left-hand-side identifiers are
introduced into the scope targeted by the outer pattern context. For example,
the following is valid `(a -> b) -> a.default + b`.
- Lambdas cannot currently occur in a matching context.
Please note that if a later lambda in a chain shadows an earlier lambda (e.g.
`a -> a -> a`), the shadowed arguments by that name are inaccessible. If you
want to unify later arguments with previous ones, you must employ the scope
reference rule and write (in this case) `a -> A -> a`.
> The actionables for this section are:
>
> - In the future we want to be able to match on function types, so this
> restriction should be relaxed.
> - Do we want any automated unification to take place in the shadowing case?
## Defining Functions
A function definition is just syntactic sugar for the definition of a lambda,
and hence has all the properties that a lambda does. Syntactically, functions
are defined in a similar way to variables. The only difference is that the
function name is followed by one or more parameters.
```ruby
sum x y = x + y
```
Under the hood, functions are desugared to a lambda assigned to a variable that
binds the function name. This means that:
- Like any variable, you can use the `:` type ascription operator to provide a
user-defined type for the function.
```ruby
sum : (a: Monoid) -> a -> a
sum : x -> y -> x + y
sum x y = x + y
```
- Functions have an _arity_. Unlike a single lambda which always has an arity of
one, function arity refers to the number of arguments in the function
definition, which may not always be deduced from the type signature, but may
still be inferred.
## Methods
Enso makes a distinction between functions and methods. In Enso, a method is a
function where the first argument (known as the `this` argument) is associated
with a given atom. Methods are dispatched dynamically based on the type of the
`this` argument, while functions are not.
Methods can be defined in Enso in two ways:
1. **In the Body of a Type:** A function defined in the body of a `type`
definition is automatically converted to a method on all the atoms defined in
the body of that type definition.
```ruby
type Maybe a
Nothing
type Just (value : a)
isJust = case this of
Nothing -> False
Just _ -> True
```
2. **As an Extension Method:** A function defined _explicitly_ on an atom counts
as an extension method on that atom. It can be defined on a typeset to apply
to all the atoms within that typeset.
```ruby
Number.floor = case this of
Integer -> ...
...
```
3. **As a Function with an Explicit `this` Argument:** A function defined with
the type of the `this` argument specified to be a type.
```ruby
floor (this : Number) = case this of
Integer -> ...
```
If the user does not explicitly specify the `this` argument by name when
defining a method (e.g. they use the `Type.name` syntax), it is implicitly added
to the start of the argument list.
## Universal Call Syntax
Calling a function or method is, in general, as simple as applying it to some
arguments. However, as Enso supports both methods and functions, it is very
important that users do not have to think about which of the two they are using
when calling it. To that end, Enso supports what is known as Uniform Call Syntax
(UCS).
- Where the syntax for calling methods differs from the syntax for calling
functions, there are needless constraints on writing generic code.
- This is a needless constraint as both notations have their advantages.
- Enso has two notations, but one unified semantics.
The rules for the uniform syntax call translation in Enso are as follows:
1. For an expression `t.fn <args>`, this is equivalent to `fn t <args>`.
2. For an expression `fn t <args>`, this is equivalent to `t.fn <args>`.
## Code Blocks
Top-level blocks in the language are evaluated immediately. This means that the
layout of the code has no impact on semantics of the code:
- This means that the following `a` and `b` are equivalent.
```ruby
a = foo x y
b =
foo x y
```
- To suspend blocks, we provide a `suspend` function in the standard library.
- This function takes any expression as an argument (including a block), and
suspends the execution of that expression such that it is not evaluated until
forced later.
```ruby
susp = suspend
x = foo x y z
x.do_thing
```
The following rules apply to code blocks:
- Code blocks are desugared into in-order applications of monadic bind (as in
keeping with the fact that all blocks are monadic contexts).
- If an expression that returns a value is not assigned to an identifier, this
will issue a warning.
- To suppress this warning you can assign it to a blank (`_`).
```ruby
test =
_ = expr1
expr2
# Becomes
test =
expr1 >>= (_ -> expr2)
# Equivalent to
test =
expr1 >> expr2
```
- If the trailing line of the block (the return value) is an assignment, it will
return `Nothing` as all assignments do.
```ruby
foo =
pat1 = expr1
# Becomes
foo =
expr1 >>= (pat1 -> Nothing)
```
## Operators
In Enso, an operator is a function with a non-alphanumeric name (e.g. `+`). We
only support binary operators, with left and right arguments.
Enso provides a significant amount of flexibility for developers who want to
define custom operators. Formally, any sequence of the following characters
forms an operators `.!$%&*+-/<>?^~\`. Operator definitions have three main
parts:
- **Definition:** This defines a function that is called on the arguments
provided to the operator.
- **Precedence:** This is an optional block that defines the
[precedence relation](https://en.wikipedia.org/wiki/Order_of_operations) for
the operator. Precedence in Enso is specified _in relation_ to existing
operators. If you do not provide this information, no precedence relations
will be defined.
- **Associativity:** This is an optional block that defines the
[operator associativity](https://en.wikipedia.org/wiki/Operator_associativity)
to be either `left`, `right`, or `none`. If you do not provide this, the
operator's associativity will default to `left`.
```ruby
@prec [> *, < $]
@assoc left
^ a n = a * a ^ (n-1)
```
### Precedence
Operator precedence in Enso is a collection of rules that reflect conventions
about which operations to perform first in order to evaluate a given expression
that contains operators. However, operator precedence in Enso differs from
many other programming languages.
- Precedence is not set at fixed levels, but is instead defined in relation to
the precedence of other operators.
- Precedence of an operator in Enso depends on whether a particular operator is
surrounded by spaces or not. This means that the precedence of _any_ operator
not surrounded by spaces is always higher than the precedence of any operator
surrounded by spaces. The only exception to this rule is the `,` operator,
which retains the same precedence level regardless of whether it is surrounded
by spaces or not.
This space-based precedence may seem strange coming from other languages, but
it allows for writing _far_ cleaner code than other functional languages. This
is best demonstrated by example. Consider the following code:
```ruby
list = 1 .. 100
randomList = each random list
headOfList = take 10 randomList
result = sort headOfList
```
This could easily be refactored to the following one-liner, and then transformed
using UCS to an expression that reads left to right:
```ruby
result = sort (take 10 (each random (1 .. 100)))
result = (((1 .. 100).each random).take 10).sort
```
This is still quite noisy, however, so using the whitespace-sensitive operator
precedence rules, combined with the fact that the operator `.` is a regular
operator, we get the following.
```ruby
result = 1..100 . each random . take 10 . sort
```
### Sections
An operator section is a nice shorthand for partially applying an operator. It
works as follows.
- Where an argument is not applied to an operator, the missing argument is
replaced by an implicit `_`.
- The application is then translated based upon the rules for
[underscore arguments](./function-arguments.md#underscore-arguments) described
later.
- The whitespace-based precedence rules discussed above also apply to operator
sections.
## Mixfix Functions
A mixfix function is a function that is made up of multiple sections. They are
defined using a special syntax, and operate as follows:
- They are defined using a 'split snake case'. The first section is written as
normal, but subsequent sections are prefixed by an underscore (`if c _then a`,
for example).
- The layout rules applied to mixfix functions operate as if each section was a
separate operator, allowing you to write an indented block of code after each
section.
Probably the best-known example of a mixfix function is `if-then-else`, which
is indeed defined in the Enso standard library.
```ruby
if foo == bar then frob else
thing1
thing2
```

117
docs/syntax/layout.md Normal file
View File

@ -0,0 +1,117 @@
---
layout: developer-doc
title: Layout Rules
category: syntax
tags: [syntax, layout]
order: 3
---
# Layout Rules
Enso is a layout-aware programming language, in that layout rules are used to
determine code structure. The layout rules in Enso are intended to provide for
an intuitive way to format code.
This document describes the layout rules for Enso's white-space.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Maximum Line Length](#maximum-line-length)
- [Indented Blocks](#indented-blocks)
- [Trailing Operator on the Parent Line](#trailing-operator-on-the-parent-line)
- [Leading Operator on All Child Lines](#leading-operator-on-all-child-lines)
- [No Leading or Trailing Operators](#no-leading-or-trailing-operators)
- [Debug Line Breaks](#debug-line-breaks)
<!-- /MarkdownTOC -->
## Maximum Line Length
The maximum length of a line in an Enso source file is restricted to 80
characters outside of text blocks. If your code exceeds this limit, the compiler
will emit a warning message.
There is no option to change this limit in order to enforce visual consistency
in user code. The reasoning behind this is as follows:
- The default soft-wrapping of long lines in editors is highly disruptive to the
visual structure of code, making it harder to understand.
- Code should still be understandable on smaller screens or with multiple-column
views.
## Indented Blocks
Indentation in Enso is used to start a block. Every indented line is considered
to be a sub-structure of the nearest previous line with lower indentation. We
refer to these as the 'child' and the 'parent' lines respectively. This means
that any region at the same indentation is considered to be part of the same
block, and blocks may contain blocks.
```ruby
block =
x = 2 . replicate 7 . map show . intercalate ","
IO.println x
```
In addition, we have a set of custom layout rules that impact exactly how blocks
are defined. These are described in the following subsections.
### Trailing Operator on the Parent Line
If a line ends with an operator then all of its child lines form a
[_code_ block](./functions.md/#code-blocks). The most common usage of this kind
of indented block is a function definition body (following the `=` or `->`).
```ruby
test = a -> b ->
sum = a + b
```
### Leading Operator on All Child Lines
If all the child lines in a block begin with an operator, the lines in the block
are considered to form a single expression.
This expression is built as follows:
1. Every line in the block is built as a standard inline expression, ignoring
the leading operator.
2. The final expression is built top to bottom, treating the leading operators
as left-associative with the lowest possible precedence level.
Please note that the operator at the _beginning_ of each child line is used
_after_ the line expression is formed.
```ruby
nums = 1..100
. each random
. sort
. take 100
```
### No Leading or Trailing Operators
In the case where neither the parent line ends with a trailing operator, or the
child lines begin with an operator, every child line is considered to form a
separate expression passed as an argument to the parent expression. The most
common usage of this is to split long expressions across multiple lines.
```ruby
geo1 = sphere (radius = 15) (position = vector 10 0 10) (color = rgb 0 1 0)
geo2 = sphere
radius = 15
position = vector 10 0 10
color = rgb 0 1 0
```
### Debug Line Breaks
In certain cases it may be useful to debug line breaks in code. To this end, we
provide a debug line-break operator `\\` which, when placed at the beginning of
a child line tells Enso to glue that line to the end of the previous one.
This should be avoided in production code and its use will issue a warning.
```ruby
debugFunc = v -> v2 ->
print (v2.normalize * ((v.x * v.x) + (v.y * v.y)
\\ + (v.z * v.z)).sqrt)
validFunc = v -> v2 ->
len = ((v.x * v.x) + (v.y * v.y) + (v.z * v.z)).sqrt
v2 = v2.normalize * len
print v2
```

127
docs/syntax/literals.md Normal file
View File

@ -0,0 +1,127 @@
---
layout: developer-doc
title: Literals
category: syntax
tags: [syntax, literals]
order: 4
---
# Literals
Enso supports a small set of literals that allow the expression of some common
types in literal form in the source code.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Numeric Literals](#numeric-literals)
- [Text Literals](#text-literals)
- [Inline Text Literals](#inline-text-literals)
- [Text Block Literals](#text-block-literals)
- [Vector Literals](#vector-literals)
<!-- /MarkdownTOC -->
## Numeric Literals
Enso provides rich support for numeric literals, including literals that use
different numeric bases. It does, of course, support floating point numerals as
well.
A numeric literal takes the form:
```ebnf
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
hex = "a" | "b" | "c" | "d" | "e" | "f";
number-digit = digit | hex;
decimal-point = ".";
float-digit = number-digit | decimal-point;
base-specifier = { digit };
numeric-literal = [base-specifier, "_"], { number-digit };
```
If no base is specified, it is inferred to be a standard base-10 numeral.
Some examples of numeric literals follow:
```ruby
decimal = 12345.39
decimal_explicit = 10_1029301
octal = 8_122137
hex = 16_ae2f14
binary = 2_10011101010
```
> Actionables for this section are:
>
> - Think about whether we want to support explicit fractional and complex
> literals, or whether these should be relegated to type constructors.
## Text Literals
Enso provides rich support for textual literals in the language, supporting both
raw and interpolated strings natively.
- **Raw Strings:** Raw strings are delimited using the standard double-quote
character (`"`). Raw strings have support for escape sequences.
```ruby
raw_string = "Hello, world!"
```
- **Interpolated Strings:** Interpolated strings support the splicing of
executable Enso expressions into the string. Such strings are delimited using
the single-quote (`'`) character, and splices are delimited using the backtick
(`` ` ``) character. Splices are run, and then the result is converted to a
string using `show`. These strings also have support for escape sequences.
```ruby
fmt_string = 'Hello, my age is `time.now.year - person.birthday.year`'
```
### Inline Text Literals
In Enso, inline text literals are opened and closed using the corresponding
quote type for the literal. They may contain escape sequences but may _not_ be
broken across lines.
```ruby
inline_raw = "Foo bar baz"
inline_interpolated = 'Foo `bar` baz'
```
### Text Block Literals
In Enso, text block literals rely on _layout_ to determine the end of the block,
allowing users to only _open_ the literal. Block literals are opened with three
of the relevant quote type, and the contents of the block are determined by the
following layout rules:
- The first child line of the block sets the baseline left margin for the block.
Any indentation up to this margin will be removed.
- Any indentation further than this baseline will be retained as part of the
text literal.
- The literal is _closed_ by the first line with a _lower_ level of indentation
than the first child lineand will not contain the final blank line.
```
block_raw = '''
part of the string
still part of the string
also part of the string
not_string_expr = foo bar
```
## Vector Literals
Enso also supports vector literals, which allow users to create literal vectors
of elements.
```ruby
literal = [elem_1, elem_2, elem_3, ...]
```
A vector literal works as follows:
- It is begun by the `[` character.
- It is ended by the `]` character.
- Elements in vector literals are concatenated using the `,` operator, which
acts as `cons` on vectors.

71
docs/syntax/macros.md Normal file
View File

@ -0,0 +1,71 @@
---
layout: developer-doc
title: The Enso Macro Syntax
category: syntax
tags: [syntax, macro]
order: 7
---
# The Enso Macro Syntax
Enso provides a macro system that allows users to perform AST to AST
transformations on the provided pieces of code. While many languages' macros
provide their users with access to the compilation and type-checking phases
(scala, for example), there are a few reasons that we don't want to:
- The power of a dependently-typed language obviates the need for the ability to
manipulate types at compile time.
- Syntactic macros are far more predictable than those that can perform type
manipulation and compute values.
- We do not want to introduce a metaprogramming system that is too complex.
> The actionables for this section are:
>
> - Fully specify the macro system.
> - Fully specify the interactions between the parser-based macro system and the
> runtime.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Annotations](#annotations)
- [Automatic Deriving](#automatic-deriving)
<!-- /MarkdownTOC -->
## Annotations
Much like annotations on the JVM, annotations in Enso are tags that perform a
purely syntactic transformation on the entity to which they are applied. The
implementation of this requires both parser changes and support for
user-defined macros, but for now it would be possible to work only with a set
of hard-coded annotation macros.
Annotations can be arbitrarily nested, so a set of annotation macros become
implicitly nested inside each other:
```ruby
@derive Eq Debug
@make_magic
type Maybe a
use Nothing
type Just
```
The above example is logically translated to:
```ruby
derive Eq Debug
make_magic
type Maybe a
use Nothing
type Just (value : a)
```
In the presence of annotations and macros, it becomes more and more important
that we are able to reserve words such as `type` to ensure that users can
always have a good sense of what the most common constructs in the language
mean, rather than allowing them to be overridden outside of the stdlib.
## Automatic Deriving
In order to make the language easier to debug, we have all types automatically
derive an interface `DebugShow`. This interface provides a function that will
print all the significant information about the value (e.g. locations, types,
source information, etc).

153
docs/syntax/naming.md Normal file
View File

@ -0,0 +1,153 @@
---
layout: developer-doc
title: Naming
category: syntax
tags: [syntax, naming]
order: 2
---
# Naming
This file describes the syntax for naming language constructs in Enso, as well
as the various rules that names follow.
Names in Enso are restricted to using ASCII characters. This arises from the
simple fact that all names should be easy to type without less common input
methods. Furthermore, we enforce a rigid style for naming. This is in aid of
giving Enso code a uniform identity.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Naming Constructs](#naming-constructs)
- [Pattern Contexts](#pattern-contexts)
- [Localised Naming](#localised-naming)
- [Operator Naming](#operator-naming)
- [Reserved Names](#reserved-names)
<!-- /MarkdownTOC -->
## Naming Constructs
Given that Enso is dependently-typed, with no artificial separation between the
type and value-level syntaxes, an arbitrary name can refer to both types and
values. This means that naming itself can become a bit of a concern. At first
glance, there is no meaningful syntax-based disambiguation in certain contexts
(such as patterns and type signatures) between introducing a fresh variable, or
an occurrence of one already in scope.
As we still want to have a minimal syntax for such use-cases, Enso enforces the
following rules around naming:
- All identifiers are named as follows. This is known as 'variable' form.
+ Each 'word' in the identifier must be lower-case or a number.
+ Words in the identifier are separated using `_`.
+ Numbers may not occur as the first 'word' in an identifier.
- An identifier named as above can be referred to by capitalizing the first
letter of each 'word' in the identifier. This is known as 'referent' form.
- No mixed-format names are allowed (e.g. `HTTP`, `foO`, `make_New`, or
`Make_new`). These should be rejected by the compiler.
- We _strongly encourage_ using capitalised identifiers to refer to atoms.
Name resolution obeys the following rules:
- Contexts where it is _ambiguous_ as to whether a name is fresh or should refer
to an identifier in scope are known as _pattern contexts_.
- In a [pattern context](#pattern-contexts), an identifier in referent form will
_always_ refer to a name in scope, whereas an identifier in variable form is
interpreted as the creation of a fresh identifier.
- This behaviour _only_ occurs in pattern contexts. In all other contexts,
both conventions refer to that name already in scope.
- Operator names behave as variable names when placed in a prefix position
(e.g. `+ a b`).
- Operator names behave as referent names when placed in an infix position (e.g.
`a + b`).
- All literals (e.g. `1` and `"Hello"`) are treated as referent names.
Identifiers are introduced by:
- Naming them in a binding (assignments and function arguments).
- Using them in a pattern matching context (free variables).
- Using them in a type ascription (free variables).
## Pattern Contexts
A pattern context is a span in the code where variable identifiers (as described
above) can be used to introduce new identifiers into the scope containing the
pattern context. The following spans are pattern contexts:
- The left-hand-side of the assignment operator (`=`).
- The right-hand-side of the ascription operator (`:`).
- The left-hand-side of the arrow operator (`->`).
The following behaviours occur within a pattern context:
- Variable names are matched against corresponding portions of the expression
and are introduced into scope.
- Type names require that the matched value is of a given structure (be that
matching a typeset, atom, or some combination thereof), and allows for
matching these fields recursively.
- Any literals (e.g. numbers) behave as type names.
- In any place where a variable identifier may be introduced in a pattern
context, an `_` (known as an ignore) may be substituted. This does _not_ bind
a new name, and hence cannot be used later.
In the core language, it should be noted that all non-trivial constructs are
desugared into the set of above constructs plus `case ... of` expressions. This
means that these are the _only_ constructs which introduce pattern contexts.
> Actionables for this section are:
>
> - Clarify exactly what "corresponding portions of the expression" actually
> means in a formal sense.
## Localised Naming
We do, however, recognise that there is sometimes a need for unicode characters
in names (e.g. designing a high-level visual library that targets a narrow
domain in a specific country). To that end, Enso allows users to specify
optional localised names as part of a function's documentation.
Special support is provided for providing completions based on these localised
names in the language server, and in Enso Studio.
## Operator Naming
Operator names are those built solely from operator symbols (e.g. `+` or `<*>`).
Operator symbols are defined as characters in the following set.
```
!$%&*+-/<>?^~|:\,.()[]{}=
```
Please note that not every sequence that can be created from the above is a
_valid_ operator name, as some may collide with built-in language constructs
(e.g. `[` and `]`, which start and end a vector literal respectively).
## Reserved Names
Even though we do not intend to reserve any names at the level of the lexer or
parser, there are a number of constructs so core to the operation of Enso as a
language that we do not want to let them be overridden or redefined by users.
These constructs are known as reserved names, and these restrictions are
enforced in the compiler.
We reserve these names because allowing their redefinition would severely hinder
the readability and consistency of Enso code. They are as follows:
- `type`: This reserved name is used to define new atoms and typesets.
- `->`: This reserved name is the 'function' type, and represents a mapping from
the type of its first operand to the type of its second operand.
- `:`: This reserved name is the type attribution operator. It ascribes the type
described by its right operand to its left operand.
- `=`: This reserved name is the assignment operator, and assigns the value of
its right operand to the name on its left. Under the hood this desugars to the
relevant implementation of monadic bind.
- `.`: This is the forward function chaining operator.
- `case ... of`: This reserved name is the case expression that is fundamental
to the operation of control flow in the language.
- `this`: This reserved name is the one used to refer to the enclosing type in
a method or type definition.
- `here`: This reserved name is the one used to refer to the enclosing module.
- `in`: Used to specify the monadic context(s) in a type signature.
Many of these reserved words are implemented as macros in the parser, but these
macros are always in scope and cannot be overridden, hidden, or redefined.
> The actionables for this section are as follows:
>
> - In the future, we need to determine if we need `all` and `each` explicit
> keywords in the case of dependency. Explicit examples are required.

146
docs/syntax/projections.md Normal file
View File

@ -0,0 +1,146 @@
---
layout: developer-doc
title: Projections and Field Access
category: syntax
tags: [syntax, projections, field-access, pattern-matching]
order: 11
---
# Projections and Field Access
Enso provides multiple ways for users to access data from their types. It has
the old functional stalwart of pattern matching, but it also has an inbuilt
notion of accessors based on lenses (field projections).
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Pattern Matching](#pattern-matching)
- [The Underscore in Pattern Matching](#the-underscore-in-pattern-matching)
- [Projections](#projections)
<!-- /MarkdownTOC -->
## Pattern Matching
Pattern matching in Enso works similarly to as you would expect in various other
functional languages. Typing information is _always_ refined in the branches of
a case expression, which interacts well with dependent typing and type-term
unification. There are a few main ways you can pattern match:
1. **Positional Matching:** Matching on the scrutinee by structure. This works
both for atoms and typesets (for typesets it is a subsumption judgement).
```ruby
type Vector a
V2 x:a y:a
V3 x:a y:a z:a
v = Vector.V3 x y z
case v of
Vector.V3 x y z -> print x
```
2. **Type Matching:** Matching purely by the types involved, and not matching
on structure.
```ruby
case v of
Vector.V3 -> print v.x
```
3. **Name Matching on Labels:** Matching on the labels defined within a type
for both atoms and typesets, with renaming.
```ruby
case v of
Vector.V3 {x y} -> print x
{x} -> print x
```
4. **Naming Scrutinees in Branches:** Ascribing a name of a scrutinee is done
using the standard typing judgement. This works due to the type-term
unification present in Enso.
```ruby
case _ of
v : Vector.V3 -> print v,x
```
> The actionables for this section :
>
> - Refine the syntax for the name-based case.
> - Provide code examples for why the renaming use-case is important (e.g.
> cases where there are clashing field names).
> - Function-resolution matching.
### The Underscore in Pattern Matching
An underscore `_` passed as an argument to a syntactic pattern does not behave
like the function argument shorthand. Instead it acts as a positional match that
is given no name.
## Projections
Unlike the simple accessors defined by most programming language, Enso's
accessors are far more powerful. This is because they are based on lenses.
- Field accessors are standard lenses. As such, they can be used for both the
getting and setting of properties.
- As a lens (e.g. `.field`) is a first-class function, they can be curried and
passed around like any other function.
```ruby
type Engine
type Combustion
power: Int
cylinder_count: Int
type Electric
power: Int
is_blue: Bool
type Vehicle
type Car
color: String
max_speed: Int
engine: Engine
type Bike
color: String
type Person
type Cons
name: String
vehicle: Vehicle
main =
p1 = Person.Cons "Joe" (Vehicle.Car 'pink' 300 (Engine.Combustion 500 8))
print $ p1.name # -> Joe
print $ p1.vehicle.color # -> pink
print $ p1.vehicle.max_speed # -> Some 300
print $ p1.vehicle.engine.power # -> Some 500
print $ p1.vehicle.engine.is_blue # -> None
p1.vehicle.color = 'red' # OK
p1.vehicle.max_speed = 310 # FAIL: security reasons. Allowing this
# in Haskell was the worst decision
# ever. After refactoring it
# silently does nothing there.
p2 = p1.vehicle.max_speed ?= 310 # OK
p3 = p1.vehicle.engine.power = 510 # FAIL
p4 = p1.vehicle.engine.power ?= 510 # OK
lens_name = .name
lens_color = .vehicle.color
lens_max_speed = .vehicle.max_speed
lens_power = .vehincle.engine.power
## Function like usage:
print $ lens_name p1
print $ lens_color p1
print $ lens_max_speed p1
print $ lens_power p1
p1 . at lens_name = ... # OK
```
> The actionables for this section are:
>
> - Fix the example above. It isn't correct.

76
docs/syntax/top-level.md Normal file
View File

@ -0,0 +1,76 @@
---
layout: developer-doc
title: Enso Top-Level Syntax
category: syntax
tags: [syntax, top-level, file]
order: 8
---
# Enso Top-Level Syntax
Like almost all statically-typed programming languages, the top-level of an Enso
file is non-executable. The top level may contain the following constructs:
- Type definitions (both complex and simple)
- Method definitions
- Function definitions
> The actionables for this section are as follows:
>
> - Fully specify the top-level syntax for Enso.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Main](#main)
- [Against Top-Level Evaluation](#against-top-level-evaluation)
<!-- /MarkdownTOC -->
## Main
The entry point for an Enso program is defined in a special top-level binding
called `main` in the file `Main.enso`. However, we also provide for a scripting
workflow in the form of `enso run`, which will look for a definition of `main`
in the file it is provided.
```ruby
main = IO.println "Hello, World!"
```
## Against Top-Level Evaluation
At points during Enso's development it was up for debate as to whether we wanted
the language to have an executable top-level (akin to scripting languages like
Python). In order to make a decision, we listed the following use cases, and the
corresponding analysis is provided here for posterity.
| Label | Meaning |
| --------| ------- |
| `[?,_]` | We don't know how to implement it, but it may be possible.
| `[-,_]` | Not possible to implement using purely syntactic macros.
| `[M,_]` | Possible to implement using purely syntactic macros.
| `[_,H]` | High priority. This will be used often.
| `[_,M]` | Medium priority. This will be used with a medium frequency.
| `[_,L]` | Low priority. Nice to have, but we can likely live without it.
| `[_,!]` | Something that we never want to have in the language.
The use-cases we have considered are as follows:
| Label | Description |
| ------- | ----------- |
| `[-,L]` | Creating top-level constructs in `IO`, such as `IORef`. This is, in general, considered to be bad style, but can sometimes be useful. |
| `[-,L]` | Using enso files like python is able to be for scripting work. The ability to write constructs at the top-level and just evaluate them. |
| `[M,H]` | The ability to generate structures and / types for a dataframe at compilation time, or the automatic generation of an API for a library. A key recognition is that dependent types and type-level execution replace much of the need to be able to query the type-checker and runtime while writing a syntactic macro. |
| `[M,H]` | Static metaprogramming (transformations from `AST -> AST`) to let users generate types and functions based on existing AST. There is the potential to want to be able to evaluate actions in `IO` while doing this, but it may not be necessary. |
| `[-,!]` | Dynamic metaprogramming to let users mutate program state at runtime (e.g. changing atom shapes, function definitions), also known as 'monkey patching'. This is not something we want in the language, but we do perhaps want the ability to do so on values of type `Dynamic`. |
| `[M,H]` | 'Remembering' things when compiling a file, such as remembering all structures marked by an `AST` annotation. An example use case for a mechanism like this is to generate pattern matches for all possible `AST` types. This can be done by letting macros write to a per-file peristent block of storage that could be serialised during precompilation. |
| `[M,H]` | Grouping of macros (e.g. `deriveAll = derive Ord Debug Show`). This can be easily handled by doing discovery on functions used as macros, and treating it as a macro as well. |
| `[?,M]` | Method-missing magic, akin to ruby. This is likely able to be handled using other, existing language mechanisms. |
In summary and when considering the above use-cases, it seems that there is
little need for top-level expression evaluation in Enso. We can support all of
the above-listed important use-cases using syntactic (`AST -> AST`) macros,
while allowing for top-level evaluation would enable users to write a lot of
overly-magical code, which will always be a code-smell.
Syntactic macros, however, do not easily support a scripting workflow, but the
solution to this problem is simple. We can just provide an `enso run <file>`
command which will search for and execute the `main` function in the provided
file.

184
docs/syntax/types.md Normal file
View File

@ -0,0 +1,184 @@
---
layout: developer-doc
title: Types and Type Signatures
category: syntax
tags: [syntax, types]
order: 6
---
# Types and Type Signatures
Enso is a statically typed language, meaning that every variable is associated
with information about the possible values it can take. In Enso, the type
language is the same as the term language, with no artificial separation. For
more information on the type system, please see the [types](../types/README.md)
design document.
This section will refer to terminology that has not been defined in _this_
document. This is as this document is a specification rather than a guide, and
it is expected that you have read the above-linked document on the type-system
design as well.
Additionally, this document will colloquially refer to the left and right hand
sides of the type ascription operator `:` as the 'term' and 'type' levels,
respectively. In reality, there is no separation between the two in Enso, but it
is a useful way of thinking about things when discussing type signatures.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Type Signatures](#type-signatures)
- [Operations on Types](#operations-on-types)
- [Type Definitions](#type-definitions)
- [Visibility and Access Modifiers](#visibility-and-access-modifiers)
<!-- /MarkdownTOC -->
## Type Signatures
Enso allows users to provide explicit type signatures for values through use of
the type ascription operator `:`. The expression `a : b` says that the value
`a` has the type `b` attributed to it.
```ruby
foo : (m : Monoid) -> m.this
```
Type signatures in Enso have some special syntax:
- The reserved name `in` is used to specify the monadic context in which a value
resides. The Enso expression `a in IO` is equivalent to the Haskell
`MonadIO a`.
```ruby
foo : Int -> Int in IO
```
- The operator `!` is used to specify the potential for an _error_ value in a
type. The expression `a ! E` says that the type is either an `a` or an error
value of type `E`.
```ruby
/ : Number -> Number -> Number ! ArithError
```
In Enso, a type signature operates to constrain the values that a given variable
can hold. Type signatures are _always_ checked, but Enso may maintain more
specific information in the type inference and checking engines about the type
of a variable. This means that:
- Enso will infer constraints on types that you haven't necessarily written.
- Type signatures can act as a sanity check in that you can encode your
intention as a type.
- If the value is of a known type (distinguished from a dynamic type),
constraints will be introduced on that type.
- Where the type of the value is known, ascription can be used to constrain that
type further.
- It is legal to add constraints to an identifier using `:` in any scope in
which the identifier is visible.
From a syntactic perspective, the type ascription operator `:` has the following
properties:
- The right hand operand to the operator introduces a pattern context.
- The right hand side may declare fresh variables that occur in that scope.
- It is not possible to ascribe a type to an identifier without also assigning
to that identifier.
- Introduced identifiers will always shadow other identifiers in scope due to
the fact that `:` introduces a new scope on its RHS.
- Constraint implication is purely feed-forward. The expression `b:A` only
introduces constraints to `b`.
> The actionables for this section are:
>
> - How do signatures interact with function bodies in regards to scoping?
> - Does this differ for root and non-root definitions?
## Operations on Types
Enso also provides a set of rich operations on its underlying type-system notion
of typesets. Their syntax is as follows:
- **Union - `|`:** The resultant typeset may contain a value of the union of its
arguments.
- **Intersection - `&`:** The resultant typeset may contain values that are
members of _both_ its arguments.
- **Subtraction - `\`:** The resultant typeset may contain values that are in
the first argument's set but not in the second.
> The actionables for this section are:
>
> - Unify this with the types document at some point. The types document
> supersedes this section while this actionable exists.
## Type Definitions
Types in Enso are defined by using the `type` reserved name. This works in a
context-dependent manner that is discussed properly in the
[type system design document](../types/README.md), but is summarised
briefly below.
- **Name and Fields:** When you provide the keyword with only a name and some
field names, this creates an atom.
```ruby
type Just value
```
- **Body with Atom Definitions:** If you provide a body with atom definitions,
this defines a smart constructor that defines the atoms and related functions
by returning a typeset.
```ruby
type Maybe a
Nothing
type Just (value : a)
isJust = case this of
Nothing -> False
Just _ -> True
nothing = not isJust
```
Please note that the `type Foo (a : t)` is syntax only allowable inside a type
definition. It defines an atom `Foo`, but constrains the type variable of the
atom _in this usage_.
- **Body Without Atom Definitions:** If you provide a body and do not define any
atoms within it, this creates an interface that asserts no atoms as part of
it.
```ruby
type HasName
name: String
```
In addition, users may write explicit `this` constraints in a type definition,
using the standard type-ascription syntax:
```ruby
type Semigroup
<> : this -> this
type Monoid
this : Semigroup
use Nothing
```
### Visibility and Access Modifiers
While we don't usually like making things private in a programming language, it
sometimes the case that it is necessary to indicate that certain fields should
not be touched (as this might break invariants and such like). To this end, we
propose an explicit mechanism for access modification that works as follows:
- We provide explicit access modifiers that, at the definition site, start an
indented block. These are `private` and `unsafe`.
- All members in the block have the access modifier attributed to them.
- By default, accessing any member under an access modifier will be an error.
- To use members under an access modifier, you use the syntax `use <mod>`, where
`<mod>` is a modifier. This syntax 'takes' an expression, including blocks,
within which the user may access members qualified by the modifier `<mod>`.
While `private` works as you might expect, coming from other languages, the
`unsafe` annotation has additional restrictions:
- It must be explicitly imported from `Std.Unsafe`.
- When you use `unsafe`, you must write a documentation comment on its usage
that contains a section `Safety` that describes why this usage of unsafe is
valid.

72
docs/types/README.md Normal file
View File

@ -0,0 +1,72 @@
---
layout: section-summary
title: Enso's Type System
category: types
tags: [types, readme]
order: 0
---
# Enso's Type System
On the spectrum of programming type systems ranging from dynamically typed to
statically typed, one likes to think that there is a happy medium between the
two. A language that _feels_ dynamic, with high levels of type inference, but
lets the users add more type information as they want more safety and
compile-time checking.
Enso aims to be that language, providing a statically-typed language with a type
system that makes it feel dynamic. It will infer sensible types in many cases,
but as users move from exploratory pipelines to production systems, they are
able to add more and more type information to their programs, proving more and
more properties using the type system. This is based on a novel type-inference
engine, and a fusion of nominal and structural typing, combined with dependent
types.
All in all, the type system should stay out of the users' ways unless they make
a mistake, but give more experienced users the tools to build the programs that
they require.
This document contains discussion and designs for the type-system's behaviour,
as well as formal specifications where necessary. It discusses the impact of
many syntactic language features upon inference and type checking, and is
instrumental for ensuring that we build the right language.
> #### A Note on Syntax
> In the aid of precision, this document will use syntax that _may not_ be
> exposed to users. The appearance of a piece of syntax here that is not
> described in the [syntax](../syntax/README.md) document makes no promises as
> to whether said syntax will be exposed in the surface language.
> **Please Note:**
> The designs in this section are currently very exploratory as the type system
> is not slated from implementation until after 2.0.
Information on the type system is broken up into the following sections:
- [**Goals for the Type System:**](./goals.md) The goals for the
Enso type system, particularly around usability and user experience.
- [**The Type Hierarchy:**](./hierarchy.md) The type hierarchy in Enso.
- [**Function Types:**](./function-types.md) Function types in Enso.
- [**Access Modification:**](./access-modifiers.md) Access modifiers in Enso
(e.g. `private` and `unsafe`),
- [**Pattern Matching:**](./pattern-matching.md) The typing of pattern match
expressions in Enso.
- [**Dynamic Dispatch:**](./dynamic-dispatch.md) How dynamic dispatch interacts
with the type system in Enso.
- [**Modules:**](./modules.md) A description of the Enso module system.
- [**Monadic Contexts:**](./contexts.md) A description of the typing and
functionality of Enso's monadic contexts.
- [**Strictness and Suspension:**](./evaluation.md) A description of how the
type system interacts with Enso's evaluation semantics.
- [**Analysing Parallelism:**](./parallelism.md) A description of how the type
system interacts with the planned automated parallelism analysis.
- [**Type-Directed Programming:**](./type-directed-programming.md) A description
of how the type system aids type-directed programming, and the features it
has to support this approach.
- [**Errors:**](./errors.md) The interaction between Enso's errors and the type
system.
- [**Type Inference and Checking:**](./inference-and-checking.md) A description
of Enso's type inference and checking system.
- [**Dependent Typing:**](./dependent-typing.md) A description of the dependent
nature of Enso's type system.
- [**References:**](./references.md) Useful references for working on the type
system and its theory.

View File

@ -0,0 +1,56 @@
---
layout: developer-doc
title: Access Modifiers
category: types
tags: [types, access-modifiers]
order: 4
---
# Access Modifiers
While we don't usually like making things private in a programming language, it
sometimes the case that it is necessary to indicate that certain fields should
not be touched (as this might break invariants and such like). To this end, Enso
provides an explicit mechanism for access modification.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Access Modification](#access-modification)
- [Private](#private)
- [Unsafe](#unsafe)
<!-- /MarkdownTOC -->
## Access Modification
Access modifiers in Enso work as follows:
- We provide explicit access modifiers that, at the definition site, start an
indented block.
- All members in the block have the access modifier attributed to them.
- By default, accessing any member under an access modifier will be an error.
- To use members under an access modifier, you use the syntax `use <mod>`, where
`<mod>` is a modifier. This syntax 'takes' an expression, including blocks,
within which the user may access members qualified by the modifier `<mod>`.
> The actionables for this section are:
>
> - How do we type this?
## Private
The `private` modifier acts to hide implementation details from clients of the
API. It is:
- Available by default in the `Base` library.
- Able to be overridden using the above-described mechanism.
## Unsafe
While `private` works as you might expect, coming from other languages, the
`unsafe` annotation has additional restrictions:
- It must be explicitly imported from `Std.Unsafe`.
- When you use `unsafe`, you must write a documentation comment on its usage
that contains a section `Safety` that describes why this usage of unsafe is
valid.
> The actionables for this section are:
>
> - Specify `unsafe` properly.

66
docs/types/contexts.md Normal file
View File

@ -0,0 +1,66 @@
---
layout: developer-doc
title: Monadic Contexts
category: types
tags: [types, context, monad, effect]
order: 8
---
# Monadic Contexts
Coming from a Haskell background, we have found that Monads provide a great
abstraction with which to reason about program behaviour, but they have some
severe usability issues. The main one of these is the lack of automatic lifting,
requiring users to explicitly lift computations through their monad transformer
stack.
For a language as focused on usability as Enso is this really isn't feasible. To
that end, we have created the notion of a 'Monadic Context', which is a monad
transformer based on Supermonads (see
[references](./references.md#monadic-contexts)). These have special support in
the compiler, and hence can be automatically lifted to aid usability.
> The actionables for this section are:
>
> - Think about subsumption for contexts.
> - Contexts (e.g. IO) are represented using `T in IO`. Multiple contexts are
> combined as standard `(IO | State Int)`, and it is written the same in arg
> position.
> - Do we definitely want to use monads, or can we use arrows or other
> interpreter-based effects systems? These may aid with parallelism analysis.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Context Syntax](#context-syntax)
- [Monadic Bind](#monadic-bind)
- [Context Definitions](#context-definitions)
- [Context Lifting](#context-lifting)
<!-- /MarkdownTOC -->
## Context Syntax
There are three main notes about the syntax of contexts:
1. Monadic contexts are defined using the `in` keyword (e.g. `Int in IO`).
2. We have a symbol `!`, which is short-hand for putting something into the
`Exception` monadic context. This is related to broken values.
3. Contexts can be combined by using the standard typeset operators, or nested
through repeated uses of `in`.
## Monadic Bind
It is also important to note that Enso has no equivalent to `<-` in Haskell.
Instead, pure computations are implicitly placed in the `Pure` monadic context,
and `=` acts to 'peel off' the outermost layer of contexts. As such, this means
that `=` _always_ acts as `bind`, greatly simplifying how the type-checker has
to work.
## Context Definitions
Contexts can be defined by users.
> The actionables for this section are:
>
> - How, what, when and why?
## Context Lifting
> The actionables for this section are:
>
> - Specify and explain how automated lifting of monadic contexts works.

View File

@ -0,0 +1,97 @@
---
layout: developer-doc
title: Dependent Typing
category: types
tags: [types, inference, checking, dependent, dependency]
order: 14
---
# Dependent Typing
Enso is a [dependently typed](https://en.wikipedia.org/wiki/Dependent_type)
programming language. This means that types are first-class values in the
language, and hence can be manipulated and computed upon just like any other
value. To the same end, there is no distinction between types and kinds, meaning
that Enso obeys the 'Type in Type' axiom of dependent types. While this does
make the language unsound as a logic, the type safety properties do not depend
on this fact.
In essence, values are types and types are values, and all kinds are also types.
This means that, in Enso, you can run _arbitrary_ code to compute with types.
All in all, dependency in Enso is not about being a proof system, but is instead
about enabling practical usage without hampering usability. To that end we
combine a powerful and non-traditional dependently-typed system with the ability
to automate much of the proof burden through SMT solvers.
> The actionables for this section are:
>
> - Do we want the ability to explicitly quantify type variables for visibility,
> dependency, relevance and requiredness (forall, foreach, etc).
> - How do we infer as much of these above properties as possible?
> - Based on QTT and RAE's thesis.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Proving Program Properties](#proving-program-properties)
- [Automating the Proof Burden](#automating-the-proof-burden)
<!-- /MarkdownTOC -->
## Proving Program Properties
Some notes:
- Dependent types as constructing proofs through combining types. Combining
types provides evidence which can be discharged to create a proof. A value can
then only be constructed by discharging a proof.
- To provide more predictable inference in dependent contexts, this system will
draw on the notion of _matchability polymorphism_ as outlined in the
higher-order type-level programming paper. The key recognition to make,
however is that where that paper was required to deal with
backwards-compatibility concerns, we are not, and hence can generalise all
definitions to be matchability polymorphic where appropriate.
- Provide some keyword (`prove`) to tell the compiler that a certain property
should hold when typechecking. It takes an unrestricted expression on types,
and utilises this when typechecking. It may also take a string description of
the property to prove, allowing for nicer error messages:
```
append : (v1 : Vector a) -> (v2 : Vector a) -> (v3 : Vector a)
append = vec1 -> vec2 ->
prove (v3.size == v1.size + v2.size) "appending augments length"
...
````
Sample error:
```
[line, col] Unable to prove that "appending augments length":
Required Property: v3.size == v1.size + v2.size
Proof State: <state>
<caret diagnostics>
```
This gives rise to the question as to how we determine which properties (or
data) are able to be reasoned about statically.
- Dependent types in Enso will desugar to an application of Quantitative Type
Theory.
> The actionables for this section are:
>
> - Specify how we want dependency to behave in a _far more rigorous_ fashion.
## Automating the Proof Burden
Even with as capable and simple a dependently-typed system as that provided by
Enso, there is still a burden of proof imposed on our users that want to use
these features. However, the language [F*](https://www.fstar-lang.org/) has
pioneered the combination of a dependently-typed system with an SMT solver to
allow the automation of many of the simpler proofs.
- The Enso typechecker will integrate an aggressive proof rewrite engine to
automate as much of the proof burden as possible.
> The actionables for this section are:
>
> - What is the impact of wanting this on our underlying type theory?
> - Where and how can we draw the boundary between manual and automated proof?
> - How much re-writing can we do (as aggressive as possible) to assist the SMT
> solver and remove as much proof burden as possible.

View File

@ -0,0 +1,98 @@
---
layout: developer-doc
title: Dynamic Dispatch
category: types
tags: [types, dispatch]
order: 6
---
# Dynamic Dispatch
Enso is a language that supports pervasive dynamic dispatch. This is a big boon
for usability, as users can write very flexible code that still plays nicely
with the GUI.
The current implementation of Enso supports single dispatch (dispatch purely on
the type of `this`), but there are broader visions afoot for the final
implementation of dynamic dispatch in Enso.
> The actionables for this section include:
>
> - Determining whether we want to support proper multiple dispatch in the
> future. This is important to know as it has implications for the type
> system, and the design of the dispatch algorithm.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Specificity](#specificity)
- [Multiple Dispatch](#multiple-dispatch)
<!-- /MarkdownTOC -->
## Specificity
In order to determine which of the potential dispatch candidates is the correct
one to select, the compiler needs to have a notion of _specificity_, which is
effectively an algorithm for determining which candidate is more specific than
another.
- Always prefer a member function for both `x.f y` and `f y x` notations.
- Only member functions, current module's functions, and imported functions are
considered to be in scope. Local variable `f` could not be used in the `x.f y`
syntax.
- Selecting the matching function:
1. Look up the member function. If it exists, select it.
2. If not, find all functions with the matching name in the current module and
all directly imported modules. These functions are the _candidates_.
3. Eliminate any candidate `X` for which there is another candidate `Y` whose
`this` argument type is strictly more specific. That is, `Y` this type is a
substitution of `X` this type but not vice versa.
4. If not all of the remaining candidates have the same this type, the search
fails.
5. Eliminate any candidate `X` for which there is another candidate `Y` which
type signature is strictly more specific. That is, `Y` type signature is a
substitution of `X` type signature.
6. If exactly one candidate remains, select it. Otherwise, the search fails.
> The actionables for this section are as follows:
>
> - THE ABOVE VERSION IS OLD. NEEDS UPDATING.
> - The definition of specificity for dispatch candidates (including how it
> interacts with the subsumption relationship on typesets and the ordering of
> arguments).
## Multiple Dispatch
It is an open question as to whether we want to support proper multiple dispatch
in Enso. Multiple dispatch refers to the dynamic dispatch target being
determined based not only on the type of the `this` argument, but the types of
the other arguments to the function.
To do multiple dispatch properly, it is very important to get a rigorous
specification of the specificity algorithm. It must account for:
- The typeset subsumption relationship.
- The ordering of arguments.
- How to handle defaulted and lazy arguments.
- Constraints in types. This means that for two candidates `f` and `g`, being
dispatched on a type `t` with constraint `c`, the more specific candidate is
the one that explicitly matches the constraints. An example follows:
```ruby
type HasName
name : String
greet : t -> Nothing in IO
greet _ = print "I have no name!"
greet : (t : HasName) -> Nothing in IO
greet t = print 'Hi, my name is `t.name`!'
type Person
Pers (name : String)
main =
p1 = Person.Pers "Joe"
greet p1 # Hi, my name is Joe!
greet 7 # I have no name
```
Here, because `Person` conforms to the `HasName` interface, the second `greet`
implementation is chosen because the constraints make it more specific.

67
docs/types/errors.md Normal file
View File

@ -0,0 +1,67 @@
---
layout: developer-doc
title: Errors
category: types
tags: [types, errors]
order: 12
---
# Errors
Enso supports two notions of errors. One is the standard asynchronous exceptions
model, while the other is a theory of 'broken values' that propagate through
computations.
> The actionables for this section are:
>
> - Greatly expand on the reasoning and theory behind the two exception models.
> - Explain why broken values serve the GUI well.
> - Explain how this can all be typed.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Async Exceptions](#async-exceptions)
- [Broken Values](#broken-values)
<!-- /MarkdownTOC -->
## Async Exceptions
> The actionables for this section are:
>
> - Formalise the model of async exceptions as implemented.
## Broken Values
In Enso we have the notion of a 'broken' value: one which is in an invalid state
but not an asynchronous error. While these may initially seem a touch useless,
they are actually key for the display of errors in the GUI.
Broken values can be thought of like checked monadic exceptions in Haskell, but
with an automatic propagation mechanism:
- Broken values that aren't handled explicitly are automatically promoted
through the parent scope. This is trivial inference as no evidence discharge
will have occurred on the value.
```ruby
open : String -> String in IO ! IO.Exception
open = ...
test =
print 'Opening the gates!'
txt = open 'gates.txt'
print 'Gates were opened!'
7
```
In the above example, the type of test is inferred to
`test : Int in IO ! IO.Exception`, because no evidence discharge has taken
place as the potential broken value hasn't been handled.
- This allows for very natural error handling in the GUI.
> The actionables for this section are:
>
> - Determine what kinds of APIs we want to use async exceptions for, and which
> broken values are more suited for.
> - Ensure that we are okay with initially designing everything around async
> exceptions as broken values are very hard to support without a type checker.
> - Initially not supported for APIs.

38
docs/types/evaluation.md Normal file
View File

@ -0,0 +1,38 @@
---
layout: developer-doc
title: Evaluation and Typing
category: types
tags: [types, evaluation]
order: 9
---
# Evaluation and Typing
Enso is a language that has strict semantics by default, but it can still be
very useful to be able to opt-in to suspended computations (thunks) for the
design of certain APIs.
To that end, Enso provides a mechanism for this through the type system. The
standard library defines a `Suspend a` type which, when used in explicit type
signatures, will cause the corresponding expression to be suspended.
- The explicit calls to `Suspend` and `force` are inserted automatically by the
compiler doing demand analysis.
- This demand analysis process will also ensure that there are not polynomial
chains of suspend and force being inserted to ensure performance.
> The actionables for this section are as follows:
>
> - Specify this much better.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Specifying Suspension in the Type System](#specifying-suspension-in-the-type-system)
<!-- /MarkdownTOC -->
## Specifying Suspension in the Type System
> The actionables for this section are:
>
> - Actually specify how the type system interacts with eager and lazy
> evaluation.

View File

@ -0,0 +1,127 @@
---
layout: developer-doc
title: Function Types
category: types
tags: [types, functions]
order: 3
---
# Function Types
As a functional programming language, the type of functions in Enso (`->`) is
key. There are a few things that should be noted about functions in Enso.
> The actionables for this section:
>
> - Work out a full specification for function typing and behaviour.
> - Calling a function with an upper-case letter instantiates all of its type
> arguments to free type variables.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Scoping](#scoping)
- [Structural Type Shorthand](#structural-type-shorthand)
- [Function Composition](#function-composition)
<!-- /MarkdownTOC -->
## Scoping
Enso's scoping rules should be fairly familiar to most other programming
languages, as it uses standard lexical scoping with shadowing. However, there
are a few unique points to keep in mind:
- There is no separation between the type and term levels. This means that, for
a given lexical scope, type, function and lambda variables exist in the same
name space.
- This means that if different names are used on the type and value level to
refer to the same entity, both names are valid.
- Name clashes between different entities are not allowed.
- Shadowing between different lexical scopes occurs as standard.
- Shadowing in the same lexical scope may only occur in a chain of lambdas, and
in such a case the shadowed variables must unify.
- Variables from the body are accessible in the type signature.
- Variables from the type signature are accessible in the body.
Scope lookup proceeds from the function body outwards, as is standard for
lexical scoping. For a detailed discussion of scoping, please see
[the scoping documentation](../semantics/scoping.md).
## Structural Type Shorthand
In Enso, we want to be able to write a type-signature that represents types in
terms of the operations that take place on the input values. A classical example
is `add`:
```ruby
add : a -> b -> b + a
add = a -> b -> b + a
```
There are a few things to note about this signature from a design standpoint:
- `a` and `b` are not the same type. This may, at first glance, seem like a
signature that can't work, but the return type, in combination with our
integrated `Convertible` mechanism gives us the tools to make it work.
- The return type is `a + b`. This is a shorthand expression for a detailed
desugaring. The desugaring provided below is what the typechecker would infer
based on such a signature.
```ruby
add : forall a b c d. ({+ : Convertible b c => a -> c -> d} <: a) => a -> b -> d
```
This may look fairly strange at first, but we can work through the process as
follows:
1. The expression `b + a` is syntactic sugar for a method call on a: `a.+ b`.
2. This means that there must be a `+` method on a that takes both an `a` and a
`b`, with return-type unspecified as of yet: `+ : a -> b -> ?`
3. However, as `Convertible` is built into the language, we have to consider
that for `a.+ b` to work, the `+` method can actually take any type to which
`b` converts. This introduces the constraint `Convertible b c`, and we get
`+ : a -> c -> ?`
4. The return type from a function need not be determined by its arguments, so
hence in the general case we have to default to an unrestricted type variable
giving `+ a -> c -> d`.
5. This method must exist on the type `a`, resulting in the constraint that the
row `{+ : a -> c -> d} <: a` must conform to that interface.
6. We now know the return type of `a + b`, and can rewrite it in the signature
as `d`.
Please note that `a <: b` (which can be flipped as `:>`) means that `a` is a row
that is a sub-row contained within the row `b`. The containment relation allows
for the possibility that `a == b`.
The inferred type for any function should, in general, match the given type
signature. Cases where this break down should only exist where the type
signature acts to constrain the function type further than would be inferred.
> The actionables for this section are:
>
> PLEASE NOTE. THIS IS OUT OF DATE.
>
> - Clean it up and work out how it applies in light of the new developments of
> typesets.
## Function Composition
Enso introduces a function composition operator which composes functions after
all arguments have been applied. This operator is `>>` (and its backwards cousin
`<<`). It takes a function `f` with `n` arguments, and a function `g` with `m`
arguments, and the result consumes `n` arguments, applies them to `f`, and then
applies the result of that plus any additional arguments to `g`.
```ruby
computeCoeff = (+) >> (*5)
doThing = (+) >> (*)
```
In addition, we have the operator `.`, which acts as standard forward function
chaining in Enso, and its backwards chaining cousin `<|`.
> The actionables from this section are:
>
> - Examples for the more advanced use-cases of `>>` to decide if the type
> complexity is worth it.
> - Otherwise, standardise on using `>>` and `<<` for standard function
> composition:
> + `<< : (b -> c) -> (a -> b) -> a -> c` - backwards composition (standard)
> + `>> : (a -> b) -> (b -> c) -> a -> c` - forwards composition

38
docs/types/goals.md Normal file
View File

@ -0,0 +1,38 @@
---
layout: developer-doc
title: Goals for the Enso Type System
category: types
tags: [types, goals, design]
order: 1
---
# Goals for the Enso Type System
In our design for Enso, we firmly believe that the type system should be able to
aid the user in writing correct programs, far and above anything else. However,
with so much of our targeted user-base being significantly non-technical, it
needs to be as unobtrusive as possible.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [High-Level Goals](#high-level-goals)
<!-- /MarkdownTOC -->
## High-Level Goals
The high-level goals for the Enso type system are as follows:
- Inference should have maximal power. We want users to be _forced_ to write
type annotations in as few situations as possible. This means that, ideally,
we are able to infer higher-rank types and make impredicative instantiations
without annotations.
- Error messages must be informative. This is usually down to the details of the
implementation, but we'd rather not employ an algorithm that discards
contextual information that would be useful for crafting useful errors.
- Dependent types are a big boon for safety in programming languages, allowing
the users that _want to_ to express additional properties of their programs
in their types. We would like to introduce dependent types in future, but
would welcome insight on whether it is perhaps easier to do so from the get
go. If doing so, we would prefer to go with `Type : Type`.
- Our aim is to create a powerful type system to support development, rather
than turn Enso into a research language. We want users to be able to add
safety gradually.

515
docs/types/hierarchy.md Normal file
View File

@ -0,0 +1,515 @@
---
layout: developer-doc
title: The Enso Type Hierarchy
category: types
tags: [types, hierarchy, typeset, atom]
order: 2
---
# The Enso Type Hierarchy
Enso is a statically typed language based upon a theory of set-based typing,
what we call `typesets`. This is a novel approach, and it is key to our intent
for Enso to _feel_ like a dynamic language while still bringing enhanced safety.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Typeset Theory](#typeset-theory)
- [Atoms](#atoms)
- [Typesets](#typesets)
- [Typeset Operators](#typeset-operators)
- [Typeset Subsumption](#typeset-subsumption)
- [Unsafe Typeset Field Mutation](#unsafe-typeset-field-mutation)
- [Interfaces](#interfaces)
- [Special Interfaces](#special-interfaces)
- [Type Ascription](#type-ascription)
- [Scoping in Type Ascription](#scoping-in-type-ascription)
- [Projections](#projections)
- [Special Fields](#special-fields)
<!-- /MarkdownTOC -->
## Typeset Theory
- All types are denoted by a set of constructors, which represent the atomic
values of that type. We call these 'atoms'. For example, the typeset `Nat` is
made up of the atoms `1, 2, 3, ...` and so on.
- Constructors are grouped into typesets.
- These typesets are arranged into a modular lattice:
+ The type `Any` is the typeset of all typesets.
+ The type `Void` is the empty typeset.
+ All atoms are typesets, but not all typesets are atoms.
+ This lattice is ordered using the `<:` subsumption judgement. For more
information please see [typeset subsumption](#typeset-subsumption).
All in all, this means that a value in Enso can have myriad different types
attributed to it, even though these may vary greatly in the level of
specificity.
```ruby
7 : 7 : Natural : Integer : Number : Any : Any : ...
```
A brief note on naming (for more, please see the
[naming syntax](../syntax/naming.md):
- Naming in Enso is case-insensitive.
- In contexts where it is ambiguous as to whether the user would be referring to
a fresh name or a name already in scope, capitalisation is used to determine
which is meant.
- An uncapitalised identifier is assumed to be fresh, while a capitalised
identifier is assumed to be in scope.
## Atoms
Atoms are the fundamental building blocks of types in Enso, so named because
they are small units of 'type', but nonetheless may be separated further. In
Enso's type hierarchy, atoms are the typesets that are unified _nominally_,
meaning that every atom has a discrete identity.
- Atoms can be thought of as the 'values' of Enso's type system. Formally an
atom is a product type with named fields, where the fields are polymorphic.
- Atoms have _unique identity_. This means that while a typeset may subsume an
atom, an atom can never be subsumed by another atom, even if it has the same
number of fields with the same names.
- An atom, thus, can only unify with a site expecting that atom, or some
anonymous typeset.
The following are all examples of atoms with a usage example:
```ruby
type Nothing
type Just value
atom Vec3 x y z
atom Vec2 x y
v = V3 1 2 3 : V3 1 2 3 : V3 Int Int Int : V3 Any Any Any : Any
```
## Typesets
Typesets in Enso are an entity unique to Enso's type system. They are a
fundamental recognition of types as 'sets of values' in Enso, and while they
share some similarities with records they are a far more general notion.
- A typeset is an entity that contains one or more labels.
- Each label has a type, which _may_ be explicitly ascribed to it.
- Each label may have a default value provided for it.
- A label may be _duplicated_, as long as the types of the duplicate labels are
different.
The other key notion of typesets is that typesets are matched _structurally_,
subject to the rules for nominal typing of atoms discussed above.
The definition for a typeset member uses the syntax `label : type = val`, where
the following rules apply:
- If a default is explicitly requested it becomes part of the subsumption
judgement (see [below](#typeset-subsumption)).
- If the type of a label is omitted it is inferred from a default if present,
and is otherwise inferred to be `Any`.
- If only the type is present, auto-generated labels are provided using the
index of the member in the typeset (e.g `{ Int, String }.1` has type `Int`).
These labels act specially for unification.
- Labels are syntactically _names_, but internally may be other entities (e.g.
types).
- A typeset member is itself a typeset.
Typesets themselves can be declared in two ways:
1. **Anonymous:** An anonymous typeset can be declared using the curly-braces
syntax `{}`. Such a definition must contain zero or more typeset fields (see
above).
2. **Atoms:** An atom definition declares a typeset with a discrete identity,
using atom definition syntax. Atom fields must only be names.
Typesets can be combined using the [typeset operators](#typeset-operators)
defined below.
In addition, we provide syntactic sugar for defining typesets as described in
the syntax document. This syntax has a special desugaring that obeys the
following rules:
> Actually fill in these rules.
> Note that because things desugar to functions we can place arbitrary
constraints on initialisation (partial type constructors style).
```ruby
type Maybe a
Nothing
type Just (value : a)
isJust = case this of
Nothing -> False
Just _ -> True
nothing = not isJust
```
Becomes
```ruby
atom Just value
maybe a =
{ (Nothing | Just a), isJust: IsJust = isJust, nothing : Nothing = nothing }
Nothing.isJust : Maybe a -> Bool
Nothing.isJust this = case this of
Nothing -> False
Just _ -> True
Just.isJust : Maybe a -> Bool
Just.isJust this = case this of
Nothing -> False
Just _ -> True
Nothing.nothing : Maybe a -> Bool
Nothing.nothing = not isJust
Just.nothing : Maybe a -> Bool
Just.nothing = not isJust
```
> The actionables for this section are as follows:
>
> - Simplify this definition if we decide _not_ to support multiple dispatch.
### Typeset Operators
Enso defines a set of operations on typesets that can be used to combine and
manipulate them:
- **Subsumption - `<:`:** This operator expresses the relation between typesets
defined below in the section on [typeset subsumption](#typeset-subsumption).
- **Equality - `~`:** This operator expresses the relation that two typesets `a`
and `b` are equal (such that `a <: b && b <: a`).
- **Concatenation - `,`:** This operator combines multiple typesets, and is most
often used to combine typeset fields. It expresses product types.
- **Union - `|`:** This operator creates a typeset that contains all the types
in the union of its operand typesets.
- **Intersection - `&`:** This operator creates a typeset that contains all the
types in the intersection of its operand typesets.
- **Subtraction - `\`:** This operator creates a typeset that contains all the
types in its first operand that are _not_ also contained in its second
operand.
- **Function - `->`:** This operator creates a mapping from one typeset to
another, and its result is a typeset containing all the possible types of that
mapping.
Any use of these operators introduces typing evidence which may later be
discharged through pattern matching.
> The actionables for this section are:
>
> - When necessary, we need to _explicitly formalise_ the semantics of all of
> these operators.
> - When do applications of these constructors create matchable (injective and
> generative) types?
### Typeset Subsumption
For two typesets `a` and `b`, `a` is said to be subsumed by `b` (written using
the notation `a <: b`) if the following hold recursively. This can be thought of
as a 'can behave as' relation.
1. `a` contains a subset of the labels in `b`. It should be noted that `a` is
not _limited to_ being a subset of the labels in `b`.
2. For each label in `a`, the type of that label `t` is subsumed by the type
`q` of the corresponding label in `b`. That is, `t <: q`, defined as
follows:
1. If both `t` and `q` are atoms, then it holds only if `t` and `q` are the
same atom (have the same identity).
2. If `t` is an atom, then it holds only if the fields in `t` are subsumed
by `q`.
3. If either `t` or `q` is a function type but not _both_ `t` and q are
function types, then the relation does not hold.
4. If both `t` and `q` are function types, then the relation holds if:
- If `t` contains defaulted arguments, not present in `q`, then these
can be ignored for the purposes of determining whether `t <: q`. For
example, `f : a -> b = x -> c` is subsumed by `f : a -> c`.
- For the _argument_ position of both `t` and `q`, `t.arg <: q.arg`
(the argument position is covariant).
- For the _return_ position of both `t` and `q`, if it is not a function
type, then `t.ret <: q.ret` (the return position is covariant). If it
is a function type then recurse.
5. If the types have constraints then the constraints must match. A
constraint is simply an application of the `<:` relation.
6. The types both have the same relevance and visibility (in the dependent
sense).
3. For any typeset `a`, the relation `a <: Any` always holds, and the converse
`Any <: a` never holds.
4. For any typeset `a`, the relation `a <: Void` never holds, and the converse
`Void <: a` always holds.
Two typesets `A` and `B` are defined to be structurally equal if `A <: B` and
`B <: A`.
> The actionables for this section are as follows:
>
> - Fix the above. It isn't 100% correct, but should convey a general gist. Use
> examples including all the operators.
> - Ensure that co- and contra-variance are handled properly. They are a bit
> odd under this theory.
> - Do we need explicit variance annotations?
> - How do constraints factor in?
> - We want users not to have to think about the difference between `~`, `:` and
> `<:` so we need to work out if we can elide them from the surface language.
> This requires considering polymorphic function arguments, partial data,
> qualified function types and variable definitions.
> - Reformulate this in terms of row polymorphism. We really want to avoid a
> _real_ subtyping relationship as it doesn't play at all well with global
> inference.
> - To that end, it is an open question as to whether we can have type unions
> without subtyping. Conventionally we wouldn't be able to, but with our
> theory we may.
### Unsafe Typeset Field Mutation
For performance it is sometimes necessary to have the ability to _directly_
_mutate_ the field of a typeset. This is most often necessary for atoms
themselves, but as atoms are typesets this also applies.
- We define a method `setField : (field : Name) -> (value : Any) -> Nothing`
that performs in-place field mutation of the field `field` to set its value to
`any`.
- In order to prevent this from being used flippantly, this functionality is
marked `unsafe` (see [access modifiers](./access-modifiers.md) for more).
## Interfaces
Because typesets can be matched _structurally_, all typesets implicitly define
interfaces. A type `t` conforming to an interface `i` in Enso is as simple as
the relation `i <: t` (as in [typeset subsumption](#typeset-subsumption))
holding.
This means that types need not _explicitly_ implement interfaces, which can be
thought of as a form of static duck typing. However, when defining a new type,
users may choose to explicitly state that it defines an interface. This has
two main benefits:
- We can include default implementations from the interface definition.
- We can provide better diagnostics in the compiler as we can point to the
definition site instead of the use site.
```ruby
type HasName
name: String
name = "unnamed"
type Vector a
this: HasName
V2 x:a y:a
V3 x:a y:a z:a
name (this:Int) = "IntName"
greet (t:HasName) = print 'Hi, my name is `t.name`'
main =
greet (V3 1 2 3)
greet 8
```
As an aside, it should be noted that the nature of Enso's typesets means that it
is easy to express far more general interfaces than Haskell's typeclasses can.
### Special Interfaces
In order to aid usability we include a few special interfaces in the standard
library that have special support in the compiler.
#### Wrapper
In a language where composition is queen and inheritance doesn't exist there
needs to be an easy way for users to compose typesets without having to define
wrappers for the contained types. This is a big usability bonus for Enso.
```ruby
type Wrapper
wrapped : (lens s t a b) this.unwrapped
unwrapped : t
unwrapped = t # Default implementation based on inferred type.
```
`Wrapper` is an interface implemented implicitly for all typesets, and boils
down to delegating to the contained members if a given label is not found on
the top typeset. This delegation only occurs on the this type.
A usage example is as follows:
```ruby
type HasName a
this:Wrapper # The field 'unwrapped' uses default implementation.
type Cons
name : String
wrapped : a
test i:Int = i + 1
main =
p1 = HasName.Cons "Zenek" 7
p2 = p1 + 1 # OK, uses `wrapped` lens.
print $ p2.name # OK
print $ test p1 # OK, uses `wrapped` lens.
```
#### Convertible
Also useful in a language for data science is the ability to have the compiler
help you by automatically converting between types that have sensible coercions
between them. This interface is known as `Convertible`, and defines a one-way
conversion between a type `a` and a type `b`.
```ruby
Convertible t
to : t -> t
```
There are a few key points of this design that must be considered carefully:
- This interface _only_ applies when implemented explicitly by the type. The
compiler will not automatically generate implementations for `Convertible t`.
- It is very important for the conversions to be inserted automatically in the
correct place. If a conversion is required in the body of a block, the point
at which the conversion takes place should propagate outwards as far as
possible. This is very important for proper definition of controls in the GUI.
- `Convertible t` can also be implemented by a function that accepts arguments
_iff_ all of the arguments have default values associated with them. In such
a case, the GUI should display conversion controls with a checkbox that, when
checked, can be converted to an explicit conversion call.
- We will need some limited mechanism for doing this even without type inference
as it forms the backbone of good API design for the graphical interface. This
is because polymorphic functions are much harder to support with graphical
controls.
An example use-case is as follows:
```ruby
type Vector a
type V3 x:a y:a z:a
this : Convertible String
to = 'V3 `this.x` `this.y` `this.z`'
this : Convertible (a: Semigroup)
to = this.x + this.y + this.z
test: Int -> String
test i = print 'I got the magic value `i`!'
main =
test 7 # OK!
test 'hi' # FAIL: type mismatch, no definition `Convertible Int` for String.
test (Vector.V3 1 2 3) # OK, prints 'I got the magic value 6'.
```
> The actionables for this section are:
>
> - Work out the semantics for working with this only for `this` in the dynamic
> case.
> - Work out how best the type system can expose the conversion so that you can
> see the additional defaulted parameters and conversion types in the UI.
#### Destruct
While it is a common idiom in functional languages to implement the `bracket`
pattern for acquiring and releasing resources, but this isn't such a good fit
for a language where many users aren't going to be used to thinking about
resources.
Instead, we have the final of our special traits, called `Destruct`, defined as
follows:
```ruby
type Destruct
destroy : This -> Nothing
```
For those coming from Rust, C++, or another language which uses the RAII
principle, this should be a familiar sight. It works as follows:
1. All types automatically provide a trivial implementation for `destroy`. This
will recursively call the destructors of the type's members.
2. A type definition may provide a non-default implementation for `destroy`,
such that it implements more complex behaviour.
3. When a type goes out of scope, its `destroy` method is called, allowing it
to clean up any resources that it owns.
Initially, going out of scope will be defined as the point at which the
instance is garbage collected, while later, once we are able to perform more
sophisticated analysis, it will instead be defined as the point at which the
instance's lexical lifetime ends.
It should be noted, however, that a type that implements an explicit `destroy`
method should still implement explicit methods for resource handling as lexical
lifetimes are not always sufficient (e.g. a socket that you may want to close
and re-open in the same block).
> The actionables for this section are:
>
> - Determine how this interacts with copying and moving.
## Type Ascription
Like all statically-typed programming languages, Enso provides the means for the
user to ascribe a type to a value. In Enso, this is done using the `:` operator.
An expression `a : b` says that the expression denoted by `a` has the type
denoted by the expression `b`. However, unlike in many programming languages,
the types ascribed to values in Enso are not the be-all and end-all.
- The Enso compiler is free to infer a _more specific_ type than the one
provided in a type signature. If this is the case, then the actual type of the
value is the inferred type rather than the provided type.
```ruby
a = 17 : Int | Text
b = a + 1 # This is not an error
```
- If the Enso compiler would infer a _more general_ type than the one provided
in a type signature, then the signature _constrains_ the allowable type of the
value.
- Type signatures must be subsumed by the inferred type of the value, otherwise
the compiler will raise an error. This includes
Additionally, as Enso is a dependently-typed language, the expression `b` may
contain arbitrary Enso expressions. The type-checking of such signatures is
discussed further in the section on [dependency](./dependent-typing.md).
### Scoping in Type Ascription
Enso intends to support some form of mutual scoping between the left and right
sides of the type ascription operator. This introduces some complexity into the
typechecker but brings some significant benefits.
- It is common in dependently-typed languages for the signature to balloon
significantly as you add more constraints to your program.
- To this end, Enso wants to support a sharing of scopes between the top-level
scope of the type signature's right-hand-side, and the top-level scope of the
signature's left hand side.
- This will allow users to move complexity _out_ of the type signature and into
the body of the expression, aiding code clarity.
- It does, however, introduce some significant complexity around recursive
bindings in groups, and the desugaring needs to depend on combinations of
`>>=` and `fix`.
## Projections
In order to work efficiently with typesets, we need the ability to seamlessly
access and modify (immutably) their properties. In the context of our type
theory, this functionality is known as a _projection_, in that it projects a
value from (or into) a typeset.
Coming from Haskell, we are well-versed with the flexibility of lenses, and
more generally _optics_. To that end, we base our projection operations on
standard theories of optics. While we _do_ need to formalise this, for now we
provide examples of the expected basic usage. This only covers lenses, while in
the future we will likely want prisms and other more-advanced optics.
A projection is generated for each field of a typeset.
> Actionables for this section:
>
> - Work out whether standard optics theory with custom types is sufficient for
> us. We may want to support side effects.
> - Determine how much of the above we can support without a type-checker. There
> are likely to be a lot of edge-cases, so it's important that we make sure we
> know how to get as much of it working as possible.
> - How (if at all) do lenses differ for atoms and typesets?
### Special Fields
We also define special projections from typesets:
- `index`: The expression `t.n`, where `n` is of type `Number` is translated to
`t.index n`.
- `field`: The expression `t.s` where `s` is of type `Text` is translated to
`t.fieldByName s`.

View File

@ -0,0 +1,71 @@
---
layout: developer-doc
title: Inference and Checking
category: types
tags: [types, inference, checking, compiler]
order: 13
---
# Inference and Checking
As a statically-typed language, Enso is built with a sophisticated type checker
capable of reasoning about a fully dependently-typed system. However, a type
checker on its own is quite useless. For Enso to truly be usable, it must also
have a powerful type inference engine.
> The actionables for this section are:
>
> - Work out how on earth we do inference and how we maximise inference power.
> - Do we want to provide a way to reason about the _runtime representation_ of
> types? This is 'Levity Polymorphism' style.
> - We want error messages to be as informative as possible, and are willing to
> retain significant extra algorithmic state in the typechecker to ensure that
> they are. This means both _formatting_ and _useful information_.
> - It is going to be important to retain as much information as possible in
> order to provide informative error messages. This means that the eventual
> algorithm is likely to combine techniques from both W and M
> (context-insensitive and context-sensitive respectively).
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Maximal Inference Power](#maximal-inference-power)
- [Type Inference Algorithm](#type-inference-algorithm)
- [Inferring Dependency](#inferring-dependency)
- [Type Checking Algorithm](#type-checking-algorithm)
<!-- /MarkdownTOC -->
## Maximal Inference Power
In order to make Enso's type inference as helpful and friendly as possible to
our users, we want the ability to infer the _maximal subset_ of the types that
Enso can express.
> The actionables for this section are:
>
> - How do we do inference for higher-rank and impredicative instantiations.
> - How do we infer contexts, and how do we make that inference granular (e.g.
> `IO.Read`, `IO.Write`, rather than just `IO`).
> - How do we propagate inference information as far as possible?
> - If it comes to a tension between typechecker speed and inference capability,
> Enso will err on the side of inference capability in order to promote ease
> of use. Speed will be increased by performing incremental type-checking
> where possible on subsequent changes.
> - Where are we okay requiring annotations? Polymorphic recursion, higher rank
> function parameters, constrained data and dependency?
## Type Inference Algorithm
> The actionables for this section are:
>
> - Specify the inference algorithm.
### Inferring Dependency
> The actionables for this section are:
>
> - Specify how (if at all) we can infer dependent quantifiers.
## Type Checking Algorithm
> The actionables for this section are:
>
> - Specify the type checking algorithm.

79
docs/types/modules.md Normal file
View File

@ -0,0 +1,79 @@
---
layout: developer-doc
title: Modules
category: types
tags: [types, modules]
order: 7
---
# Modules
With such a flexible type system in Enso, the need for making modules
first-class is obviated. Instead, a module is very much its own entity, being
simply a container for bindings (whether they be functions, methods, atoms, or
more generic typesets).
> The actionables for this section are:
>
> - Characterise modules in more depth as we need them.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Resolving Name Clashes](#resolving-name-clashes)
- [Scoping and Imports](#scoping-and-imports)
<!-- /MarkdownTOC -->
## Resolving Name Clashes
Enso modules employ the following rules in order to avoid name clashes:
- Where the module name clashes with a member contained in the module, the
member is preferred. If you need the module you must import it qualified under
another name.
- We provide the alias `here` as a way to access the name of the current module.
## Scoping and Imports
To use the contents of a module we need a way to bring them into scope. Like
most languages, Enso provides an _import_ mechanism for this. Enso has four
different kinds of imports that may be combined freely, all of which take a
module path as their first argument.
1. **Unqualified Imports:** These import all symbols from the module into the
current scope (`import M`).
2. **Qualified Imports:** These import all symbols from the module into the
current scope with symbols qualified under a name _different_ from the
module name (`import M as T`).
3. **Restricted Imports:** These import only the specific symbols from the
module into the current scope (`import M only sym1 sym2`).
4. **Hiding Imports:** These are the inverse of restricted imports, and import
_all_ symbols other than the named ones into the current scope
(`import M hiding sym1 sym2`),
Imports may introduce ambiguous symbols, but this is not an error until one of
the ambiguous symbols is used in user code.
When importing a module `X` into the current module `Y`, the bindings in `X`
become available in `Y` (modified by the import type). However, these bindings
are _not_ available in `Y` externally. This means that we need a re-export
mechanism. Similarly to imports, this has four kinds, all of which take a module
path as their first argument, and all of which _may_ introduce the module it
exports into scope (if it is not already imported).
1. **Unqualified Exports:** These export all symbols from the module as if they
were defined in the exporting module (`export X`).
2. **Qualified Exports:** These export all symbols from the module as if they
were defined in another module accessible in the exporting module
(`export X as Y`).
3. **Restricted Exports:** These export only the specified symbols from the
module as if they were defined in the exporting module (`export X only sym`)
4. **Hiding Exports:** These export all symbols from the module except those
explicitly specified (`export X hiding sym1 sym2`).
Exports effectively act to 'paste' the contents of the exported module into the
module declaring the export. This means that exports that create name clashes
must be resolved at the source.
> The actionables for this section are:
>
> - Are we _really, really_ sure we want unqualified by default?
> - Think about how to handle imports properly in the type checker. What, if
> they have any, are the impacts of imports on inference and checking?

33
docs/types/parallelism.md Normal file
View File

@ -0,0 +1,33 @@
---
layout: developer-doc
title: Automated Parallelism Analysis
category: types
tags: [types, parallelism]
order: 10
---
# Automated Parallelism Analysis
The nature of Enso's type system provides the compiler with a significant
amount of information about the program as it runs. This information can be
exploited by the compiler to automatically parallelise some sections of Enso
programs.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Supporting Parallelism Analysis with Types](#supporting-parallelism-analysis-with-types)
- [Constructs That Can Be Parallelised](#constructs-that-can-be-parallelised)
<!-- /MarkdownTOC -->
## Supporting Parallelism Analysis with Types
> The actionables for this section are:
>
> - Work out how the type checker can support parallelism analysis.
## Constructs That Can Be Parallelised
> The actionables for this section are:
>
> - Provide an analysis of the language constructs that could automatically be
> parallelised, and the typing predicates that make them so.

View File

@ -0,0 +1,73 @@
---
layout: developer-doc
title: Pattern Matching
category: types
tags: [types, pattern-matching]
order: 5
---
# Pattern Matching
Pattern matching in Enso works similarly to as you would expect in various other
functional languages. Typing information is _always_ refined in the branches of
a case expression, which interacts well with dependent typing and type-term
unification.
> The actionables for this section :
>
> - How do we type pattern matching?
> - Exactly how (and what) evidence is discharged during matching?
> - How can we use bijective applications of the
> [typeset operators](/hierarchy.md#typeset-operators) to perform pattern
> matching?
> - How do we extend this to structural matching in general on typesets?
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Positional Matching](#positional-matching)
- [Type Matching](#type-matching)
- [Name Matching on Labels](#name-matching-on-labels)
- [Naming Scrutinees](#naming-scrutinees)
<!-- /MarkdownTOC -->
## Positional Matching
Matching on the scrutinee by structure. This works both for atoms and typesets
(for typesets it is a subsumption judgement).
```ruby
type Vector a
V2 x:a y:a
V3 x:a y:a z:a
v = Vector.V3 x y z
case v of
Vector.V3 x y z -> print x
```
## Type Matching
Matching purely by the types involved, and not matching on structure.
```ruby
case v of
Vector.V3 -> print v.x
```
## Name Matching on Labels
Matching on the labels defined within a type for both atoms and typesets, with
renaming.
```ruby
case v of
Vector.V3 {x y} -> print x
{x} -> print x
```
## Naming Scrutinees
Ascribing a name of a scrutinee is done using the standard typing judgement.
This works due to the type-term unification present in Enso.
```ruby
case _ of
v : Vector.V3 -> print v,x
```

81
docs/types/references.md Normal file
View File

@ -0,0 +1,81 @@
---
layout: developer-doc
title: References
category: types
tags: [types, references]
order: 15
---
# References
This file contains a variety of useful references for those wanting to
understand the concepts and research that underlie Enso's type system and type
theory.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Dependent Types](#dependent-types)
- [Gradual Typing](#gradual-typing)
- [Maximum Inference Power](#maximum-inference-power)
- [Monadic Contexts](#monadic-contexts)
- [Refinement Typing and Compiler Assistance](#refinement-typing-and-compiler-assistance)
- [Rows](#rows)
- [Types and Performance](#types-and-performance)
- [Usability](#usability)
<!-- /MarkdownTOC -->
## Dependent Types
- [A Classical Sequent Calculus for Dependent Types](https://hal.inria.fr/hal-01519929/document)
- [Dependent Information Flow Types](http://ctp.di.fct.unl.pt/~luisal/resources/popl15-paper187.pdf)
- [Dependent Intersection: A New Way of Defining Records in Type Theory](https://ieeexplore.ieee.org/document/1210048)
- [Dependent Types and Monadic Effects in F*](https://www.fstar-lang.org/papers/mumon/)
- [Dependent Types in Haskell: Theory and Practice](https://cs.brynmawr.edu/~rae/papers/2016/thesis/eisenberg-thesis.pdf)
- [Dynamic Typing with Dependent Types](https://link.springer.com/chapter/10.1007/1-4020-8141-3_34)
- [Handling Delimited Continuations with Dependent Types](https://dl.acm.org/doi/10.1145/3236764)
- [Integrating Linear and Dependent Types](https://www.cl.cam.ac.uk/~nk480/dlnl-paper.pdf)
- [Irrelevance, Heterogeneous Equality, and Call-by-Value Dependent Type Systems](https://www.cis.upenn.edu/~sweirich/papers/msfp12prog.pdf)
- [Lightweight Invariants with Full Dependent Types](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.150.5717)
- [Normnalisation through Evaluation for Sized Dependent Types](https://core.ac.uk/display/94336601)
- [Practical Erasure in Dependently-Typed Languages](https://eb.host.cs.st-andrews.ac.uk/drafts/dtp-erasure-draft.pdf)
- [Resourceful Dependent Types](http://www.cse.chalmers.se/~abela/types18.pdf)
- [Syntax and Semantics of Quantitative Type Theory](https://bentnib.org/quantitative-type-theory.pdf)
- [Verifying Higher-Order Programs with the Dijkstra Monad](https://www.microsoft.com/en-us/research/publication/verifying-higher-order-programs-with-the-dijkstra-monad/)
## Gradual Typing
- [Approximate Normalisation for Gradual Dependent Types](https://arxiv.org/abs/1906.06469)
- [Gradual Type Theory](https://www.ccs.neu.edu/home/amal/papers/gtt.pdf)
- [Gradual Type-and-Effect Systems](https://pdfs.semanticscholar.org/fedf/ccecaa94d4bc502e9a7557b89a503fcb4b95.pdf)
## Maximum Inference Power
- [A Theory of Qualified Types](https://github.com/sdiehl/papers/blob/master/A_Theory_Of_Qualified_Types.pdf)
- [Boxy Type-Inference for Higher-Rank Types and Impredicativity](https://www.microsoft.com/en-us/research/publication/boxy-type-inference-for-higher-rank-types-and-impredicativity/)
- [Complete and Easy Bidirectional Typechecking for Higher-Rank Polymorphism](https://www.cl.cam.ac.uk/~nk480/bidir.pdf)
- [Flexible Types: Robust Type Inference for First-class Polymorphism](https://www.microsoft.com/en-us/research/publication/flexible-types-robust-type-inference-for-first-class-polymorphism/)
- [FPH: First-Class Polymorphism for Haskell](https://www.microsoft.com/en-us/research/publication/fph-first-class-polymorphism-for-haskell/)
- [MLF: Raising ML to the Power of System-F](http://gallium.inria.fr/~remy/work/mlf/icfp.pdf)
- [Practical Type Inference for Arbitrary-Rank Types](https://www.microsoft.com/en-us/research/publication/practical-type-inference-for-arbitrary-rank-types/)
- [QML: Explicit, First-Class Polymorphism for ML](https://www.microsoft.com/en-us/research/wp-content/uploads/2009/09/QML-Explicit-First-Class-Polymorphism-for-ML.pdf)
- [Wobbly Types: Type Inference for GADTs](https://www.microsoft.com/en-us/research/publication/wobbly-types-type-inference-for-generalised-algebraic-data-types/)
## Monadic Contexts
- [Supermonads](http://eprints.nottingham.ac.uk/36156/1/paper.pdf)
## Refinement Typing and Compiler Assistance
- [Abstract Refinement Types](http://goto.ucsd.edu/~rjhala/papers/abstract_refinement_types.pdf)
- [Liquid Types](https://patrickrondon.com/research/papers/rondon-liquid-types.pdf)
- [Towards a Provably Correct Encoding from F* to SMT](https://prosecco.gforge.inria.fr/personal/hritcu/students/alejandro/report.pdf)
## Rows
- [Abstracting Extensible Data Types](http://ittc.ku.edu/~garrett/pubs/morris-popl2019-rows.pdf)
- [Algebraic Subtyping](https://www.cl.cam.ac.uk/~sd601/thesis.pdf)
- [Extensible Records with Scoped Labels](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/scopedlabels.pdf)
## Types and Performance
- [A Lazy Language Needs a Lazy Type System](https://www.researchgate.net/publication/311648324_A_Lazy_Language_Needs_a_Lazy_Type_System_Introducing_Polymorphic_Contexts)
- [Higher-Order Type-Level Programming in Haskell](https://www.microsoft.com/en-us/research/publication/higher-order-type-level-programming-in-haskell/)
- [Levity Polymorphism](https://cs.brynmawr.edu/~rae/papers/2017/levity/levity-extended.pdf)
- [Partial Type-Constructors](https://cs.brynmawr.edu/~rae/papers/2019/partialdata/partialdata.pdf)
- [Theory and Practice of Demand Analysis in Haskell](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/03/demand-jfp-draft.pdf)
## Usability
- [Explaining Type Errors](https://repository.brynmawr.edu/compsci_pubs/80/)

View File

@ -0,0 +1,53 @@
---
layout: developer-doc
title: Type-Directed Programming
category: types
tags: [types, tooling]
order: 11
---
# Type-Directed Programming
When you have a language with as powerful a type system as Enso, you can
leverage the type system to provide users with tools to write their programs
based on types. This is an _advanced_ feature and is not expected to be used by
the novice, but it is nonetheless an important feature for working with a
powerful type system.
> The actionables for this section are:
>
> - Examine what other ways we can exploit type information to aid development.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [Typed Holes](#typed-holes)
- [Case Splitting](#case-splitting)
- [Row Manipulation](#row-manipulation)
- [Dependent Sum Manipulation](#dependent-sum-manipulation)
<!-- /MarkdownTOC -->
## Typed Holes
> The actionables for this section are:
>
> - Determine how we want to support typed holes.
> - Determine the syntax for typed holes.
## Case Splitting
> The actionables for this section are:
>
> - Determine how we want to support case splitting.
> - Determine the tooling for case splitting.
## Row Manipulation
> The actionables for this section are:
>
> - Determine how we want to support row manipulation.
## Dependent Sum Manipulation
> The actionables for this section are:
>
> - Determine how we want to support dependent sum manipulation.