9.7 KiB
Design for Unison documentation and comments
This is a rough design of a way to supply commentary and formal documentation for Unison code. Discuss here and also be sure to view the raw markdown file for some embedded comments.
Comments
Comments in Unison can be either line comments or block comments. It’s probably only necessary to implement one of these for a first release of Unison, but ultimately we may want to offer both.
Line comments
Line comments can be introduced in code with a special token. For example, if we want Haskell-like syntax, the --
token introduces a comment:
foo x y =
-- This is a comment
x + y
Line comments follow these syntactic rules:
- A line comment must occupy the whole line. For simplicity, it’s a syntax error to put a comment at the end of a line that contains anything other than whitespace.
- The comment is attached to the abstract syntax tree node that is BEGUN by the token following the comment.
- When rendering comments, the indentation should be the same as the token that follows the comment.
Block comments
Block comments can be introduced with special brackets. For example, if we want Haskell-like syntax, the {-``-}
brackets delimit a block comment:
foo x y =
{- This is a comment. -} x + y
foo x y = {- comment -} (x + y)
foo x y =
{- comment -}
(x + y)
foo x y =
{- comment -}
x + y
Block comments follow these syntactic rules:
- A block comment can appear anywhere.
- The comment is attached to the abstract syntax tree node that is BEGUN by the token following the comment. If that's not defined, could be an error, or could just use some ad hoc heuristic to find "nearest" AST node.
- When rendering comments, the indentation should be the same as the token that follows the comment.
Comments and code structure
Comments should not have any effect on the hash of a Unison term or type. I propose that comments be kept as an annotation on the AST rather than as part of the AST itself. This way, comments can be edited, added, or removed, without touching the AST.
Comments and the codebase
Comments should be stored in the codebase as annotations on the syntax tree. For example, under the hash for the term (or type), we could add a new file comments.ub
that contains the comments in pairs of (AST node index, comment text)
.
A future version might allow for multiple comment sets (commentary with different purposes or audiences) by adding e.g. a tag field to the comments, or having a whole comments
directory instead of just one file.
API documentation
Any hash in the codebase can have formal API documentation associated with it. This might include basic usage, free-text explanations, examples, links to further reading, and links to related hashes.
Probably some flavor of Markdown is ideal for API docs.
The Unison CLI and API docs
Ultimately we’ll want to have a more visual codebase editor (see e.g. Pharo Smalltalk), but for now we have the Unison CLI. So there ought to be a special syntax for indicating that you want to associate API docs to a definition when you add
it to the codebase (or update
). This syntax should be light-weight and easy to type.
For example:
{| `foo x y` adds `x` to `y` |}
foo x y = x + y
The rule here would be that the documentation block gets associated with the definition that immediately follows.
Alternatively, something like:
{foo| `foo x y` adds `x` to `y`|}.
This would associate the documentation block to the hash named foo
even if that hash isn’t being otherwise edited in the file.
Semantic content of API docs
Wherever docs have code (in Markdown between fences or backticks), Unison should parse that code, resolve names, and substitute hashes for names.
E.g., the doc might have a usage example:
{|
Usage: `foo x y` adds `x` to `y`.
|}
When this doc block gets processed by Unison, it should parse foo x y
and recognize that foo
, x
, and y
are free. It should replace foo
with a hyperlink to the hash of foo
. It should do this for every name that exists in the codebase.
There should be some syntax to exclude a code block from this processing.
Alternatively, we could have special syntax to indicate that something should be parsed as a Unison name. E.g.
{|
Usage: `@foo x y` adds `x` to `y`.
|}
Where @foo
indicates that foo
is a Unison name, we’d like an error if it isn’t, and it should be replaced in the rendered docs with a hyperlink to foo
.
Opinionated doc format
It’s possible that we’ll want to be very opinionated about how what goes into API documentation, for uniformity across libraries and ease of use.
For example, we might have API docs support the following fields for a function definition:
- Usage: How to call the function. E.g. “
foo x y
addsx
toy
”. We should maintain the invariant that the usage is correct (that it matches the name of the function and its arity). - Examples: discussed above.
Note that author name, time stamp, etc, can be inferred from the codebase. These are data that can be displayed in the API docs when rendered, but don’t need to be written by the author.
Docbase/Wiki
Separately from API documentation, it would be good to be able to write tutorials or long-form explanations of Unison libraries, with links into the codebase API docs.
We’d need to write a tool that can process e.g. Github-flavoured Markdown together with a Unison codebase. The markdown format would have Unison-specific extensions to allow hyperlinking Unison hashes as well as Tut-style evaluation of examples.
Ideally, the documentation would be kept automatically up to date in the face of renames, etc.
Processing has to have two distinct phases, authoring and rendering.
- Authoring: you write the markdown document and use Unison human-readable names in your code. When you add your document to the docbase, all the names get replaced with Unison hashes before being stored.
- Rendering: A document stored in the docbase could then be rendered as e.g. HTML (or Markdown) where Unison hashes are turned back to human-readable names from the codebase, and hyperlinked to the API documentation for the hashes.
Transclusion
A particularly useful feature for this kind of documentation tool would be transclusion of code. E.g. with a syntax like…
{:transclude MyLibrary.myFun}
The tool could render that as a code block containing the definition of MyLibrary.myFun
. Ideally that would register this document as a dependency of MyLibrary.myFun
and propagation of updates could work the same way as for code.
It would be good to also have a way (as in Elm) of transcluding the API docs of individual types and functions in a document.
This is a way of keeping documentation automatically up to date, at least partially.