unison/docs/comments-and-docs.markdown
2019-04-16 11:58:17 -04:00

9.7 KiB
Raw Permalink Blame History

Design for Unison documentation and comments

This is a rough design of a way to supply commentary and formal documentation for Unison code. Discuss here and also be sure to view the raw markdown file for some embedded comments.

Comments

Comments in Unison can be either line comments or block comments. Its probably only necessary to implement one of these for a first release of Unison, but ultimately we may want to offer both.

Line comments

Line comments can be introduced in code with a special token. For example, if we want Haskell-like syntax, the -- token introduces a comment:

foo x y = 
  -- This is a comment
  x + y

Line comments follow these syntactic rules:

  1. A line comment must occupy the whole line. For simplicity, its a syntax error to put a comment at the end of a line that contains anything other than whitespace.
  2. The comment is attached to the abstract syntax tree node that is BEGUN by the token following the comment.
  3. When rendering comments, the indentation should be the same as the token that follows the comment.

Block comments

Block comments can be introduced with special brackets. For example, if we want Haskell-like syntax, the {-``-} brackets delimit a block comment:

foo x y = 
  {- This is a comment. -} x + y

foo x y = {- comment -} (x + y)

foo x y = 
  {- comment -}
  (x + y)

foo x y = 
  {- comment -}
  x + y

Block comments follow these syntactic rules:

  1. A block comment can appear anywhere.
  2. The comment is attached to the abstract syntax tree node that is BEGUN by the token following the comment. If that's not defined, could be an error, or could just use some ad hoc heuristic to find "nearest" AST node.
  3. When rendering comments, the indentation should be the same as the token that follows the comment.

Comments and code structure

Comments should not have any effect on the hash of a Unison term or type. I propose that comments be kept as an annotation on the AST rather than as part of the AST itself. This way, comments can be edited, added, or removed, without touching the AST.

Comments and the codebase

Comments should be stored in the codebase as annotations on the syntax tree. For example, under the hash for the term (or type), we could add a new file comments.ub that contains the comments in pairs of (AST node index, comment text).

A future version might allow for multiple comment sets (commentary with different purposes or audiences) by adding e.g. a tag field to the comments, or having a whole comments directory instead of just one file.

API documentation

Any hash in the codebase can have formal API documentation associated with it. This might include basic usage, free-text explanations, examples, links to further reading, and links to related hashes.

Probably some flavor of Markdown is ideal for API docs.

The Unison CLI and API docs

Ultimately well want to have a more visual codebase editor (see e.g. Pharo Smalltalk), but for now we have the Unison CLI. So there ought to be a special syntax for indicating that you want to associate API docs to a definition when you add it to the codebase (or update). This syntax should be light-weight and easy to type.

For example:

{| `foo x y` adds `x` to `y` |}

foo x y = x + y

The rule here would be that the documentation block gets associated with the definition that immediately follows.

Alternatively, something like:

{foo| `foo x y` adds `x` to `y`|}.

This would associate the documentation block to the hash named foo even if that hash isnt being otherwise edited in the file.

Semantic content of API docs

Wherever docs have code (in Markdown between fences or backticks), Unison should parse that code, resolve names, and substitute hashes for names.

E.g., the doc might have a usage example:

{|
Usage: `foo x y` adds `x` to `y`.
|}

When this doc block gets processed by Unison, it should parse foo x y and recognize that foo, x, and y are free. It should replace foo with a hyperlink to the hash of foo. It should do this for every name that exists in the codebase.

There should be some syntax to exclude a code block from this processing.

Alternatively, we could have special syntax to indicate that something should be parsed as a Unison name. E.g.

{| 
Usage: `@foo x y` adds `x` to `y`.
|}

Where @foo indicates that foo is a Unison name, wed like an error if it isnt, and it should be replaced in the rendered docs with a hyperlink to foo.

Opinionated doc format

Its possible that well want to be very opinionated about how what goes into API documentation, for uniformity across libraries and ease of use.

For example, we might have API docs support the following fields for a function definition:

  • Usage: How to call the function. E.g. “foo x y adds x to y”. We should maintain the invariant that the usage is correct (that it matches the name of the function and its arity).
  • Examples: discussed above.

Note that author name, time stamp, etc, can be inferred from the codebase. These are data that can be displayed in the API docs when rendered, but dont need to be written by the author.

Docbase/Wiki

Separately from API documentation, it would be good to be able to write tutorials or long-form explanations of Unison libraries, with links into the codebase API docs.

Wed need to write a tool that can process e.g. Github-flavoured Markdown together with a Unison codebase. The markdown format would have Unison-specific extensions to allow hyperlinking Unison hashes as well as Tut-style evaluation of examples.

Ideally, the documentation would be kept automatically up to date in the face of renames, etc.

Processing has to have two distinct phases, authoring and rendering.

  • Authoring: you write the markdown document and use Unison human-readable names in your code. When you add your document to the docbase, all the names get replaced with Unison hashes before being stored.
  • Rendering: A document stored in the docbase could then be rendered as e.g. HTML (or Markdown) where Unison hashes are turned back to human-readable names from the codebase, and hyperlinked to the API documentation for the hashes.

Transclusion

A particularly useful feature for this kind of documentation tool would be transclusion of code. E.g. with a syntax like…

{:transclude MyLibrary.myFun}

The tool could render that as a code block containing the definition of MyLibrary.myFun. Ideally that would register this document as a dependency of MyLibrary.myFun and propagation of updates could work the same way as for code.

It would be good to also have a way (as in Elm) of transcluding the API docs of individual types and functions in a document.

This is a way of keeping documentation automatically up to date, at least partially.