It is now completely[^1] independent of the Unison language. The parser takes a few parsers as arguments: one for
identifiers, one for code, and one to indicate the end of the Doc block.
[^1]: There is one last bit to be removed in the next commit – Doc still looks for `type` or `ability` to identify type
links.
In general, they map to the constructors of the Doc types, with some wiggle room for now.
It’s probably beneficial to review this commit by ignoring whitespace.
These are needed for the new Doc types, but had been stubbed out. Moving
the Doc types to their own module forced the changes that got in the way
of generating these with Template Haskell.
We now build the stanzas at the same time as the tree, and don’t discard them after reordering.
This also changes the closing element of `Block` to be `Maybe` instead of `[]`.
This removes the need to pad the lexer stream with trailing `Close` lexemes. If
EOF is reached, the parser will automatically close any layout blocks (but not
context-free blocks).
After running the core of the lexer, the `lexer` function then does some
work to turn the stream into a tree, and reorder some lexemes. It then
throws away the tree structure.
This is the first step of preserving the tree structure for the parser.
It extracts the “pre-parser” from `lexer` so
that it can eventually be used _after_ the lexer, rather than internally.
This also moves `fixup` to be applied on each block as we reorder it,
rather than across the entire stream at the end (since the goal is to
not _have_ an entire stream any more).
This removes the layer that makes the `Doc` parser look like a lexer and
replaces it with a function that converts the Doc structure directly
Unison Terms.
`doc2` was a parser in lexer’s clothing. It would parse recursively, but
then return the result as a flat list of tokens.
This separates the parsing from the “unparsing” (which returns the
tokens), so now we have a parser to a recursive `Doc` structure. This
currently immediately applies the unparser, and should result in an
identical stream of tokens as the previous version. Eventually, we
should be able to avoid unparsing the `Doc` structure.
`doc2` is a Unison lexer that traverses a `Doc`.
`docBody` is the actual `Doc` lexer that is ignorant of the fact that Unison wraps `Doc` blocks in `{{`/`}}`.