2020-01-17 00:55:41 +03:00
This transcript explains a few minor details about doc parsing and pretty-printing, both from a user point of view and with some implementation notes. The later stuff is meant more as unit testing than for human consumption. (The ucm `add` commands and their output are hidden for brevity.)
2019-12-14 21:15:32 +03:00
2019-11-30 18:58:31 +03:00
Docs can be used as inline code comments.
```unison
2019-12-14 21:15:32 +03:00
foo : Nat -> Nat
foo n =
2019-12-17 02:32:36 +03:00
[: do the thing :]
2019-12-14 21:15:32 +03:00
n + 1
2019-11-30 18:58:31 +03:00
```
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
⍟ These new definitions are ok to `add` :
2019-12-14 21:15:32 +03:00
foo : Nat -> Nat
2019-11-30 18:58:31 +03:00
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
```
```ucm
2019-12-14 21:15:32 +03:00
.> view foo
2019-11-30 18:58:31 +03:00
2019-12-14 21:15:32 +03:00
foo : Nat -> Nat
foo n =
use Nat +
2020-01-17 00:50:25 +03:00
[: do the thing :]
2019-12-14 21:15:32 +03:00
n + 1
2019-11-30 18:58:31 +03:00
```
2019-12-14 21:15:32 +03:00
Note that `@` and `:]` must be escaped within docs.
2019-11-30 18:58:31 +03:00
```unison
escaping = [: Docs look [: like \@this \:] :]
```
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
⍟ These new definitions are ok to `add` :
escaping : Doc
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
```
```ucm
.> view escaping
escaping : Doc
2020-01-17 00:50:25 +03:00
escaping = [: Docs look [: like \@this \:] :]
2019-11-30 18:58:31 +03:00
```
2019-12-14 21:15:32 +03:00
(Alas you can't have `\@` or `\:]` in your doc, as there's currently no way to 'unescape' them.)
```unison
-- Note that -- comments are preserved within doc literals.
commented = [:
example:
2020-01-17 00:50:25 +03:00
-- a comment
f x = x + 1
2019-12-14 21:15:32 +03:00
:]
```
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
⍟ These new definitions are ok to `add` :
commented : Doc
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
```
```ucm
2019-12-17 02:32:36 +03:00
.> view commented
2019-12-14 21:15:32 +03:00
commented : Doc
commented =
2019-12-22 01:07:11 +03:00
[:
example:
2019-12-14 21:15:32 +03:00
2019-12-19 03:08:47 +03:00
-- a comment
2020-01-17 00:50:25 +03:00
f x = x + 1
:]
2019-12-17 02:32:36 +03:00
```
### Indenting, and paragraph reflow
Handling of indenting in docs between the parser and pretty-printer is a bit fiddly.
```unison
-- The leading and trailing spaces are stripped from the stored Doc by the
-- lexer, and one leading and trailing space is inserted again on view/edit
-- by the pretty-printer.
doc1 = [: hi :]
```
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
2019-12-14 21:15:32 +03:00
2019-12-17 02:32:36 +03:00
⍟ These new definitions are ok to `add` :
doc1 : Doc
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
```
```ucm
.> view doc1
2019-12-14 21:15:32 +03:00
doc1 : Doc
2020-01-17 00:50:25 +03:00
doc1 = [: hi :]
2019-12-17 02:32:36 +03:00
```
```unison
2020-01-17 00:55:41 +03:00
-- Lines (apart from the first line, i.e. the bit between the [: and the
-- first newline) are unindented until at least one of
2019-12-17 02:32:36 +03:00
-- them hits the left margin (by a post-processing step in the parser).
-- You may not notice this because the pretty-printer indents them again on
-- view/edit.
doc2 = [: hello
- foo
- bar
and the rest. :]
```
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
2019-12-14 21:15:32 +03:00
2019-12-17 02:32:36 +03:00
⍟ These new definitions are ok to `add` :
doc2 : Doc
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
```
```ucm
.> view doc2
2019-12-14 21:15:32 +03:00
doc2 : Doc
doc2 =
2020-01-17 00:50:25 +03:00
[: hello
2019-12-19 03:08:47 +03:00
- foo
- bar
2019-12-14 21:15:32 +03:00
and the rest. :]
2019-12-17 02:32:36 +03:00
```
```unison
doc3 = [: When Unison identifies a paragraph, it removes any newlines from it before storing it, and then reflows the paragraph text to fit the display window on display/view/edit.
For these purposes, a paragraph is any sequence of non-empty lines that have zero indent (after the unindenting mentioned above.)
- So this is not a paragraph, even
though you might want it to be.
And this text | as a paragraph
is not treated | either.
Note that because of the special treatment of the first line mentioned above, where its leading space is removed, it is always treated as a paragraph.
:]
```
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
2019-12-14 21:15:32 +03:00
2019-12-17 02:32:36 +03:00
⍟ These new definitions are ok to `add` :
doc3 : Doc
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
```
```ucm
.> view doc3
2019-12-14 21:15:32 +03:00
doc3 : Doc
doc3 =
2020-01-17 00:50:25 +03:00
[: When Unison identifies a paragraph, it removes any newlines
2019-12-22 01:07:11 +03:00
from it before storing it, and then reflows the paragraph text
to fit the display window on display/view/edit.
2019-12-17 02:32:36 +03:00
For these purposes, a paragraph is any sequence of non-empty
lines that have zero indent (after the unindenting mentioned
above.)
2019-12-19 03:08:47 +03:00
- So this is not a paragraph, even
though you might want it to be.
2019-12-17 02:32:36 +03:00
2019-12-19 03:08:47 +03:00
And this text | as a paragraph
is not treated | either.
2019-12-17 02:32:36 +03:00
Note that because of the special treatment of the first line
2019-12-19 03:08:47 +03:00
mentioned above, where its leading space is removed, it is always
treated as a paragraph.
2019-12-17 02:32:36 +03:00
:]
```
```unison
doc4 = [: Here's another example of some paragraphs.
All these lines have zero indent.
- Apart from this one. :]
```
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
⍟ These new definitions are ok to `add` :
doc4 : Doc
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
```
```ucm
.> view doc4
doc4 : Doc
doc4 =
2020-01-17 00:50:25 +03:00
[: Here's another example of some paragraphs.
2019-12-17 02:32:36 +03:00
2020-01-17 00:50:25 +03:00
All these lines have zero indent.
2019-12-17 02:32:36 +03:00
2020-01-17 00:50:25 +03:00
- Apart from this one. :]
2019-12-17 02:32:36 +03:00
```
```unison
-- The special treatment of the first line does mean that the following
-- is pretty-printed not so prettily. To fix that we'd need to get the
-- lexer to help out with interpreting doc literal indentation (because
-- it knows what columns the `[:` was in.)
doc5 = [: - foo
- bar
and the rest. :]
```
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
⍟ These new definitions are ok to `add` :
doc5 : Doc
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
```
```ucm
.> view doc5
doc5 : Doc
doc5 =
2020-01-17 00:50:25 +03:00
[: - foo
2019-12-19 03:08:47 +03:00
- bar
2019-12-14 21:15:32 +03:00
and the rest. :]
```
2020-01-17 00:55:41 +03:00
```unison
2019-12-17 02:32:36 +03:00
-- You can do the following to avoid that problem.
doc6 = [:
- foo
- bar
and the rest.
:]
```
2019-11-30 18:58:31 +03:00
2020-01-17 00:55:41 +03:00
```ucm
2019-11-30 18:58:31 +03:00
2020-01-17 00:55:41 +03:00
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
⍟ These new definitions are ok to `add` :
doc6 : Doc
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
2019-11-30 18:58:31 +03:00
2019-12-17 02:32:36 +03:00
```
2020-01-17 00:55:41 +03:00
```ucm
2019-12-17 02:32:36 +03:00
.> view doc6
2020-01-17 00:55:41 +03:00
doc6 : Doc
doc6 =
[:
- foo
- bar
and the rest.
:]
2019-12-17 02:32:36 +03:00
2019-12-22 01:07:11 +03:00
```
2020-01-17 00:55:41 +03:00
### More testing
```unison
-- Check empty doc works.
2019-12-22 01:07:11 +03:00
empty = [::]
2020-01-17 00:55:41 +03:00
expr = foo 1
2019-12-22 01:07:11 +03:00
```
2020-01-17 00:55:41 +03:00
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
⍟ These new definitions are ok to `add` :
empty : Doc
expr : Nat
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
2019-12-22 01:07:11 +03:00
```
2020-01-17 00:55:41 +03:00
```ucm
.> view empty
empty : Doc
empty = [: :]
2019-12-22 01:07:11 +03:00
```
2020-01-17 00:55:41 +03:00
```unison
test1 = [:
The internal logic starts to get hairy when you use the \@ features, for example referencing a name like @List .take. Internally, the text between each such usage is its own blob (blob ends here --> @List .take), so paragraph reflow has to be aware of multiple blobs to do paragraph reflow (or, more accurately, to do the normalization step where newlines with a paragraph are removed.)
Para to reflow: lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor ending in ref @List .take
@List .take starting para lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor.
2019-12-22 01:07:11 +03:00
2020-01-17 00:55:41 +03:00
Middle of para: lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor @List .take lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor.
- non-para line (@List.take) with ref @List .take
Another non-para line
@List .take starting non-para line
- non-para line with ref @List .take
before a para-line lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor.
- non-para line followed by a para line starting with ref
@List .take lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor.
a para-line ending with ref lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor @List .take
- non-para line
para line lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor
@List .take followed by non-para line starting with ref.
@[signature] List.take
@[source] foo
@[evaluate] expr
@[include] doc1
2020-01-26 18:51:11 +03:00
-- note the leading space below
@[signature] List.take
2020-01-17 00:55:41 +03:00
:]
2019-12-22 01:07:11 +03:00
```
2020-01-17 00:55:41 +03:00
```ucm
I found and typechecked these definitions in scratch.u. If you
do an `add` or `update` , here's how your codebase would
change:
⍟ These new definitions are ok to `add` :
test1 : Doc
Now evaluating any watch expressions (lines starting with
`>` )... Ctrl+C cancels.
2019-12-22 01:07:11 +03:00
```
2020-01-17 00:55:41 +03:00
```ucm
.> view test1
2019-12-22 01:07:11 +03:00
2020-01-17 00:55:41 +03:00
test1 : Doc
test1 =
[:
The internal logic starts to get hairy when you use the \@ features,
for example referencing a name like @List .take. Internally,
the text between each such usage is its own blob (blob ends here
--> @List .take), so paragraph reflow has to be aware of multiple
blobs to do paragraph reflow (or, more accurately, to do the
normalization step where newlines with a paragraph are removed.)
Para to reflow: lorem ipsum dolor lorem ipsum dolor lorem ipsum
dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem
ipsum dolor ending in ref @List .take
@List .take starting para lorem ipsum dolor lorem ipsum dolor
lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum
dolor lorem ipsum dolor.
Middle of para: lorem ipsum dolor lorem ipsum dolor lorem ipsum
dolor @List .take lorem ipsum dolor lorem ipsum dolor lorem ipsum
dolor lorem ipsum dolor.
- non-para line (@List.take) with ref @List .take
Another non-para line
@List .take starting non-para line
- non-para line with ref @List .take
before a para-line lorem ipsum dolor lorem ipsum dolor lorem
ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor
lorem ipsum dolor lorem ipsum dolor.
- non-para line followed by a para line starting with ref
@List .take lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor
lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum
dolor lorem ipsum dolor.
a para-line ending with ref lorem ipsum dolor lorem ipsum dolor
lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum
dolor lorem ipsum dolor lorem ipsum dolor @List .take
- non-para line
para line lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor
lorem ipsum dolor lorem ipsum dolor lorem ipsum dolor lorem ipsum
dolor lorem ipsum dolor
@List .take followed by non-para line starting with ref.
@[signature] List.take
@[source] foo
@[evaluate] expr
@[include] doc1
2020-01-26 18:51:11 +03:00
-- note the leading space below
@[signature] List.take
2020-01-17 00:55:41 +03:00
:]
```