mirror of
https://github.com/unisonweb/unison.git
synced 2024-09-17 13:27:30 +03:00
#936 pretty-print doc literals - spec out modified paragraph reflow/wrapping
Currently Unison takes doc literals like [: example usage: - note the call to bar - we pass it 1 foo = bar 1 baz :] and renders them as [: example usage: - note the call to bar - we pass it 1 foo = bar 1 baz :] due to the paragraph reflow/wrapping function. Also, this function only works the first time you use it on a doc: if you `edit` a doc and mess with it, you won't get the wrapping refreshed the next time you view. That's because the pretty-printing works by inserting newlines, which are treated as paragraph breaks by Unison the next time it sees the code. I'm planning to fix this by - only doing paragraph reflow/wrapping for lines with zero indent (where zero is the left-most non-whitespace column of the doc) - normalizing docs during parse, including removing newlines that are within a paragraph. See unison-src/transcripts/doc-formatting.md in this commit for more details of how this shakes out. Not the .md.output because that's showing the current behavior without this change implemented. Hopefully this is just a few more lines of code to implement, and then #936/#994 is good to merge.
This commit is contained in:
parent
8f284d902c
commit
776af61064
@ -303,7 +303,7 @@ docBlock = do
|
||||
segs <- many segment
|
||||
closeTok <- closeBlock
|
||||
let a = ann openTok <> ann closeTok
|
||||
pure . docUnIndent $ Term.app a (Term.constructor a DD.docRef DD.docJoinId) (Term.seq a segs)
|
||||
pure . docNormalize $ Term.app a (Term.constructor a DD.docRef DD.docJoinId) (Term.seq a segs)
|
||||
where
|
||||
segment = blob <|> linky
|
||||
blob = do
|
||||
@ -344,10 +344,17 @@ docBlock = do
|
||||
|
||||
-- Operates on the text of the Blobs within a doc (as parsed by docBlock):
|
||||
-- - reduces the whitespace after all newlines so that at least one of the
|
||||
-- non-initial lines has zero indent.
|
||||
-- The pretty-printer will add indentation when rendering - we're undoing that here.
|
||||
docUnIndent :: (Ord v, Show v) => AnnotatedTerm v a -> AnnotatedTerm v a
|
||||
docUnIndent tm = case tm of
|
||||
-- non-initial lines has zero indent (important because the pretty-printer adds
|
||||
-- indenting every time the doc passes through it)
|
||||
-- TODO implement the following
|
||||
-- - remove trailing whitespace from each line
|
||||
-- - remove newlines between any sequence of non-empty zero-indent lines
|
||||
-- (i.e. undo manual line-breaking within paragraphs.)
|
||||
-- Should be understood in tandem with Util.Pretty.paragraphyText, which
|
||||
-- outputs doc text for display/edit/view.
|
||||
-- See also unison-src/transcripts/doc-formatting.md.
|
||||
docNormalize :: (Ord v, Show v) => AnnotatedTerm v a -> AnnotatedTerm v a
|
||||
docNormalize tm = case tm of
|
||||
-- This pattern is just `DD.DocJoin seqs`, but exploded in order to grab
|
||||
-- the annotations. The aim is just to map `go` over it.
|
||||
a@(Term.App' c@(Term.Constructor' DD.DocRef DD.DocJoinId) s@(Term.Sequence' seqs))
|
||||
|
@ -144,9 +144,15 @@ wrapImpl (p:ps) = wrap_ . Seq.fromList $
|
||||
wrapString :: (LL.ListLike s Char, IsString s) => String -> Pretty s
|
||||
wrapString s = wrap (lit $ fromString s)
|
||||
|
||||
-- 0. Preserve all leading and trailing whitespace
|
||||
-- TODO - replace this spec...
|
||||
-- 0. Preserve all leading and trailing whitespace
|
||||
-- 1. Preserve all newlines
|
||||
-- 2. Wrap all text in between newlines
|
||||
-- TODO ... with this one:
|
||||
-- Wrap all text on lines that don't start with a space.
|
||||
-- Preserve other whitespace.
|
||||
-- Should be understood in tandem with TermParser.docNormalize.
|
||||
-- See also unison-src/transcripts/doc-formatting.md.
|
||||
paragraphyText :: (LL.ListLike s Char, IsString s) => Text -> Pretty s
|
||||
paragraphyText t = text start <> inner <> text end where
|
||||
inner = sep "\n" . fmap (wrap . text) . Text.splitOn "\n" $ t'
|
||||
|
@ -1,17 +1,18 @@
|
||||
This transcript verifies a few minor details about doc parsing and pretty-printing.
|
||||
This transcript explains a few minor details about doc parsing and pretty-printing, both from a user point of view and with some implementation notes. (The ucm `add` commands and their output are hidden for brevity.)
|
||||
|
||||
Docs can be used as inline code comments.
|
||||
|
||||
```unison
|
||||
foo : Nat -> Nat
|
||||
foo n =
|
||||
[: do the
|
||||
thing :]
|
||||
[: do the thing :]
|
||||
n + 1
|
||||
```
|
||||
|
||||
```ucm
|
||||
```ucm:hide
|
||||
.> add
|
||||
```
|
||||
```ucm
|
||||
.> view foo
|
||||
```
|
||||
|
||||
@ -21,48 +22,16 @@ Note that `@` and `:]` must be escaped within docs.
|
||||
escaping = [: Docs look [: like \@this \:] :]
|
||||
```
|
||||
|
||||
```ucm
|
||||
```ucm:hide
|
||||
.> add
|
||||
```
|
||||
```ucm
|
||||
.> view escaping
|
||||
```
|
||||
|
||||
(Alas you can't have `\@` or `\:]` in your doc, as there's currently no way to 'unescape' them.)
|
||||
|
||||
### Indenting
|
||||
|
||||
Handling of indenting in docs between the parser and pretty-printer is a bit fiddly.
|
||||
|
||||
```unison
|
||||
|
||||
-- The leading and trailing spaces are stripped from the stored Doc, and one
|
||||
-- leading and trailing space is inserted again on view/edit
|
||||
-- by the pretty-printer.
|
||||
doc1 = [: hi :]
|
||||
|
||||
-- The above treatment is only applied to the first line of a Doc.
|
||||
-- For storing subsequent lines, they are unindented until at least one of
|
||||
-- them hits the left margin. You don't notice this because the
|
||||
-- pretty-printer indents them again on view/edit.
|
||||
doc2 = [: hello
|
||||
- foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
|
||||
-- That does mean that the following is pretty-printed not so prettily.
|
||||
-- To fix that we'd need to get the lexer to help out with interpreting
|
||||
-- doc literal indentation (because it knows what columns the `[:` was in.)
|
||||
doc3 = [: - foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
|
||||
-- You can do the following to avoid that problem.
|
||||
-- TODO this hashes the same as 3
|
||||
doc4 = [:
|
||||
- foo
|
||||
- bar
|
||||
and the rest.
|
||||
:]
|
||||
|
||||
-- Note that -- comments are preserved within doc literals.
|
||||
commented = [:
|
||||
example:
|
||||
@ -72,15 +41,116 @@ commented = [:
|
||||
:]
|
||||
```
|
||||
|
||||
```ucm
|
||||
```ucm:hide
|
||||
.> add
|
||||
.> view doc1 doc2 doc3 doc4 commented
|
||||
```
|
||||
```ucm
|
||||
.> view commented
|
||||
```
|
||||
|
||||
If we edit `foo` starting from the pretty-printed version above, and add it again, we end up with the same thing.
|
||||
### Indenting, and paragraph reflow
|
||||
|
||||
Handling of indenting in docs between the parser and pretty-printer is a bit fiddly.
|
||||
|
||||
```unison
|
||||
-- The leading and trailing spaces are stripped from the stored Doc by the
|
||||
-- lexer, and one leading and trailing space is inserted again on view/edit
|
||||
-- by the pretty-printer.
|
||||
doc1 = [: hi :]
|
||||
```
|
||||
|
||||
```ucm:hide
|
||||
.> add
|
||||
```
|
||||
```ucm
|
||||
.> view doc1
|
||||
```
|
||||
|
||||
```unison
|
||||
-- Lines (apart from the first line) are unindented until at least one of
|
||||
-- them hits the left margin (by a post-processing step in the parser).
|
||||
-- You may not notice this because the pretty-printer indents them again on
|
||||
-- view/edit.
|
||||
doc2 = [: hello
|
||||
- foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
```
|
||||
|
||||
```ucm:hide
|
||||
.> add
|
||||
```
|
||||
```ucm
|
||||
.> view doc2
|
||||
```
|
||||
|
||||
```unison
|
||||
doc3 = [: When Unison identifies a paragraph, it removes any newlines from it before storing it, and then reflows the paragraph text to fit the display window on display/view/edit.
|
||||
|
||||
For these purposes, a paragraph is any sequence of non-empty lines that have zero indent (after the unindenting mentioned above.)
|
||||
|
||||
- So this is not a paragraph, even
|
||||
though you might want it to be.
|
||||
|
||||
And this text | as a paragraph
|
||||
is not treated | either.
|
||||
|
||||
Note that because of the special treatment of the first line mentioned above, where its leading space is removed, it is always treated as a paragraph.
|
||||
:]
|
||||
```
|
||||
|
||||
```ucm:hide
|
||||
.> add
|
||||
```
|
||||
```ucm
|
||||
.> view doc3
|
||||
```
|
||||
|
||||
```unison
|
||||
doc4 = [: Here's another example of some paragraphs.
|
||||
|
||||
All these lines have zero indent.
|
||||
|
||||
- Apart from this one. :]
|
||||
```
|
||||
|
||||
```ucm:hide
|
||||
.> add
|
||||
```
|
||||
```ucm
|
||||
.> view doc4
|
||||
```
|
||||
|
||||
```unison
|
||||
-- The special treatment of the first line does mean that the following
|
||||
-- is pretty-printed not so prettily. To fix that we'd need to get the
|
||||
-- lexer to help out with interpreting doc literal indentation (because
|
||||
-- it knows what columns the `[:` was in.)
|
||||
doc5 = [: - foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
```
|
||||
|
||||
```ucm:hide
|
||||
.> add
|
||||
```
|
||||
```ucm
|
||||
.> view doc5
|
||||
```
|
||||
TODO fence
|
||||
```
|
||||
-- You can do the following to avoid that problem.
|
||||
doc6 = [:
|
||||
- foo
|
||||
- bar
|
||||
and the rest.
|
||||
:]
|
||||
```
|
||||
TODO
|
||||
|
||||
### Reflowing paragraphs
|
||||
|
||||
```
|
||||
.> add
|
||||
```
|
||||
TODO
|
||||
```
|
||||
.> view doc6
|
||||
```
|
||||
|
@ -1,12 +1,11 @@
|
||||
This transcript verifies a few minor details about doc parsing and pretty-printing.
|
||||
This transcript explains a few minor details about doc parsing and pretty-printing, both from a user point of view and with some implementation notes. (The ucm `add` commands and their output are hidden for brevity.)
|
||||
|
||||
Docs can be used as inline code comments.
|
||||
|
||||
```unison
|
||||
foo : Nat -> Nat
|
||||
foo n =
|
||||
[: do the
|
||||
thing :]
|
||||
[: do the thing :]
|
||||
n + 1
|
||||
```
|
||||
|
||||
@ -25,19 +24,12 @@ foo n =
|
||||
|
||||
```
|
||||
```ucm
|
||||
.> add
|
||||
|
||||
⍟ I've added these definitions:
|
||||
|
||||
foo : Nat -> Nat
|
||||
|
||||
.> view foo
|
||||
|
||||
foo : Nat -> Nat
|
||||
foo n =
|
||||
use Nat +
|
||||
[: do the
|
||||
thing :]
|
||||
[: do the thing :]
|
||||
n + 1
|
||||
|
||||
```
|
||||
@ -62,12 +54,6 @@ escaping = [: Docs look [: like \@this \:] :]
|
||||
|
||||
```
|
||||
```ucm
|
||||
.> add
|
||||
|
||||
⍟ I've added these definitions:
|
||||
|
||||
escaping : Doc
|
||||
|
||||
.> view escaping
|
||||
|
||||
escaping : Doc
|
||||
@ -76,40 +62,7 @@ escaping = [: Docs look [: like \@this \:] :]
|
||||
```
|
||||
(Alas you can't have `\@` or `\:]` in your doc, as there's currently no way to 'unescape' them.)
|
||||
|
||||
### Indenting
|
||||
|
||||
Handling of indenting in docs between the parser and pretty-printer is a bit fiddly.
|
||||
|
||||
```unison
|
||||
-- The leading and trailing spaces are stripped from the stored Doc, and one
|
||||
-- leading and trailing space is inserted again on view/edit
|
||||
-- by the pretty-printer.
|
||||
doc1 = [: hi :]
|
||||
|
||||
-- The above treatment is only applied to the first line of a Doc.
|
||||
-- For storing subsequent lines, they are unindented until at least one of
|
||||
-- them hits the left margin. You don't notice this because the
|
||||
-- pretty-printer indents them again on view/edit.
|
||||
doc2 = [: hello
|
||||
- foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
|
||||
-- That does mean that the following is pretty-printed not so prettily.
|
||||
-- To fix that we'd need to get the lexer to help out with interpreting
|
||||
-- doc literal indentation (because it knows what columns the `[:` was in.)
|
||||
doc3 = [: - foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
|
||||
-- You can do the following to avoid that problem.
|
||||
-- TODO this hashes the same as 3
|
||||
doc4 = [:
|
||||
- foo
|
||||
- bar
|
||||
and the rest.
|
||||
:]
|
||||
|
||||
-- Note that -- comments are preserved within doc literals.
|
||||
commented = [:
|
||||
example:
|
||||
@ -128,27 +81,13 @@ commented = [:
|
||||
⍟ These new definitions are ok to `add`:
|
||||
|
||||
commented : Doc
|
||||
doc1 : Doc
|
||||
doc2 : Doc
|
||||
doc3 : Doc
|
||||
doc4 : Doc
|
||||
|
||||
Now evaluating any watch expressions (lines starting with
|
||||
`>`)... Ctrl+C cancels.
|
||||
|
||||
```
|
||||
```ucm
|
||||
.> add
|
||||
|
||||
⍟ I've added these definitions:
|
||||
|
||||
commented : Doc
|
||||
doc1 : Doc
|
||||
doc2 : Doc
|
||||
doc3 : Doc
|
||||
doc4 : Doc
|
||||
|
||||
.> view doc1 doc2 doc3 doc4 commented
|
||||
.> view commented
|
||||
|
||||
commented : Doc
|
||||
commented =
|
||||
@ -156,28 +95,219 @@ commented = [:
|
||||
|
||||
-- a comment
|
||||
f x = x + 1 :]
|
||||
|
||||
```
|
||||
### Indenting, and paragraph reflow
|
||||
|
||||
Handling of indenting in docs between the parser and pretty-printer is a bit fiddly.
|
||||
|
||||
```unison
|
||||
-- The leading and trailing spaces are stripped from the stored Doc by the
|
||||
-- lexer, and one leading and trailing space is inserted again on view/edit
|
||||
-- by the pretty-printer.
|
||||
doc1 = [: hi :]
|
||||
```
|
||||
|
||||
```ucm
|
||||
|
||||
I found and typechecked these definitions in scratch.u. If you
|
||||
do an `add` or `update`, here's how your codebase would
|
||||
change:
|
||||
|
||||
⍟ These new definitions are ok to `add`:
|
||||
|
||||
doc1 : Doc
|
||||
|
||||
Now evaluating any watch expressions (lines starting with
|
||||
`>`)... Ctrl+C cancels.
|
||||
|
||||
```
|
||||
```ucm
|
||||
.> view doc1
|
||||
|
||||
doc1 : Doc
|
||||
doc1 = [: hi :]
|
||||
|
||||
```
|
||||
```unison
|
||||
-- Lines (apart from the first line) are unindented until at least one of
|
||||
-- them hits the left margin (by a post-processing step in the parser).
|
||||
-- You may not notice this because the pretty-printer indents them again on
|
||||
-- view/edit.
|
||||
doc2 = [: hello
|
||||
- foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
```
|
||||
|
||||
```ucm
|
||||
|
||||
I found and typechecked these definitions in scratch.u. If you
|
||||
do an `add` or `update`, here's how your codebase would
|
||||
change:
|
||||
|
||||
⍟ These new definitions are ok to `add`:
|
||||
|
||||
doc2 : Doc
|
||||
|
||||
Now evaluating any watch expressions (lines starting with
|
||||
`>`)... Ctrl+C cancels.
|
||||
|
||||
```
|
||||
```ucm
|
||||
.> view doc2
|
||||
|
||||
doc2 : Doc
|
||||
doc2 =
|
||||
[: hello
|
||||
- foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
|
||||
```
|
||||
```unison
|
||||
doc3 = [: When Unison identifies a paragraph, it removes any newlines from it before storing it, and then reflows the paragraph text to fit the display window on display/view/edit.
|
||||
|
||||
For these purposes, a paragraph is any sequence of non-empty lines that have zero indent (after the unindenting mentioned above.)
|
||||
|
||||
- So this is not a paragraph, even
|
||||
though you might want it to be.
|
||||
|
||||
And this text | as a paragraph
|
||||
is not treated | either.
|
||||
|
||||
Note that because of the special treatment of the first line mentioned above, where its leading space is removed, it is always treated as a paragraph.
|
||||
:]
|
||||
```
|
||||
|
||||
```ucm
|
||||
|
||||
I found and typechecked these definitions in scratch.u. If you
|
||||
do an `add` or `update`, here's how your codebase would
|
||||
change:
|
||||
|
||||
⍟ These new definitions are ok to `add`:
|
||||
|
||||
doc3 : Doc
|
||||
|
||||
Now evaluating any watch expressions (lines starting with
|
||||
`>`)... Ctrl+C cancels.
|
||||
|
||||
```
|
||||
```ucm
|
||||
.> view doc3
|
||||
|
||||
doc3 : Doc
|
||||
doc3 =
|
||||
[:
|
||||
When Unison identifies a paragraph, it removes any newlines
|
||||
from it before storing it, and then reflows the paragraph
|
||||
text to fit the display window on display/view/edit.
|
||||
|
||||
For these purposes, a paragraph is any sequence of non-empty
|
||||
lines that have zero indent (after the unindenting mentioned
|
||||
above.)
|
||||
|
||||
- So this is not a paragraph, even
|
||||
though you might want it to be.
|
||||
|
||||
And this text | as a paragraph
|
||||
is not treated | either.
|
||||
|
||||
Note that because of the special treatment of the first line
|
||||
mentioned above, where its leading space is removed, it is
|
||||
always treated as a paragraph.
|
||||
:]
|
||||
|
||||
```
|
||||
```unison
|
||||
doc4 = [: Here's another example of some paragraphs.
|
||||
|
||||
All these lines have zero indent.
|
||||
|
||||
- Apart from this one. :]
|
||||
```
|
||||
|
||||
```ucm
|
||||
|
||||
I found and typechecked these definitions in scratch.u. If you
|
||||
do an `add` or `update`, here's how your codebase would
|
||||
change:
|
||||
|
||||
⍟ These new definitions are ok to `add`:
|
||||
|
||||
doc4 : Doc
|
||||
|
||||
Now evaluating any watch expressions (lines starting with
|
||||
`>`)... Ctrl+C cancels.
|
||||
|
||||
```
|
||||
```ucm
|
||||
.> view doc4
|
||||
|
||||
doc4 : Doc
|
||||
doc4 =
|
||||
[: Here's another example of some paragraphs.
|
||||
|
||||
All these lines have zero indent.
|
||||
|
||||
- Apart from this one. :]
|
||||
|
||||
```
|
||||
```unison
|
||||
-- The special treatment of the first line does mean that the following
|
||||
-- is pretty-printed not so prettily. To fix that we'd need to get the
|
||||
-- lexer to help out with interpreting doc literal indentation (because
|
||||
-- it knows what columns the `[:` was in.)
|
||||
doc5 = [: - foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
```
|
||||
|
||||
```ucm
|
||||
|
||||
I found and typechecked these definitions in scratch.u. If you
|
||||
do an `add` or `update`, here's how your codebase would
|
||||
change:
|
||||
|
||||
⍟ These new definitions are ok to `add`:
|
||||
|
||||
doc5 : Doc
|
||||
|
||||
Now evaluating any watch expressions (lines starting with
|
||||
`>`)... Ctrl+C cancels.
|
||||
|
||||
```
|
||||
```ucm
|
||||
.> view doc5
|
||||
|
||||
doc5 : Doc
|
||||
doc5 =
|
||||
[: - foo
|
||||
- bar
|
||||
and the rest. :]
|
||||
|
||||
```
|
||||
If we edit `foo starting from the pretty-printed version above, and add it again, we end up with the same thing.
|
||||
TODO fence
|
||||
```
|
||||
-- You can do the following to avoid that problem.
|
||||
doc6 = [:
|
||||
- foo
|
||||
- bar
|
||||
and the rest.
|
||||
:]
|
||||
|
||||
```
|
||||
|
||||
TODO
|
||||
```
|
||||
.> add
|
||||
|
||||
### Reflowing paragraphs
|
||||
```
|
||||
|
||||
TODO
|
||||
```
|
||||
.> view doc6
|
||||
|
||||
```
|
||||
|
||||
|
@ -1,3 +1,6 @@
|
||||
|
||||
|
||||
|
||||
# Documenting Unison code
|
||||
|
||||
Unison documentation is written in Unison. Documentation is a value of the following type:
|
||||
@ -50,7 +53,7 @@ use .builtin
|
||||
|
||||
docs.List.take = [:
|
||||
|
||||
`@List.take n xs` returns the first `n` elements of `xs`. (No need to add line breaks manually. The display command will do wrapping of text for you.)
|
||||
`@List.take n xs` returns the first `n` elements of `xs`. (No need to add line breaks manually. The display command will do wrapping of text for you. Indent any lines where you don't want it to do this.)
|
||||
|
||||
## Examples:
|
||||
|
||||
|
@ -1,3 +1,6 @@
|
||||
|
||||
|
||||
|
||||
# Documenting Unison code
|
||||
|
||||
Unison documentation is written in Unison. Documentation is a value of the following type:
|
||||
@ -92,7 +95,7 @@ use .builtin
|
||||
|
||||
docs.List.take = [:
|
||||
|
||||
`@List.take n xs` returns the first `n` elements of `xs`. (No need to add line breaks manually. The display command will do wrapping of text for you.)
|
||||
`@List.take n xs` returns the first `n` elements of `xs`. (No need to add line breaks manually. The display command will do wrapping of text for you. Indent any lines where you don't want it to do this.)
|
||||
|
||||
## Examples:
|
||||
|
||||
@ -149,7 +152,8 @@ Now that documentation is linked to the definition. We can view it if we like:
|
||||
|
||||
`builtin.List.take n xs` returns the first `n` elements of
|
||||
`xs`. (No need to add line breaks manually. The display
|
||||
command will do wrapping of text for you.)
|
||||
command will do wrapping of text for you. Indent any lines
|
||||
where you don't want it to do this.)
|
||||
|
||||
## Examples:
|
||||
|
||||
@ -170,7 +174,8 @@ Or there's also a convenient function, `docs`, which shows the `Doc` values that
|
||||
|
||||
`builtin.List.take n xs` returns the first `n` elements of
|
||||
`xs`. (No need to add line breaks manually. The display
|
||||
command will do wrapping of text for you.)
|
||||
command will do wrapping of text for you. Indent any lines
|
||||
where you don't want it to do this.)
|
||||
|
||||
## Examples:
|
||||
|
||||
@ -194,7 +199,8 @@ Note that if we view the source of the documentation, the various references are
|
||||
[:
|
||||
`@builtin.List.take n xs` returns the first `n` elements of
|
||||
`xs`. (No need to add line breaks manually. The display
|
||||
command will do wrapping of text for you.)
|
||||
command will do wrapping of text for you. Indent any lines
|
||||
where you don't want it to do this.)
|
||||
|
||||
## Examples:
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user