Previously, if an operator had preceding comments attached to its second
argument, they would end up printed right after the operator:
a
+ -- b comment
b
On second run however, the comment would be interpreted as attached to ‘(+)’
and the result would be:
a
+ b -- b comment
Breaking the idempotence guarantees.
The solution that this commit implements includes several steps:
* Introduce the concept of “dirty line”. A line is dirty if it has something
on it that can have a comment attached to it.
* ‘txt’ is supposed to output fixed bits of syntax that cannot have comments
attached to them (at least in Ormolu's model).
* ‘atom’ on the other hand outputs things that mark the current line dirty.
* When we're to print preceding comments for the second argument we check if
the current line is dirty. If it is, we output an extra newline to prevent
the first comment from changing “hosts”.
* Now there is another problem with trailing whitespace after the operator
in that case. We solve that by making spaces a bit “lazy”. When the ‘space’
combinator is used (which is the recommeneded way to separate different
constructs now) it just guarantees that the next thing we'll output on the
same line will be separated from previous output by a single space.
So, using ‘space’ twice results in single space in output still. This has
the extra benefit of simplifying all the logic that made sure that we have
only single space and not 0 or 2 spaces when spaces are inserted
conditionally and independently.
There has been a lot of good intense work lately and as a result of that
some examples have grown considerably. The problem is that we do not show
diffs when something is not formatted as expected, we show entire
"expected/got" files. It works well when files are small, but not so well
where they are huge (some of our examples are well beyond 100 lines). It can
be hard to understand where the problem is.
This commit split long examples into smaller ones to make it easier to see
what went wrong when a test fails.
Attach the comment if the next element is not a sibling. I think this is
quite often what we want, since if we put a comment inside a construct, we
prefer it to stay inside the same element.
This change adds an ad-hoc parser for module pragmas to handle
OPTIONS_* pragmas. I did not want to use an existing tokenizer,
because I felt like tokenizing and pretty printing the GHC options
are more prone to error without providing much benefit.
The issue was simply indenting the closing `|]` one more level. However
there were a few more issues around them, which led me to a slightly
bigger refactor.
main reason is that, all `p_*Decl` functions used to print a trailing
newline. This makes sense for top-level constructs, however it was making
printing something like `[d|data Foo = Foo|]` impossible.
In this commit, I removed trailing newlines from individual printers
and gave that responsibility to `p_hsDecls`, and inserted an additional
trailing newline when printing modules.
While doing that, I noticed a few bugs/inconsistencies, and I had to
fix them in the process:
* Warning pragmas used to not print a trailing newline, so they were
always attached to the next expression. I made it more like the other
pragmas, where we attach it to a neighbour function if the name matches,
otherwise we separate it with a newline.
* We used to print single line GADT's and single line `do` notations
using multiple lines, which breaks idempotency. I tweak them to prefer
single line layout if possible (sometimes it is not possible because
of the semicolon syntax).
Function id obtained through pattern matching on ‘FunBind’ should not be
used to print the actual equations because the different ‘RdrNames’ used in
the equations may have different “decorations” (such as backticks and
paretheses) associated with them. It is necessary to use per-equation names
obtained from ‘m_ctxt’ of ‘Match’.
This puts out the fire, but I'm not fully content with the solution. I also
do not understand why it fails in the original issue but succeeds for e.g.:
foo = do
1
+
2