Problem: As GitHub and GitLab do not render symlinks as the file they
point to, we are considering to implement a new scanner for symlinks
that verifies them up to some extent.
Solution: A scanner that validates the reference from a symlink has been
implemented in the same style as the markdown scanner.
Problem: Currently, Xrefcheck can follow redirects with an absolute
location link, but it cannot handle relative ones.
Solution: After parsing the location link, obtain the corresponding
absolute link by using the original request one.
Problem: We currenlty have golden tests for anchor case-sensitivity, but
not for path case-sensitivity.
Solution: A golden test for path case-sensitivity has been added and the
previous one has been renamed.
Problem: When a file name contains unusual characters, its output line
in git ls-files is quoted and Xrefcheck parses the file name wrong.
Solution: Use the git ls-files -z option to output lines verbatim with
null character line terminators. The file not found message has also
been improved for the case in which its link contains a backslash.
Problem: After changing the progress bar interface to require progress
unit witnesses, some functions related to an anonymous task timestamp
also started to require a progress unit witness, which complicates its
usage unnecessarily.
Solution: Do not require a progress unit witness for getTaskTimestamp.
Problem: After refactoring the FilePath usages in the codebase to have a
canonical representation of them, we noticed that further improvements
could be applied, such as clarifying whether the path is system
dependent and avoiding absolute file system paths.
Solution: We now use POSIX relative paths during the analysis, and
system dependent ones for reading file contents in the scan phase.
Problem: We noticed that output was not being colorized in GitLab CI due
to the current implemented guesses for whether showing colors or not.
Solution: On the one hand, we extend the current guesses to also enable
coloring by default when the CI env var is set to true. On the other
hand, we also add a new flag, color, which avoids these guesses and
enables colors.
Problem: The current progress bar interface is quite low-level and
error-prone.
Solution: The proposed solution here modifies both the progress bar
internal model and interface by representing progress units by
witnesses. It does not require great changes, but the resulting
interface is more clear and less error-prone.
Problem: When the repository contains a symlink to a markdown file, it
is processed by xrefcheck as if it was the same markdown file but in the
symlink's location. This leads to broken references and can be avoided
because neither GitHub nor GitLab try to render symlinks as the file
they point to.
Solution: Consider symlinks as no scannable files. In the future, we
will consider to include a new dedicated scanner for symlinks if it
works.
Problem: There is a duplicated line in the changelog unreleased section,
which seems to be an accidental copy-paste typo or a bad merge conflict
resolution.
Solution: Delete the aforementioned changelog line.
Problem: After recent work on Xrefcheck redirect behavior, it remained
to discuss how to handle 304 redirects.
Solution: With the current default configuration, 304 redirects are
considered as valid, which seems appropriate taking into account that
304 responses usually mean that you previously received a successful
response for that same request and there is no need to retransmit the
resource again. We add a test case for this default config and update
the FAQ section.
Problem: We previously changed the default behaviour of Xrefcheck when
following link redirects, but did not provide a way to configure it.
Solution: We are adding a new field in the configuration file to allow
writing a list of redirect rules that will be applied to links that
match them.
Problem: Currently, getting response timeout immediately results in
fail, it's desired to have a possibility to configure retries on
timeouts.
Solution: The new ExternalHttpTimeout error is added, which is treated
in a similar way as the ExternalHttpTooManyRequests error.
A new field is added to the config meaning how many timeouts are
allowed. Default value equals to 1.
Problem: some arrays from the package.yaml file seemed to be almost
alphabetically sorted, but not completely.
Solution: sort the default-extensions and dependencies arrays from the
package.yaml file.
Problem: xrefcheck does not allow to print the config to stdout instead
of writing it to a file. Also, it is easy to overwrite your changes by
mistake by executing the command again.
Solution: provide a --stdout flag to print the config to stdout, and do
not write it to a file unless a --force flag has been included.
Problem: the current usage of filepaths is error-prone and can be
simplified.
Solution: canonicalize filepaths at the boundaries, so their management
will be safer and will simplify the codebase.
Problem: the danger checks were failing because it was configured to
fetch only and partially the current PR branch.
Solution: force the danger checks CI to get all the repository branches.
Problem: the danger checks that we copied had a license header, and
REUSE now complains that those headers refer to unknown licences.
Solution: copy the license file.
Problem: it would be helpful to have checks for common rules on styling
commits and stuff, to avoid checking those manually every time.
Solution: add Danger checks, mostly the same we had in Morley.
Problem: There is currently some problem in stack or cabal
that produces a warning when building this project on
case-insensitive systems.
Solution: The current workaroud for it is to add the GHC
option '-optP-Wno-nonportable-include-path'.
Problem: We have a Golden test that expects an output in English
and fails if a different language is configured.
Solution: Configure explicitly the language before running the
corresponding test.
Problem: Some Markdown flavours such as the GitHub one are case
insensitive regarding anchors, but our analysis is currently
case sensitive and it produces false positives.
Solution: Support case-insensitivity depending on the configured
Markdown flavour. Apply this also to ambiguous and similar anchors
detection.
Problem: Xrefcheck currently always follows redirect links.
Solution: We are changing its default behaviour regarding redirect
links to fail and report permanent redirects, and to pass for temporary
redirects. Further PRs will allow the user to configure other policies.
Problem: Currently, there are some invalid links in the repository,
mostly for tests, so one has to add some CLI options
for `xrefcheck` to succeed.
Solution: Added a new minimal `.xrefcheck.yaml` config file
such that `xrefcheck` succeeded without any options.
Problem: We have found that the current tag for local file references, current file, may lead to ambiguities.
Solution: Rename the tag that we use for local file references to be file-local instead.
Problem: We are using unicode symbols as visual clues in the program output that are not commonly supported and are therefore not always displayed as intended.
Solution: Remove the usage of these symbols, as the program output is already using other visual clues and the result will remain understandable for the user.
Problem: we are not testing behavior of xrefcheck on Windows
Solution: and add workflow to run
golden and tasty tests on CI
via github-actions windows runner
Some subproblems appear:
1.
Problem: CI build fails beacuse it needs `pcre` package
Solution: add it (somehow), see `install pacman dependencies`
in ci.yml
2.
Problem: Network errors displayed different on different platforms
Solution: collect output from both and use
`assert_diff expected_linux.gold || assert_diff expected_windows.gold`
3:
Problem: "Config matches" test is failing because checkout action
clone files with CRLF, and test assert equality of two ByteStrings
Solution: manually remove CR
Problem: in markdown links, '/' should always be used as path separator ,
e.g. GitHub renderer is not threating link `[x](a\b.md)` as a link
to file in dir `a`
We use OS-dependent file separators everywhere, so
e.g. `canonizeLocalRef "./a.md"="./a.md`"` on Windows
Solution: replace '\' to '/' while constructing repo tree if needed
and then use only '/', preferring functions from `System.FilePath.Posix`.
We can do this, since 'a/b' and `a\b` are equivalent paths on Windows
Problem: xrefcheck uses utf8 symbols in reports, which are not supported
on most of Windows shells by default.
Sometimes they are printed as question marks (and it cause golden tests to fail)
and sometimes printing of them raise an error.
Solution: use function `withCP65001` from `code-page` package which
sets correct codepage on Windows and do nothing on other OSs