Commit Graph

45 Commits

Author SHA1 Message Date
Sergey Gulin
95d5bad3cd
[#133] Refactor golden tests
Problem: We're using a common pattern in our bats tests:
  Run xrefcheck, redirect output to a temp file
  Check the temp file matches some .gold file using `diff`
  Delete temp file
We could encapsulate this pattern and make it easier to reuse.

Solution: In the `setup` function, create a temp directory. In the
`teardown` function, delete the temp directory. Create a `to_temp`
function that runs xrefcheck with desired options, pipes its output
through the `prepare` helper function and saves it in a file inside
the temp directory. Create a `assert_diff` function that reads the temp
file, and uses `diff` to compare it against some expected output.
2022-09-24 23:47:24 +10:00
Sergey Gulin
a99005d731
[#156] Make all config options optional
Problem: In #126 we made the `ignoreRefs` option required (to match the
other options). However, having it optional is better for
backwards-compatibility and to help users migrate to newer xrefcheck
versions.

Solution: Make all config options optional.
2022-09-24 05:51:39 +10:00
Sergey Gulin
c8d19a3f98
[#56] Dump all the errors from different files
Problem: Currently, xrefcheck fails immediately after the first
observed error because `die` is used right in `markdownScanner` What
we want is dumping all the errors from different markdowns and then
print them as a final xrefcheck's result together with the broken
links. Also, despite the fact that in the `makeError` function we have
4 error messages, 2 of them are not reported, and the test case that
should check this only checks that at least one of the four files
throws an error.

Solution: Make xrefcheck to report all errors. Add `ScanError` type
and propagate errors to report all of them, rather than failing
immediately after the first error is detected.
2022-09-23 17:13:50 +10:00
Sergey Gulin
943d0c881b
[#76] Add tests for virtualFiles glob patterns
Problem: The `virtualFiles` config allows the user to use glob patterns
to specify files that do not physically exist in the repository but
should be treated as existing nevertheless. However, we do not yet
have any around our usage of glob patterns. We should write some to
1) ensure it behaves in a sensible way even in corner cases and 2)
document  the behaviour.

Solution: Add tests that document how the `virtualFiles` glob paterns work.
2022-09-19 17:32:24 +10:00
Sergey Gulin
db77aaa9e4
[#137] Remove checkLocalhost option
Problem: In #85, we added the `checkLocalhost` option to decide
whether to verify links to localhost. However, upon further
reflection, it seems like this could have been subsumed by the
existing `ignoreRefs` option instead.

Solution: Remove `check-localhost` CLI option and `checkLocalhost`
config option. Add a regex matching localhost links to the
`ignoreRefs` field of the default config.
2022-09-16 03:00:53 +10:00
Sergey Gulin
332da5569e
[#77] Add support for glob patterns to ignored and notScanned
Problem: The `virtualFiles` config option supports glob patterns. On the
other hand, `ignored` only supports exact matches and `notScanned`
mathches on prefixes. There is also a bug where `ignored` does not
ignore files if they contain broken xrefcheck annotations.

Solution: Add support for glob patterns to `ignored` and
`notScanned`. Filter ignored files before parsing their contents.
2022-09-08 22:49:27 +10:00
Sergey Gulin
a3f2d28216
[#125] Display URL parsing errors
Problem: We use a 2-step process to parse a URL: we use `parseURI` and
then `mkURIBs`. Both of these functions can fail. At the moment, we're
ignoring their errors and simply throwing a `ExternalResourceInvalidUri`,
and then displaying a generic error message to the user.

Solution: Catch errors from `parseUri` and `mkURIBs` and use them to
tell user why the URL was invalid.
2022-09-08 22:31:12 +10:00
Sergey Gulin
36a1da6473
[#120] Fix bug with ignoring checks for relative anchors
Problem: When a file contains a reference to another file, and that
reference contains an anchor, that anchor is not checked.

Solution: Normalise relative anchor links before check.
2022-09-07 21:48:13 +10:00
Constantine Ter-Matevosian
80b5edd1c7
[#49] Allow certain reserved characters in the URLs
Problem: The current version of xrefcheck doesn't allow the square
brackets and some other special characters, like the angle brackets and
the curly brackets, to be present in the URLs, even in the query
strings, as they need to be percent-encoded first.

Solution: Allow some of the reserved characters, like the brackets, to
be present in the query strings of the URLs.
There exist two main standards of URL parsing: RFC 3986 and the Web
Hypertext Application Technology Working Group's URL standard. Ideally,
we want to be able to parse the URLs in accordance with the latter
standard, because it provides a much less ambiguous set of rules for
percent-encoding special characters, and is essentially a living
standard that gets updated constantly.
We allow these characters to be present in the query strings by using
the `parseURI` function from the `uri-bytestring` library with
`laxURIParseOptions`.
2022-09-06 04:39:40 +10:00
Nurlan Alkuatov
86e17eb3a4 [#99] Support Retry-After headers with dates
Problem: We currently support obtaining `Retry-After` header
values as seconds. However, the http specs state that the header
value can be also a date, e.g: `Wed, 21 Oct 2015 07:28:00 GMT`.

Solution: Support `Retry-After` headers with dates.
2022-09-05 13:17:05 +06:00
Sergey Gulin
c4f0bfb9fc
[#94] Add support for the id attribute in anchors
Problem: The `name` attribute was deprecated, and web devs are now
encouraged to use the `id` attribute instead. We should add support
for the `id` attribute, while retaining support for the `name`
attribute.

Solution: Add support for the `id` attribute.
2022-08-29 18:41:20 +10:00
Sergey Gulin
114a6138a3
[#126] Remove note ignoreRefs is optional from config
Problem: We made `ignoreRefs` a required option. But the config file
generated with the `dump-config` option still contains a note that
`ignoreRefs` can be omitted.

Solution: Remove the note that says `ignoreRefs` can be omitted.
2022-08-27 04:54:56 +10:00
Sergey Gulin
f198777f57
[#126] Make ignoreRefs a required parameter
Problem: The `ignoreRefs` parameter, in the config file, was
optional. This is in stark contrast with all the other parameters, all
of which are required.

Solution: Make `ignoreRefs` a required parameter in the config file.
2022-08-26 21:48:33 +10:00
Nurlan Alkuatov
96960f7ebf [#90] Forbid verifying a single file
Problem: Verifying a single file using `-r` option missbehaves
when there are absolute links present in the file. Since `-r`
option expects a directory, we can just forbid verifying a single
file.

Solution: Fail with an error message when the user tries to
specify a file as the repository's root directory.
2022-07-24 18:59:53 +06:00
Andrei Borzenkov
a6b4513587 [#95] Support HTML tag parsing compatible with HTML spec
Problem: We had hardcoded HTML tag parser, that doesn't work with add valid HTML tags

Solution: Replace it with `tagsoup` library, that care about all parsing stuff
2022-07-17 20:48:51 +04:00
Andrei Borzenkov
2c41713578
[#104] Add maxRetries to configurable options
Problem: maxRetries option was hardcoded in source

Solution: Add it to verify options in config and make new CLI option to override count of retries
2022-07-14 17:25:52 +03:00
Constantine Ter-Matevosian
032395007b
[#31] Handle the "429 too many requests" errors
Problem: The current version of xrefcheck handles the HTTP responses
with the 429 status code just like every other error, when it is
possible to try and eliminate the occurrences of such errors within the
program itself.

Solution: Each time the result of performing a request on a given link
is a 429 error, retrieve the Retry-After information, describing the
delay (in seconds), from the headers of the HTTP response, or,
alternatively, use a configurable default value if the Retry-After
header is absent, and rerun the request after an amount of time
described by the said value had passed. Only after the number of retries
had reached its limiting value, which, as of right now, is not
configurable and is hardcoded, is when the 429 error is converted into
becoming 'unfixable', and any further attempts to remove the error are
terminated.

Additionally, the progress bar has been upgraded and the following
elements are supplied:
1. an extra color -- Blue -- indicating the errors that might get
   eliminated during the verification;
2. a timer with the number of seconds left to wait for the restart of
   the request; if, during the verification, a new 429 error had emerged
   with the new Retry-After value being greater than or equal to the
   elapsed time, the timer is immediately updated with that value and
   begins ticking down each second from scratch.
2022-07-14 17:25:52 +03:00
Andrei Borzenkov
6cdaef2412 [#103] Specify port in golden test links instead of default one
Problem: We used default ports to test error reports in checking of localhost link, but this port may be in use by other program, so xrefcheck reports another message

Solution: Specify port by value that is likely not to be used by other programs
2022-07-14 14:40:45 +04:00
Andrei Borzenkov
55268fee2c [#106] Fix using ./ in paths
Problem: --ignored and --root CLI options had misbehave when you use ./ in path

Solution: add path normalisation and path-equality from System.FilePath instead of common functions for strings
2022-07-13 16:13:38 +04:00
Andrei Borzenkov
654d143113 [#105] Add hlint support, enable -Weveryting
Problem: we had a lot redundant dependencies and had no linter for handling obvious errors

Solution: hlint support and enable -Weverything flag, fix all hints from them, add hlint to the CI pipeline
2022-07-13 11:08:01 +04:00
Andrei Borzenkov
e460301275 [#107] Replace file-embed library with inline config
Problem: At new resolver version we recieved obscure error when tried to cross-compile project to Windows on CI. Changing file-embed version to the old one doesn't help us.

Solution: inline content of this file into haskell source, using raw-string-qq library, that helps us to avoid escaping and typing newline characters.
2022-07-08 13:01:13 +04:00
Андреев Кирилл
878775ce33
Add regression tests for ignore link bug 2021-11-04 18:20:00 +04:00
Андреев Кирилл
d644a95734
[#71] Separate concerns in Node traversal.
Problem:  The tree traversal uses explicit recursion and
          does not-closely-unrelated stuff at once.

Solution: Separate different actions.
2021-11-01 15:25:34 +04:00
Kirill Andreev
92c3de5587
Improve readability of imports
Problem:  In
          ```
          import qualified Foo.Bar as Bar
          import Foo.Bar (Bar)
          ```
          names of the imported modules are on different
          vertical lines, which disables autosorting,
          and makes it harder to read.

Solution: Use `ImportQualifiedPost`
2021-11-01 15:25:29 +04:00
Kirill Andreev
2f9a9d8599
Remove mixins
Problem:  Stack cannot build projects with mixins, and they only
          used to splice universum instead of base.

Solution: Use `NoImplicitPreduce` and import `Universum`
          everywhere explicitly.
2021-11-01 15:24:25 +04:00
Constantine Ter-Matevosian
b9e7ffb99d
[#75] Fix the root with an appended slash support
Problem: The results of the repository analysis will always contain
invalid references if the root contains a trailing forward slash.

Solution: Strip the root's trailing slash (if present) when having it be
given as an argument of the System.FilePath.Posix.takeDirectory
function.
2021-10-26 15:16:47 +03:00
Andrey Demidenko
c67ee9bd52
[#47] Handle ftp links
Problem:
Currently we support only http and https links. If there is an `ftp://`
link, you will get exception.

Solution:
Use `ftp-client` to check connection to ftp, see response statuses and
check file existence. This produces adding new error types and small
refactoring.
Provide a test which is separate executable, where we have to pass CLA -
ftp host.

Co-authored-by: Alexander Bantyev <alexander.bantyev@serokell.io>
2021-10-08 13:59:14 +03:00
Andrey Demidenko
24226ac2d8
[#81] Add config option for protected links
Problem:
We do not know what to do with protected links, because we cannot check
them. So we have two options, assume that these links is valid or not.

Solution:
Provide config option for user to decide what to do - assume protected
links valid or not.
2021-10-08 13:29:19 +03:00
Andrey Demidenko
29d4f12c61
[#81] Add config option for localhost links
Problem:
Almost all the time we can't validate localhost links, so we just skip them.
But to run ftp links tests (#47) we need to refer to localhost.

Solution:
Add config option whether to ignore localhost links and provide bats
tests for this new feature.

Co-authored-by: Alexander Bantyev <alexander.bantyev@serokell.io>
2021-10-08 13:29:19 +03:00
Андреев Кирилл
80a8c7dbfd
[#62] Fix tests failing after rebase
Problem:  The "edge-case" is now a normal case, and should be
          recognized.
Solution: Make test to expect it to be recognised.
2021-10-04 14:27:47 +04:00
Kirill Andreev
a61defd917
[#62] [INT-141] Add to changelog, update version 2021-10-04 14:12:09 +04:00
Kirill Andreev
7a2dce17fd
[#62] [INT-141] Add anchor recognition in headers
Problem:  Subtrees of header are ignored.

Solution: add subtrees of header to the traversal.
2021-10-04 14:12:02 +04:00
Constantine Ter-Matevosian
be09345c72
[#72] Fix anchor generation of headers with spaces
Problem: A header name with its custom anchor appended after the
octothorpe symbol(s) is parsed with the leading spaces present, thus
generating an anchor from it will result in having a leading hyphen
prepended to its beginning.

Solution: Strip the header name to ensure the absence of the leading
hyphen in the later generated anchor.
2021-09-03 09:19:45 +03:00
Andrey Demidenko
91db52dd8d
Fix indentions
Problem:
Our style guide requires two spaces indention, but in some places we have four or even more spaces.

Solution:
Delete spaces where it contradicts our style guide.
2021-07-26 12:58:26 +01:00
martoon
045f4146cf
[#55] Generate config depending on the repo type
Problem: now when we include repository type into the config, it seems
to make sense to generate config differently depending on the repository
type. Especially taking into account that currently in some fields we
mix GitHub and GitLab -specific contents.

Solution:

Leave placeholders in the default config and later fill them from the
code depending on the required repository type.

Add a mandatory repository type parameter to `dump-config` CLI command.

Along with a test checking for config validity, add a golden test on the
produced config so that we could assess how sane it looks like.
2021-05-03 17:01:20 +03:00
martoon
aac3d5aec7
[#55] Account for Markdown flavor type
Problem: it turned out that GitHub and GitLab render anchors for headers
differently.

Solution: make it possible to specify flavor in config, adjust
headers conversion respectively. Added some fixes for GitHub as well.

Flavor field is mandatory, so this change is breaking, but I
think it's a worthy thing since otherwise other subtle bugs are
possible, it should be good if users specify the flavor as soon
as possible.
2021-05-03 16:57:28 +03:00
Zhenya Vinogradov
69f45d769a
Remove trailing whitespace everywhere 2021-03-11 14:47:34 +03:00
Alyona Antonova
bc9e497efb [#135] Add tests and markdowns to check ignoring regex performance
Problem: There are no tests checking ignoring
regex performance.

Solution: Add test checking that broken links
matched by regexs are not verified and
test checking that not matched broken links
are verified as links with error.
2021-03-09 22:18:07 +03:00
Alyona Antonova
41c6dbb3af [#135] Add tests and markdowns to check ignoring modes performance
Problem: There are no tests checking the performance
of different ignoring modes.

Solution: Add such tests to `Xrefcheck.Test.IgnoringSpec`.
Also add markdowns to test on real files.
2021-03-09 22:18:07 +03:00
martoon
0c1d453e87
Rename module names
Problem: module names are prefixed with `Crv` which does not suit the
new project naming.

Solution: following the new `xrefcheck` name, rename modules so that
they start from `Xrefcheck`.
2020-01-14 19:41:47 +03:00
Ivan Gromakovskii
8060c7187b
[INT-128] Make the repository REUSE compliant
Problem: nowadays we want all files to store licensing information in
machine-readable format and to use reuse tool to check that. But the
repo is not REUSE compliant.

Solution: add `LICENSES` folder and licensing information for each
file.
2019-12-19 16:19:27 +03:00
martoon
0671986bcb
Make providing config optional
Problem: currently `crossref-verify` executable does not work without
configuration, and user was supposed to use one from the repository if
he does not have its own. This is very inconvenient.

Solution: make executable remember some config and use it by default.
2019-12-03 23:19:37 +03:00
martoon
77cba6ccf2
Fix ./ local refs
Problem: for reference starting from `./` anchors are not checked.

Solution: lookup for such references in gathered repo info failed,
stripping `./` resolves the issue.
2019-05-08 02:12:44 +03:00
martoon
f229ca6e6d Fix 'headerToAnchor'
Consider some unusual cases like emoji or punctuation appearance.
2019-04-09 03:12:13 +03:00
martoon
2f4101c361
Add tests on header to anchor conversion 2019-03-11 17:16:06 +03:00