daml/ci/cron
Gary Verhaegen 9da40ea426
ci/cron: fix retry policy (#8985)
I think the retry is clobbering the files. Here is my theory:
- The HTTP request is lazy, i.e. it starts producing a byte stream
  before it has finished downloading.
- The connection somehow crashes in the middle of that lazy handling,
  possibly because the Haskell code blocks for too long on something
  else and GCP thus closes the connection. (If this is true, making sure
  we download the entire thing before we start writing may make the
  download more reliable.) This explains why we get a "resource vanished"
  and not a plain 404 to start with.
- The retry policy doesn't know anything about HTTP requests; it just
  sees an IO action throwing an exception and restarts the whole thing.
- Because the IO action opens the file in Append mode, we thus end up
  with a file that is too big and has its "starting bytes" multiple
  times. That obviously fails to sign-check.

If this is what happens then the retry does not help at all, which does
seem to be what we've been observing (though I haven't tracked the exact
error rate too closely). The fix would likely be as simple as changing
`IO.AppendMode to IO.WriteMode (which truncates, per [documentation]).

[documentation]: https://hackage.haskell.org/package/base-4.14.1.0/docs/System-IO.html

CHANGELOG_BEGIN
CHANGELOG_END
2021-03-02 13:51:00 +00:00
..
perf Bump perf test for scalafmt update (#8444) 2021-01-09 13:12:54 +01:00
src ci/cron: fix retry policy (#8985) 2021-03-02 13:51:00 +00:00
BUILD.bazel Retry asset downloads in check_releases (#8730) 2021-02-03 12:04:59 +01:00
daily-compat.yml run PR builds on NOTICES updates (#8931) 2021-02-24 13:36:51 +00:00
tuesday.yml ci: pin down Ubuntu versions (#8388) 2021-01-05 10:39:59 +01:00
wednesday.yml trigger PRs job on generated PRs (#8489) 2021-01-13 11:30:17 +00:00