Commit Graph

5 Commits

Author SHA1 Message Date
Gary Verhaegen
df0086d26f
ci/linux: kill machines if they fail to clean up (#8835)
It does not seem like CI machines recover from a failed clean-up. This
is not the most elegant solution possible, but it's a cheap one that
should work.

Not: shutting down the machine in the middle of the build will not
provide an error message to Slack for main branch builds (because the
`tell_slack_failed` step would need to run on the same machine) but will
correctly report failure for PRs (that was the original purpose of the
`collect_build_data` step).

An alternative here would be to give a delay to the shutdown command,
and try to calibrate it so that it's long enough for this job to
correctly report its failure to both Azure and Slack, while making it
short enough that no other job gets assigned to the machine. I'm not
clear enough on how often Azure assigns jobs to try and bet on that.

CHANGELOG_BEGIN
CHANGELOG_END
2021-02-12 19:32:14 +01:00
Gary Verhaegen
95c2184dcc
bump disk cleanup threashold (#8807)
I've seen [a build] failing with "disk full" after starting with 41GB
free.

[a build]: https://dev.azure.com/digitalasset/daml/_build/results?buildId=69270&view=logs&j=870bb40c-6da0-5bff-67ed-547f10fa97f2&t=deecee86-545a-596e-8b0d-fb7d606fe9f2

CHANGELOG_BEGIN
CHANGELOG_END
2021-02-10 13:39:29 +00:00
Gary Verhaegen
ab5f62abac
ci/clean-up: s/lsof/pgrep (#8748)
CHANGELOG_BEGIN
CHANGELOG_END
2021-02-04 12:21:55 +01:00
Gary Verhaegen
5734730d50
tweak local cache cleanup (#8738)
For various reasons, my attempts at improving the cache cleanup process
have been delayed. There are, however, two simple, non-controversial
changes I can "backport" without having to wait for consensus on the
whole thing:

1. Increase the threshold. At least for the compat jobs, we have seen
   builds failing after starting with ~32GB free.
2. Kill dangling Bazel processes, which keep some files open and
   sometimes cause the clean-up process to crash.

CHANGELOG_BEGIN
CHANGELOG_END
2021-02-03 19:47:52 +01:00
Gary Verhaegen
532f996f12
ci: clean-up hard drive more often (#8582)
CHANGELOG_BEGIN
CHANGELOG_END
2021-01-20 17:22:53 +01:00