Commit Graph

8 Commits

Author SHA1 Message Date
Gary Verhaegen
7c4b32aee9
use coreutils date on macos (#9228)
macOS uses BSD date by default which has slightly different options.

CHANGELOG_BEGIN
CHANGELOG_END
2021-03-24 13:35:02 +01:00
Gary Verhaegen
691edeacf2
ci: fix cache cleanup (#9137)
This is a continuation of #8595 and #8599. I somehow had missed that
`/etc/fstab` can be used to tell `mount` to let users mount some
filesystems with preset options.

This is using the full history of `mount` hardening so should be safe
enough. The option `user` in `/etc/fstab` automatically disables any kind
of `setuid` feature on the mounted filesystem, which is the main attack
vector I know of.

This works flawlessly on my local VM, so hopefully this time's the
charm. (It also happens to be my third PR specifically targeted on this
issue, so, who knows, it may even work.)

CHANGELOG_BEGIN
CHANGELOG_END
2021-03-16 17:51:38 +01:00
Gary Verhaegen
c556db48ed
ci/clean-up: remove poweroff (#9108)
It's not working and I can't make it work (see #9096), so I'd rather
just remove it.

CHANGELOG_BEGIN
CHANGELOG_END
2021-03-12 10:48:34 +01:00
Gary Verhaegen
df0086d26f
ci/linux: kill machines if they fail to clean up (#8835)
It does not seem like CI machines recover from a failed clean-up. This
is not the most elegant solution possible, but it's a cheap one that
should work.

Not: shutting down the machine in the middle of the build will not
provide an error message to Slack for main branch builds (because the
`tell_slack_failed` step would need to run on the same machine) but will
correctly report failure for PRs (that was the original purpose of the
`collect_build_data` step).

An alternative here would be to give a delay to the shutdown command,
and try to calibrate it so that it's long enough for this job to
correctly report its failure to both Azure and Slack, while making it
short enough that no other job gets assigned to the machine. I'm not
clear enough on how often Azure assigns jobs to try and bet on that.

CHANGELOG_BEGIN
CHANGELOG_END
2021-02-12 19:32:14 +01:00
Gary Verhaegen
95c2184dcc
bump disk cleanup threashold (#8807)
I've seen [a build] failing with "disk full" after starting with 41GB
free.

[a build]: https://dev.azure.com/digitalasset/daml/_build/results?buildId=69270&view=logs&j=870bb40c-6da0-5bff-67ed-547f10fa97f2&t=deecee86-545a-596e-8b0d-fb7d606fe9f2

CHANGELOG_BEGIN
CHANGELOG_END
2021-02-10 13:39:29 +00:00
Gary Verhaegen
ab5f62abac
ci/clean-up: s/lsof/pgrep (#8748)
CHANGELOG_BEGIN
CHANGELOG_END
2021-02-04 12:21:55 +01:00
Gary Verhaegen
5734730d50
tweak local cache cleanup (#8738)
For various reasons, my attempts at improving the cache cleanup process
have been delayed. There are, however, two simple, non-controversial
changes I can "backport" without having to wait for consensus on the
whole thing:

1. Increase the threshold. At least for the compat jobs, we have seen
   builds failing after starting with ~32GB free.
2. Kill dangling Bazel processes, which keep some files open and
   sometimes cause the clean-up process to crash.

CHANGELOG_BEGIN
CHANGELOG_END
2021-02-03 19:47:52 +01:00
Gary Verhaegen
532f996f12
ci: clean-up hard drive more often (#8582)
CHANGELOG_BEGIN
CHANGELOG_END
2021-01-20 17:22:53 +01:00