mirror of
https://github.com/digital-asset/daml.git
synced 2024-09-20 01:07:18 +03:00
631ed3e891
This bumps the timeout of the compat tests on PRs to 360 minutes matching other jobs on a PR (we mainly hit this if ghc-lib is rebuilt) and the timeout on the daily jobs to 720 minutes (we hit this if _everything_ is rebuilt). I am slightly worried about the timeout on the daily job. After having taken a look at it, there are a few reasons how we ended up here: 1. We started including more tests, e.g., sandbox-classic. Not much we can do here, those tests are useful. 2. We have a very large number of snapshots for 1.3.0. There are a few reasons for this: 1. Timing: We branched off early for the 1.2.0 release so the first snapshot for 1.3 was on June 3th. For 1.4 it looks like the first snapshot will be on July 15th so that’s roughly 2 extra snapshots just due to timing. 2. Additional snapshots: We had one broken snapshot due to a broken VSCode extension that we didn’t delete (probably not worth doing at this point). We also had to backport to an old snapshot which resulted in another extra snapshot. We also had one extra snapshot which was supposed to be the RC but wasn’t since the ANF revert needed to go in. The only thing that is clearly useless is the one broken snapshot but that doesn’t change things that much. I see 2 orthogonal options for improving this assuming we agree that the current runtime is worryingly high. 1. Prune snapshots more aggressively, e.g., only include the last 3 snapshots. That’s a pretty arbitrary decision but it would enforce a hard limit. 2. Reduce test combinations. E.g., only test snapshots vs stable releases but not snapshots vs snapshots. 3. We end up forcing a full build quite frequently. Here are just 2 examples of how we’ve done that so far. 1. Upgrade rules_haskell. Basically all tests are run by a Haskell binary so this forces a full rebuild. 2. Change runfiles of `daml`. I don’t think there is much we can do about 1 or 3 which leaves us with 2. One not entirely unreasonable option is to just do nothing. We did have periods where things went pretty smoothly for the most part and each month we reset to a much smaller number of releases (we also have to start throwing out old stable releases at some point). Otherwise reducing the number of test combinations seems the most promising option to me. changelog_begin changelog_end |
||
---|---|---|
.. | ||
cron | ||
da-ghc-lib | ||
docker/daml-sdk | ||
patch_bazel_windows | ||
build-unix.yml | ||
build-windows.yml | ||
check-changelog.sh | ||
clear-shared-segments-macos.yml | ||
compatibility_ts_libs.yml | ||
compatibility-windows.yml | ||
compatibility.yml | ||
configure-bazel.sh | ||
daily_tell_slack.yml | ||
dev-env-install.sh | ||
dev-env-push.py | ||
postgresql.conf | ||
report-end.yml | ||
report-start.yml | ||
slack_user_ids | ||
tell-slack-failed.yml | ||
windows-diagnostics.ps1 |