The "output was not created" errors seem to have become very
frequent. While taking out nodes seems to work as a bandaid, I’d like
to see if resetting the cache buys us a few days of not having to deal
with this. Admittedly, I don’t really have an explanation for why
resetting the cache should help if taking out the machines seems to do
something (suggesting that it hasn’t propagated fully).
changelog_begin
changelog_end
* Upgrade nixpkgs revision
* Remove unused minio
It used to be used as a gateway to push the Nix cache to GCS, but has
since been replaced by nix-store-gcs-proxy.
* Update Bazel on Windows
changelog_begin
changelog_end
* Fix hlint warnings
The nixpkgs update implied an hlint update which enabled new warnings.
* Fix "Error applying patch"
Since Bazel 2.2.0 the order of generating `WORKSPACE` and `BUILD` files
and applying patches has been reversed. The allows users to define
patches to these files that will not be immediately overwritten.
However, it also means that patches on another repository's original
`WORKSPACE` file will likely become invalid.
* a948eb7255
* https://github.com/bazelbuild/bazel/issues/10681
Hint: If you're generating a patch with `git` then you can use the
following command to exclude the `WORKSPACE` file.
```
git diff ':(exclude)WORKSPACE'
```
* Update rules_nixpkgs
* nixpkgs location expansion escaping
* Drop --noincompatible_windows_native_test_wrapper
* client_server_test using sh_inline_test
client_server_test used to produce an executable shell script in form of
a text file output. However, since the removal of
`--noincompatible_windows_native_test_wrapper` this no longer works on
Windows since `.sh` files are not directly executable on Windows.
This change fixes the issue by producing the script file in a dedicated
rule and then wrapping it in a `sh_test` rule which also works on
Windows.
* daml_test using sh_inline_test
* daml_doc_test using sh_inline_test
* _daml_validate_test using sh_inline_test
* damlc_compile_test using sh_inline_test
* client_server_test find .exe on Windows
* Bump Windows cache for Bazel update
Remove `clean --expunge` after merge.
Co-authored-by: Andreas Herrmann <andreas.herrmann@tweag.io>
We've seen a series of failures of the form
```
ERROR: D:/a/1/s/daml-assistant/integration-tests/BUILD.bazel:162:1: output 'daml-assistant/integration-tests/create-daml-app-tests.exe' was not created
ERROR: D:/a/1/s/daml-assistant/integration-tests/BUILD.bazel:162:1: not all outputs were created or valid
```
across multiple machines. We suspect cache poisoning as the cause. This
increments the cache URL to effectively clear the cache.
changelog_begin
changelog_end
Co-authored-by: Andreas Herrmann <andreas.herrmann@tweag.io>
We are seeing
ERROR: D:/a/2/s/compiler/scenario-service/protos/BUILD.bazel:67:1:
output
'compiler/scenario-service/protos/_obj/scenario_service_haskell_proto/ScenarioService.o'
was not created
again so following our experiments, let’s reset the cache to see if it
fixes anything.
changelog_begin
changelog_end
For some reason, platform_suffix doesn’t seem to provide enough
isolation to fix the “undeclared inclusion” errors even though it does
fix the issues for me locally.
This PR tries to address the problem by switching from
`platform_suffix` to modifying the actual URL of the cache.
To avoid leaking stuff from the local cache, I’ve added a clean
--expunge for now. We should be able to remove this once nodes have
been reset tomorrow. It will slow down nodes but that is clearly
better than having everything fail.
changelog_begin
changelog_end
When bumping the cache url on Windows, I accidentally also changed the
URL we push to on Linux and MacOS. This is obviously a bad idea so
this PR fixes it.
changelog_begin
changelog_end
* Bump cache suffix
As discussed, we are going to bump this every time we feel like
resetting the cache might help. This is a temporary measure to get
some metrics on how often things break and if resetting the cache
helps.
changelog_begin
changelog_end
* Update configure-bazel as well
changelog_begin
changelog_end
* Include rules_haskell revision in platform suffix
Hopefully this makes CI a bit less of a dumpsterfire. I’ve also
followed the comment and made the suffix actually 3 characters long
instead of 2 since that makes me worry less about collisions and
should hopefully still be short enough to not hit MAX_PATH.
changelog_begin
changelog_end
* Update ci/configure-bazel.sh
Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
* Include sources directory in the Bazel cache key
This should hopefully fix the “undeclared inclusion” errors we have
been getting daily on CI
The details are in a comment but the short summary is that
the daily cron job is running in D:\a\1 whereas jobs on the same
machine afterwards run in D:\a\2. Because absolute paths leak in some
places, this fucks things up.
changelog_begin
changelog_end
* Add debugging output to the cache
* Apply plotform_suffix on all Windows pipelines
To distinguish action keys between the compatibility and the main
workspace and avoid the "undeclared input(s)" error. We also modify the
main workspace's action cache keys to avoid poisoned cache items.
CHANGELOG_BEGIN
CHANGELOG_END
* Avoid exceeding MAX_PATH on Windows
Co-authored-by: Andreas Herrmann <andreas.herrmann@tweag.io>
This is a first step towards testing cross-version
compatibility. It doesn’t actuall do much yet but hopefully it should
be easier to parallelize once we have the initial boilerplate in place
so ideally I’d like to address most missing things and issues in
separate PRs.
changelog_begin
changelog_end
This is a first step towards improving our docs release process. The
goal here is to get rid of the manual "publish docs" step. This is done
as a periodic check because we only want to run this for "published"
releases, i.e. the ones that are not marked as prerelease. Because the
act of publishing a release is a manual step that Azure cannot trigger
on, we instead opt for a periodic check.
Not included in this piece of work:
- Any change to the docs themselves; the goal here is to automate the
current process as a first step. Future plans for the docs themselves
include adding links to older versions of the docs.
- A better way to detect docs are already up-to-date, and abort if so.
- Including older versions of the docs.
- Switching the DNS record from the current AWS S3 bucket to this new
GCS bucket. That will be a manual step once we're happy with how the
new bucket works.