One of the outputs of our brainstorming about how to make CI better was
that it is annoying to have to "babysit" pull requests. This PR attempts
to introduce a notification mechanism by which Azure will notify people
on Slack when a build finishes, so they know they need to go and rerun
or merge the corresponding PR.
This commit also changes the existing $Slack.URL variable to
$Slack.team-daml, to make more explicit where the Slack message is being
sent to (Slack works with one token per destination channel). Both
$Slack.URL and $Slack.team-daml are currently defined as the same token
in Azure.
CHANGELOG_BEGIN
CHANGELOG_END
Originally we ran the release step on both Linux and MacOS to handle
platform dependent artifacts, in particular, damlc.jar. However, we
don’t have any platform dependent artifacts that are uploaded as part
of the release script anymore and I hope we will never have to add any
in the future.
So this PR, removes the code for handling platform dependent artifacts
in the release step and disables the release step on MacOS (while
still setting the variables like we do on Windows).
Currently the release step still costs us ~2 minutes on MacOS which is
already our slowest platform so hopefully this will speed things up a
bit.
changelog_begin
changelog_end
Currently, pretty much all of our builds are bottlenecked on
MacOS (mainly because the builders are slower and have worse
caching). The release step adds > 3min to each build which is a bit
annoying. It turns out that removing the calls to `bazel query` which
are used to check for missing Maven dependencies speeds this up by >
5x. Given that the check is platform independent anyway, we can just
disable it on MacOS.
changelog_begin
changelog_end
* make packages public
This uploads the typescript npm packages of the language support to the
npm registry in the release process.
CHANGELOG_BEGIN
CHANGELOG_END
* address moritz/gary's review
* generate the .npmrc file
* adding debug output
just in case the upload will fail in the next release.
* reverse package order
This patch changes the call to the GitHub API that translates the
release notes from markdown to HTML to use gfm instead of plain
markdown. gfm is a superset of markdown that adds the following:
- GitHub usernames (`@`-mentions) are turned into links to the user's
profile page.
- Issues and PR numbers (`#1234`) are turned into links to the
corresponding issue or PR.
- Existing git shas are turned into links to the corresponding commit.
An example of this feature missing is the release notes for
[v0.13.42](https://blog.daml.com/release-notes/0.13.42-1), where
intended links such as
> - Rename argument in active contract to payload. See #3826.
are not rendered.
CHANGELOG_BEGIN
CHANGELOG_END
We have recently added the option for this script to not build some
versions (because they are too old and external dependencies have
changed from under them). We also have changed the GitHub call to get
all the history of releases.
This PR changes the logic to generate the `versions.json` file so that
it only contains versions that we have either built or copied over.
Consequently, it also changes the logic to decide whether this job
should run to depend only on the latest version, rather than the whole
list of versions.
CHANGELOG_BEGIN
CHANGELOG_END
The GitHub API is paginated (30 items by default). This creates two
problems:
1. At the moment, older versions silently drop from the docs website,
without us having made any explicit decision about it.
2. When we prepare a new version, it gets created as a pre-release
version. Our script filters that out, but that happens on our end so
we end up with 29 published versions and the list is different form the
existing one. If the prerelease then gets dropped, the oldest version
comes back.
It is possible that we will sometime decide we do not want to keep old
documentation around forever, but that should be an explicit decision.
This patch changes the logic to fetch the list of versions from GitHub
so that we always get all the published versions (barring race
conditions inherent to that kind of paginated API).
CHANGELOG_BEGIN
CHANGELOG_END
The docs build is currently not reproducible as it include to-the-minute
time-of-build information. It also includes some Sphinx binary caches
which I suppose will also not be reproducible (though I have not checked
the details there).
This commit attempts to remove all sources of non-reproducibility from
the docs build, though this is hard to test without having a stable,
older release to compare with.
CHANGELOG_BEGIN
CHANGELOG_END
The latest changes to the docs cron have introduced a bug whereby the
"latest" version is determined including prereleases.
CHANGELOG_BEGIN
CHANGELOG_END
This commit makes two conceptually independent changes:
1. It adds a checksum file to each version folder. This allows the
script to detect when a version has not been correctly uploaded.
2. It changes the script to first download all the docs website, and
then reuse existing version folders where appropriate (i.e. when their
folder matches its checksum).
The hope is that this will reduce the time it takes to deploy a new
version, as only the current version should be rebuilt (in addition to
previous, failed versions).
The first time this cron runs (upon next release as per the current
setup), however, it will still rebuild all existing versions as they do
not currently have a checksum.
CHANGELOG_BEGIN
CHANGELOG_END
This commit aims at mitigating two issues we have noticed with the
0.13.41 release:
1. The initial cron run for that release got interrupted at the 50
minutes mark, which happened to be right in the middle of the s3 upload.
This means it had already changed the versions.json file, but had not
finished updating the actual html files. Right now, the docs.daml.com
website shows version 0.13.41 in the drop-down, but actually displays
the content for 0.13.40. Additionally, trying to explicitly visit the
website for 0.13.41 (https://docs.daml.com/0.13.41) yields a 404. Note
that this also means the cron job did not reach the "tell HubSpot"
point, so 0.13.41 did not get announced.
2. As the script also did not reach the "clear cache" step, subsequent
runs have been rebuilding the documentation for no reason as the
sequence of steps was: check versions.json through HTTP, get cached one,
see it's not up-to-date, build docs, check versions.json through s3 API,
bypassing the cache, see it's up-to-date, stop.
To address those issues, this PR changes the cron to:
1. Increase the timeout to 2h instead of 50 minutes.
2. Always check the versions.json file through s3, rather than go
through the HTTP cache first.
These are not complete solutions but I'm not sure how to do better given
that s3 does not have atomic operations.
* re-add cleanup for /tmp to remove 700ish mb of unneeded temp files made by the sdk installer
* Set WORKDIR to daml user home dir so that sdk tools can create files
* add daml sdk config defaults for auto-install and update-check sdk install RUN command
* add --no-cache to apk add to reduce size a little
* add line return to end of daml-config.yaml
Currently if the docs script fails, the Slack message we get mentions the commit title of the docs version that failed to build, which is not super useful. This ensures we get back to the current commit regardless of what happens with the Haskell script.
* bazel: 0.28.1 --> 1.1.0
* bazel-watcher sha256
* Fix missing line in patch
* proto_source_root --> strip_import_prefix
See https://github.com/bazelbuild/bazel/issues/7153 for details.
* Update rules_nixpkgs
Required to avoid errors of the form
```
ERROR: An error occurred during the fetch of repository 'node_nix':
parameter 'sep' may not be specified by name, for call to method split(sep, maxsplit = None) of 'string'
```
and
```
ERROR: An error occurred during the fetch of repository 'node_nix':
Traceback (most recent call last):
File "/private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 149
_execute_or_fail(repository_ctx, <3 more arguments>)
File "/private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 318, in _execute_or_fail
fail(<1 more arguments>)
Cannot build Nix attribute 'nodejs'.
Command: [/Users/runner/.nix-profile/bin/nix-build, /private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/node_nix/nix/bazel.nix, "-A", "nodejs", "--out-link", "bazel-support/nix-out-link", "-I", "nixpkgs=/private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/nixpkgs/nixpkgs"]
Return code: 1
Error output:
src/main/tools/process-tools.cc:173: "setitimer": Invalid argument
```
* Update rules_scala
* .proto has been removed, use [ProtoInfo] instead
See
https://docs.bazel.build/versions/1.1.0/be/protocol-buffer.html#proto_library
* python3_nix add nix_file attribute
To avoid the following error
```
ERROR: /home/aj/tweag.io/da/da-bazel-1.1/BUILD:66:1: //:nix_python3_runtime depends on @python3_nix//:bin/python in repository @python3_nix which failed to fetch. no such package '@python3_nix//': Traceback (most recent call last):
File "/home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 149
_execute_or_fail(repository_ctx, <3 more arguments>)
File "/home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 318, in _execute_or_fail
fail(<1 more arguments>)
Cannot build Nix attribute 'python3'.
Command: [/home/aj/.nix-profile/bin/nix-build, "-E", "import <nixpkgs> { config = {}; overlays = []; }", "-A", "python3", "--out-link", "bazel-support/nix-out-link", "-I", "nixpkgs=/home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/nixpkgs/nixpkgs"]
Return code: 1
Error output:
error: anonymous function at /home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/nixpkgs/nixpkgs.nix:3:1 called with unexpected argument 'config', at (string):1:1
```
* rules_haskell unnamed string.split(_, maxsplit = _)
The keyword argument may no longer be named.
* string.replace(_, _, maxsplit = _) may not be named
* Move proto sources from deps to data
Fixes
```
ERROR: /home/aj/tweag.io/da/da-bazel-1.1/daml-lf/archive/BUILD.bazel:150:1: in deps attribute of scala_test rule //daml-lf/archive:daml_lf_archive_reader_tests_test_suite_src_test_scala_com_digitalasset_daml_lf_archive_DecodeV1Spec.scala: '//daml-lf/archive:daml_lf_1.6_archive_proto_srcs' does not have mandatory providers: 'JavaInfo'. Since this rule was created by the macro 'da_scala_test_suite', the error might have been caused by the macro implementation
```
* Define sha256 for haskell_ghc__paths
Bazel 1.1.0 fails on missing hashes.
* Disable --incompatible_windows_native_test_wrapper
* //compiler/daml-extension don't modify sources
Modifying sources in-place can cause issues on Windows, where build
actions are not sandboxed and changes on sources can affect other build
steps.
* bazel-genfiles --> bazel-bin
The bazel-genfiles symlink has been removed since Bazel 1.0.
See https://github.com/bazelbuild/bazel/issues/8651
* Mark dev_env_tool repository rule as configure
See
https://docs.bazel.build/versions/1.1.0/skylark/lib/globals.html#repository_rule
* Move data deps into data attribute
* Mark dev_env_tool as local = True
* Manually fetch @makensis_dev_env
* Shrink the docker image for the SDK by 57%
Wiping out the `/tmp` dir after installing the SDK does wonders.
@associahedron I wonder if we should do this in the assistant?
* Update release notes
Previously, we were installing the SDK as root which is probably not a
good idea. This PR adds a new `daml` user and fixes PATH (`$HOME` and
`~` both don’t work in this context).
* Fixes#1725: Correct Maven credential variables in CI release script.
Update documentation that refered explicitly to the old version, to
use refer to new version.
* Fixes#1204: Release bindings and codegens to Maven Central.
Upload the Java and Scala Bindings with the respective code
generator binaries to Sonatype Open Source Repository
Host for synchronization with Maven Central.
* webide: build webide image when sdk releases
* add scripts which check the latest version of sdk. If webide docker
image version does not exist or is older than the sdk version, it will
kick off a build of the webide docker image
* add job to azure cron
* webide: minor response to review
* windows: fixed daml-lf tests for Windows by using Bazel's rlocation
* more consistent logging on CI; publishing Windows test logs on failure
* windows: fix daml-lf engine tests
* windows: add diff tool to msys
This is a first step towards improving our docs release process. The
goal here is to get rid of the manual "publish docs" step. This is done
as a periodic check because we only want to run this for "published"
releases, i.e. the ones that are not marked as prerelease. Because the
act of publishing a release is a manual step that Azure cannot trigger
on, we instead opt for a periodic check.
Not included in this piece of work:
- Any change to the docs themselves; the goal here is to automate the
current process as a first step. Future plans for the docs themselves
include adding links to older versions of the docs.
- A better way to detect docs are already up-to-date, and abort if so.
- Including older versions of the docs.
- Switching the DNS record from the current AWS S3 bucket to this new
GCS bucket. That will be a manual step once we're happy with how the
new bucket works.
This reverts commit 3d8acde916.
For some reason that commit seems to have resulted in a lot of
"unexpected end of file" errors during cache downloads. I do not know
what is going on here or how to fix it so let’s revert it for now.
* release: make 'ci/release.sh' runnable for dry runs.
release-dry-run.sh is outdated and duplicates logic from ci/release.sh, so it
got deleted.
* ledger-api-test-tool: release the tool together with the SDK components.
* ledger-api-test-tool: update docs to reflect distribution mechanism.
* ledger-api-test-tool: further docs refinements.
* Add Ledger API Test Tool mention into release notes.
The ci/release.sh fails if the BUILD_SOURCEBRANCHNAME environment
variable is not set. Although this variable is normally set by the
CI system, it is sometimes useful to run the script manually and
simply adding an 'invalid' default to the check of the env variable
means that the script still works if the variable is unbound.
The newer version seems to segfault on MacOS quite often so let’s
downgrade for now. We should also try to see if we can find a
reasonable way of reproducing this and report it upstream.
As multiple platforms will create different annotated tags (because an
annotated tag includes a tag time), they will conflict on trying to
push. Therefore, we go for a lightweight tag for now, as those are
simple pointers to a commit and git will recognize that they are the
same and there is no conflict.
This rewrites the release script to be a lot simpler and significantly
faster:
- The artifacts are now declared in a separate yaml file which should
make it easier for people to modify and doesn’t clutter the actual
code.
- There is only a constant number of calls to Bazel which speeds up
the script quite a bit.
I verified that the release artifacts are the same that we got
before and I traced the calls to the jfrog binary in a fake release
and ignoring order they are identical.
This adds `ci/azure-cleanup`, containing a script that talks to azure pipelines, removing agents older than 25 hours in a specific pool.
Machines are meant to be killed after 24 hours anyway, make sure they're properly unregistered from Azure Pipelines, too.
By doing this, we don't need to unregister nodes manually on shutdown.
Idea is to execute this every time a new agent is provisioned, it has cloned the repo. We intend to clone the repo and pre-warm the caches there anyhow.
WIP until the repo fetching and cache pre-warming is present, too.
cc @zimbatm
### Pull Request Checklist
- [x] Read and understand the [contribution guidelines](https://github.com/digital-asset/daml/blob/master/CONTRIBUTING.md)
- [x] Include appropriate tests
- [x] Set a descriptive title and thorough description
- [x] Add a reference to the [issue this PR will solve](https://github.com/digital-asset/daml/issues), if appropriate
- [x] Add a line to the [release notes](https://github.com/digital-asset/daml/blob/master/docs/source/support/release-notes.rst), if appropriate
NOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with `/AzurePipelines run` to
trigger the build.
Azure Pipelines has direct integration with GitHub, so we're just using
that. Releases on GitHub have to target a tag, so we also need to push
the tag as an intermediate step; we also need to include the platform
name in the artifact to avoid overwriting from different builds.
The two "GitHub release" steps depend on two Azure variables that are
not defined in the pipeline script. This may look like it should not
work, but in fact it does, because these variables are set by the
release script.
In Azure Pipelines, any build step can set variables for the next build
steps by outputting specially-formatted text to stdout. This text will
not appear in the build output displayed by Azure Pipelines, e.g.:
```
echo '##vso[task.setvariable variable=sauce]tomatoes'
```
would define the Azure variable `sauce` to have the string `tomatoes` as
its value for the next build steps.
See [0] for details.
[0]: https://docs.microsoft.com/en-us/azure/devops/pipelines/process/variables?view=azure-devops&tabs=yaml%2Cbatch#set-in-script
* nix: add the more providers to terraform
* docs: make tarballs more reproducible
* ci: use the linux-pool pool
* ci: tweak the nix installation
handle the case where the user is root and on ubuntu
* infra: terraform fmt
* infra: add Azure Pipeline agents
* ci: only enable linux-pool for internal PRs
Without this PR, this variable ends up being set to the string
"variables['System.PullRequest.IsFork']" which meant that we never
uploaded to the Bazel cache.