As we strive for more inclusiveness, we are becoming less comfortable
with historically-charged terms being used in our everyday work.
This is targeted for merge on Dec 26, _after_ the necessary
corresponding changes at both the GitHub and Azure Pipelines levels.
CHANGELOG_BEGIN
- DAML Connect development is now conducted from the `main` branch,
rather than the `master` one. If you had any dependency on the
digital-asset/daml repository, you will need to update this parameter.
CHANGELOG_END
This reverts commit 61e9df3eaf.
This interacts very badly with the fact that we check out old commits
for releases. While we could fix it for this particular issue, I don’t
think this buys us enough to make this worth doing and it makes it
easy to introduce issues in the future if we modify lib.sh
changelog_begin
changelog_end
trigger all releases from master
The 1.1.0 release went wrong and we had to trash it and release 1.1.1
instead. This is an attempt at identifying and correcting the root
cause behind that incident.
To understand the situation, we need to know how releases worked before
1.0. We had a one-line file called `LATEST` that specifies the git SHA and
version tag for the latest release. A change to that file triggered a
release with the specified release tag, built from the source tree of
the specified commit. The `LATEST` file looked something like:
```
f050da78c9 1.0.0-snapshot.20200411.3905.0.f050da78
```
To mark a release as stable, we would change it to look like this:
```
f050da78c9 1.0.0
```
i.e. simply drop the `-snapshot...` suffix. Even though the commit (and
thus the entire source tree we build from) is the same, we would need to
rebuild almost all of our release artifacts, as they embed the version
tag in various places and ways. That worked well as long as we could
assume we were doing trunk-based development, i.e. all releases would
always come from the same (`master`) branch.
When we released 1.0, and started work on 1.1, we had a few bug reports
for 1.0 that we decided should be resolved in a point release. We
decided that the best way to handle that would be to have a branch
starting on the release commit for 1.0, and then backport patches from
`master` to that branch. We adapted our build process to also watch the
`release/1.0.x` branch and, in particular, trigger a new release build if
the `LATEST` file in that branch changed. That worked well.
The plan going forward was to keep doing regular snapshot releases from
the `master` branch, and create support, point releases ("patch" releases
in semver) from dedicated branches.
On April 30, we made a snapshot release as an RC for 1.1.0, by changing
the `LATEST` file in the `master` branch. That release was built on commit
681c862d. On May 6, we decided to take a new snapshot as the RC for
1.1.0; we changed `LATEST` in `master` to designate 7e448d81 as the new
latest release.
On May 11, we noticed an issue that broke our builds. Without going into
details, an external artifact we depend on had changed in incompatible
ways. After fixing that on `master`, we reasoned that this would also
break the build of the final 1.1.0 release if we just tried to build
7e448d81 again. But as the target release date was May 13, we did not
want to take a new snapshot after that fix, as that would have included
one more week of work in the release, and given us no time to test it.
So we did what we did for the 1.0 branch, as it had worked well: we
created a branch that forked from `master` at commit 7e448d81 and called
it `release/1.1.x`, then cherry-picked the one fix to our build process to
work around the broken download. When the time came to make the final
1.1.0 build on May 13, we naturally picked the `LATEST` file from the
`release/1.1.x` branch and dropped the `-snapshot...` suffix. Importantly,
we did not need to update the target commit to include the "broken
download" fix as, in the meantime, the internet had fixed itself, and we
thus reasoned we should go for the exact code of the RC rather than
include an unnecessary, albeit seemingly harmless, change.
Everything went well with the release process. Tests went well too. Then
we got a report that an application that worked against the latest RC
broke with the final 1.1.0. The issue was that we had built the wrong
commit: by branching off at the point of the _target_ commit for the
latest snapshot, we did not have the change to the `LATEST` file that
designated that commit as the target. So the `LATEST` file in
`release/1.1.x` was still pointing to 681c862d.
I believe the root cause for this issue is the fact that we have
scattered our release process over multiple branches, meaning there is
no linear history of what was released and we are relying on people
being able to mentally manage multiple timelines. Therefore, I propose
to fix our release process so this should not happen again by
linearizing the release process, i.e. getting back to a situation where
all releases are made from a single branch, `master`.
Because we do want to be able to release _for_ multiple release branches
(to provide backports and bugfixes), we still need some way to
accommodate that. Having a single `LATEST` file in the same format as
before would not really work well: keeping track of interleaved release
streams on a single file would not really be easier than keeping track
of multiple branches.
My proposed solution is to instead have a multiline LATEST file, so that
all the release branch "tips" can be observed at the same time, and, as
long as we take care to only advance one release branch at a time, we
can easily keep track of each of them. This is what this PR does.
This required a few changes to our release process. Most notably:
- Obviously, as this is the main point of this PR, the build process has
once again been restricted to only trigger new releases from the
`master` branch.
- As our CI machinery cannot easily be made to produce multiple releases
from a single build, the `check_for_release` step will only recognize
a commit as a release trigger if it changes a single line in the
`LATEST` file. This restriction comes in addition to the existing one
that a release commit is only allowed to change either just the
`LATEST` file or both the `LATEST` and
`docs/source/support/release-notes.rst` files.
- The docs publication process has been changed to update _all_
published versions to display the _latest_ release notes page. This
means that the release notes page will always show you all published
versions, regardless of which version of the documentation you're
looking at. This also means that interleaving release notes correctly on
that page is a manual exercise.
- As per the intention of the new process, the `LATEST` file has been
updated to contained all existing post-1.0 stable releases. It should
also include all existing snapshot releases should we have more than one
at a time (say, should we discover an issue with 1.1.1 that required us
to work on a 1.1.2).
- The `release.sh` script has been dramatically simplified as I felt it
was trying to do too much and porting its existing functionality to a
multi-line `LATEST` file would be too hard.
CHANGELOG_BEGIN
CHANGELOG_END
This commit aims at enabling future patch releases; it is the
master-branch equivalent of #5569 (applied to the 1.0 release branch).
The only change between the two changelogs should be that this one also
changes the docs cron so it can find the trigger commits for patch
releases.
CHANGELOG_BEGIN
CHANGELOG_END
During the latest attempt at making a snapshot release (#4749),
everything seemingly went well except for the step of signing the
Windows installer.
Based on a very obscure erorr message (Access Denied) and a bit of
Google search, my current hypothesis is that the signing fails because
the artifact produced by Bazel is read-only. This was not an issue in
the previous setup because we were getting the installer from an Azure
internal download between different jobs. We are now getting the binary
directly from Bazel.
This also adds a `set -e` to the relevant Bash snippet, because ideally
they should always have one.
CHANGELOG_BEGIN
CHANGELOG_END
Somehow, in the current setup, the publish steps do not get executed on
master. This is what Azure reports:
```
Evaluating: and(succeeded(), eq('$(is_release)', 'true'),
eq(variables['Build.SourceBranchName'], 'master'), eq('linux', 'linux'))
Expanded: and(True, eq('$(is_release)', 'true'),
eq(variables['Build.SourceBranchName'], 'master'), eq('linux', 'linux'))
Result: False
```
So it looks like, in the condition, `${{parameters.is_release}}`
evaluates to the literal string `$(is_release)`. If we look at the point
of invocation of the ~function~ template, we can see:
```
- template: ci/build-unix.yml
parameters:
release_tag: $(release_tag)
name: 'linux'
is_release: $(is_release)
```
so it does not seem completely crazy. However, according to the
documentation, we should expect that to be replaced by the value of the
corresponding variable, as per:
```
variables:
release_sha: $[ dependencies.check_for_release.outputs['out.release_sha'] ]
release_tag: $[ coalesce(dependencies.check_for_release.outputs['out.release_tag'], '0.0.0') ]
trigger_sha: $[ dependencies.check_for_release.outputs['out.trigger_sha'] ]
is_release: $[ dependencies.check_for_release.outputs['out.is_release'] ]
```
What's interesting here is that, within `build-unix.yml`, we are also
using `release_tag` in the exact same way:
```
- bash: ./build.sh "_$(uname)"
displayName: 'Build'
env:
DAML_SDK_RELEASE_VERSION: ${{parameters.release_tag}}
```
and this time output from the build seems to show the value being
correctly substituted:
```
damlc - Compiler and IDE backend for the Digital Asset Modelling
Language
SDK Version: 0.13.55-snapshot.20200226.3266.d58bb459
Usage: <interactive> COMMAND
Invoke the DAML compiler. Use -h for help.
```
My current guess is that the (undocumented, as far as I can tell)
evaluation order is as follows:
1. In the template, syntactically replace all the parameters.
2. In the job definition, replace the call to the template with the code
of the template. So it is as if we had written the template directly in
the `azure-pipelines.yml` file, with `$(release_tag)` and
`$(is_release)`.
3. Run the build. When we reach the time to run this specific job,
we can evaluate the expressions for the variables and replace them in
the rest of the job.
So what is going wrong? I believe the issue is with the quotes,
preventing the substitution of `is_release`. They came directly from the
[documented
syntax](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/conditions?view=azure-devops&tabs=yaml#use-a-template-parameter-as-part-of-a-condition),
but if the above evaluation order is correct, they should not be there.
There are actually two things going wrong here. The first one is that
the syntax `$()` is used to substitute a value in what Azure considers a
string. This is the case for `env` keys. However, the `condition` key
is not a string, it is an Azure "expression". Expressions have their own
evaluation rules and syntax, and in particular, `$()` is not a
substitution rule there, so when it sees `$()` in a string in an
expression (due to the quoptes), it leaves it alone.
Removing the quotes does not directly help, though, as we then end with
```
condition: eq($(is_release), 'true')
```
and `$()` is not valid syntax in an expression. The way to use variables
in an expression is `variables.name` (or `variables["name"]`, because
why have only one?).
So that means we have to pass variables to the template in different
ways depending on how they will be used. So much fun.
CHANGELOG_BEGIN
CHANGELOG_END
The existing approach is a historical accident. The reason for the
additional tarball/install jobs was that, in my original attempt, the
build steps would still build the current commit, as opposed to the
target commit.
This is not such an issue on Linux, but setting up the build environment
on Windows and macOS _again_ for no good reason is a pure waste of time
(and effort in getting it right). Now that the build steps build the
target commit (with the env var set), we can go back to the way things
were previously: just take the build products directly from the build
step.
CHANGELOG_BEGIN
CHANGELOG_END
Context
=======
After multiple discussions about our current release schedule and
process, we've come to the conclusion that we need to be able to make a
distinction between technical snapshots and marketing releases. In other
words, we need to be able to create a bundle for early adopters to test
without making it an officially-supported version, and without
necessarily implying everyone should go through the trouble of
upgrading. The underlying goal is to have less frequent but more stable
"official" releases.
This PR is a proposal for a new release process designed under the
following constraints:
- Reuse as much as possible of the existing infrastructure, to minimize
effort but also chances of disruptions.
- Have the ability to create "snapshot"/"nightly"/... releases that are
not meant for general public consumption, but can still be used by savvy
users without jumping through too many extra hoops (ideally just
swapping in a slightly-weirder version string).
- Have the ability to promote an existing snapshot release to "official"
release status, with as few changes as possible in-between, so we can be
confident that the official release is what we tested as a prerelease.
- Have as much of the release pipeline shared between the two types of
releases, to avoid discovering non-transient problems while trying to
promote a snapshot to an official release.
- Triggerring a release should still be done through a PR, so we can
keep the same approval process for SOC2 auditability.
The gist of this proposal is to replace the current `VERSION` file with
a `LATEST` file, which would have the following format:
```
ef5d32b7438e481de0235c5538aedab419682388 0.13.53-alpha.20200214.3025.ef5d32b7
```
This file would be maintained with a script to reduce manual labor in
producing the version string. Other than that, the process will be
largely the same, with releases triggered by changes to this `LATEST`
and the release notes files.
Version numbers
===============
Because one of the goals is to reduce the velocity of our published
version numbers, we need a different version scheme for our snapshot
releases. Fortunately, most version schemes have some support for that;
unfortunately, the SDK sits at the intersection of three different
version schemes that have made incompatible choices. Without going into
too much detail:
- Semantic versioning (which we chose as the version format for the SDK
version number) allows for "prerelease" version numbers as well as
"metadata"; an example of a complete version string would be
`1.2.3-nightly.201+server12.43`. The "main" part of the version string
always has to have 3 numbers separated by dots; the "prerelease"
(after the `-` but before the `+`) and the "metadata" (after the `+`)
parts are optional and, if present, must consist of one or more segments
separated by dots, where a segment can be either a number or an
alphanumeric string. In terms of ordering, metadata is irrelevant and
any version with a prerelease string is before the corresponding "main"
version string alone. Amongst prereleases, segments are compared in
order with purely numeric ones compared as numbers and mixed ones
compared lexicographically. So 1.2.3 is more recent than 1.2.3-1,
which is itself less recent than 1.2.3-2.
- Maven version strings are any number of segments separated by a `.`, a
`-`, or a transition between a number and a letter. Version strings
are compared element-wise, with numeric segments being compared as
numbers. Alphabetic segments are treated specially if they happen to be
one of a handful of magic words (such as "alpha", "beta" or "snapshot"
for example) which count as "qualifiers"; a version string with a
qualifier is "before" its prefix (`1.2.3` is before `1.2.3-alpha.3`,
which is the same as `1.2.3-alpha3` or `1.2.3-alpha-3`), and there is a
special ordering amongst qualifiers. Other alphabetic segments are
compared alphabetically and count as being "after" their prefix
(`1.2.3-really-final-this-time` counts as being released after `1.2.3`).
- GHC package numbers are comprised of any number of numeric segments
separated by `.`, plus an optional (though deprecated) alphanumeric
"version tag" separated by a `-`. I could not find any official
documentation on ordering for the version tag; numeric segments are
compared as numbers.
- npm uses semantic versioning so that is covered already.
After much more investigation than I'd care to admit, I have come up
with the following compromise as the least-bad solution. First,
obviously, the version string for stable/marketing versions is going to
be "standard" semver, i.e. major.minor.patch, all numbers, which works,
and sorts as expected, for all three schemes. For snapshot releases, we
shall use the following (semver) format:
```
0.13.53-alpha.20200214.3025.ef5d32b7
```
where the components are, respectively:
- `0.13.53`: the expected version string of the next "stable" release.
- `alpha`: a marker that hopefully scares people enough.
- `20200214`: the date of the release commit, which _MUST_ be on
master.
- `3025`: the number of commits in master up to the release commit
(included). Because we have a linear, append-only master branch, this
uniquely identifies the commit.
- `ef5d32b7ù : the first 8 characters of the release commit sha. This is
not strictly speaking necessary, but makes it a lot more convenient to
identify the commit.
The main downsides of this format are:
1. It is not a valid format for GHC packages. We do not publish GHC
packages from the SDK (so far we have instead opted to release our
Haskell code as separate packages entirely), so this should not be an
issue. However, our SDK version currently leaks to `ghc-pkg` as the
version string for the stdlib (and prim) packages. This PR addresses
that by tweaking the compiler to remove the offending bits, so `ghc-pkg`
would see the above version number as `0.13.53.20200214.3025`, which
should be enough to uniquely identify it. Note that, as far as I could
find out, this number would never be exposed to users.
2. It is rather long, which I think is good from a human perspective as
it makes it more scary. However, I have been told that this may be
long enough to cause issues on Windows by pushing us past the max path
size limitation of that "OS". I suggest we try it and see what
happens.
The upsides are:
- It clearly indicates it is an unstable release (`alpha`).
- It clearly indicates how old it is, by including the date.
- To humans, it is immediately obvious which version is "later" even if
they have the same date, allowing us to release same-day patches if
needed. (Note: that is, commits that were made on the same day; the
release date itself is irrelevant here.)
- It contains the git sha so the commit built for that release is
immediately obvious.
- It sorts correctly under all schemes (modulo the modification for
GHC).
Alternatives I considered:
- Pander to GHC: 0.13.53-alpha-20200214-3025-ef5d32b7. This format would
be accepted by all schemes, but will not sort as expected under semantic
versioning (though Maven will be fine). I have no idea how it will sort
under GHC.
- Not having any non-numeric component, e.g. `0.13.53.20200214.3025`.
This is not valid semantic versioning and is therefore rejected by
npm.
- Not having detailed info: just go with `0.13.53-snapshot`. This is
what is generally done in the Java world, but we then lose track of what
version is actually in use and I'm concerned about bug reports. This
would also not let us publish to the main Maven repo (at least not more
than once), as artifacts there are supposed to be immutable.
- No having a qualifier: `0.13.53-3025` would be acceptable to all three
version formats. However, it would not clearly indicate to humans that
it is not meant as a stable version, and would sort differently under
semantic versioning (which counts it as a prerelease, i.e. before
`0.13.53`) than under maven (which counts it as a patch, so after
`0.13.53`).
- Just counting releases: `0.13.53-alpha.1`, where we just count the
number of prereleases in-between `0.13.52` and the next. This is
currently the fallback plan if Windows path length causes issues. It
would be less convenient to map releases to commits, but it could still
be done via querying the history of the `LATEST` file.
Release notes
=============
> Note: We have decided not to have release notes for snapshot releases.
Release notes are a bit tricky. Because we want the ability to make
snapshot releases, then later on promote them to stable releases, it
follows that we want to build commits from the past. However, if we
decide post-hoc that a commit is actually a good candidate for a
release, there is no way that commit can have the appropriate release
notes: it cannot know what version number it's getting, and, moreover,
we now track changes in commit messages. And I do not think anyone wants
to go back to the release notes file being a merge bottleneck.
But release notes need to be published to the releases blog upon
releasing a stable version, and the docs website needs to be updated and
include them.
The only sensible solution here is to pick up the release notes as of
the commit that triggers the release. As the docs cron runs
asynchronously, this means walking down the git history to find the
relevant commit.
> Note: We could probably do away with the asynchronicity at this point.
> It was originally included to cover for the possibility of a release
> failing. If we are releasing commits from the past after they have been
> tested, this should not be an issue anymore. If the docs generation were
> part of the synchronous release step, it would have direct access to the
> correct release notes without having to walk down the git history.
>
> However, I think it is more prudent to keep this change as a future step,
> after we're confident the new release scheme does indeed produce much more
> reliable "stable" releases.
New release process
===================
Just like releases are currently controlled mostly by detecting
changes to the `VERSION` file, the new process will be controlled by
detecting changes to the `LATEST` file. The format of that file will
include both the version string and the corresponding SHA.
Upon detecting a change to the `LATEST` file, CI will run the entire
release process, just like it does now with the VERSION file. The main
differences are:
1. Before running the release step, CI will checkout the commit
specified in the LATEST file. This requires separating the release
step from the build step, which in my opinion is cleaner anyway.
2. The `//:VERSION` Bazel target is replaced by a repository rule
that gets the version to build from an environment variable, with a
default of `0.0.0` to remain consistent with the current `daml-head`
behaviour.
Some of the manual steps will need to be skipped for a snapshot release.
See amended `release/RELEASE.md` in this commit for details.
The main caveat of this approach is that the official release will be a
different binary from the corresponding snapshot. It will have been
built from the same source, but with a different version string. This is
somewhat mitigated by Bazel caching, meaning any build step that does
not depend on the version string should use the cache and produce
identical results. I do not think this can be avoided when our artifact
includes its own version number.
I must note, though, that while going through the changes required after
removing the `VERSION` file, I have been quite surprised at the sheer number of
things that actually depend on the SDK version number. I believe we should
look into reducing that over time.
CHANGELOG_BEGIN
CHANGELOG_END
They can weigh close to 1GB, and the internal Azure networks are
unreliable enough that this adds up significant amounts of time to our
already slow CI pipeline (usually around a minute, but I've seen at
least one case where it took almost 28 minutes).
Given that we pretty much never look a them, 🔥.
CHANGELOG_BEGIN
CHANGELOG_END
* bazel: 0.28.1 --> 1.1.0
* bazel-watcher sha256
* Fix missing line in patch
* proto_source_root --> strip_import_prefix
See https://github.com/bazelbuild/bazel/issues/7153 for details.
* Update rules_nixpkgs
Required to avoid errors of the form
```
ERROR: An error occurred during the fetch of repository 'node_nix':
parameter 'sep' may not be specified by name, for call to method split(sep, maxsplit = None) of 'string'
```
and
```
ERROR: An error occurred during the fetch of repository 'node_nix':
Traceback (most recent call last):
File "/private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 149
_execute_or_fail(repository_ctx, <3 more arguments>)
File "/private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 318, in _execute_or_fail
fail(<1 more arguments>)
Cannot build Nix attribute 'nodejs'.
Command: [/Users/runner/.nix-profile/bin/nix-build, /private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/node_nix/nix/bazel.nix, "-A", "nodejs", "--out-link", "bazel-support/nix-out-link", "-I", "nixpkgs=/private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/nixpkgs/nixpkgs"]
Return code: 1
Error output:
src/main/tools/process-tools.cc:173: "setitimer": Invalid argument
```
* Update rules_scala
* .proto has been removed, use [ProtoInfo] instead
See
https://docs.bazel.build/versions/1.1.0/be/protocol-buffer.html#proto_library
* python3_nix add nix_file attribute
To avoid the following error
```
ERROR: /home/aj/tweag.io/da/da-bazel-1.1/BUILD:66:1: //:nix_python3_runtime depends on @python3_nix//:bin/python in repository @python3_nix which failed to fetch. no such package '@python3_nix//': Traceback (most recent call last):
File "/home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 149
_execute_or_fail(repository_ctx, <3 more arguments>)
File "/home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 318, in _execute_or_fail
fail(<1 more arguments>)
Cannot build Nix attribute 'python3'.
Command: [/home/aj/.nix-profile/bin/nix-build, "-E", "import <nixpkgs> { config = {}; overlays = []; }", "-A", "python3", "--out-link", "bazel-support/nix-out-link", "-I", "nixpkgs=/home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/nixpkgs/nixpkgs"]
Return code: 1
Error output:
error: anonymous function at /home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/nixpkgs/nixpkgs.nix:3:1 called with unexpected argument 'config', at (string):1:1
```
* rules_haskell unnamed string.split(_, maxsplit = _)
The keyword argument may no longer be named.
* string.replace(_, _, maxsplit = _) may not be named
* Move proto sources from deps to data
Fixes
```
ERROR: /home/aj/tweag.io/da/da-bazel-1.1/daml-lf/archive/BUILD.bazel:150:1: in deps attribute of scala_test rule //daml-lf/archive:daml_lf_archive_reader_tests_test_suite_src_test_scala_com_digitalasset_daml_lf_archive_DecodeV1Spec.scala: '//daml-lf/archive:daml_lf_1.6_archive_proto_srcs' does not have mandatory providers: 'JavaInfo'. Since this rule was created by the macro 'da_scala_test_suite', the error might have been caused by the macro implementation
```
* Define sha256 for haskell_ghc__paths
Bazel 1.1.0 fails on missing hashes.
* Disable --incompatible_windows_native_test_wrapper
* //compiler/daml-extension don't modify sources
Modifying sources in-place can cause issues on Windows, where build
actions are not sandboxed and changes on sources can affect other build
steps.
* bazel-genfiles --> bazel-bin
The bazel-genfiles symlink has been removed since Bazel 1.0.
See https://github.com/bazelbuild/bazel/issues/8651
* Mark dev_env_tool repository rule as configure
See
https://docs.bazel.build/versions/1.1.0/skylark/lib/globals.html#repository_rule
* Move data deps into data attribute
* Mark dev_env_tool as local = True
* Manually fetch @makensis_dev_env
* windows: fixed daml-lf tests for Windows by using Bazel's rlocation
* more consistent logging on CI; publishing Windows test logs on failure
* windows: fix daml-lf engine tests
* windows: add diff tool to msys