For the used-to-be-rare-but-not-so-much-anymore case where the job fails
after having pushed its logs (without this the push fails as we can't
overwrite artifacts).
This was spurred by the fact that the "report_end" task sometimes fails
on m1 with the "install Bash lib" step just never finishing (and the
whole job then times out after 6h).
Hopefully by running fewer things we get fewer chances of these kinds of
weird issues.
Note that it's unclear if anything actually crashes on the m1 machines
or if this is a loss of connection between Azure Pipelines and the
machine. From what I've seen as soon as that job times out the machine
is able to successfully pick up other jobs. Speaking of, I've also
reduced the 6h timeouts to a more reasonable 3h.
We routinely have upwards of 3GB of logs. They are very rarely
downloaded, most people don't even know they're there. Uploading 3GB
takes time. This should make it faster, hopefully.
I also changed CI config so we run this on every build but only upload
on releases. That should hopefully make sure we catch this immediately
next time. The script is fast enough that this shouldn’t slow this
down meaningfully.
changelog_begin
changelog_end
New year, new copyright, new expected unknown issues with various files
that won't be covered by the script and/or will be but shouldn't change.
I'll do the details on Jan 1, but would appreciate this being
preapproved so I can actually get it merged by then.
CHANGELOG_BEGIN
CHANGELOG_END
They migrated our account so hopefully this should work without any
other changes and fix our publishing issues.
I’m keeping the long timeout for now since I don’t know what an
appropriate timeout is.
changelog_begin
changelog_end
This has now screwed us over for two releases (1.14 and currently
blocking 1.15) because we didn’t backport the change. While we could
backport this, it is annoying and provides little to no benefit given
that a failure here is harmless so let’s just ignore failures here.
changelog_begin
changelog_end
uname is the name for Linux and Linux_scala_2_12 which causes builds
to override each other and it looks like that might even break in case
of concurrent uploads although that could also be general flakiness in Azure.
changelog_begin
changelog_end
Even with the cache retries something still doesn’t seem to be cached
quite like I expect. I can’t really debug this without exec logs so
this PR starts publishing those.
changelog_begin
changelog_end
* Move artifact publishing out of yaml files
The current publishing process pretty much hardcodes the set of
artifacts we publish in the yaml config. This is a problem because we
always release from `main` so the yaml files are always
identical. However, we will add new artifacts over time and this
starts falling apart. This PR changes this such that the process
described in the yaml files is very generic and just uploads and
downloads everything in a directory whereas the details are handled in
bash scripts that will come from the respective release branch and are
therefore version-dependent.
As usual for these type of changes, I don’t have a great way to test
this. I did do some due diligence to test that at least the artifacts
are published correctly and I can download them but I can’t test the
actual publishing.
changelog_begin
changelog_end
* Update ci/copy-unix-release-artifacts.sh
Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
* Update ci/copy-windows-release-artifacts.sh
Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
* Update ci/publish-artifactory.sh
Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
* Merge Maven uploads for different Scala versions
It turns out Maven will abort an existing staging operation if you
create a new one. This means our jobs race against each other. We
could try to fix that by either sequencing the jobs in a clever
way (annoying and can break things like rerunning if only parts
failed), or by creating more profiles (unclear if you can even have
two profiles for the same group id, even if you do, it’s annoying to
merge).
So in this PR I (grudgingly) merged both uploads into the Haskell
script. This isn’t all bad:
1. It moves some logic from bash embedded in yaml string literals into
Haskell code.
2. It duplicates some versions but it removes duplication in other
places so overall not too much worse.
3. It does however, make things slower. We don’t run this stuff in
parallel. That said, the release step is relatively small (< 5min) and
it only runs on Linux.
We could add CLI arguments to make the Scala versions configurable for
local development. Given that this is blocking releases, I wanted to
get something in that works first and then see what we need in that regard.
changelog_begin
changelog_end
* .
changelog_begin
changelog_end
* .
changelog_begin
changelog_end
* .
changelog_begin
changelog_end
* Fixup condition for running publish_mvn_npm
This needs to run for both linux and linux-scala-2.13
changelog_begin
changelog_end
* Update ci/build-unix.yml
Co-authored-by: Samir Talwar <samir.talwar@digitalasset.com>
Co-authored-by: Samir Talwar <samir.talwar@digitalasset.com>
My goal here is to investigate the new warning Azure has been showing
for the past few days:
> ##[warning]%25 detected in ##vso command. In March 2021, the agent command parser will be updated to unescape this to %. To opt out of this behavior, set a job level variable DECODE_PERCENTS to false. Setting to true will force this behavior immediately. More information can be found at https://github.com/microsoft/azure-pipelines-agent/blob/master/docs/design/percentEncoding.md
As far as I'm aware we are not deliberately passing in any `%25` in any
of our `vso` commands, so I was a bit surprised by this.
CHANGELOG_BEGIN
CHANGELOG_END
* include oauth2 logback config in release tarball
overlooked in https://github.com/digital-asset/daml/pull/8611
* Release trigger-service and oauth2-middleware JARs
changelog_begin
changelog_end
* drop from artifacts.yaml
Co-authored-by: Andreas Herrmann <andreas.herrmann@tweag.io>
Current reports look like:
```
Disk cache small enough:\n20G/home/vsts/.bazel-cache
```
because `echo` does not convert `\n`. An alternative would be to replace
`echo` with `printf`, but I have had enough issues with
subshells-in-strings lately that I prefer just avoiding them when
possible.
CHANGELOG_BEGIN
CHANGELOG_END
This is the equivalent of #8515 for Linux. There was some concern that
`bazel` would be upset at having that cache removed, so I spent a fair
amount of time trying to break it (on a Linux VM, as for some reason
`bazel` chooses not to use `~/.cache` on macOS). I could not make
`bazel` unhappy by deleting the whole thing. Deleting random files,
however, did end up producing error messages along the lines of:
```
$ bazel build //...
FATAL: corrupt installation: file '/home/vagrant/.cache/bazel/_bazel_vagrant/install/73d06d52dbf3a8e6ed43f5bf5f115eb0/embedded_tools/src/BUILD' is missing or modified. Please remove '/home/vagrant/.cache/bazel/_bazel_vagrant/install/73d06d52dbf3a8e6ed43f5bf5f115eb0' and try again.
```
which suggest busting the entire thing as a solution, so I think we're
safe here.
CHANGELOG_BEGIN
CHANGELOG_END
Hopefully this works around our recent CI disk space issues, while 80GB
should be large enough that it only happens once per machine per day, so
perf shouldn't be impacted too much.
CHANGELOG_BEGIN
CHANGELOG_END
As we strive for more inclusiveness, we are becoming less comfortable
with historically-charged terms being used in our everyday work.
This is targeted for merge on Dec 26, _after_ the necessary
corresponding changes at both the GitHub and Azure Pipelines levels.
CHANGELOG_BEGIN
- DAML Connect development is now conducted from the `main` branch,
rather than the `master` one. If you had any dependency on the
digital-asset/daml repository, you will need to update this parameter.
CHANGELOG_END
I have created the corresponding user and repositories on Artifactory,
and tested the `curl` command manually. I'll add the corresponding
credentials to Azure once this is approved.
CHANGELOG_BEGIN
CHANGELOG_END
* Move public code into daml-integration-api
CHANGELOG_BEGIN
[DAML Integration Kit]: Removed sandbox specific code from the API intended to be used by ledger integrations. Use the maven coordinates ``com.daml:participant-integration-api:VERSION`` instead of ``com.daml:ledger-api-server`` or ``com.daml:sandbox``.
CHANGELOG_END
This reverts commit 61e9df3eaf.
This interacts very badly with the fact that we check out old commits
for releases. While we could fix it for this particular issue, I don’t
think this buys us enough to make this worth doing and it makes it
easy to introduce issues in the future if we modify lib.sh
changelog_begin
changelog_end
Version is taken from the env var (or defaulted to 0.0.0) at build-time.
Since those two packages are not build by default by Bazel, we need to
add the env var to the Bash step where they do get explicitly built.
Fixes#6090.
CHANGELOG_BEGIN
- sandbox and http-api fatjars will now display correct version number.
CHANGELOG_END
trigger all releases from master
The 1.1.0 release went wrong and we had to trash it and release 1.1.1
instead. This is an attempt at identifying and correcting the root
cause behind that incident.
To understand the situation, we need to know how releases worked before
1.0. We had a one-line file called `LATEST` that specifies the git SHA and
version tag for the latest release. A change to that file triggered a
release with the specified release tag, built from the source tree of
the specified commit. The `LATEST` file looked something like:
```
f050da78c9 1.0.0-snapshot.20200411.3905.0.f050da78
```
To mark a release as stable, we would change it to look like this:
```
f050da78c9 1.0.0
```
i.e. simply drop the `-snapshot...` suffix. Even though the commit (and
thus the entire source tree we build from) is the same, we would need to
rebuild almost all of our release artifacts, as they embed the version
tag in various places and ways. That worked well as long as we could
assume we were doing trunk-based development, i.e. all releases would
always come from the same (`master`) branch.
When we released 1.0, and started work on 1.1, we had a few bug reports
for 1.0 that we decided should be resolved in a point release. We
decided that the best way to handle that would be to have a branch
starting on the release commit for 1.0, and then backport patches from
`master` to that branch. We adapted our build process to also watch the
`release/1.0.x` branch and, in particular, trigger a new release build if
the `LATEST` file in that branch changed. That worked well.
The plan going forward was to keep doing regular snapshot releases from
the `master` branch, and create support, point releases ("patch" releases
in semver) from dedicated branches.
On April 30, we made a snapshot release as an RC for 1.1.0, by changing
the `LATEST` file in the `master` branch. That release was built on commit
681c862d. On May 6, we decided to take a new snapshot as the RC for
1.1.0; we changed `LATEST` in `master` to designate 7e448d81 as the new
latest release.
On May 11, we noticed an issue that broke our builds. Without going into
details, an external artifact we depend on had changed in incompatible
ways. After fixing that on `master`, we reasoned that this would also
break the build of the final 1.1.0 release if we just tried to build
7e448d81 again. But as the target release date was May 13, we did not
want to take a new snapshot after that fix, as that would have included
one more week of work in the release, and given us no time to test it.
So we did what we did for the 1.0 branch, as it had worked well: we
created a branch that forked from `master` at commit 7e448d81 and called
it `release/1.1.x`, then cherry-picked the one fix to our build process to
work around the broken download. When the time came to make the final
1.1.0 build on May 13, we naturally picked the `LATEST` file from the
`release/1.1.x` branch and dropped the `-snapshot...` suffix. Importantly,
we did not need to update the target commit to include the "broken
download" fix as, in the meantime, the internet had fixed itself, and we
thus reasoned we should go for the exact code of the RC rather than
include an unnecessary, albeit seemingly harmless, change.
Everything went well with the release process. Tests went well too. Then
we got a report that an application that worked against the latest RC
broke with the final 1.1.0. The issue was that we had built the wrong
commit: by branching off at the point of the _target_ commit for the
latest snapshot, we did not have the change to the `LATEST` file that
designated that commit as the target. So the `LATEST` file in
`release/1.1.x` was still pointing to 681c862d.
I believe the root cause for this issue is the fact that we have
scattered our release process over multiple branches, meaning there is
no linear history of what was released and we are relying on people
being able to mentally manage multiple timelines. Therefore, I propose
to fix our release process so this should not happen again by
linearizing the release process, i.e. getting back to a situation where
all releases are made from a single branch, `master`.
Because we do want to be able to release _for_ multiple release branches
(to provide backports and bugfixes), we still need some way to
accommodate that. Having a single `LATEST` file in the same format as
before would not really work well: keeping track of interleaved release
streams on a single file would not really be easier than keeping track
of multiple branches.
My proposed solution is to instead have a multiline LATEST file, so that
all the release branch "tips" can be observed at the same time, and, as
long as we take care to only advance one release branch at a time, we
can easily keep track of each of them. This is what this PR does.
This required a few changes to our release process. Most notably:
- Obviously, as this is the main point of this PR, the build process has
once again been restricted to only trigger new releases from the
`master` branch.
- As our CI machinery cannot easily be made to produce multiple releases
from a single build, the `check_for_release` step will only recognize
a commit as a release trigger if it changes a single line in the
`LATEST` file. This restriction comes in addition to the existing one
that a release commit is only allowed to change either just the
`LATEST` file or both the `LATEST` and
`docs/source/support/release-notes.rst` files.
- The docs publication process has been changed to update _all_
published versions to display the _latest_ release notes page. This
means that the release notes page will always show you all published
versions, regardless of which version of the documentation you're
looking at. This also means that interleaving release notes correctly on
that page is a manual exercise.
- As per the intention of the new process, the `LATEST` file has been
updated to contained all existing post-1.0 stable releases. It should
also include all existing snapshot releases should we have more than one
at a time (say, should we discover an issue with 1.1.1 that required us
to work on a 1.1.2).
- The `release.sh` script has been dramatically simplified as I felt it
was trying to do too much and porting its existing functionality to a
multi-line `LATEST` file would be too hard.
CHANGELOG_BEGIN
CHANGELOG_END
Note: this is beta-level software. See documentation for the precise
guarantees this does and does not come with. (Documentation does not
exist at the time of opening this PR, but should exist by the time the
first version of this gets published.)
CHANGELOG_BEGIN
- We now publish Sandbox Next as an **ALPHA** standalone jar.
- We now publish the HTTP JSON API as a standalone jar.
CHANGELOG_END
bazel configuration does two things:
It modifies .bazelrc.local and it writes a temp file.
I’ve run `git clean` on a PR. This caused the temp file to be
removed. However `.bazelrc.local` stayed since it is in
`.gitignore`. This meant that the next time the formatting check ran
the `.bazelrc.local` pointed to the temp file but the temp file was no
longer there.
changelog_begin
changelog_end
This commit aims at enabling future patch releases; it is the
master-branch equivalent of #5569 (applied to the 1.0 release branch).
The only change between the two changelogs should be that this one also
changes the docs cron so it can find the trigger commits for patch
releases.
CHANGELOG_BEGIN
CHANGELOG_END
In https://github.com/digital-asset/daml/pull/5464 we removed the
hack for 0.13.55 but sadly also removed the condition to only run this
on Linux which I sadly missed during review. This PR adds it back.
changelog_begin
changelog_end
Calling a zip archive .tar.gz is rather confusing. The good news is
that apart from this, the release did work successfully this time so
not going to attempt another one until the proper 1.0 release.
changelog_begin
changelog_end
Last release attempt failed because when we tried to publish it from
macos it was already published from Linux.
Also includes a fix for the name of the zip.
changelog_begin
changelog_end
The formatting check used to be git-repo-dependent (see #4985), which is
preventing our candidate 0.13.55 from building.
This PR introduces a temporary hack to disable format checking on
release PRs & commits. It should be reverted once 0.13.55 is released.
CHANGELOG_BEGIN
CHANGELOG_END