Commit Graph

118 Commits

Author SHA1 Message Date
Moritz Kiefer
1b533561b4
Only publish JSON API to GH releases (#6620)
daml-on-sql isn’t quite ready

changelog_begin
changelog_end
2020-07-06 14:43:09 +00:00
Moritz Kiefer
bc3f485b9a
Update maven_install.json in compatibility tests (#6555)
We take our own libraries from latest_stable_version which changed but
we did not rerun pinning which meant that this did not get updated.

changelog_begin
changelog_end
2020-07-01 00:49:53 +00:00
Gary Verhaegen
beb33f2ab1
add explanation for clearing shared segments (#6545)
As requested on #6530.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-30 13:21:32 +00:00
Gary Verhaegen
55776f92ba
clear shared memory segment on macOS (#6530)
For a while now we've had errors along the line of

```
FATAL:  could not create shared memory segment: No space left on device
DETAIL:  Failed system call was shmget(key=5432001, size=56, 03600).
HINT:  This error does *not* mean that you have run out of disk space.
It occurs either if all available shared memory IDs have been taken, in
which case you need to raise the SHMMNI parameter in your kernel, or
because the system's overall limit for shared memory has been reached.
        The PostgreSQL documentation contains more information about
shared memory configuration.
child process exited with exit code 1
```

on macOS CI nodes, which we were not able to reproduce locally. Today I
managed to, sort of by accident, and that allowed me to dig a bit
further.

The root cause seems to be that PostgreSQL, as run by Bazel, does not
always seem to properly unlink the shared memory segment it uses to
communicate with itself. On my machine, running:

```
bazel test -t- --runs_per_test=100 //ledger/sandbox:conformance-test-wall-clock-postgresql
```

and eyealling the results of

```
watch ipcs -mcopt
```

I would say about one in three runs leaks its memory segment. After much
googling and some head scratching trying to figure out the C APIs for
managing shared memory segments on macOS, I kind of stumbled on a
reference to `pcirm` in a comment to some low-ranking StackOverflow
answer. It looks like it's working very well on my machine, even if I
run it while a test (and therefore an instance of pg) is running. I
believe this is because the command does not actually remove the shared
memory segments, but simply marks them for removal once the last process
stops using it. (At least that's what the manpage describes.)

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-30 01:40:16 +02:00
Gary Verhaegen
c7ea0a8b08
automatically run update-versions on release (#6479)
This PR adds an extra post-release job to CI that will run the
[`compatiblity/update-versions.sh`][0] script and open a PR with the
result.

[0]: cb82a8d6be/compatibility/update-versions.sh

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-24 17:02:12 +02:00
Moritz Kiefer
416a568cbd
Release daml-on-sql and JSON API to GH releases (#6397)
fixes #6384

For now this keeps the GCP bucket as well. I would suggest to keep
that for 1.3 and drop it in 1.4 but I don’t feel particularly strongly
about this so I’m also happy to drop it now.

changelog_begin

- [SDK] The JSON API and DAML on SQL (sandbox-classic) are now
  published as fat JARs to github releases. The GCP bucket that
  contained the fat JARs will not receive releases > 1.3.

changelog_end
2020-06-18 13:08:18 +02:00
Moritz Kiefer
2c1d4cb805
Fix nix installation (#6400)
Nix now requires -L, I’ve gone ahead and just normalized everything to
use -sfL which we were already using in one place.

changelog_begin
changelog_end
2020-06-18 10:34:08 +02:00
Gary Verhaegen
7735acb833
add sha256sums to releases (#6263)
Based on discussion on #6258.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-08 14:19:04 +00:00
Gary Verhaegen
2fe320fe48
automated ghc-lib build (#6188)
automated ghc-lib build

This PR aims at automating the build of ghc-lib. The current process
still has a few manual steps; it needs to be updated because Bintray is
going away, so this seemed like a good opportunity to fully automate it.

This works like the "patch bazel on Windows" jobs: the filename will
contain a hash of the `ci/da-ghc-lib` folder, and the job will run only
if the corresponding filename does not yet exist on the GCS bucket. PRs
aiming at changing the ghc-lib version will need to run twice: once to
create the artifacts, and once to change the `stack-snapshot.yaml` file
to match.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-04 12:05:03 +02:00
Gary Verhaegen
595f1e278d
fix fat-jar publish (#6046)
CHANGELOG_BEGIN
CHANGELOG_END
2020-05-20 15:08:53 +02:00
Gary Verhaegen
4882327db5
fix release diff (#6042)
CHANGELOG_BEGIN
CHANGELOG_END
2020-05-20 11:58:33 +02:00
Gary Verhaegen
fb6dc904a4
trigger all releases from master (#6016)
trigger all releases from master

The 1.1.0 release went wrong and we had to trash it and release 1.1.1
instead. This is an attempt at identifying and correcting the root
cause behind that incident.

To understand the situation, we need to know how releases worked before
1.0. We had a one-line file called `LATEST` that specifies the git SHA and
version tag for the latest release. A change to that file triggered a
release with the specified release tag, built from the source tree of
the specified commit. The `LATEST` file looked something like:

```
f050da78c9 1.0.0-snapshot.20200411.3905.0.f050da78
```

To mark a release as stable, we would change it to look like this:

```
f050da78c9 1.0.0
```

i.e. simply drop the `-snapshot...` suffix. Even though the commit (and
thus the entire source tree we build from) is the same, we would need to
rebuild almost all of our release artifacts, as they embed the version
tag in various places and ways. That worked well as long as we could
assume we were doing trunk-based development, i.e. all releases would
always come from the same (`master`) branch.

When we released 1.0, and started work on 1.1, we had a few bug reports
for 1.0 that we decided should be resolved in a point release. We
decided that the best way to handle that would be to have a branch
starting on the release commit for 1.0, and then backport patches from
`master` to that branch. We adapted our build process to also watch the
`release/1.0.x` branch and, in particular, trigger a new release build if
the `LATEST` file in that branch changed. That worked well.

The plan going forward was to keep doing regular snapshot releases from
the `master` branch, and create support, point releases ("patch" releases
in semver) from dedicated branches.

On April 30, we made a snapshot release as an RC for 1.1.0, by changing
the `LATEST` file in the `master` branch. That release was built on commit
681c862d. On May 6, we decided to take a new snapshot as the RC for
1.1.0; we changed `LATEST` in `master` to designate 7e448d81 as the new
latest release.

On May 11, we noticed an issue that broke our builds. Without going into
details, an external artifact we depend on had changed in incompatible
ways. After fixing that on `master`, we reasoned that this would also
break the build of the final 1.1.0 release if we just tried to build
7e448d81 again. But as the target release date was May 13, we did not
want to take a new snapshot after that fix, as that would have included
one more week of work in the release, and given us no time to test it.

So we did what we did for the 1.0 branch, as it had worked well: we
created a branch that forked from `master` at commit 7e448d81 and called
it `release/1.1.x`, then cherry-picked the one fix to our build process to
work around the broken download. When the time came to make the final
1.1.0 build on May 13, we naturally picked the `LATEST` file from the
`release/1.1.x` branch and dropped the `-snapshot...` suffix. Importantly,
we did not need to update the target commit to include the "broken
download" fix as, in the meantime, the internet had fixed itself, and we
thus reasoned we should go for the exact code of the RC rather than
include an unnecessary, albeit seemingly harmless, change.

Everything went well with the release process. Tests went well too. Then
we got a report that an application that worked against the latest RC
broke with the final 1.1.0. The issue was that we had built the wrong
commit: by branching off at the point of the _target_ commit for the
latest snapshot, we did not have the change to the `LATEST` file that
designated that commit as the target. So the `LATEST` file in
`release/1.1.x` was still pointing to 681c862d.

I believe the root cause for this issue is the fact that we have
scattered our release process over multiple branches, meaning there is
no linear history of what was released and we are relying on people
being able to mentally manage multiple timelines. Therefore, I propose
to fix our release process so this should not happen again by
linearizing the release process, i.e. getting back to a situation where
all releases are made from a single branch, `master`.

Because we do want to be able to release _for_ multiple release branches
(to provide backports and bugfixes), we still need some way to
accommodate that. Having a single `LATEST` file in the same format as
before would not really work well: keeping track of interleaved release
streams on a single file would not really be easier than keeping track
of multiple branches.

My proposed solution is to instead have a multiline LATEST file, so that
all the release branch "tips" can be observed at the same time, and, as
long as we take care to only advance one release branch at a time, we
can easily keep track of each of them. This is what this PR does.

This required a few changes to our release process. Most notably:

- Obviously, as this is the main point of this PR, the build process has
  once again been restricted to only trigger new releases from the
  `master` branch.
- As our CI machinery cannot easily be made to produce multiple releases
  from a single build, the `check_for_release` step will only recognize
  a commit as a release trigger if it changes a single line in the
  `LATEST` file. This restriction comes in addition to the existing one
  that a release commit is only allowed to change either just the
  `LATEST` file or both the `LATEST` and
  `docs/source/support/release-notes.rst` files.
- The docs publication process has been changed to update _all_
  published versions to display the _latest_ release notes page. This
  means that the release notes page will always show you all published
  versions, regardless of which version of the documentation you're
  looking at. This also means that interleaving release notes correctly on
  that page is a manual exercise.
- As per the intention of the new process, the `LATEST` file has been
  updated to contained all existing post-1.0 stable releases. It should
  also include all existing snapshot releases should we have more than one
  at a time (say, should we discover an issue with 1.1.1 that required us
  to work on a 1.1.2).
- The `release.sh` script has been dramatically simplified as I felt it
  was trying to do too much and porting its existing functionality to a
  multi-line `LATEST` file would be too hard.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-19 19:18:10 +02:00
Gary Verhaegen
af939a7ee4
provisional beta deployment for daml-on-sql (#6024)
Note: this is beta-level software. See documentation for the precise
guarantees this does and does not come with. (Documentation does not
exist at the time of opening this PR, but should exist by the time the
first version of this gets published.)

CHANGELOG_BEGIN
- We now publish Sandbox Next as an **ALPHA** standalone jar.
- We now publish the HTTP JSON API as a standalone jar.
CHANGELOG_END
2020-05-19 18:11:26 +02:00
Moritz Kiefer
294c881a2a
Fix standard change check (#5958)
This check never triggered for changes to LATEST due to the trailing
slash in `has_changed`.

changelog_begin
changelog_end
2020-05-13 13:58:43 +02:00
Moritz Kiefer
4916a28682
Include create-daml-app tests in compatibility tests (#5945)
This is the first part of #5700

It adds tests that build create-daml-app using `daml build` and then
run the codegen and build the UI. Contrary to our main tests these
also run on Windows. This is actually reasonably simple by first
building the typescript libraries on Linux and then downloading them
on Windows.

There are two parts that are still missing from the tests in the main
workspace:

1. Building the extra feature. This should be fairly easy to add.
2. Running the pupeeter tests. At least MacOS and Linux should be
   reasonably easy. I don’t know what horrors Windows will throw at
   us. This step is what actually makes this a compatibility
   test. Currently it doesn’t actually launch Sandbox and the JSON API.

Since this PR is already pretty large, I’d like to tackle those things
separately.

changelog_begin
changelog_end
2020-05-13 10:39:51 +02:00
Gary Verhaegen
bda565fa44
patching Bazel on Windows (infra bits, no patch yet) (#5918)
patch Bazel on Windows (ci setup)

We have a weird, intermittent bug on Windows where Bazel gets into a
broken state. To investigate, we need to patch Bazel to add more debug
output than present in the official distribution. This PR adds the basic
infrastructure we need to download the Bazel source code, apply a patch,
compile it, and make that binary available to the rest of the build.
This is for Windows only as we already have the ability to do similar
things on Linux and macOS through Nix.

This PR does not contain any intresting patch to Bazel, just the minimum
that we can check we are actually using the patched version.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-12 23:16:04 +02:00
Gary Verhaegen
3899a59a11
switch back to hosted macOS nodes (#5935)
CHANGELOG_BEGIN
CHANGELOG_END
2020-05-11 22:59:33 +02:00
Gary Verhaegen
9b476416b8
switch back to Azure-provided macos nodes (#5920)
This is temporary. It looks like the macOS nodes are dead; @nycnewman is
looking into it, but in case he doesn't fix it in time, at least we
have a backup plan so we're not completely blocked on Monday.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-11 09:28:40 +02:00
Gary Verhaegen
4a6ab84b69
add default machine capability (#5912)
add default machine capability

We semi-regularly need to do work that has the potential to disrupt a
machine's local cache, rendering it broken for other streams of work.
This can include upgrading nix, upgrading Bazel, debugging caching
issues, or anything related to Windows.

Right now we do not have any good solution for these situations. We can
either not do those streams of work, or we can proceed with them and
just accept that all other builds may get affected depending on which
machine they get assigned to. Debugging broken nodes is particularly
tricky as we do not have any way to force a build to run on a given
node.

This PR aims at providing a better alternative by (ab)using an Azure
Pipelines feature called
[capabilities](https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/agents?view=azure-devops&tabs=browser#capabilities).
The idea behind capabilities is that you assign a set of tags to a
machine, and then a job can express its
[demands](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/demands?view=azure-devops&tabs=yaml),
i.e. specify a set of tags machines need to have in order to run it.

Support for this is fairly badly documented. We can gather from the
documentation that a job can specify two things about a capability
(through its `demands`): that a given tag exists, and that a given tag
has an exact specified value. In particular, a job cannot specify that a
capability should _not_ be present, meaning we cannot rely on, say,
adding a "broken" tag to broken machines.

Documentation on how to set capabilities for an agent is basically
nonexistent, but [looking at the
code](https://github.com/microsoft/azure-pipelines-agent/blob/master/src/Microsoft.VisualStudio.Services.Agent/Capabilities/UserCapabilitiesProvider.cs)
indicates that they can be set by using a simple `key=value`-formatted
text file, provided we can find the right place to put this file.

This PR adds this file to our Linux, macOS and Windows node init scripts
to define an `assignment` capability and adds a demand for a `default`
value on each job. From then on, when we hit a case where we want a PR
to run on a specific node, and to prevent other PRs from running on that
node, we can manually override the capability from the Azure UI and
update the demand in the relevant YAML file in the PR.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-09 18:21:42 +02:00
Gary Verhaegen
2b0c59f8af
detect cancellation in notify-user (#5895)
This PR changes the notify_user job to not run when the job has been
canceled, which happens mostly when we push new code.

Not sure how I failed to see the `canceled` function in the past, but
this does seem to do exactly what we want.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-07 19:35:45 +02:00
Gary Verhaegen
11a2fc3c2d
more flexible perf test check (#5891)
This PR separates the "last known valid perf test" commit from the
"baseline speedy implementation" commit. It is important for the perf
test to be meaningful that the changes between those two commits are
benign, say minor API adjustments, so that the perf measurement remains
meaningful.

This also adds a check on merging to master that tells Slack if the perf
test has changed and the `test_sha` file needs updating. The Slack
message is conditional on the current commit to avoid excessive noise.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-07 13:53:22 +02:00
Moritz Kiefer
b291e96ce1
Publish execution logs from Windows compatibility jobs (#5834)
Hopefully, this helps diagnose the Windows CI failures.

changelog_begin
changelog_end
2020-05-05 12:23:11 +02:00
Gary Verhaegen
49b4a8dad8
tweak release process for more reliable labeling (#5823)
Currently, there are quite a few releases that are lacking the
Standard-Change label, even though they did publish artifacts. This
makes our SOC2-compliance tracking a bit harder. For the past two
months, I have manually added the label after-the-fact while preparing
the monthly compliance report, but that doesn't seem like a great
solution.

This PR changes the release process to be more optimistic: assume the
release is going to succeed by putting in the label immediately, and
then (optionally) removing it if the release fails.

Note that the label should only be removed in the rare case where the
release was merged into master but somehow did not produce any artifact.
This can only happen if the Linux build fails quite early, which as far
as I know only happened once over the past two months when we had the
release notes race condition.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-04 15:35:58 +02:00
Andreas Herrmann
4c99f67814
Publish Bazel logs (#5821)
CHANGELOG_BEGIN
CHANGELOG_END

Co-authored-by: Andreas Herrmann <andreas.herrmann@tweag.io>
2020-05-04 11:14:40 +02:00
Gary Verhaegen
32fbf040aa
fail collect_build_data for windows_compat (#5779)
At the moment, collect_build_data will wait for the Windows
compatibility test to have "finished", but doesn't check its return
status. This means two things:

1. Should the compatibility test end without a success or error (e.g.
   communication broken between Azure and the node), the option to rerun
   failed jobs will not appear, as there will be no failed job.
2. The subsequent notify_user step will ignore failures in the
   compatibility_windows job when reporting to Slack, making for
   confusing reports.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-30 15:03:03 +02:00
Moritz Kiefer
49e19ebed1
Make compat tests work on windows (#5732)
* Make compat tests work on windows

This required some changes to the daml_sdk rule since the read-only
installation by the assistant breaks Bazel completely. We could only
apply those changes on Windows but I think I prefer the consistency
across platforms here over trying to stay close to how the SDK is
installed on user machines given that the SDK installation is not
something we’ve had issues with.

I’ve excluded the postgresql tests for now. I don’t expect them to be
particularly hard to fix but I’ve already spent almost 2 days on this
and having some tests run on Windows seems like a clear improvement
over running no tests on Windows :)

changelog_begin
changelog_end

* Remove todo

changelog_begin
changelog_end
2020-04-28 16:06:36 +02:00
Gary Verhaegen
54d6782be3
drop v from release titles (#5742)
This is a minor, cosmetic change. Note that all our references to
releases are based on tags, and do not depend on the release title. This
is evidenced by the fairly random titles we used to have before the
title was set by CI, see e.g.
[0.13.35](https://github.com/digital-asset/daml/releases/tag/v0.13.53).

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-28 11:51:16 +02:00
Gary Verhaegen
7ceda5678a
run compatibility tests on macos (#5723)
This PR extends the existing Linux compatibility tests to run on macOS
too. Fixes #5692.

CHANGELOG_BEGIN
CHANGELOG_END

Co-authored-by: Moritz Kiefer <moritz.kiefer@purelyfunctional.org>
2020-04-27 14:55:16 +02:00
Moritz Kiefer
0d1f21e4a2
Extend compatibility tests to test against HEAD (#5714)
fixes #5691

changelog_begin
changelog_end
2020-04-24 14:43:35 +02:00
Moritz Kiefer
df16cf7094
Extend compatibility tests to DAML on SQL (#5705)
* Extend compatibility tests to DAML on SQL

This feels a bit hacky since the runfiles don’t work quite like I
would expect them to but it’s at least not more hacky than what we do
for the head-based tests we currently have.

Progress towards #5695

changelog_begin
changelog_end

* Fix runfiles with more bash

changelog_begin
changelog_end

* remove redundant port options

changelog_begin
changelog_end

* Create fewer sandbox targets

changelog_begin
changelog_end

* Apply suggestions from code review

Co-Authored-By: Andreas Herrmann <42969706+aherrmann-da@users.noreply.github.com>

* Fix runfiles snippet

changelog_begin
changelog_end

Co-authored-by: Andreas Herrmann <42969706+aherrmann-da@users.noreply.github.com>
2020-04-24 12:08:32 +02:00
Moritz Kiefer
7d36402412
Initial boilerplate for cross-version compatibility testing (#5665)
This is a first step towards testing cross-version
compatibility. It doesn’t actuall do much yet but hopefully it should
be easier to parallelize once we have the initial boilerplate in place
so ideally I’d like to address most missing things and issues in
separate PRs.

changelog_begin
changelog_end
2020-04-23 12:58:11 +02:00
Gary Verhaegen
88c389c17a
enable patch releases (fix) (#5634)
This is applying, on `master`, the same patch as #5605 applied on
`release/1.0.x`.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-20 17:01:08 +02:00
Gary Verhaegen
7fb0c8c3ac
fix check_standard_change_label (#5600)
As currently written, the check will compare the current commit with the
base commit the branch was forked from. The intention was to only list
the changes from the current PR, and to make the check work for PRs
against release branches (the previous version of the heck always
compared to master, regardless of what branch the PR was targeting).

However, this does not work as expected because the "current commit" is
not the tip of the PR, but the merge commit supplied by GitHub.
Therefore, the diff here will include not only the changes in this PR,
but also all the changes that happened on the target branch since
forking. This is not an issue if the PR is properly rebased, but that's
hard to control in a world where other people work too.

This PR corrects this by explicitly computing the diff between the fork
point on the target branch and the tip of the PR, ignoring the
currently-checked-out commit.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-17 13:54:18 +02:00
Gary Verhaegen
a1fab2d9af
enable patch releases (#5584)
This commit aims at enabling future patch releases; it is the
master-branch equivalent of #5569 (applied to the 1.0 release branch).

The only change between the two changelogs should be that this one also
changes the docs cron so it can find the trigger commits for patch
releases.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-16 17:50:55 +02:00
Gary Verhaegen
466fe1b518
switch to home-hosted macos nodes (#5543)
CHANGELOG_BEGIN
CHANGELOG_END
2020-04-14 18:18:30 +02:00
Gary Verhaegen
033b798009
cleanup collect_build_data job (#5548)
I recently noticed that the `check_for_release` and
`check_standard_change_label` jobs do not currently report their
runtime, so including them in the build data is a bit moot (we always
get `""` for all three values). Given that they usually run in under 3
seconds, I've decided the best way to fix this is to remove them from
the build data, rather than add the required steps to collect their
build times.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-14 15:33:08 +00:00
Gary Verhaegen
7d55095f17
reenable collect_build_data and notify_user (#5545)
Looking at the behaviour of `succeededOrFailed`, it looks like it does
not do what we want at all: both steps now only run on failures. My
current hypothesis is that `write_ledger_dump` being skipped switches
the state of the last job to something that is neither success nor
failure.

It would be really nice if Azure had a way to detect cancellation. :(

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-14 11:29:12 +00:00
Gary Verhaegen
1780466330
cleanup collect_build_data & notify_user steps (#5491)
Over the past three weeks or so, I have not seen a single case where the
"get commit" step failed erroneously, i.e. all failures were genuine
pushes, which we don't care about. Therefore, I hacve decided to remove
the `tell_gary` code, as it is long, a bit hairy, and duplicated.

In the meantime I've also discovered there actually is a way to tell
Azure not to run these steps on a canceled build, which I believe makes
sense, so I've added that.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-08 17:13:36 +02:00
Gary Verhaegen
8261af86d2
correct commit title in Slack msg (#5471)
Currently, on a release commit on master, if the commit fails, we get
the message from the target PR, which is confusing. This should
(hopefully; it's a bit hard to test as it would require setting up a
release PR that succeeds but fails on master) get us the title of the
release commit, which hopefully will be less confusing.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-08 13:01:42 +02:00
Moritz Kiefer
be3d8bc301
Publish protobuf zip to github releases (#5418)
I cannot test this without actually making a release but it is all
copy-pasted from other targets so hopefully it works.

changelog_begin
changelog_end
2020-04-03 15:36:41 +02:00
Gary Verhaegen
1872c668a5
replace DAML Authors with DA in copyright headers (#5228)
Change requested by Manoj.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-27 01:26:10 +01:00
Moritz Kiefer
0ed3bbf2ce
Bump nix version (#4934)
Using the newest seems like a good idea and the previous one has
network errors

changelog_begin
changelog_end
2020-03-11 12:26:29 +00:00
Gary Verhaegen
ef931e0b72
skip testing release script after making a release (#4911)
Currently, on Linux, after the normal build, we try running the release
script (in "dry run" mode). This is to check that the release script not
only compiles, but actually runs. To be honest I'm not entirely sure why
we do that as a separate step (i.e. why does `bazel test //...` not give
us confidence about this script?), but the point of this PR is that,
while there may be some benefit in running this script on normal PRs to
check that we have not broken the release step, there is absolutely no
point in running it _on a release build_, i.e. right after we've used
the same script in "real" ("wet run"? 🤔) mode.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-10 12:45:14 +01:00
Gary Verhaegen
41643315ac
hopefully fix Azure pipelines github tag release (#4912)
Maybe. If I'm reading
[that](https://docs.microsoft.com/en-us/azure/devops/pipelines/tasks/utility/github-release?view=azure-devops)
right.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-09 19:39:08 +01:00
Moritz Kiefer
52bf9b2a5c
Fix git tag of releases (#4862)
Previously, we tagged the commit that made the release instead of the
commit we are building the release off.

changelog_begin
changelog_end
2020-03-06 10:54:07 +01:00
Gary Verhaegen
950d8c3501
remove perf tests from CI (#4851)
CHANGELOG_BEGIN
CHANGELOG_END
2020-03-05 17:29:28 +01:00
Gary Verhaegen
3a7dca3286
remove exponential backoff for getting sha (#4748)
This has been running for a few days now and while I have seen a bunch
of these cases, I have not once received a message with a BACKOFF value
different from 512. This means that, likely due to some sort of internal
caching in Azure, retrying in this case is useless and just makes the
build failure take more time, i.e. more time before we can rerun.

Rerunning does usually solve it, though.

I have also noticed that we still get these notifications when the job
has been canceled, which usually means the user has force-pushed (in
which case it makes sense that the commit is no longer available). I'm
not sure we can detect this, but I take this opportunity to print the
JobStatus just in case.

CHANGELOG_BEGIN
CHANGELOG_END
2020-02-27 15:14:21 +01:00
Gary Verhaegen
86bce50b9a
fix passing is_release through (#4745)
Somehow, in the current setup, the publish steps do not get executed on
master. This is what Azure reports:

```
Evaluating: and(succeeded(), eq('$(is_release)', 'true'),
eq(variables['Build.SourceBranchName'], 'master'), eq('linux', 'linux'))
Expanded: and(True, eq('$(is_release)', 'true'),
eq(variables['Build.SourceBranchName'], 'master'), eq('linux', 'linux'))
Result: False
```

So it looks like, in the condition, `${{parameters.is_release}}`
evaluates to the literal string `$(is_release)`. If we look at the point
of invocation of the ~function~ template, we can see:

```
      - template: ci/build-unix.yml
        parameters:
          release_tag: $(release_tag)
          name: 'linux'
          is_release: $(is_release)
```

so it does not seem completely crazy. However, according to the
documentation, we should expect that to be replaced by the value of the
corresponding variable, as per:

```
    variables:
      release_sha: $[ dependencies.check_for_release.outputs['out.release_sha'] ]
      release_tag: $[ coalesce(dependencies.check_for_release.outputs['out.release_tag'], '0.0.0') ]
      trigger_sha: $[ dependencies.check_for_release.outputs['out.trigger_sha'] ]
      is_release: $[ dependencies.check_for_release.outputs['out.is_release'] ]
```

What's interesting here is that, within `build-unix.yml`, we are also
using `release_tag` in the exact same way:

```
  - bash: ./build.sh "_$(uname)"
    displayName: 'Build'
    env:
      DAML_SDK_RELEASE_VERSION: ${{parameters.release_tag}}
```

and this time output from the build seems to show the value being
correctly substituted:

```
damlc - Compiler and IDE backend for the Digital Asset Modelling
Language
SDK Version: 0.13.55-snapshot.20200226.3266.d58bb459

Usage: <interactive> COMMAND
  Invoke the DAML compiler. Use -h for help.
```

My current guess is that the (undocumented, as far as I can tell)
evaluation order is as follows:

1. In the template, syntactically replace all the parameters.
2. In the job definition, replace the call to the template with the code
of the template. So it is as if we had written the template directly in
the `azure-pipelines.yml` file, with `$(release_tag)` and
`$(is_release)`.
3. Run the build. When we reach the time to run this specific job,
we can evaluate the expressions for the variables and replace them in
the rest of the job.

So what is going wrong? I believe the issue is with the quotes,
preventing the substitution of `is_release`. They came directly from the
[documented
syntax](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/conditions?view=azure-devops&tabs=yaml#use-a-template-parameter-as-part-of-a-condition),
but if the above evaluation order is correct, they should not be there.

There are actually two things going wrong here. The first one is that
the syntax `$()` is used to substitute a value in what Azure considers a
string. This is the case for `env` keys. However, the `condition` key
is not a string, it is an Azure "expression". Expressions have their own
evaluation rules and syntax, and in particular, `$()` is not a
substitution rule there, so when it sees `$()` in a string in an
expression (due to the quoptes), it leaves it alone.

Removing the quotes does not directly help, though, as we then end with

```
condition: eq($(is_release), 'true')
```

and `$()` is not valid syntax in an expression. The way to use variables
in an expression is `variables.name` (or `variables["name"]`, because
why have only one?).

So that means we have to pass variables to the template in different
ways depending on how they will be used. So much fun.

CHANGELOG_BEGIN
CHANGELOG_END
2020-02-27 14:33:20 +01:00
Gary Verhaegen
1c2b921c14
fix tarball generation (#4738)
The existing approach is a historical accident. The reason for the
additional tarball/install jobs was that, in my original attempt, the
build steps would still build the current commit, as opposed to the
target commit.

This is not such an issue on Linux, but setting up the build environment
on Windows and macOS _again_ for no good reason is a pure waste of time
(and effort in getting it right). Now that the build steps build the
target commit (with the env var set), we can go back to the way things
were previously: just take the build products directly from the build
step.

CHANGELOG_BEGIN
CHANGELOG_END
2020-02-27 10:48:38 +01:00
Moritz Kiefer
1bab818dea
Prefix release jobs with 'release_' (#4737)
changelog_begin
changelog_end
2020-02-26 20:46:45 +00:00