Commit Graph

211 Commits

Author SHA1 Message Date
Gary Verhaegen
9c8c1fa909
lightly safer docs cron: fail instead of error (#6288)
See @cocreature's comment on #6285.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-10 19:18:14 +02:00
Gary Verhaegen
485069f017
fix docs cron for releae notes (#6285)
Thinking about the upcoming release, I realized our current docs cron
has somehow lost the step of taking the release notes from the
triggering commit, probably in all the back-and-forth about which
release notes version to use to overwrite all the other ones.

This restores that, and adapts the algorithm for the new, multi-line
LATEST file format.

This _should_ work for all the current history, including releases made
on `release/*` branches and the unifying commit that turned the LATEST
file multiline (it adds more than one line so won't be matched as a
trigger commit).

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-10 14:43:23 +02:00
Gary Verhaegen
664df64e13
fix daily perf Slack notification (#6267)
This PR fixes the Slack notification on daily perf runs. It also updates
the perf sha.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-09 06:45:58 +00:00
Gary Verhaegen
445f6467d9
daily run: warn on master only (#6177)
Currently the message to Slack is always triggered by running the daily
checks. This means that it gets very noisy to:

1. Run the check on PRs affecting the check (like this one),
2. Rerun the check multiple times to ascertain that a given failure is
   flaky.

With this PR, the message to Slack is replaced with a simple `echo` when
these checks are not run from the `master` branch, so whoever (manually)
triggered them can still get feedback on the result, but other people
don't get spurious `@here` mentions.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-03 16:36:05 +02:00
Gary Verhaegen
90547e6ab4
build old docs with their release notes (#6128)
In light of #6127, I kept wondering why rebuilding 1.1.1 would fail. The
problem addressed by #6127 is that we tried to rebuild it, which we
shouldn't, but the reason I noticed it is because the build failed, and
there is no good reason for the 1.1.1 docs to not build anymore. Looking
at the logs confused me even more as it failed with (elided):

```
docs/source/support/new-assistant.rst:
WARNING: document isn't included in any toctree
```

and that change happened _after_ 1.1.1. So I went back to the code, and
discovered I somehow had gotten confused as I changed the approach
mid-way through editing the file. If we're overwriting the
`release-notes.html` file post-build, which we are now doing (and is the
reason for ignoring it when checking checksums), then we should not be
touching the `release-notes.rst` file pre-build.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-27 22:19:18 +00:00
Gary Verhaegen
e2d416e335
fix docs cron not ignoring release-notes (#6127)
The docs cron is supposed to ignore the release-notes.html page when
checking whether a docs folder is corrupted, because we manually
override it. However, that currently doesn't work, either because the
`sed` version we are using does not support changing the delimiters, or
because no version of `sed` does and I just imagined it.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-27 18:47:40 +02:00
Gary Verhaegen
ccb496ee0d
update perf test sha (#6125)
Changed by #6123, relevant part of the diff is:

```
           ledger.lookupGlobalContract(ParticipantView(committers.head),
effectiveAt, acoid) match {
-            case LookupOk(_, result) =>
+            case LookupOk(_, result, _) =>
               cachedContract = cachedContract + (step -> result)
```

which seems benign enough.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-27 15:10:11 +00:00
Gary Verhaegen
6e48abc793
update perf benchmark following #6080 (#6120)
This should be merged after #6080. This PR adds a patch (and
consequently updates the `ci/cron/perf/compare.sh` script) to apply the
same logical change as #6080 on top of the baseline commit, so our
performance comparison remains "apples to apples".

I am well aware that managing patches is not going to be a great way
forward. The rate of changes on the benchmark seems to be slow enough
that this is good enough for now, but should we change the benchmark
more often and/or want to add new benchmarks, a better approach would be
to handle the changes at the Scala level. That is:

- Create a "rest of the world" (world = Speedy, its compiler, and all of
  the associated types) interface that benchmarks would depend on,
  rather than depend directly on the rest of the codebase.
- Create two implementations of that interface, one that compiles
  against the current state of the world, and one that compiles against
  the baseline.
- Change the script to load the relevant implementation, and then run
  all the benchmarks as-is, with no match necessary.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-27 13:34:08 +02:00
nickchapman-da
fb6cafa311
Bump the sha for CI perf (#6078)
changelog_begin
changelog_end
2020-05-22 16:24:18 +00:00
Moritz Kiefer
629ec732dd
Include puppeteer tests in compat tests (#6018)
* Include puppeteer tests in compat tests

This PR adds the puppeteer based tests to the compatibility
tests. This also means that they are now actually compatibility
tests. Before, we only tested the SDK side.

Apart from process management being a nightmare on Windows as usually,
there are two things that might stick out here:

1. I’ve replaced the `sh_binary` wrapper by a `cc_binary`. There is a
   lengthy comment explaining why. I think at the moment, we could
   actually get rid of the wraper completely and add JAVA to path in
   the tests that need it but at least for now, I’d like to keep it
   until we are sure that we don’t need to add more to it (and then
   it’s also in the git history if we do need to resurrect it).
2. These tests are duplicated now similar to the `daml ledger *`
   tests. The reasoning here is different. They depend on the SDK
   tarball either way so performance wise there is no reason to keep
   them. However, we reference the other file in the docs which means
   we cannot change it freely. What we could do is to make this
   sufficiently flexible to handle both the `daml start` case and
   separate `daml sandbox`/`daml json-api` processes and then we can
   reference it in the docs. There is still added complexity for
   Windows but that’s necessary for users as well that want to run
   this on Windows so that seems unavoidable. (I should probably also
   remove my snarky comments 😇) I’d like to kee it duplicated
   for this PR and then we can clean it up afterwards.

changelog_begin
changelog_end

* Bump timeouts

changelog_begin
changelog_end
2020-05-22 14:02:59 +02:00
Gary Verhaegen
957a74c325
fix trailing newline in docs cron (#6053)
CI currently errors with:

```
Subprocess:
git checkout efe6545c2c
 -- docs/source/support/release-notes.rst
failed with exit code 127; output:
---

---
err:
---
Previous HEAD position was 2af134c... WIP: Draft version constraint
generation (#5472)
HEAD is now at efe6545... 1.2.0-snapshot.20200520.4224.0.2af134ca
(#6040)
/bin/sh: 2: --: not found

---

```

because the line

```
latest_release_notes_sha <- shell "git log -n1 --format=%H HEAD -- LATEST"
```

will assign a string that ends in a newline, and then when we try to
construct the shell command:

```
(shell_ $ "git checkout " <> latest_sha <> " -- docs/source/support/release-notes.rst")
```

we actually get two lines for Bash to execute.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-20 18:26:27 +02:00
Gary Verhaegen
94122ec561
fix docs cron (#6049)
Current version yields:

```
Subprocess:
git log -n1 --format=%H master -- LATEST
failed with exit code 128; output:
---
---
err:
---
fatal: bad revision 'master'
---
```

so apparently we can't trust a CI run on master to have a master branch
defined. `HEAD` should work, though.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-20 16:37:47 +02:00
Gary Verhaegen
fb6dc904a4
trigger all releases from master (#6016)
trigger all releases from master

The 1.1.0 release went wrong and we had to trash it and release 1.1.1
instead. This is an attempt at identifying and correcting the root
cause behind that incident.

To understand the situation, we need to know how releases worked before
1.0. We had a one-line file called `LATEST` that specifies the git SHA and
version tag for the latest release. A change to that file triggered a
release with the specified release tag, built from the source tree of
the specified commit. The `LATEST` file looked something like:

```
f050da78c9 1.0.0-snapshot.20200411.3905.0.f050da78
```

To mark a release as stable, we would change it to look like this:

```
f050da78c9 1.0.0
```

i.e. simply drop the `-snapshot...` suffix. Even though the commit (and
thus the entire source tree we build from) is the same, we would need to
rebuild almost all of our release artifacts, as they embed the version
tag in various places and ways. That worked well as long as we could
assume we were doing trunk-based development, i.e. all releases would
always come from the same (`master`) branch.

When we released 1.0, and started work on 1.1, we had a few bug reports
for 1.0 that we decided should be resolved in a point release. We
decided that the best way to handle that would be to have a branch
starting on the release commit for 1.0, and then backport patches from
`master` to that branch. We adapted our build process to also watch the
`release/1.0.x` branch and, in particular, trigger a new release build if
the `LATEST` file in that branch changed. That worked well.

The plan going forward was to keep doing regular snapshot releases from
the `master` branch, and create support, point releases ("patch" releases
in semver) from dedicated branches.

On April 30, we made a snapshot release as an RC for 1.1.0, by changing
the `LATEST` file in the `master` branch. That release was built on commit
681c862d. On May 6, we decided to take a new snapshot as the RC for
1.1.0; we changed `LATEST` in `master` to designate 7e448d81 as the new
latest release.

On May 11, we noticed an issue that broke our builds. Without going into
details, an external artifact we depend on had changed in incompatible
ways. After fixing that on `master`, we reasoned that this would also
break the build of the final 1.1.0 release if we just tried to build
7e448d81 again. But as the target release date was May 13, we did not
want to take a new snapshot after that fix, as that would have included
one more week of work in the release, and given us no time to test it.

So we did what we did for the 1.0 branch, as it had worked well: we
created a branch that forked from `master` at commit 7e448d81 and called
it `release/1.1.x`, then cherry-picked the one fix to our build process to
work around the broken download. When the time came to make the final
1.1.0 build on May 13, we naturally picked the `LATEST` file from the
`release/1.1.x` branch and dropped the `-snapshot...` suffix. Importantly,
we did not need to update the target commit to include the "broken
download" fix as, in the meantime, the internet had fixed itself, and we
thus reasoned we should go for the exact code of the RC rather than
include an unnecessary, albeit seemingly harmless, change.

Everything went well with the release process. Tests went well too. Then
we got a report that an application that worked against the latest RC
broke with the final 1.1.0. The issue was that we had built the wrong
commit: by branching off at the point of the _target_ commit for the
latest snapshot, we did not have the change to the `LATEST` file that
designated that commit as the target. So the `LATEST` file in
`release/1.1.x` was still pointing to 681c862d.

I believe the root cause for this issue is the fact that we have
scattered our release process over multiple branches, meaning there is
no linear history of what was released and we are relying on people
being able to mentally manage multiple timelines. Therefore, I propose
to fix our release process so this should not happen again by
linearizing the release process, i.e. getting back to a situation where
all releases are made from a single branch, `master`.

Because we do want to be able to release _for_ multiple release branches
(to provide backports and bugfixes), we still need some way to
accommodate that. Having a single `LATEST` file in the same format as
before would not really work well: keeping track of interleaved release
streams on a single file would not really be easier than keeping track
of multiple branches.

My proposed solution is to instead have a multiline LATEST file, so that
all the release branch "tips" can be observed at the same time, and, as
long as we take care to only advance one release branch at a time, we
can easily keep track of each of them. This is what this PR does.

This required a few changes to our release process. Most notably:

- Obviously, as this is the main point of this PR, the build process has
  once again been restricted to only trigger new releases from the
  `master` branch.
- As our CI machinery cannot easily be made to produce multiple releases
  from a single build, the `check_for_release` step will only recognize
  a commit as a release trigger if it changes a single line in the
  `LATEST` file. This restriction comes in addition to the existing one
  that a release commit is only allowed to change either just the
  `LATEST` file or both the `LATEST` and
  `docs/source/support/release-notes.rst` files.
- The docs publication process has been changed to update _all_
  published versions to display the _latest_ release notes page. This
  means that the release notes page will always show you all published
  versions, regardless of which version of the documentation you're
  looking at. This also means that interleaving release notes correctly on
  that page is a manual exercise.
- As per the intention of the new process, the `LATEST` file has been
  updated to contained all existing post-1.0 stable releases. It should
  also include all existing snapshot releases should we have more than one
  at a time (say, should we discover an issue with 1.1.1 that required us
  to work on a 1.1.2).
- The `release.sh` script has been dramatically simplified as I felt it
  was trying to do too much and porting its existing functionality to a
  multi-line `LATEST` file would be too hard.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-19 19:18:10 +02:00
Remy
b1c09ce1a0
Update perf tests SHA (#5971)
* update Perfs test sha

* changelog

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-13 19:08:53 +00:00
Moritz Kiefer
4916a28682
Include create-daml-app tests in compatibility tests (#5945)
This is the first part of #5700

It adds tests that build create-daml-app using `daml build` and then
run the codegen and build the UI. Contrary to our main tests these
also run on Windows. This is actually reasonably simple by first
building the typescript libraries on Linux and then downloading them
on Windows.

There are two parts that are still missing from the tests in the main
workspace:

1. Building the extra feature. This should be fairly easy to add.
2. Running the pupeeter tests. At least MacOS and Linux should be
   reasonably easy. I don’t know what horrors Windows will throw at
   us. This step is what actually makes this a compatibility
   test. Currently it doesn’t actually launch Sandbox and the JSON API.

Since this PR is already pretty large, I’d like to tackle those things
separately.

changelog_begin
changelog_end
2020-05-13 10:39:51 +02:00
Gary Verhaegen
3899a59a11
switch back to hosted macOS nodes (#5935)
CHANGELOG_BEGIN
CHANGELOG_END
2020-05-11 22:59:33 +02:00
Gary Verhaegen
9b476416b8
switch back to Azure-provided macos nodes (#5920)
This is temporary. It looks like the macOS nodes are dead; @nycnewman is
looking into it, but in case he doesn't fix it in time, at least we
have a backup plan so we're not completely blocked on Monday.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-11 09:28:40 +02:00
Gary Verhaegen
4a6ab84b69
add default machine capability (#5912)
add default machine capability

We semi-regularly need to do work that has the potential to disrupt a
machine's local cache, rendering it broken for other streams of work.
This can include upgrading nix, upgrading Bazel, debugging caching
issues, or anything related to Windows.

Right now we do not have any good solution for these situations. We can
either not do those streams of work, or we can proceed with them and
just accept that all other builds may get affected depending on which
machine they get assigned to. Debugging broken nodes is particularly
tricky as we do not have any way to force a build to run on a given
node.

This PR aims at providing a better alternative by (ab)using an Azure
Pipelines feature called
[capabilities](https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/agents?view=azure-devops&tabs=browser#capabilities).
The idea behind capabilities is that you assign a set of tags to a
machine, and then a job can express its
[demands](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/demands?view=azure-devops&tabs=yaml),
i.e. specify a set of tags machines need to have in order to run it.

Support for this is fairly badly documented. We can gather from the
documentation that a job can specify two things about a capability
(through its `demands`): that a given tag exists, and that a given tag
has an exact specified value. In particular, a job cannot specify that a
capability should _not_ be present, meaning we cannot rely on, say,
adding a "broken" tag to broken machines.

Documentation on how to set capabilities for an agent is basically
nonexistent, but [looking at the
code](https://github.com/microsoft/azure-pipelines-agent/blob/master/src/Microsoft.VisualStudio.Services.Agent/Capabilities/UserCapabilitiesProvider.cs)
indicates that they can be set by using a simple `key=value`-formatted
text file, provided we can find the right place to put this file.

This PR adds this file to our Linux, macOS and Windows node init scripts
to define an `assignment` capability and adds a demand for a `default`
value on each job. From then on, when we hit a case where we want a PR
to run on a specific node, and to prevent other PRs from running on that
node, we can manually override the capability from the Azure UI and
update the demand in the relevant YAML file in the PR.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-09 18:21:42 +02:00
Gary Verhaegen
11a2fc3c2d
more flexible perf test check (#5891)
This PR separates the "last known valid perf test" commit from the
"baseline speedy implementation" commit. It is important for the perf
test to be meaningful that the changes between those two commits are
benign, say minor API adjustments, so that the perf measurement remains
meaningful.

This also adds a check on merging to master that tells Slack if the perf
test has changed and the `test_sha` file needs updating. The Slack
message is conditional on the current commit to avoid excessive noise.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-07 13:53:22 +02:00
Martin Huschenbett
6642b1fc8b
Report speedup in daily perf report cron job (#5885)
Also track against both targets, 5x and 10x.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-07 10:52:33 +02:00
Gary Verhaegen
204c8b0657
add daily perf report (#5843)
This PR adds a simple daily job that runs the performance test on a
chosen "baseline" commit and then runs the same benchmark on latest
master. This should allow us to track overall performance improvements.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-06 13:50:35 +02:00
Moritz Kiefer
b291e96ce1
Publish execution logs from Windows compatibility jobs (#5834)
Hopefully, this helps diagnose the Windows CI failures.

changelog_begin
changelog_end
2020-05-05 12:23:11 +02:00
Gary Verhaegen
ed13afc56a
fix overeager docs cron (#5797)
Currently the docs cron _always_ decides it has something new to
publish. This PR fixes that.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-30 15:31:38 +02:00
Moritz Kiefer
49e19ebed1
Make compat tests work on windows (#5732)
* Make compat tests work on windows

This required some changes to the daml_sdk rule since the read-only
installation by the assistant breaks Bazel completely. We could only
apply those changes on Windows but I think I prefer the consistency
across platforms here over trying to stay close to how the SDK is
installed on user machines given that the SDK installation is not
something we’ve had issues with.

I’ve excluded the postgresql tests for now. I don’t expect them to be
particularly hard to fix but I’ve already spent almost 2 days on this
and having some tests run on Windows seems like a clear improvement
over running no tests on Windows :)

changelog_begin
changelog_end

* Remove todo

changelog_begin
changelog_end
2020-04-28 16:06:36 +02:00
Gary Verhaegen
cfae3df7fa
report compat status every day (#5744)
I believe the compatibility check is important enough, and should fail
rarely enough, that it is worth reporting even on success. This will
mean (once we support Windows) 3 messages a day, sent while presumably
nobody is working, so the disruption should be minimal.

The issue with reporting only on failures is that, if we don't
proactively check (which we do for the state of master for different
reasons, but would likely not keep doing for a job that doesn't block
PRs), we may get into a state where it is so broken that it doesn't even
report.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-28 12:30:45 +02:00
Moritz Kiefer
d1db5c1c96
Set buffering of the docs cron job to LineBuffering (#5735)
I am hoping this will make the Azure output a bit more chatty. At the
moment, you don’t get any output until the job has finished which is a
bit annoying (although not really an issue).

changelog_begin
changelog_end
2020-04-27 15:29:08 +02:00
Gary Verhaegen
7ceda5678a
run compatibility tests on macos (#5723)
This PR extends the existing Linux compatibility tests to run on macOS
too. Fixes #5692.

CHANGELOG_BEGIN
CHANGELOG_END

Co-authored-by: Moritz Kiefer <moritz.kiefer@purelyfunctional.org>
2020-04-27 14:55:16 +02:00
Moritz Kiefer
0d1f21e4a2
Extend compatibility tests to test against HEAD (#5714)
fixes #5691

changelog_begin
changelog_end
2020-04-24 14:43:35 +02:00
Moritz Kiefer
f61aadc422
Add dade-assist step to compat cron job (#5712)
changelog_begin
changelog_end
2020-04-24 11:25:14 +02:00
Gary Verhaegen
ec1a9326ea
fix daily-compat.yml (#5686)
For some reason the Azure Pipelines folks thought it was a good idea to
search for templates starting with the current YAML file's path. Other
steps (e.g. the bash script here) start from the root of the repo,
though.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-23 13:38:57 +02:00
Moritz Kiefer
7d36402412
Initial boilerplate for cross-version compatibility testing (#5665)
This is a first step towards testing cross-version
compatibility. It doesn’t actuall do much yet but hopefully it should
be easier to parallelize once we have the initial boilerplate in place
so ideally I’d like to address most missing things and issues in
separate PRs.

changelog_begin
changelog_end
2020-04-23 12:58:11 +02:00
Gary Verhaegen
644d4c7512
add empty daily CI run (#5675)
This is meant to be filled once we have a better idea of what exactly we
want to test. See #5665 for current thinking about it.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-23 10:36:20 +02:00
Moritz Kiefer
2ea4fbd850
Fix docs cron (#5612)
--branches='*' seems to only include local branches not branches in
the remote. It looks like --branches='*' --remotes='*' would work but
--all seems simpler. I struggled to find any docs on this but this
matched what I got when testing locally.

changelog_begin
changelog_end
2020-04-17 20:59:49 +02:00
Gary Verhaegen
a1fab2d9af
enable patch releases (#5584)
This commit aims at enabling future patch releases; it is the
master-branch equivalent of #5569 (applied to the 1.0 release branch).

The only change between the two changelogs should be that this one also
changes the docs cron so it can find the trigger commits for patch
releases.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-16 17:50:55 +02:00
Gary Verhaegen
1872c668a5
replace DAML Authors with DA in copyright headers (#5228)
Change requested by Manoj.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-27 01:26:10 +01:00
Gary Verhaegen
fd185ed22e
publish prerelease documentation (#4976)
This PR changes the documentation release process to publish the
documentation for releases tagged "prerelease" on GitHub, while
discarding them when deciding on the latest version (the one that shows
on `/` on the docs site) and omitting them from the `versions.json` file
(meaning they do not appear on the dropdown).

This PR also makes a bit of cleanup/bug fixing:
- The change in `nix` toolset name (#4724) needs to be protected by a
  version check, as we checkout older versions of the repo during docs
  build.
- The data types BlogSubmit and BlogId seem to have survived the "dead
  code detection" in #4956.
- The documentation build step had not been updated to pass down the
  correct version string (#4513).

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-12 18:54:47 +00:00
Gary Verhaegen
872a5fc0df
do not send release notes to hubspot (#4956)
@bame-da wants a more manual process where he can control exactly when
release notes are posted, possibly in advance.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-11 21:41:46 +01:00
Gary Verhaegen
80652bd51f
report-std-change: handle GitHub errors (#4814)
There have been a few GitHub glitches last week that resulted in a few
commits on master not being associated with a PR (though they really
were created from merging a PR, and the correct PR number is in their
title).

This makes the report script crash on not finding the PR, so this PR
fixes that. And a comment.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-05 10:06:58 +01:00
Gary Verhaegen
c2240df083
change standard change report time (#4804)
In the current setup, the report generation starts at 4. However, the
daily killing of all machines also starts at 4, giving the report little
chance of ever finishing.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-04 18:22:32 +01:00
Moritz Kiefer
8c14d16718
Disable pdf docs builds on macos (#4724)
This disables the PDF docs builds on MacOS on CI (they are still built
locally by default) and removes them from the Nix closure by
introducing a separate ci-cached attribute that filters out texlive.

Since we built `nix-build nix -A tools -A cached` on CI, I’ve also
removed all the Tex stuff from tools which only means that it ends up
in PATH which nobody seems to care about.

changelog_begin
changelog_end
2020-02-26 14:52:08 +00:00
Gary Verhaegen
5a117dc358
introduce new release process (#4513)
Context
=======

After multiple discussions about our current release schedule and
process, we've come to the conclusion that we need to be able to make a
distinction between technical snapshots and marketing releases. In other
words, we need to be able to create a bundle for early adopters to test
without making it an officially-supported version, and without
necessarily implying everyone should go through the trouble of
upgrading. The underlying goal is to have less frequent but more stable
"official" releases.

This PR is a proposal for a new release process designed under the
following constraints:

- Reuse as much as possible of the existing infrastructure, to minimize
  effort but also chances of disruptions.
- Have the ability to create "snapshot"/"nightly"/... releases that are
  not meant for general public consumption, but can still be used by savvy
  users without jumping through too many extra hoops (ideally just
  swapping in a slightly-weirder version string).
- Have the ability to promote an existing snapshot release to "official"
  release status, with as few changes as possible in-between, so we can be
  confident that the official release is what we tested as a prerelease.
- Have as much of the release pipeline shared between the two types of
  releases, to avoid discovering non-transient problems while trying to
  promote a snapshot to an official release.
- Triggerring a release should still be done through a PR, so we can
  keep the same approval process for SOC2 auditability.

The gist of this proposal is to replace the current `VERSION` file with
a `LATEST` file, which would have the following format:

```
ef5d32b7438e481de0235c5538aedab419682388 0.13.53-alpha.20200214.3025.ef5d32b7
```

This file would be maintained with a script to reduce manual labor in
producing the version string. Other than that, the process will be
largely the same, with releases triggered by changes to this `LATEST`
and the release notes files.

Version numbers
===============

Because one of the goals is to reduce the velocity of our published
version numbers, we need a different version scheme for our snapshot
releases. Fortunately, most version schemes have some support for that;
unfortunately, the SDK sits at the intersection of three different
version schemes that have made incompatible choices. Without going into
too much detail:

- Semantic versioning (which we chose as the version format for the SDK
  version number) allows for "prerelease" version numbers as well as
  "metadata"; an example of a complete version string would be
  `1.2.3-nightly.201+server12.43`. The "main" part of the version string
  always has to have 3 numbers separated by dots; the "prerelease"
  (after the `-` but before the `+`) and the "metadata" (after the `+`)
  parts are optional and, if present, must consist of one or more segments
  separated by dots, where a segment can be either a number or an
  alphanumeric string. In terms of ordering, metadata is irrelevant and
  any version with a prerelease string is before the corresponding "main"
  version string alone. Amongst prereleases, segments are compared in
  order with purely numeric ones compared as numbers and mixed ones
  compared lexicographically. So 1.2.3 is more recent than 1.2.3-1,
  which is itself less recent than 1.2.3-2.
- Maven version strings are any number of segments separated by a `.`, a
  `-`, or a transition between a number and a letter. Version strings
  are compared element-wise, with numeric segments being compared as
  numbers. Alphabetic segments are treated specially if they happen to be
  one of a handful of magic words (such as "alpha", "beta" or "snapshot"
  for example) which count as "qualifiers"; a version string with a
  qualifier is "before" its prefix (`1.2.3` is before `1.2.3-alpha.3`,
  which is the same as `1.2.3-alpha3` or `1.2.3-alpha-3`), and there is a
  special ordering amongst qualifiers. Other alphabetic segments are
  compared alphabetically and count as being "after" their prefix
  (`1.2.3-really-final-this-time` counts as being released after `1.2.3`).
- GHC package numbers are comprised of any number of numeric segments
  separated by `.`, plus an optional (though deprecated) alphanumeric
  "version tag" separated by a `-`. I could not find any official
  documentation on ordering for the version tag; numeric segments are
  compared as numbers.
- npm uses semantic versioning so that is covered already.

After much more investigation than I'd care to admit, I have come up
with the following compromise as the least-bad solution. First,
obviously, the version string for stable/marketing versions is going to
be "standard" semver, i.e. major.minor.patch, all numbers, which works,
and sorts as expected, for all three schemes. For snapshot releases, we
shall use the following (semver) format:

```
0.13.53-alpha.20200214.3025.ef5d32b7
```

where the components are, respectively:

- `0.13.53`: the expected version string of the next "stable" release.
- `alpha`: a marker that hopefully scares people enough.
- `20200214`: the date of the release commit, which _MUST_ be on
  master.
- `3025`: the number of commits in master up to the release commit
  (included). Because we have a linear, append-only master branch, this
  uniquely identifies the commit.
- `ef5d32b7ù : the first 8 characters of the release commit sha. This is
  not strictly speaking necessary, but makes it a lot more convenient to
  identify the commit.

The main downsides of this format are:

1. It is not a valid format for GHC packages. We do not publish GHC
  packages from the SDK (so far we have instead opted to release our
  Haskell code as separate packages entirely), so this should not be an
  issue. However, our SDK version currently leaks to `ghc-pkg` as the
  version string for the stdlib (and prim) packages. This PR addresses
  that by tweaking the compiler to remove the offending bits, so `ghc-pkg`
  would see the above version number as `0.13.53.20200214.3025`, which
  should be enough to uniquely identify it. Note that, as far as I could
  find out, this number would never be exposed to users.
2. It is rather long, which I think is good from a human perspective as
  it makes it more scary. However, I have been told that this may be
  long enough to cause issues on Windows by pushing us past the max path
  size limitation of that "OS". I suggest we try it and see what
  happens.

The upsides are:

- It clearly indicates it is an unstable release (`alpha`).
- It clearly indicates how old it is, by including the date.
- To humans, it is immediately obvious which version is "later" even if
  they have the same date, allowing us to release same-day patches if
  needed. (Note: that is, commits that were made on the same day; the
  release date itself is irrelevant here.)
- It contains the git sha so the commit built for that release is
  immediately obvious.
- It sorts correctly under all schemes (modulo the modification for
  GHC).

Alternatives I considered:

- Pander to GHC: 0.13.53-alpha-20200214-3025-ef5d32b7. This format would
  be accepted by all schemes, but will not sort as expected under semantic
  versioning (though Maven will be fine). I have no idea how it will sort
  under GHC.
- Not having any non-numeric component, e.g. `0.13.53.20200214.3025`.
  This is not valid semantic versioning and is therefore rejected by
  npm.
- Not having detailed info: just go with `0.13.53-snapshot`. This is
  what is generally done in the Java world, but we then lose track of what
  version is actually in use and I'm concerned about bug reports. This
  would also not let us publish to the main Maven repo (at least not more
  than once), as artifacts there are supposed to be immutable.
- No having a qualifier: `0.13.53-3025` would be acceptable to all three
  version formats. However, it would not clearly indicate to humans that
  it is not meant as a stable version, and would sort differently under
  semantic versioning (which counts it as a prerelease, i.e. before
  `0.13.53`) than under maven (which counts it as a patch, so after
  `0.13.53`).
- Just counting releases: `0.13.53-alpha.1`, where we just count the
  number of prereleases in-between `0.13.52` and the next. This is
  currently the fallback plan if Windows path length causes issues. It
  would be less convenient to map releases to commits, but it could still
  be done via querying the history of the `LATEST` file.

Release notes
=============

> Note: We have decided not to have release notes for snapshot releases.

Release notes are a bit tricky. Because we want the ability to make
snapshot releases, then later on promote them to stable releases, it
follows that we want to build commits from the past. However, if we
decide post-hoc that a commit is actually a good candidate for a
release, there is no way that commit can have the appropriate release
notes: it cannot know what version number it's getting, and, moreover,
we now track changes in commit messages. And I do not think anyone wants
to go back to the release notes file being a merge bottleneck.

But release notes need to be published to the releases blog upon
releasing a stable version, and the docs website needs to be updated and
include them.

The only sensible solution here is to pick up the release notes as of
the commit that triggers the release. As the docs cron runs
asynchronously, this means walking down the git history to find the
relevant commit.

> Note: We could probably do away with the asynchronicity at this point.
> It was originally included to cover for the possibility of a release
> failing. If we are releasing commits from the past after they have been
> tested, this should not be an issue anymore. If the docs generation were
> part of the synchronous release step, it would have direct access to the
> correct release notes without having to walk down the git history.
>
> However, I think it is more prudent to keep this change as a future step,
> after we're confident the new release scheme does indeed produce much more
> reliable "stable" releases.

New release process
===================

Just like releases are currently controlled mostly by detecting
changes to the `VERSION` file, the new process will be controlled by
detecting changes to the `LATEST` file. The format of that file will
include both the version string and the corresponding SHA.

Upon detecting a change to the `LATEST` file, CI will run the entire
release process, just like it does now with the VERSION file. The main
differences are:

1. Before running the release step, CI will checkout the commit
  specified in the LATEST file. This requires separating the release
  step from the build step, which in my opinion is cleaner anyway.
2. The `//:VERSION` Bazel target is replaced by a repository rule
  that gets the version to build from an environment variable, with a
  default of `0.0.0` to remain consistent with the current `daml-head`
  behaviour.

Some of the manual steps will need to be skipped for a snapshot release.
See amended `release/RELEASE.md` in this commit for details.

The main caveat of this approach is that the official release will be a
different binary from the corresponding snapshot. It will have been
built from the same source, but with a different version string. This is
somewhat mitigated by Bazel caching, meaning any build step that does
not depend on the version string should use the cache and produce
identical results. I do not think this can be avoided when our artifact
includes its own version number.

I must note, though, that while going through the changes required after
removing the `VERSION` file, I have been quite surprised at the sheer number of
things that actually depend on the SDK version number. I believe we should
look into reducing that over time.

CHANGELOG_BEGIN
CHANGELOG_END
2020-02-25 17:01:23 +01:00
Gary Verhaegen
064b26c75e
standard change extract script (#4416)
As part of our SOC2/ISO certification, we need to be able to evidence a
list of "Standard Changes" for the DAML SDK project. This commit adds a
script that extracts, for a given month, the list of Standard Changes
that happened along with relevant information (author, reviewer, etc.).

It also adds a definition for a monthly cron job on Azure to run the
script and send the result to Slack, @-mentioning Martin.

CHANGELOG_BEGIN
CHANGELOG_END
2020-02-07 15:16:03 +01:00
Gary Verhaegen
8a1b46f4fd
docs cron: use GitHub-Flavoured Markdown (#4141)
This patch changes the call to the GitHub API that translates the
release notes from markdown to HTML to use gfm instead of plain
markdown. gfm is a superset of markdown that adds the following:

- GitHub usernames (`@`-mentions) are turned into links to the user's
  profile page.
- Issues and PR numbers (`#1234`) are turned into links to the
  corresponding issue or PR.
- Existing git shas are turned into links to the corresponding commit.

An example of this feature missing is the release notes for
[v0.13.42](https://blog.daml.com/release-notes/0.13.42-1), where
intended links such as

> - Rename argument in active contract to payload. See #3826.

are not rendered.

CHANGELOG_BEGIN
CHANGELOG_END
2020-01-21 14:46:02 +01:00
Gary Verhaegen
f2827e0207
docs cron: remove links to missing versions (#4123)
We have recently added the option for this script to not build some
versions (because they are too old and external dependencies have
changed from under them). We also have changed the GitHub call to get
all the history of releases.

This PR changes the logic to generate the `versions.json` file so that
it only contains versions that we have either built or copied over.

Consequently, it also changes the logic to decide whether this job
should run to depend only on the latest version, rather than the whole
list of versions.

CHANGELOG_BEGIN
CHANGELOG_END
2020-01-21 13:50:40 +01:00
Gary Verhaegen
760f9d4d37
docs cron: follow github pagination links (#4115)
The GitHub API is paginated (30 items by default). This creates two
problems:

1. At the moment, older versions silently drop from the docs website,
  without us having made any explicit decision about it.
2. When we prepare a new version, it gets created as a pre-release
  version. Our script filters that out, but that happens on our end so
  we end up with 29 published versions and the list is different form the
  existing one. If the prerelease then gets dropped, the oldest version
  comes back.

It is possible that we will sometime decide we do not want to keep old
documentation around forever, but that should be an explicit decision.
This patch changes the logic to fetch the list of versions from GitHub
so that we always get all the published versions (barring race
conditions inherent to that kind of paginated API).

CHANGELOG_BEGIN
CHANGELOG_END
2020-01-20 18:47:47 +01:00
Gary Verhaegen
8811006617
docs cron: more reliable checksums (#4102)
The docs build is currently not reproducible as it include to-the-minute
time-of-build information. It also includes some Sphinx binary caches
which I suppose will also not be reproducible (though I have not checked
the details there).

This commit attempts to remove all sources of non-reproducibility from
the docs build, though this is hard to test without having a stable,
older release to compare with.

CHANGELOG_BEGIN
CHANGELOG_END
2020-01-20 16:21:34 +01:00
Gary Verhaegen
b8a588e9c0
docs cron: sort versions.json (#4062)
CHANGELOG_BEGIN
CHANGELOG_END
2020-01-16 12:32:31 +01:00
Gary Verhaegen
11be496e15
docs cron: create temp dirs (#4061)
CHANGELOG_BEGIN
CHANGELOG_END
2020-01-16 00:40:26 +01:00
Gary Verhaegen
45c474b3d5
try to fix docs again (#4060)
CHANGELOG_BEGIN
CHANGELOG_END
2020-01-15 22:50:46 +01:00
Gary Verhaegen
ada0ad07ca
docs cron: special case 0.13.43 for scala http issue (#4058)
CHANGELOG_BEGIN
CHANGELOG_END
2020-01-15 21:48:37 +01:00
Gary Verhaegen
e96db012ed
fix logic bug in docs release cron (#3998)
The latest changes to the docs cron have introduced a bug whereby the
"latest" version is determined including prereleases.

CHANGELOG_BEGIN
CHANGELOG_END
2020-01-09 13:02:42 +01:00
Gary Verhaegen
40fd4b3626
docs cron: do not rebuild old versions (#3944)
This commit makes two conceptually independent changes:

1. It adds a checksum file to each version folder. This allows the
script to detect when a version has not been correctly uploaded.
2. It changes the script to first download all the docs website, and
then reuse existing version folders where appropriate (i.e. when their
folder matches its checksum).

The hope is that this will reduce the time it takes to deploy a new
version, as only the current version should be rebuilt (in addition to
previous, failed versions).

The first time this cron runs (upon next release as per the current
setup), however, it will still rebuild all existing versions as they do
not currently have a checksum.

CHANGELOG_BEGIN
CHANGELOG_END
2020-01-09 12:17:35 +01:00
Moritz Kiefer
42c586f8d4
Bump ghcide (#3943)
* Bump ghcide

* Fix ghcide build

* Include bugfix for Windows
2020-01-04 07:51:51 +01:00
Gary Verhaegen
f8c247cadf
partial fix for docs cron (#3941)
This commit aims at mitigating two issues we have noticed with the
0.13.41 release:

1. The initial cron run for that release got interrupted at the 50
minutes mark, which happened to be right in the middle of the s3 upload.
This means it had already changed the versions.json file, but had not
finished updating the actual html files. Right now, the docs.daml.com
website shows version 0.13.41 in the drop-down, but actually displays
the content for 0.13.40. Additionally, trying to explicitly visit the
website for 0.13.41 (https://docs.daml.com/0.13.41) yields a 404. Note
that this also means the cron job did not reach the "tell HubSpot"
point, so 0.13.41 did not get announced.
2. As the script also did not reach the "clear cache" step, subsequent
runs have been rebuilding the documentation for no reason as the
sequence of steps was: check versions.json through HTTP, get cached one,
see it's not up-to-date, build docs, check versions.json through s3 API,
bypassing the cache, see it's up-to-date, stop.

To address those issues, this PR changes the cron to:
1. Increase the timeout to 2h instead of 50 minutes.
2. Always check the versions.json file through s3, rather than go
through the HTTP cache first.

These are not complete solutions but I'm not sure how to do better given
that s3 does not have atomic operations.
2020-01-03 14:43:22 +01:00
Gary Verhaegen
878429e3bf
update copyright notices to 2020 (#3939)
copyright update 2020

* update template
* run script: `dade-copyright-headers update .`
* update script
* manual adjustments
* exclude frozen proto files from further header checks (by adding NO_AUTO_COPYRIGHT files)
2020-01-02 21:21:13 +01:00
Gary Verhaegen
adceb3a6b2
checkout current sha after daily docs (#3559)
Currently if the docs script fails, the Slack message we get mentions the commit title of the docs version that failed to build, which is not super useful. This ensures we get back to the current commit regardless of what happens with the Haskell script.
2019-11-21 15:00:19 +01:00
Andreas Herrmann
33e47828e3
Bazel 1.1 (#3249)
* bazel: 0.28.1 --> 1.1.0

* bazel-watcher sha256

* Fix missing line in patch

* proto_source_root --> strip_import_prefix

See https://github.com/bazelbuild/bazel/issues/7153 for details.

* Update rules_nixpkgs

Required to avoid errors of the form
```
ERROR: An error occurred during the fetch of repository 'node_nix':
   parameter 'sep' may not be specified by name, for call to method split(sep, maxsplit = None) of 'string'
```

and
```
ERROR: An error occurred during the fetch of repository 'node_nix':
   Traceback (most recent call last):
	File "/private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 149
		_execute_or_fail(repository_ctx, <3 more arguments>)
	File "/private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 318, in _execute_or_fail
		fail(<1 more arguments>)

Cannot build Nix attribute 'nodejs'.
Command: [/Users/runner/.nix-profile/bin/nix-build, /private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/node_nix/nix/bazel.nix, "-A", "nodejs", "--out-link", "bazel-support/nix-out-link", "-I", "nixpkgs=/private/var/tmp/_bazel_runner/17d2b3954f1c6dcf5414d5453467df9a/external/nixpkgs/nixpkgs"]
Return code: 1
Error output:
src/main/tools/process-tools.cc:173: "setitimer": Invalid argument
```

* Update rules_scala

* .proto has been removed, use [ProtoInfo] instead

See
https://docs.bazel.build/versions/1.1.0/be/protocol-buffer.html#proto_library

* python3_nix add nix_file attribute

To avoid the following error

```
ERROR: /home/aj/tweag.io/da/da-bazel-1.1/BUILD:66:1: //:nix_python3_runtime depends on @python3_nix//:bin/python in repository @python3_nix which failed to fetch. no such package '@python3_nix//': Traceback (most recent call last):
        File "/home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 149
                _execute_or_fail(repository_ctx, <3 more arguments>)
        File "/home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/io_tweag_rules_nixpkgs/nixpkgs/nixpkgs.bzl", line 318, in _execute_or_fail
                fail(<1 more arguments>)

Cannot build Nix attribute 'python3'.
Command: [/home/aj/.nix-profile/bin/nix-build, "-E", "import <nixpkgs> { config = {}; overlays = []; }", "-A", "python3", "--out-link", "bazel-support/nix-out-link", "-I", "nixpkgs=/home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/nixpkgs/nixpkgs"]
Return code: 1
Error output:
error: anonymous function at /home/aj/.cache/bazel/_bazel_aj/5f825ad28f8e070f999ba37395e46ee5/external/nixpkgs/nixpkgs.nix:3:1 called with unexpected argument 'config', at (string):1:1
```

* rules_haskell unnamed string.split(_, maxsplit = _)

The keyword argument may no longer be named.

* string.replace(_, _, maxsplit = _) may not be named

* Move proto sources from deps to data

Fixes

```
ERROR: /home/aj/tweag.io/da/da-bazel-1.1/daml-lf/archive/BUILD.bazel:150:1: in deps attribute of scala_test rule //daml-lf/archive:daml_lf_archive_reader_tests_test_suite_src_test_scala_com_digitalasset_daml_lf_archive_DecodeV1Spec.scala: '//daml-lf/archive:daml_lf_1.6_archive_proto_srcs' does not have mandatory providers: 'JavaInfo'. Since this rule was created by the macro 'da_scala_test_suite', the error might have been caused by the macro implementation
```

* Define sha256 for haskell_ghc__paths

Bazel 1.1.0 fails on missing hashes.

* Disable --incompatible_windows_native_test_wrapper

* //compiler/daml-extension don't modify sources

Modifying sources in-place can cause issues on Windows, where build
actions are not sandboxed and changes on sources can affect other build
steps.

* bazel-genfiles --> bazel-bin

The bazel-genfiles symlink has been removed since Bazel 1.0.
See https://github.com/bazelbuild/bazel/issues/8651

* Mark dev_env_tool repository rule as configure

See
https://docs.bazel.build/versions/1.1.0/skylark/lib/globals.html#repository_rule

* Move data deps into data attribute

* Mark dev_env_tool as local = True

* Manually fetch @makensis_dev_env
2019-11-11 10:06:03 +01:00
Gary Verhaegen
c1662527f5
docs cron: fix content-type header (#3289) 2019-10-30 14:46:57 +01:00
Gary Verhaegen
c4aa296a5e add debug prints for docs cron (#3281) 2019-10-30 10:23:12 +00:00
Gary Verhaegen
536188abce fix spurious HubSpot announce bug (#3266) 2019-10-29 13:26:08 +00:00
Gary Verhaegen
1e1e08d3c9
rewrite docs cron in Haskell (#3235) 2019-10-28 18:26:06 +00:00