Commit Graph

353 Commits

Author SHA1 Message Date
nickchapman-da
6a745ed1fa
Support choice observers in 1.dev (#7922)
* Adapt to new desugaring for choice observers.

update hash of ghc patch.

changelog_begin
changelog_end

update ghc patch to final version

update stack-snapshot hashes for ghc-lib(-parser)

update stackage_snapshot.json, following `bazel run @stackage-unpinned//:pin`

expose Optional constructors for desugared code to use

adapt LFConversion to expect a 4-tuple for a desugared choice def/sig

update LFConversion for choice-observers

first example using new choice observer syntax.

fix scala type checker to have correct scoping rules for choice-observers

remove comment from example which says it is broken

improve variable names

improve tests for choice-observer clause

only test choice-observers SINCE 1.dev

add jq queries for choice observeres

make positive statement in jq test which checks choice obserers are present

test behaviour of choice observers

squash me

typo

* test more choice-observer divulgence

* Update documention for choice observers.

changelog_begin
Support choice observers in 1.dev
changelog_end

* fix docs build

* fix daml docs choice-observers example

* address comments: rewording text

* annotate choices observers as early-access in documention

* split out documentation code-snippets which require --target=1.dev

* final tweaks to documentation text
2020-11-18 19:51:15 +00:00
Gary Verhaegen
d4b6b06923
update docker image description (#7915)
Note: this file is meant to represent the content on [Docker Hub], but
syncing is currently a manual process. I will propagate once this is
approved and merged.

[Docker Hub]: https://hub.docker.com/repository/docker/digitalasset/daml-sdk

CHANGELOG_BEGIN
CHANGELOG_END
2020-11-06 16:39:16 +01:00
Gary Verhaegen
ba2ce20bb2
Docker Hub description for digitalasset/daml-sdk (#7886)
This commit just copies the existing description of the image from
DockerHub so we have a starting point to change it.

CHANGELOG_BEGIN
CHANGELOG_END
2020-11-04 18:25:08 +01:00
Remy
cb5b439e39
Bump test_sha (#7512) (#7860)
This changed in #7858 but it’s a harmless change.

CHANGELOG_BEGIN
CHANGELOG_END
2020-11-02 20:18:32 +00:00
Gary Verhaegen
cdf6160c76
ci/cron/check: use whileJust_ for recursion (#7747)
As suggested in #7746.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-20 16:03:46 +02:00
Gary Verhaegen
dd01dbc5a4
ci/cron: change github contact email (#7741)
GitHub requests that when we use the API without a token, which is what
we do here, we use a user-agent header that allows them to contact us in
case there is an issue they need to discuss. This PR updates that
address, and cleans-up a bit of duplication around it.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-20 12:08:31 +02:00
Gary Verhaegen
7eb4b352dd
ci/cron/check: replace wget with Haskell code (#7731)
As promised in #7696.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-19 19:07:09 +02:00
Gary Verhaegen
e1b26d27ad
ci/cron/check: limit simultaneous downloads (#7703)
As requested in review of #7696.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-16 11:37:50 +02:00
Gary Verhaegen
305a097a93
ci/cron: move concurrency from Bash to Haskell (#7696)
One small step further in removing the Bash.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-15 14:00:55 +02:00
Gary Verhaegen
ad79acdb65
ci/cron: ability to run check locally (#7690)
This PR allows the script to run without GCP credentials. It will
obviously then skip the bits that require GCP credentials, but that
still leaves it with plenty of things to do.

Because checking all releases can still be quite long (around an hour on
CI, and my personal connection is a bit slower), this also introduces a
new parameter that restricts the number of releases to test.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-15 12:18:45 +02:00
Moritz Kiefer
80a25bf54a
Bump perf sha (#7679)
The change from #7666 is benign so we can simply bump this.

changelog_begin
changelog_end
2020-10-14 11:38:49 +00:00
Gary Verhaegen
8e905b34c0
ci/bash-lib: fix gcs return code (#7630)
Currently the return code of the function is the return code of the
`eval "$restore_trap"` line, whereas semantically we want the return
code of the `gsutil` call. This is not an issue in most cases as the
`set -e` should kick in, but if the function appears as the condition in
an `if` statement the `-e` flag is suspended.

The main use-case right now is that the daily license check is _not_
uploading artifacts.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-11 19:07:22 +02:00
Gary Verhaegen
42b7fa5ab9
ci/cron: fix gcs path (#7626)
Change the path used to push to the backup gcs bucket to match what is
put by the release script. This needs to get merged before we run the
next daily.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-09 15:31:52 +02:00
Gary Verhaegen
cd427dc2d2
ci/cron: upload artifact to daml-data if missing (#7616)
If we don't already have a copy of an artifact in our "disaster
recovery" storage box, put one.

Note: as implemented, this upload mechanism happens only if the release
was successfully verified signature-wise, so this should not result in
us saving broken artifacts. Also, CI does not have deletion or overwrite
access to this bucket, so overall this should be pretty safe.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-09 14:55:37 +02:00
Gary Verhaegen
8c1fbf6225
ci/bash_lib: generalize save_gcp_data (#7599)
This PR extends the existing `save_gcp_data` function to handle any
`gsutil` command. This is done to support existence checking using
`gsutil ls` for private artifacts in release checking (`ci/cron`).

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-08 18:37:14 +02:00
Gary Verhaegen
19d7086a21
fix ci/cron: remove references to LOG (#7607)
CHANGELOG_BEGIN
CHANGELOG_END
2020-10-08 18:15:25 +02:00
Gary Verhaegen
19c658ae15
ci/cron: move temp file handling to Haskell (#7596)
CHANGELOG_BEGIN
CHANGELOG_END
2020-10-07 17:45:00 +02:00
Gary Verhaegen
c238985bf9
ci/cron: actually fail on invalid signature (#7592)
At the moment, because the signature check appears in a `if` statement,
failed signatures do not actually fail the script and would thus still
result in "success" messages to Slack.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-07 13:12:13 +02:00
Gary Verhaegen
5799557570
ci/cron: check all releases (#7586)
This walks through the paginated GH API to fetch all releases and check
their signatures.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-07 12:04:08 +02:00
Gary Verhaegen
4db8c3ada1
ci/cron: move Bash cron inside Haskell script (#7585)
Yes, this is how I write Haskell. I'm told it's an improvement over
Bash.

Jokes aside, plan is to chip away at the Bash script, starting with
replacing the outermost loop with a proper "get _all_ releases" call
from Haskell, but I like keeping things working in small steps, and even
long-term I have no desire to reimplement the gpg signature checking
code in Haskell.

I have tested that things still work (on my machine); the only
difference is that we now only get the full output all at once at the
end, rather than one signature at a time. I don't think anyone is
looking at the output in real-time, so this should not be a huge issue.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-06 18:45:29 +02:00
Gary Verhaegen
21374713d0
ci/cron: add opt-parse applicative (#7583)
As requested in [previous PR].

[previous PR]: https://github.com/digital-asset/daml/pull/7569#discussion_r499733636

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-06 17:14:27 +02:00
Gary Verhaegen
6d94208226
stop notifying shayne on pr builds (#7581)
CHANGELOG_BEGIN
CHANGELOG_END
2020-10-06 14:20:43 +02:00
Gary Verhaegen
67746b7710
ci/cron: add arg to select docs (#7569)
This is a preparatory step for moving at least some of the logic of
checking signatures to this script. The reasoning for putting signatures
in the same script basically boils down to "it already has GitHub
pagination".

I also removed the `run.sh` wrapper because it did not add anything
anymore. It used to be useful, but across various changes it's sort of
lost its purpose.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-05 19:35:09 +02:00
Gary Verhaegen
eb6b2ce1c6
ci/cron: small cleanup (#7570)
Small improvements I noticed could be made while working on #7569, in a
separate PR because they're quite unrelated.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-05 19:33:48 +02:00
Gary Verhaegen
2973228f77
signature check: report to Slack (#7568)
If a tree falls in the forest and all that.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-05 17:31:11 +02:00
Gary Verhaegen
fda2eca084
periodically check signatures (#7543)
This is a first, very incomplete step in the spirit of small,
incremental PRs. Known missing features:

- Should check all versions, not just the 30 most recent ones.
- Should also download from GCP backup and compare.
- Should alert on Slack if anything is unexpected.
- Should handle versions prior to us starting to sign (and do what?).
- Should also check artifacts in Artifactory, not just GitHub Releases.
- Optionally should save to GCP if we don't have a backup already.

So at the moment it's just downloading the artifacts for the 30 most
recent releases and printing a message stating whether we have a
signature and whether it's valid.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-01 21:01:42 +02:00
Gary Verhaegen
60b300199b
improve "patch bazel windows" UX (#6764)
This does not get used very often so it is likely nobody will remember
how it works when we do use it. It's And due to the ordering Azure makes
of jobs in its UI, it's very easy to miss that there is a final,
Linux-based step and the values are actually printed there.

So this adds a little note to remind us of that.

Note that as this changes the `ci/patch_bazel_windows` folder, this will
also generate a new Bazel, so this PR will also update the Scoop
reference.

CHANGELOG_BEGIN
CHANGELOG_END
2020-09-30 14:09:51 +02:00
Moritz Kiefer
1243afc4a1
Remove reference to release-notes.rst (#7524)
* Remove reference to release-notes.rst

https://github.com/digital-asset/daml/pull/7458 shuffled this
around. While we could update it, it doesn’t really make any sense. We
post our release notes to the blog now and not in the docs so this
whole checkout procedure is redundant. This is also true if we wanted
to make a bugfix release for a release < 1.5 where this file still
existed. The trigger_sha is always on master (following our current
release process) so the file would still not exist.

I did also remove it from the docs cronjob. We never reupload old docs
so this doesn’t make a difference.

changelog_begin
changelog_end

* Stupid whitespace change because windows is pissing me off

changelog_begin
changelog_end
2020-09-30 11:34:43 +00:00
Moritz Kiefer
89c0f6ca41
Bump test_sha (#7512)
This changed in #7501 but it’s a harmless change.

changelog_begin
changelog_end
2020-09-29 11:01:01 +00:00
Sofia Faro
c5d145358d
Patch ghc to add a daml version header marker. (#7489)
* Update ghc to add a daml version header marker.

(WIP)

changelog_begin
changelog_end

* update stack snapshot

* See what happens with daml-doc

* update patch

* update stack snapshot

* unpin stackage

* Update daml docs tests

* Remove version header ann when parsing module doc

* Update the test

* Set final patch commit SHA.

* update stack snapshot

* unpin stackage (mac/linux)

* lint

* unpin stackage (win)
2020-09-28 17:01:20 +00:00
Remy
65f1a247d0
Bump test_sha in perf tests (#7441)
CHANGELOG_BEGIN
CHANGELOG_END
2020-09-18 18:30:39 +00:00
Gary Verhaegen
29b876920e
ci/patch_bazel_windows: fail on curl failure (#7431)
At the moment, with the `curl` call comfortably nested inside an `if`
statement, `curl` failures are interpreted as simply going down the
`else` branch of the conditional. Hoisting it up to a separate variable
declaration makes the script fail on the curl call itself.

CHANGELOG_BEGIN
CHANGELOG_END
2020-09-18 15:57:21 +02:00
Gary Verhaegen
f98b92d7ba
reset Windows cache (#7423)
CHANGELOG_BEGIN
CHANGELOG_END
2020-09-16 22:38:40 +02:00
Leonid Shlyapnikov
d13e7aa184
JSON API daily perf test cron job (#7406)
* Add http-json-perf daily cron job

changelog_begin
changelog_end

* commenting out other jobs so we can manually test the new one

* commenting out other jobs so we can manually test the new one

* Fix the shell script

* Fixing the gs bucket, `gs://http-json bucket does not exist`

* uncomment the other jobs

* timestamp from git log

* get rid of DAR copying

* comment out the other jobs, so we can test it

* uncomment the other jobs
2020-09-16 13:02:02 -04:00
Gary Verhaegen
9632a2fa43
fix check-changelog (#7419)
In #7386 I inadvertently removed the default value for the argument
`check-changelog.sh` takes. This did not break CI because CI always
passes the argument explicitly, but it did break local workflows for
some developers.

Apologies.

CHANGELOG_BEGIN
CHANGELOG_END
2020-09-16 13:19:37 +00:00
Gary Verhaegen
b2d58a3304
save daily perf results (#7396)
It's a real shame I forgot to do this sooner, but better late than never
I suppose.

CHANGELOG_BEGIN
CHANGELOG_END
2020-09-14 18:38:31 +00:00
Gary Verhaegen
bcee8f9152
let dependabot bypass changelog check (#7386)
Currently, when Dependabot makes a PR, not only do we need to manually
trigger CI through `/azp run`, we also need to either change the commit
or add a new one to pass the changelog check. This addresses that second
part by making a exception for PRs signed by dependabot.

This is also a bit of an excuse to play with git signatures and gpg.

CHANGELOG_BEGIN
CHANGELOG_END
2020-09-14 14:04:37 +02:00
Moritz Kiefer
c3d758a2d3
Ignore failed ipcrm calls (#7218)
* Ignore failed ipcrm calls

changelog_begin
changelog_end

* Update ci/clear-shared-segments-macos.yml

Co-authored-by: Samir Talwar <samir.talwar@digitalasset.com>

Co-authored-by: Samir Talwar <samir.talwar@digitalasset.com>
2020-08-25 08:39:11 +00:00
Moritz Kiefer
5936644970
Bump windows cache (#7212)
changelog_begin
changelog_end
2020-08-24 18:19:28 +02:00
Gary Verhaegen
6d1adee92f
push script-runner and trigger-runner to Artifactory (#7196)
I have created the corresponding user and repositories on Artifactory,
and tested the `curl` command manually. I'll add the corresponding
credentials to Azure once this is approved.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-20 19:11:27 +02:00
Gary Verhaegen
51e7c88bf5
announce release rotation on Tuesday (#7151)
It's been pointed out to me that some people actually plan their work,
and for them, it would be useful to know at least a day in advance that
they will be doing a release. This PR attempts to accommodate that.

Note: this will run as a new, separate cron. I'll do the Azure setup
after this gets approved.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-17 13:42:35 +02:00
Gary Verhaegen
d43419d339
update perf sha (#7140)
Change in Speedy API.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-14 12:55:06 +02:00
Gary Verhaegen
1baea84ca0
fix auth header for compat pr (#7134)
On the last release, the job succeeded despite no being able to create
the compat PR. This fixes:

- The curl call to actually return non-0 on non-2xx HTTP response.
- The way in which we encode the credentials.

This also attempts to create a Bash library, hopefully this time in a
way that doesn't get destroyed by our release process. IIUC pipeline
instructions (YAML files) are all parsed and read before any execution,
so by embedding the Bash library in a template we should get the correct
version (i.e. the one that is running the pipeline) even when checking
out other commits.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-14 11:35:57 +02:00
Moritz Kiefer
67f350694c
Fix release cron job (#7095)
CI failed because $4 was unset. We explicitly check later if it is set
so this is intentional, we just need to actually get to that check and
not fail due to set -u.

changelog_begin
changelog_end
2020-08-12 10:50:28 +02:00
Stephen Compall
6306b9f8d8
perf test harmlessly changed in #6907; match sha (#7065)
CHANGELOG_BEGIN
CHANGELOG_END
2020-08-10 07:26:18 +00:00
Moritz Kiefer
31c2ce0220
Bump Windows cache (#7056)
The "output was not created" errors seem to have become very
frequent. While taking out nodes seems to work as a bandaid, I’d like
to see if resetting the cache buys us a few days of not having to deal
with this. Admittedly, I don’t really have an explanation for why
resetting the cache should help if taking out the machines seems to do
something (suggesting that it hasn’t propagated fully).

changelog_begin
changelog_end
2020-08-07 09:05:32 +00:00
Gary Verhaegen
00f3de63c9
rotate responsibility for release process (#7011)
This PR attempts to add some automation around assigning release
management. The PR adds a file `release/rotation`; each week, the
updated CI cron job will:

- Open a PR for the new release [as current].
- Assign the first user in the file to that PR.
- Add the Standard-Change label to the PR.
- Start the build for that PR [as current].
- Open a new PR that rotates the `release/rotate` file, i.e. pushes back
  the first line to the end of the file.

This PR also adds mentions of the "release handler" (the first line of
`release/rotation`) to the various messages we send to Slack along the
release process.

The initial state of the `release/rotation` file has been created by
listing all the volunteers (Language team, Application Runtime team, as
well as @SamirTalwar-DA and @stefanobaghino-da) and piping the file
through `shuf`. (Then I put myself at the top so I can hopefully iron
out the issues with the first attempt.)

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-05 18:58:56 +02:00
Gary Verhaegen
1ff75f1256
remove monthly report (#6967)
This script is no longer relevant to our internal processes. The report
is now generated by the security team and validated by us, rather than
produced and validated by us.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-04 12:01:07 +02:00
Remy
cc0b497d23
Bump test_sha in perf tests (#6972)
CHANGELOG_BEGIN
CHANGELOG_END
2020-08-03 17:21:49 +00:00
Gary Verhaegen
614b4298a1
update base image for SDK Docker image (#6970)
The openjdk-alpine images we were relying on so far do not have security
updates anymore.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-03 18:20:31 +02:00
Gary Verhaegen
ef465de0a8
run build on automated release PRs (#6964)
Even though the Azure Pipelines bot account _clearly_ has write access
to our repo, as it can create the PR, it does not count as having write
access for the purposes of Azure deciding to run the build on the PRs it
opens. Not having the build run on the release PR would defeat the whole
point of having it, so this adds a little nudge to Azure so it does
start the build after opening the PR.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-03 16:42:00 +02:00
Samir Talwar
11bc582a9e
Publish DAML on SQL to GitHub Releases. (#6876) 2020-07-27 17:33:20 +02:00
Gary Verhaegen
8043756883
better release triggers (#6859)
Based on feedback from @nickchapman-da, this PR aims at making the
release process easier by:

- Automatically opening a release PR on Wednesday morning. The goal here
  is that by the time we start working, there is a release already
  built, so we save about an hour on waiting for that. This obviously
  doesn't help with ad-hoc releases.
- On a release PR build, posting to Slack when the release is ready to
  merge.
- On a release master build, posting to Slack when a release is ready to
  be tested.

My hope is that this makes the release process less tedious. This is not
trying to address the actual release testing, but hopefully should
reduce the annoyance of having to constantly go and check if the release
is ready.

CHANGELOG_BEGIN
CHANGELOG_END
2020-07-24 16:40:11 +00:00
Gary Verhaegen
97d1fa1e04
fix docs cron (#6864)
I lost the public ACL in #6817.

CHANGELOG_BEGIN
CHANGELOG_END
2020-07-24 16:13:00 +00:00
Gary Verhaegen
818a52b094
simplify docs cron (#6817)
simplify docs cron

This commit changes the "live state" to be that all versions are there
on S3, most of them hidden the way snapshots currently are, and only
displays in the drop-down the list of "supported" versions, i.e. stable
and >= 1.0.0.

The docs cron will now:

- Get list of versions from GitHub (as it does now)
- Get list of versions from S3 (as it does now: versions.json +
  snapshots.json, though it assumes we'll have a follow-up PR to change
  the latter to hidden.json)
- Compare; if the sets of versions are the same, stop there. (Note: this
  "set of versions" here includes the notion of which versions are shown,
  not just which ones exist. See the Versions data type in the code.)
- If there is a new hidden version, just build that, push it, change
  nothing else. No need to download any of the existing versions or mess
  around with anything else (except updating `hidden.json`, otherwise
  we're going to be doing this way too often.)
- If there is a new visible version:
  - check if we have it locally (i.e. from the previous step: it's a
    version we just added)
  - figure out the old and new default versions, and then apply the diff
    to the top-level directory. Basically download the two folders, list
    files that exist in the old one and not in the new one, delete those
    from S3, then push the new one to the top-level on S3.
- update versions.json & hidden.json (and for now snapshots.json)

This means that:

- we never mess with the existing versions; we don't need to download
  them, we don't need to change them, we don't clean them up. Old links
  keep working forever.
- The running time for the docs cron is roughly constant, in that it
  should very rarely have to either build or upload (or download) more
  than 2 versions per run, and if those instances happen they'd be
  accidents (we made 3 actual releases in an hour), not build-up over
  time.
2020-07-24 14:40:32 +02:00
Andreas Herrmann
4b1438276c
Update Bazel 2.1.0 --> 3.3.1 (#6761)
* Upgrade nixpkgs revision

* Remove unused minio

It used to be used as a gateway to push the Nix cache to GCS, but has
since been replaced by nix-store-gcs-proxy.

* Update Bazel on Windows

changelog_begin
changelog_end

* Fix hlint warnings

The nixpkgs update implied an hlint update which enabled new warnings.

* Fix "Error applying patch"

Since Bazel 2.2.0 the order of generating `WORKSPACE` and `BUILD` files
and applying patches has been reversed. The allows users to define
patches to these files that will not be immediately overwritten.
However, it also means that patches on another repository's original
`WORKSPACE` file will likely become invalid.

* a948eb7255
* https://github.com/bazelbuild/bazel/issues/10681

Hint: If you're generating a patch with `git` then you can use the
following command to exclude the `WORKSPACE` file.

```
git diff ':(exclude)WORKSPACE'
```

* Update rules_nixpkgs

* nixpkgs location expansion escaping

* Drop --noincompatible_windows_native_test_wrapper

* client_server_test using sh_inline_test

client_server_test used to produce an executable shell script in form of
a text file output. However, since the removal of
`--noincompatible_windows_native_test_wrapper` this no longer works on
Windows since `.sh` files are not directly executable on Windows.

This change fixes the issue by producing the script file in a dedicated
rule and then wrapping it in a `sh_test` rule which also works on
Windows.

* daml_test using sh_inline_test

* daml_doc_test using sh_inline_test

* _daml_validate_test using sh_inline_test

* damlc_compile_test using sh_inline_test

* client_server_test find .exe on Windows

* Bump Windows cache for Bazel update

Remove `clean --expunge` after merge.

Co-authored-by: Andreas Herrmann <andreas.herrmann@tweag.io>
2020-07-23 09:46:04 +02:00
Remy
28ab504b21
Bump test_sha in perf tests (#6825)
CHANGELOG_BEGIN
CHANGELOG_END
2020-07-22 11:00:53 +00:00
Moritz Kiefer
edd84a09d5
Fix reference to return produced by ApplicativeDo (#6821)
* Fix reference to return produced by ApplicativeDo

see https://github.com/digital-asset/ghc/pull/53 for details.

fixes #6820

changelog_begin
changelog_end

* bump to merged commit

changelog_begin
changelog_end

* switch to new ghc-lib

changelog_begin
changelog_end
2020-07-22 10:09:23 +00:00
Remy
d538d9a53e
Bump test_sha in perf tests (#6816)
CHANGELOG_BEGIN
CHANGELOG_END
2020-07-21 16:56:01 +00:00
Robert Autenrieth
7ce9748066
Split sandbox code into separate packages (#6695)
* Move public code into daml-integration-api

CHANGELOG_BEGIN
[DAML Integration Kit]: Removed sandbox specific code from the API intended to be used by ledger integrations. Use the maven coordinates ``com.daml:participant-integration-api:VERSION`` instead of ``com.daml:ledger-api-server`` or ``com.daml:sandbox``.
CHANGELOG_END
2020-07-17 17:06:06 +02:00
Moritz Kiefer
147a2700c0
Bump Windows cache (#6770)
To “fix” the “output was not created” errors.

changelog_begin
changelog_end
2020-07-17 12:41:10 +02:00
Moritz Kiefer
52b9eabbcc
Revert "refactor ci jobs: add setvar to ci/lib.sh (#6708)" (#6732)
This reverts commit 61e9df3eaf.

This interacts very badly with the fact that we check out old commits
for releases. While we could fix it for this particular issue, I don’t
think this buys us enough to make this worth doing and it makes it
easy to introduce issues in the future if we modify lib.sh

changelog_begin
changelog_end
2020-07-14 23:53:49 +02:00
Gary Verhaegen
61e9df3eaf
refactor ci jobs: add setvar to ci/lib.sh (#6708)
CHANGELOG_BEGIN
CHANGELOG_END
2020-07-13 17:34:54 +02:00
Moritz Kiefer
631ed3e891
Bump timeouts in compat tests (#6689)
This bumps the timeout of the compat tests on PRs to 360 minutes
matching other jobs on a PR (we mainly hit this if ghc-lib is rebuilt)
and the timeout on the daily jobs to 720 minutes (we hit this if
_everything_ is rebuilt).

I am slightly worried about the timeout on the daily job. After having
taken a look at it, there are a few reasons how we ended up here:

1. We started including more tests, e.g., sandbox-classic. Not much we
   can do here, those tests are useful.

2. We have a very large number of snapshots for 1.3.0. There are a few
   reasons for this:

   1. Timing: We branched off early for the 1.2.0 release so the first
      snapshot for 1.3 was on June 3th. For 1.4 it looks like the first
      snapshot will be on July 15th so that’s roughly 2 extra
      snapshots just due to timing.

   2. Additional snapshots: We had one broken snapshot due to a broken
      VSCode extension that we didn’t delete (probably not worth doing
      at this point). We also had to backport to an old snapshot which
      resulted in another extra snapshot. We also had one extra
      snapshot which was supposed to be the RC but wasn’t since the
      ANF revert needed to go in.

   The only thing that is clearly useless is the one broken snapshot
   but that doesn’t change things that much. I see 2 orthogonal
   options for improving this assuming we agree that the current
   runtime is worryingly high.

   1. Prune snapshots more aggressively, e.g., only include the last 3
      snapshots. That’s a pretty arbitrary decision but it would
      enforce a hard limit.

   2. Reduce test combinations. E.g., only test snapshots vs stable
      releases but not snapshots vs snapshots.

3. We end up forcing a full build quite frequently. Here are just 2
   examples of how we’ve done that so far.

   1. Upgrade rules_haskell. Basically all tests are run by a Haskell
      binary so this forces a full rebuild.

   2. Change runfiles of `daml`.

I don’t think there is much we can do about 1 or 3 which leaves us
with 2. One not entirely unreasonable option is to just do nothing. We
did have periods where things went pretty smoothly for the most part
and each month we reset to a much smaller number of releases (we also
have to start throwing out old stable releases at some
point). Otherwise reducing the number of test combinations seems the
most promising option to me.

changelog_begin
changelog_end
2020-07-10 12:34:53 +00:00
Moritz Kiefer
6c0bbd3ba6
Bump test_sha in perf tests (#6649)
This changed by the revert of the ANF changes which is harmless by the
same reasoning that made bumping it harmless when we introduced it.

changelog_begin
changelog_end
2020-07-08 12:26:11 +00:00
Samir Talwar
89369b3bb9
CI: Increase the PostgreSQL connections from 100 to 200. (#6647)
We saw a flake recently where PostgreSQL stopped accepting connections
during a CI run, leading the build to fail. This increases the number of
connections to 200 from the default of 100, hopefully mitigating issues
such as this one.

CHANGELOG_BEGIN
CHANGELOG_END
2020-07-08 10:49:11 +00:00
Moritz Kiefer
ade99dd2c1
Reset windows cache (#6604)
We are seeing caching errors again.

changelog_begin
changelog_end
2020-07-03 16:36:35 +00:00
nickchapman-da
14ca4e5e79
bump-perf (#6553)
changelog_begin
changelog_end
2020-06-30 22:08:36 +00:00
Gary Verhaegen
8539873d84
document shared memory segment issue (#6546)
document shared memory segment issue

After discussion with @SamirTalwar-DA, we agree the CI script to clear
memory segments is a bit too dangerous to make it easy to run on
developer machines. Still, developers may run into similar issues if
they run lots of tests and/or do not reboot their laptop frequently.
On developer laptops, we  usually spawn one PostgreSQL instance per
build/test that needs it (as opposed to CI where we create a single one
for the entire build; see `build.sh`), so they can actually build up
fairly quickly in some scenarios.

As an alternative, I have added a section to the README to cover what to
do if that issue happens.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-30 17:48:14 +02:00
Gary Verhaegen
beb33f2ab1
add explanation for clearing shared segments (#6545)
As requested on #6530.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-30 13:21:32 +00:00
Gary Verhaegen
55776f92ba
clear shared memory segment on macOS (#6530)
For a while now we've had errors along the line of

```
FATAL:  could not create shared memory segment: No space left on device
DETAIL:  Failed system call was shmget(key=5432001, size=56, 03600).
HINT:  This error does *not* mean that you have run out of disk space.
It occurs either if all available shared memory IDs have been taken, in
which case you need to raise the SHMMNI parameter in your kernel, or
because the system's overall limit for shared memory has been reached.
        The PostgreSQL documentation contains more information about
shared memory configuration.
child process exited with exit code 1
```

on macOS CI nodes, which we were not able to reproduce locally. Today I
managed to, sort of by accident, and that allowed me to dig a bit
further.

The root cause seems to be that PostgreSQL, as run by Bazel, does not
always seem to properly unlink the shared memory segment it uses to
communicate with itself. On my machine, running:

```
bazel test -t- --runs_per_test=100 //ledger/sandbox:conformance-test-wall-clock-postgresql
```

and eyealling the results of

```
watch ipcs -mcopt
```

I would say about one in three runs leaks its memory segment. After much
googling and some head scratching trying to figure out the C APIs for
managing shared memory segments on macOS, I kind of stumbled on a
reference to `pcirm` in a comment to some low-ranking StackOverflow
answer. It looks like it's working very well on my machine, even if I
run it while a test (and therefore an instance of pg) is running. I
believe this is because the command does not actually remove the shared
memory segments, but simply marks them for removal once the last process
stops using it. (At least that's what the manpage describes.)

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-30 01:40:16 +02:00
Remy
f5c65696f7
Update LF Perf test SHA (#6510)
CHANGELOG_BEGIN
CHANGELOG_END
2020-06-26 12:11:50 +00:00
Shayne Fletcher
4d896bc3bd
Update ghc-lib, da-ghc-master-8.8.1 (#6460)
changelog_begin
changelog_end
2020-06-23 08:29:16 -04:00
Gary Verhaegen
7d3dae4b1f
update perf-sha (#6457)
CHANGELOG_BEGIN
CHANGELOG_END
2020-06-22 18:46:19 +02:00
Gary Verhaegen
2923048935
remove purge_old_agents (#6439)
This script was supposed to remove old agents from the Azure Pipelines
UI. It may have been useful at some time (notably, when we used
ephemeral instances, they did not necessarily get to run their shutdown
script), but as it stands now, it's broken. The output from that step
ends in:

```
error: 2 derivations need to be built, but neither local builds ('--max-jobs') nor remote builds ('--builders') are enabled
```

after listing the nix packages it would build. Furthermore, it does not
seem to be useful as I have not seen any spurious entry in the agents
list on Azure since we switched to permanent nodes, on either the Linux
or Windows side (and this would only run on Linux, if it ran).

I'm also not convinced it ever ran, as I used to see a lot of spurious
machines on both Linux and Windows when we did use ephemeral instances.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-20 17:37:24 +02:00
Shayne Fletcher
cec2693dc7
enable -Wunused-matches (#6423)
changelog_begin
changelog_end
2020-06-19 19:35:10 +00:00
Remy
149bfc89ff
Update LF Perf test SHA (#6416)
CHANGELOG_BEGIN
CHANGELOG_END
2020-06-18 14:27:26 +00:00
Moritz Kiefer
2c1d4cb805
Fix nix installation (#6400)
Nix now requires -L, I’ve gone ahead and just normalized everything to
use -sfL which we were already using in one place.

changelog_begin
changelog_end
2020-06-18 10:34:08 +02:00
Moritz Kiefer
7e0a684857
Bump Windows cache (#6383)
changelog_begin
changelog_end
2020-06-17 19:33:26 +02:00
Moritz Kiefer
a178f62613
Fix packaging performance (#6350)
fixes #3150

This PR introduces a patch to GHC to fix the performance of the
pattern match checker in the presence of multiple packages which
is currently significantly (orders of magnitude) slower than having
everything in a single package. I also added a test case that hits
this. Here’s what you need to hit this issue:

1. A typeclass with a functional dependency. `HasField` is the obvious
   candidate for this.

2. A lot of instances of this typeclass in a separate package (this is
   the only part where the separate package matters).

3. A reasonably large ADT with a bunch of strict fields.

4. A pattern match in the context of some constraints of the
   typeclass. The constraints can be completely unused.

In that case, you will get a significant slowdown in the number of
instances, number of constructors and number of constraints (didn’t
verify if it’s linear but it is significant which is all that
matters).

Here’s why this happens:

1. The pattern match checker checks for strict fields if the type is
   inhabited.

2. This calls `pmTopNormaliseType_maybe` to normalize a type (the details don’t
   matter) which in turn calls into the typechecker. This function is
   called very often (presumably linear in the number of constructors
   but didn’t verify.)

3. The typechecker has some logic in `improveFromInstEnv` for
   generating additional equations by unifying functional
   dependencies `a -> b` with constraints in scope
   and thereby deducing information about `b`.

4. In the pattern match checker the list of instances of the home
   package is empty since the pattern match checker (apparently)
   doesn’t actually care about those extra equations. However, the
   list of instances in the EPS is not empty. This is the issue here:
   By moving it to an external package we suddenly end up with
   thousands of instances that we try to unify with the functional
   dependencies every time we normalize which happens very often.

Proposed fix:

The solution is rather simple: Since the pattern match checker
apparently does not care about the instances of the home package, it
almost certainly doesn’t care about instances in general so we just
empty the instances of external packages explicitly.

Is the fix correct?

1. I verified that the GHC test suite passes with this patch which
   gives me a reasonable level of confidence.

2. I verified that our own test suite passes.

3. The most dodgy part is actually emptying the instance since the
   whole EPS stuff is a mutable mess. What could in theory happen is that
   the PM ends up loading an interface file that mutates this
   again. However, afaiu it is impossible for the PM to need an
   interface that the typechecker didnt already need. I did do a bunch
   of debugging and this is exactly what I observed in my experiments.

Alternative ideas and upstreaming:

The other option would be to not try and mess with the EPS but somehow
have a conditional flag somewhere in the typechecker env to disable
this logic in the pattern match checker. However, that sounds
significantly more complex so I don’t think it’s worth the effort.

GHC 8.10 has a new pattern match checker that has different
performance characteristics and seems to do much better here so there
is little reason to try and upstream this. I strongly want to avoid
upgrading DAML to 8.10 at this point (too much risk, let’s wait until
things calm down)

changelog_begin

- [DAML Compiler] Fix an issue where compilation slowed down
  significantly when code was split up into several packages. See
  https://github.com/digital-asset/daml/issues/3150

changelog_end
2020-06-16 15:12:34 +02:00
nickchapman-da
e19888d979
update for no stack-tracing in speedy perf (#6363) 2020-06-16 11:36:05 +00:00
Gary Verhaegen
1300644668
fix error message on daily compat failure (#6337)
When I changed the quoting for the success case as part of #6267, I
forgot to update the error case, so now we don't get well-formed JSON
for errors.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-14 22:52:57 +02:00
Andreas Herrmann
d1e422580a
Increment Windows cache URL (#6321)
We've seen a series of failures of the form
```
ERROR: D:/a/1/s/daml-assistant/integration-tests/BUILD.bazel:162:1: output 'daml-assistant/integration-tests/create-daml-app-tests.exe' was not created
ERROR: D:/a/1/s/daml-assistant/integration-tests/BUILD.bazel:162:1: not all outputs were created or valid
```
across multiple machines. We suspect cache poisoning as the cause. This
increments the cache URL to effectively clear the cache.

changelog_begin
changelog_end

Co-authored-by: Andreas Herrmann <andreas.herrmann@tweag.io>
2020-06-12 15:33:38 +02:00
Moritz Kiefer
7717574d00
Bump Windows cache (#6310)
We are seeing

ERROR: D:/a/2/s/compiler/scenario-service/protos/BUILD.bazel:67:1:
output
'compiler/scenario-service/protos/_obj/scenario_service_haskell_proto/ScenarioService.o'
was not created

again so following our experiments, let’s reset the cache to see if it
fixes anything.

changelog_begin
changelog_end
2020-06-11 16:26:31 +02:00
Gary Verhaegen
9c8c1fa909
lightly safer docs cron: fail instead of error (#6288)
See @cocreature's comment on #6285.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-10 19:18:14 +02:00
Gary Verhaegen
485069f017
fix docs cron for releae notes (#6285)
Thinking about the upcoming release, I realized our current docs cron
has somehow lost the step of taking the release notes from the
triggering commit, probably in all the back-and-forth about which
release notes version to use to overwrite all the other ones.

This restores that, and adapts the algorithm for the new, multi-line
LATEST file format.

This _should_ work for all the current history, including releases made
on `release/*` branches and the unifying commit that turned the LATEST
file multiline (it adds more than one line so won't be matched as a
trigger commit).

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-10 14:43:23 +02:00
Moritz Kiefer
20d26394e1
Modify the cache URL instead of relying on platform_suffix (#6273)
For some reason, platform_suffix doesn’t seem to provide enough
isolation to fix the “undeclared inclusion” errors even though it does
fix the issues for me locally.

This PR tries to address the problem by switching from
`platform_suffix` to modifying the actual URL of the cache.

To avoid leaking stuff from the local cache, I’ve added a clean
--expunge for now. We should be able to remove this once nodes have
been reset tomorrow. It will slow down nodes but that is clearly
better than having everything fail.

changelog_begin
changelog_end
2020-06-09 17:05:19 +02:00
Moritz Kiefer
aac1e16794
Fix caching on Linux and MacOS (#6270)
When bumping the cache url on Windows, I accidentally also changed the
URL we push to on Linux and MacOS. This is obviously a bad idea so
this PR fixes it.

changelog_begin
changelog_end
2020-06-09 08:08:06 +00:00
Gary Verhaegen
664df64e13
fix daily perf Slack notification (#6267)
This PR fixes the Slack notification on daily perf runs. It also updates
the perf sha.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-09 06:45:58 +00:00
Moritz Kiefer
1d3c8f3390
Bump cache suffix (#6265)
* Bump cache suffix

As discussed, we are going to bump this every time we feel like
resetting the cache might help. This is a temporary measure to get
some metrics on how often things break and if resetting the cache
helps.

changelog_begin
changelog_end

* Update configure-bazel as well

changelog_begin
changelog_end
2020-06-08 17:15:12 +02:00
Moritz Kiefer
f1822f6daa
Fix variable in daily slack notifications (#6221)
Currently the report fails with variables[Build.SourceBranchName]:
command not found which is obviously not what we want (it’s mixing up
the syntax in Azure’s yaml config and Bash). Looking at the
code in the tell-slack-failed.yml, this one does seem to work but I
haven’t tested this so :crossed-fingers:.

changelog_begin
changelog_end
2020-06-04 12:41:36 +02:00
Gary Verhaegen
2fe320fe48
automated ghc-lib build (#6188)
automated ghc-lib build

This PR aims at automating the build of ghc-lib. The current process
still has a few manual steps; it needs to be updated because Bintray is
going away, so this seemed like a good opportunity to fully automate it.

This works like the "patch bazel on Windows" jobs: the filename will
contain a hash of the `ci/da-ghc-lib` folder, and the job will run only
if the corresponding filename does not yet exist on the GCS bucket. PRs
aiming at changing the ghc-lib version will need to run twice: once to
create the artifacts, and once to change the `stack-snapshot.yaml` file
to match.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-04 12:05:03 +02:00
Moritz Kiefer
b993339844
Include rules_haskell revision in platform suffix (#6209)
* Include rules_haskell revision in platform suffix

Hopefully this makes CI a bit less of a dumpsterfire. I’ve also
followed the comment and made the suffix actually 3 characters long
instead of 2 since that makes me worry less about collisions and
should hopefully still be short enough to not hit MAX_PATH.

changelog_begin
changelog_end

* Update ci/configure-bazel.sh

Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>

Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
2020-06-03 21:33:37 +02:00
Gary Verhaegen
445f6467d9
daily run: warn on master only (#6177)
Currently the message to Slack is always triggered by running the daily
checks. This means that it gets very noisy to:

1. Run the check on PRs affecting the check (like this one),
2. Rerun the check multiple times to ascertain that a given failure is
   flaky.

With this PR, the message to Slack is replaced with a simple `echo` when
these checks are not run from the `master` branch, so whoever (manually)
triggered them can still get feedback on the result, but other people
don't get spurious `@here` mentions.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-03 16:36:05 +02:00
Moritz Kiefer
405f3ad6ee
Sort files when calculating CACHE_KEY (#6173)
* Sort files when calculating CACHE_KEY

The order returned by `find` is unspecified and seems to have changed
for whatever reason in some cases. This changed the cache key which is
obviously not intended. It looks like the one we currently have in our
scoop manifest is the one that we get by sorting. Reversing the sort
produces the one CI currently calculates.

changelog_begin
changelog_end

* update manifest to match CI output

Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
2020-05-31 22:02:13 +02:00
Gary Verhaegen
90547e6ab4
build old docs with their release notes (#6128)
In light of #6127, I kept wondering why rebuilding 1.1.1 would fail. The
problem addressed by #6127 is that we tried to rebuild it, which we
shouldn't, but the reason I noticed it is because the build failed, and
there is no good reason for the 1.1.1 docs to not build anymore. Looking
at the logs confused me even more as it failed with (elided):

```
docs/source/support/new-assistant.rst:
WARNING: document isn't included in any toctree
```

and that change happened _after_ 1.1.1. So I went back to the code, and
discovered I somehow had gotten confused as I changed the approach
mid-way through editing the file. If we're overwriting the
`release-notes.html` file post-build, which we are now doing (and is the
reason for ignoring it when checking checksums), then we should not be
touching the `release-notes.rst` file pre-build.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-27 22:19:18 +00:00
Gary Verhaegen
e2d416e335
fix docs cron not ignoring release-notes (#6127)
The docs cron is supposed to ignore the release-notes.html page when
checking whether a docs folder is corrupted, because we manually
override it. However, that currently doesn't work, either because the
`sed` version we are using does not support changing the delimiters, or
because no version of `sed` does and I just imagined it.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-27 18:47:40 +02:00
Gary Verhaegen
ccb496ee0d
update perf test sha (#6125)
Changed by #6123, relevant part of the diff is:

```
           ledger.lookupGlobalContract(ParticipantView(committers.head),
effectiveAt, acoid) match {
-            case LookupOk(_, result) =>
+            case LookupOk(_, result, _) =>
               cachedContract = cachedContract + (step -> result)
```

which seems benign enough.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-27 15:10:11 +00:00
Gary Verhaegen
6e48abc793
update perf benchmark following #6080 (#6120)
This should be merged after #6080. This PR adds a patch (and
consequently updates the `ci/cron/perf/compare.sh` script) to apply the
same logical change as #6080 on top of the baseline commit, so our
performance comparison remains "apples to apples".

I am well aware that managing patches is not going to be a great way
forward. The rate of changes on the benchmark seems to be slow enough
that this is good enough for now, but should we change the benchmark
more often and/or want to add new benchmarks, a better approach would be
to handle the changes at the Scala level. That is:

- Create a "rest of the world" (world = Speedy, its compiler, and all of
  the associated types) interface that benchmarks would depend on,
  rather than depend directly on the rest of the codebase.
- Create two implementations of that interface, one that compiles
  against the current state of the world, and one that compiles against
  the baseline.
- Change the script to load the relevant implementation, and then run
  all the benchmarks as-is, with no match necessary.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-27 13:34:08 +02:00
Gerolf Seitz
d55ebf08ec
Use Sandbox Classic as DAML on SQL (#6095)
CHANGELOG_BEGIN
CHANGELOG_END
2020-05-27 08:31:27 +02:00
Gary Verhaegen
9c7c8918a3
fix fatjar versions (#6091)
Version is taken from the env var (or defaulted to 0.0.0) at build-time.
Since those two packages are not build by default by Bazel, we need to
add the env var to the Bash step where they do get explicitly built.

Fixes #6090.

CHANGELOG_BEGIN
- sandbox and http-api fatjars will now display correct version number.
CHANGELOG_END
2020-05-25 15:59:23 +02:00
nickchapman-da
fb6cafa311
Bump the sha for CI perf (#6078)
changelog_begin
changelog_end
2020-05-22 16:24:18 +00:00
Moritz Kiefer
629ec732dd
Include puppeteer tests in compat tests (#6018)
* Include puppeteer tests in compat tests

This PR adds the puppeteer based tests to the compatibility
tests. This also means that they are now actually compatibility
tests. Before, we only tested the SDK side.

Apart from process management being a nightmare on Windows as usually,
there are two things that might stick out here:

1. I’ve replaced the `sh_binary` wrapper by a `cc_binary`. There is a
   lengthy comment explaining why. I think at the moment, we could
   actually get rid of the wraper completely and add JAVA to path in
   the tests that need it but at least for now, I’d like to keep it
   until we are sure that we don’t need to add more to it (and then
   it’s also in the git history if we do need to resurrect it).
2. These tests are duplicated now similar to the `daml ledger *`
   tests. The reasoning here is different. They depend on the SDK
   tarball either way so performance wise there is no reason to keep
   them. However, we reference the other file in the docs which means
   we cannot change it freely. What we could do is to make this
   sufficiently flexible to handle both the `daml start` case and
   separate `daml sandbox`/`daml json-api` processes and then we can
   reference it in the docs. There is still added complexity for
   Windows but that’s necessary for users as well that want to run
   this on Windows so that seems unavoidable. (I should probably also
   remove my snarky comments 😇) I’d like to kee it duplicated
   for this PR and then we can clean it up afterwards.

changelog_begin
changelog_end

* Bump timeouts

changelog_begin
changelog_end
2020-05-22 14:02:59 +02:00
Gary Verhaegen
957a74c325
fix trailing newline in docs cron (#6053)
CI currently errors with:

```
Subprocess:
git checkout efe6545c2c
 -- docs/source/support/release-notes.rst
failed with exit code 127; output:
---

---
err:
---
Previous HEAD position was 2af134c... WIP: Draft version constraint
generation (#5472)
HEAD is now at efe6545... 1.2.0-snapshot.20200520.4224.0.2af134ca
(#6040)
/bin/sh: 2: --: not found

---

```

because the line

```
latest_release_notes_sha <- shell "git log -n1 --format=%H HEAD -- LATEST"
```

will assign a string that ends in a newline, and then when we try to
construct the shell command:

```
(shell_ $ "git checkout " <> latest_sha <> " -- docs/source/support/release-notes.rst")
```

we actually get two lines for Bash to execute.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-20 18:26:27 +02:00
Gary Verhaegen
94122ec561
fix docs cron (#6049)
Current version yields:

```
Subprocess:
git log -n1 --format=%H master -- LATEST
failed with exit code 128; output:
---
---
err:
---
fatal: bad revision 'master'
---
```

so apparently we can't trust a CI run on master to have a master branch
defined. `HEAD` should work, though.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-20 16:37:47 +02:00
Gary Verhaegen
fb6dc904a4
trigger all releases from master (#6016)
trigger all releases from master

The 1.1.0 release went wrong and we had to trash it and release 1.1.1
instead. This is an attempt at identifying and correcting the root
cause behind that incident.

To understand the situation, we need to know how releases worked before
1.0. We had a one-line file called `LATEST` that specifies the git SHA and
version tag for the latest release. A change to that file triggered a
release with the specified release tag, built from the source tree of
the specified commit. The `LATEST` file looked something like:

```
f050da78c9 1.0.0-snapshot.20200411.3905.0.f050da78
```

To mark a release as stable, we would change it to look like this:

```
f050da78c9 1.0.0
```

i.e. simply drop the `-snapshot...` suffix. Even though the commit (and
thus the entire source tree we build from) is the same, we would need to
rebuild almost all of our release artifacts, as they embed the version
tag in various places and ways. That worked well as long as we could
assume we were doing trunk-based development, i.e. all releases would
always come from the same (`master`) branch.

When we released 1.0, and started work on 1.1, we had a few bug reports
for 1.0 that we decided should be resolved in a point release. We
decided that the best way to handle that would be to have a branch
starting on the release commit for 1.0, and then backport patches from
`master` to that branch. We adapted our build process to also watch the
`release/1.0.x` branch and, in particular, trigger a new release build if
the `LATEST` file in that branch changed. That worked well.

The plan going forward was to keep doing regular snapshot releases from
the `master` branch, and create support, point releases ("patch" releases
in semver) from dedicated branches.

On April 30, we made a snapshot release as an RC for 1.1.0, by changing
the `LATEST` file in the `master` branch. That release was built on commit
681c862d. On May 6, we decided to take a new snapshot as the RC for
1.1.0; we changed `LATEST` in `master` to designate 7e448d81 as the new
latest release.

On May 11, we noticed an issue that broke our builds. Without going into
details, an external artifact we depend on had changed in incompatible
ways. After fixing that on `master`, we reasoned that this would also
break the build of the final 1.1.0 release if we just tried to build
7e448d81 again. But as the target release date was May 13, we did not
want to take a new snapshot after that fix, as that would have included
one more week of work in the release, and given us no time to test it.

So we did what we did for the 1.0 branch, as it had worked well: we
created a branch that forked from `master` at commit 7e448d81 and called
it `release/1.1.x`, then cherry-picked the one fix to our build process to
work around the broken download. When the time came to make the final
1.1.0 build on May 13, we naturally picked the `LATEST` file from the
`release/1.1.x` branch and dropped the `-snapshot...` suffix. Importantly,
we did not need to update the target commit to include the "broken
download" fix as, in the meantime, the internet had fixed itself, and we
thus reasoned we should go for the exact code of the RC rather than
include an unnecessary, albeit seemingly harmless, change.

Everything went well with the release process. Tests went well too. Then
we got a report that an application that worked against the latest RC
broke with the final 1.1.0. The issue was that we had built the wrong
commit: by branching off at the point of the _target_ commit for the
latest snapshot, we did not have the change to the `LATEST` file that
designated that commit as the target. So the `LATEST` file in
`release/1.1.x` was still pointing to 681c862d.

I believe the root cause for this issue is the fact that we have
scattered our release process over multiple branches, meaning there is
no linear history of what was released and we are relying on people
being able to mentally manage multiple timelines. Therefore, I propose
to fix our release process so this should not happen again by
linearizing the release process, i.e. getting back to a situation where
all releases are made from a single branch, `master`.

Because we do want to be able to release _for_ multiple release branches
(to provide backports and bugfixes), we still need some way to
accommodate that. Having a single `LATEST` file in the same format as
before would not really work well: keeping track of interleaved release
streams on a single file would not really be easier than keeping track
of multiple branches.

My proposed solution is to instead have a multiline LATEST file, so that
all the release branch "tips" can be observed at the same time, and, as
long as we take care to only advance one release branch at a time, we
can easily keep track of each of them. This is what this PR does.

This required a few changes to our release process. Most notably:

- Obviously, as this is the main point of this PR, the build process has
  once again been restricted to only trigger new releases from the
  `master` branch.
- As our CI machinery cannot easily be made to produce multiple releases
  from a single build, the `check_for_release` step will only recognize
  a commit as a release trigger if it changes a single line in the
  `LATEST` file. This restriction comes in addition to the existing one
  that a release commit is only allowed to change either just the
  `LATEST` file or both the `LATEST` and
  `docs/source/support/release-notes.rst` files.
- The docs publication process has been changed to update _all_
  published versions to display the _latest_ release notes page. This
  means that the release notes page will always show you all published
  versions, regardless of which version of the documentation you're
  looking at. This also means that interleaving release notes correctly on
  that page is a manual exercise.
- As per the intention of the new process, the `LATEST` file has been
  updated to contained all existing post-1.0 stable releases. It should
  also include all existing snapshot releases should we have more than one
  at a time (say, should we discover an issue with 1.1.1 that required us
  to work on a 1.1.2).
- The `release.sh` script has been dramatically simplified as I felt it
  was trying to do too much and porting its existing functionality to a
  multi-line `LATEST` file would be too hard.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-19 19:18:10 +02:00
Gary Verhaegen
af939a7ee4
provisional beta deployment for daml-on-sql (#6024)
Note: this is beta-level software. See documentation for the precise
guarantees this does and does not come with. (Documentation does not
exist at the time of opening this PR, but should exist by the time the
first version of this gets published.)

CHANGELOG_BEGIN
- We now publish Sandbox Next as an **ALPHA** standalone jar.
- We now publish the HTTP JSON API as a standalone jar.
CHANGELOG_END
2020-05-19 18:11:26 +02:00
Moritz Kiefer
d7632c5b20
Include Bazel patch to cache exclusive tests on Windows (#6009)
* Include Bazel patch to cache exclusive tests on Windows

This includes the patch that we already use on Linux and MacOS to fix
caching of things marked exclusive. I’ve kept in the debugging output
that I added in the last patch for now. While our workaround seems to
be working, I’d like to wait a bit longer in case the issue reappears.

changelog_begin
changelog_end

* Actually bump manifest

changelog_begin
changelog_end
2020-05-18 16:01:29 +00:00
Moritz Kiefer
6142241719
Include sources directory in the Bazel cache key (#6001)
* Include sources directory in the Bazel cache key

This should hopefully fix the “undeclared inclusion” errors we have
been getting daily on CI

The details are in a comment but the short summary is that
the daily cron job is running in D:\a\1 whereas jobs on the same
machine afterwards run in D:\a\2. Because absolute paths leak in some
places, this fucks things up.

changelog_begin
changelog_end

* Add debugging output to the cache
2020-05-15 19:35:52 +00:00
Samir Talwar
57a8d0b37e
CI: Run PostgreSQL once for all Scala tests. (#5919) 2020-05-14 09:06:34 +02:00
Remy
b1c09ce1a0
Update perf tests SHA (#5971)
* update Perfs test sha

* changelog

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-13 19:08:53 +00:00
Moritz Kiefer
577cd653d9
Add debugging output to inclusion errors (#5961)
* Add debugging output to inclusion errors

This adds some more debugging output to inclusion errors in Bazel
which should hopefully help us track it down. These are the only call sites
in the Bazel source that can produce them. My suspicion is that it’s
coming from HeaderDiscovery but I’m not entirely sure what is off.

We’ll almost certainly have to add more output once we know which of
those 3 cases we hit but let’s do it step by step.

changelog_begin
changelog_end

* Bump url and hash

changelog_begin
changelog_end
2020-05-13 18:18:17 +02:00
Moritz Kiefer
4916a28682
Include create-daml-app tests in compatibility tests (#5945)
This is the first part of #5700

It adds tests that build create-daml-app using `daml build` and then
run the codegen and build the UI. Contrary to our main tests these
also run on Windows. This is actually reasonably simple by first
building the typescript libraries on Linux and then downloading them
on Windows.

There are two parts that are still missing from the tests in the main
workspace:

1. Building the extra feature. This should be fairly easy to add.
2. Running the pupeeter tests. At least MacOS and Linux should be
   reasonably easy. I don’t know what horrors Windows will throw at
   us. This step is what actually makes this a compatibility
   test. Currently it doesn’t actually launch Sandbox and the JSON API.

Since this PR is already pretty large, I’d like to tackle those things
separately.

changelog_begin
changelog_end
2020-05-13 10:39:51 +02:00
Gary Verhaegen
bda565fa44
patching Bazel on Windows (infra bits, no patch yet) (#5918)
patch Bazel on Windows (ci setup)

We have a weird, intermittent bug on Windows where Bazel gets into a
broken state. To investigate, we need to patch Bazel to add more debug
output than present in the official distribution. This PR adds the basic
infrastructure we need to download the Bazel source code, apply a patch,
compile it, and make that binary available to the rest of the build.
This is for Windows only as we already have the ability to do similar
things on Linux and macOS through Nix.

This PR does not contain any intresting patch to Bazel, just the minimum
that we can check we are actually using the patched version.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-12 23:16:04 +02:00
Gary Verhaegen
3899a59a11
switch back to hosted macOS nodes (#5935)
CHANGELOG_BEGIN
CHANGELOG_END
2020-05-11 22:59:33 +02:00
Gary Verhaegen
9b476416b8
switch back to Azure-provided macos nodes (#5920)
This is temporary. It looks like the macOS nodes are dead; @nycnewman is
looking into it, but in case he doesn't fix it in time, at least we
have a backup plan so we're not completely blocked on Monday.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-11 09:28:40 +02:00
Gary Verhaegen
4a6ab84b69
add default machine capability (#5912)
add default machine capability

We semi-regularly need to do work that has the potential to disrupt a
machine's local cache, rendering it broken for other streams of work.
This can include upgrading nix, upgrading Bazel, debugging caching
issues, or anything related to Windows.

Right now we do not have any good solution for these situations. We can
either not do those streams of work, or we can proceed with them and
just accept that all other builds may get affected depending on which
machine they get assigned to. Debugging broken nodes is particularly
tricky as we do not have any way to force a build to run on a given
node.

This PR aims at providing a better alternative by (ab)using an Azure
Pipelines feature called
[capabilities](https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/agents?view=azure-devops&tabs=browser#capabilities).
The idea behind capabilities is that you assign a set of tags to a
machine, and then a job can express its
[demands](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/demands?view=azure-devops&tabs=yaml),
i.e. specify a set of tags machines need to have in order to run it.

Support for this is fairly badly documented. We can gather from the
documentation that a job can specify two things about a capability
(through its `demands`): that a given tag exists, and that a given tag
has an exact specified value. In particular, a job cannot specify that a
capability should _not_ be present, meaning we cannot rely on, say,
adding a "broken" tag to broken machines.

Documentation on how to set capabilities for an agent is basically
nonexistent, but [looking at the
code](https://github.com/microsoft/azure-pipelines-agent/blob/master/src/Microsoft.VisualStudio.Services.Agent/Capabilities/UserCapabilitiesProvider.cs)
indicates that they can be set by using a simple `key=value`-formatted
text file, provided we can find the right place to put this file.

This PR adds this file to our Linux, macOS and Windows node init scripts
to define an `assignment` capability and adds a demand for a `default`
value on each job. From then on, when we hit a case where we want a PR
to run on a specific node, and to prevent other PRs from running on that
node, we can manually override the capability from the Azure UI and
update the demand in the relevant YAML file in the PR.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-09 18:21:42 +02:00
Gary Verhaegen
11a2fc3c2d
more flexible perf test check (#5891)
This PR separates the "last known valid perf test" commit from the
"baseline speedy implementation" commit. It is important for the perf
test to be meaningful that the changes between those two commits are
benign, say minor API adjustments, so that the perf measurement remains
meaningful.

This also adds a check on merging to master that tells Slack if the perf
test has changed and the `test_sha` file needs updating. The Slack
message is conditional on the current commit to avoid excessive noise.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-07 13:53:22 +02:00
Moritz Kiefer
67e33a1659
Move Bazel configuration before formatting (#5893)
bazel configuration does two things:

It modifies .bazelrc.local and it writes a temp file.
I’ve run `git clean` on a PR. This caused the temp file to be
removed. However `.bazelrc.local` stayed since it is in
`.gitignore`. This meant that the next time the formatting check ran
the `.bazelrc.local` pointed to the temp file but the temp file was no
longer there.

changelog_begin
changelog_end
2020-05-07 11:21:02 +00:00
Martin Huschenbett
6642b1fc8b
Report speedup in daily perf report cron job (#5885)
Also track against both targets, 5x and 10x.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-07 10:52:33 +02:00
Moritz Kiefer
3bf2402d2c
Add /etc/nsswitch.conf to our Dockerfile (#5882)
As mentioned in the comment, this is required to get DNS requests to
work. This is more important than one might realize at first:

`daml start` tries to make an HTTP request to localhost:7500 to wait
for Navigator to start up. However, in our docker image, these
requests currently fail completely since they fail to resolve
`localhost`. This stops all following steps, in particular,
`init-script` and the JSON API from starting up.

changelog_begin
changelog_end
2020-05-07 09:44:44 +02:00
Gary Verhaegen
204c8b0657
add daily perf report (#5843)
This PR adds a simple daily job that runs the performance test on a
chosen "baseline" commit and then runs the same benchmark on latest
master. This should allow us to track overall performance improvements.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-06 13:50:35 +02:00
Andreas Herrmann
150e0366a4
Apply platform_suffix on all Windows pipelines (#5846)
* Apply plotform_suffix on all Windows pipelines

To distinguish action keys between the compatibility and the main
workspace and avoid the "undeclared input(s)" error. We also modify the
main workspace's action cache keys to avoid poisoned cache items.

CHANGELOG_BEGIN
CHANGELOG_END

* Avoid exceeding MAX_PATH on Windows

Co-authored-by: Andreas Herrmann <andreas.herrmann@tweag.io>
2020-05-05 18:02:39 +00:00
Moritz Kiefer
b291e96ce1
Publish execution logs from Windows compatibility jobs (#5834)
Hopefully, this helps diagnose the Windows CI failures.

changelog_begin
changelog_end
2020-05-05 12:23:11 +02:00
Gary Verhaegen
ed13afc56a
fix overeager docs cron (#5797)
Currently the docs cron _always_ decides it has something new to
publish. This PR fixes that.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-30 15:31:38 +02:00
Moritz Kiefer
49e19ebed1
Make compat tests work on windows (#5732)
* Make compat tests work on windows

This required some changes to the daml_sdk rule since the read-only
installation by the assistant breaks Bazel completely. We could only
apply those changes on Windows but I think I prefer the consistency
across platforms here over trying to stay close to how the SDK is
installed on user machines given that the SDK installation is not
something we’ve had issues with.

I’ve excluded the postgresql tests for now. I don’t expect them to be
particularly hard to fix but I’ve already spent almost 2 days on this
and having some tests run on Windows seems like a clear improvement
over running no tests on Windows :)

changelog_begin
changelog_end

* Remove todo

changelog_begin
changelog_end
2020-04-28 16:06:36 +02:00
Gary Verhaegen
cfae3df7fa
report compat status every day (#5744)
I believe the compatibility check is important enough, and should fail
rarely enough, that it is worth reporting even on success. This will
mean (once we support Windows) 3 messages a day, sent while presumably
nobody is working, so the disruption should be minimal.

The issue with reporting only on failures is that, if we don't
proactively check (which we do for the state of master for different
reasons, but would likely not keep doing for a job that doesn't block
PRs), we may get into a state where it is so broken that it doesn't even
report.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-28 12:30:45 +02:00
Moritz Kiefer
d1db5c1c96
Set buffering of the docs cron job to LineBuffering (#5735)
I am hoping this will make the Azure output a bit more chatty. At the
moment, you don’t get any output until the job has finished which is a
bit annoying (although not really an issue).

changelog_begin
changelog_end
2020-04-27 15:29:08 +02:00
Gary Verhaegen
e344c6efa1
fix compatibility test (#5736)
CHANGELOG_BEGIN
CHANGELOG_END
2020-04-27 13:15:48 +00:00
Gary Verhaegen
7ceda5678a
run compatibility tests on macos (#5723)
This PR extends the existing Linux compatibility tests to run on macOS
too. Fixes #5692.

CHANGELOG_BEGIN
CHANGELOG_END

Co-authored-by: Moritz Kiefer <moritz.kiefer@purelyfunctional.org>
2020-04-27 14:55:16 +02:00
Moritz Kiefer
0d1f21e4a2
Extend compatibility tests to test against HEAD (#5714)
fixes #5691

changelog_begin
changelog_end
2020-04-24 14:43:35 +02:00
Moritz Kiefer
f61aadc422
Add dade-assist step to compat cron job (#5712)
changelog_begin
changelog_end
2020-04-24 11:25:14 +02:00
Gary Verhaegen
ec1a9326ea
fix daily-compat.yml (#5686)
For some reason the Azure Pipelines folks thought it was a good idea to
search for templates starting with the current YAML file's path. Other
steps (e.g. the bash script here) start from the root of the repo,
though.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-23 13:38:57 +02:00
Moritz Kiefer
7d36402412
Initial boilerplate for cross-version compatibility testing (#5665)
This is a first step towards testing cross-version
compatibility. It doesn’t actuall do much yet but hopefully it should
be easier to parallelize once we have the initial boilerplate in place
so ideally I’d like to address most missing things and issues in
separate PRs.

changelog_begin
changelog_end
2020-04-23 12:58:11 +02:00
Gary Verhaegen
644d4c7512
add empty daily CI run (#5675)
This is meant to be filled once we have a better idea of what exactly we
want to test. See #5665 for current thinking about it.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-23 10:36:20 +02:00
Gary Verhaegen
5f9db11663
fix tell-slack-failed CI "function" (#5670)
And here I was, thinking our builds had gotten more stable as of late.

😢

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-22 15:21:04 +02:00
Gary Verhaegen
88c389c17a
enable patch releases (fix) (#5634)
This is applying, on `master`, the same patch as #5605 applied on
`release/1.0.x`.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-20 17:01:08 +02:00
Moritz Kiefer
2ea4fbd850
Fix docs cron (#5612)
--branches='*' seems to only include local branches not branches in
the remote. It looks like --branches='*' --remotes='*' would work but
--all seems simpler. I struggled to find any docs on this but this
matched what I got when testing locally.

changelog_begin
changelog_end
2020-04-17 20:59:49 +02:00
Gary Verhaegen
a1fab2d9af
enable patch releases (#5584)
This commit aims at enabling future patch releases; it is the
master-branch equivalent of #5569 (applied to the 1.0 release branch).

The only change between the two changelogs should be that this one also
changes the docs cron so it can find the trigger commits for patch
releases.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-16 17:50:55 +02:00
Gary Verhaegen
8261af86d2
correct commit title in Slack msg (#5471)
Currently, on a release commit on master, if the commit fails, we get
the message from the target PR, which is confusing. This should
(hopefully; it's a bit hard to test as it would require setting up a
release PR that succeeds but fails on master) get us the title of the
release commit, which hopefully will be less confusing.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-08 13:01:42 +02:00
associahedron
696de17422
Notify Sofia on #team-daml-ci (#5487)
changelog_begin
changelog_end
2020-04-08 09:31:54 +00:00
Moritz Kiefer
b30882ef66
Disable format check on MacOS (#5477)
In https://github.com/digital-asset/daml/pull/5464 we removed the
hack for 0.13.55 but sadly also removed the condition to only run this
on Linux which I sadly missed during review. This PR adds it back.

changelog_begin
changelog_end
2020-04-07 21:53:06 +02:00
Remy
02648e17e2
notify Remy on #team-daml-ci (#5467)
* notify Remy on #team-daml-ci

* changelog

CHANGELOG_BEGIN
CHANGELOG_END

* Update ci/slack_user_ids

Co-Authored-By: Moritz Kiefer <moritz.kiefer@purelyfunctional.org>

Co-authored-by: Moritz Kiefer <moritz.kiefer@purelyfunctional.org>
2020-04-07 15:25:24 +00:00
Gary Verhaegen
9575f1bdb4
remove temp hack for 0.13.55 (#5464)
This PR reverts #5012.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-07 14:09:04 +02:00
Moritz Kiefer
ee04cbe2e2
Fix protobuf file name (#5440)
Calling a zip archive .tar.gz is rather confusing. The good news is
that apart from this, the release did work successfully this time so
not going to attempt another one until the proper 1.0 release.

changelog_begin
changelog_end
2020-04-04 22:18:58 +02:00
Moritz Kiefer
30f2c7421a
Only publish protobuf zip from Linux (#5438)
Last release attempt failed because when we tried to publish it from
macos it was already published from Linux.

Also includes a fix for the name of the zip.

changelog_begin
changelog_end
2020-04-04 17:58:21 +02:00
Moritz Kiefer
be3d8bc301
Publish protobuf zip to github releases (#5418)
I cannot test this without actually making a release but it is all
copy-pasted from other targets so hopefully it works.

changelog_begin
changelog_end
2020-04-03 15:36:41 +02:00
Gary Verhaegen
1872c668a5
replace DAML Authors with DA in copyright headers (#5228)
Change requested by Manoj.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-27 01:26:10 +01:00
Gary Verhaegen
eb857d6dd3
skip fmt check for release (#5012)
The formatting check used to be git-repo-dependent (see #4985), which is
preventing our candidate 0.13.55 from building.

This PR introduces a temporary hack to disable format checking on
release PRs & commits. It should be reverted once 0.13.55 is released.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-16 12:05:26 +01:00
Gary Verhaegen
fd185ed22e
publish prerelease documentation (#4976)
This PR changes the documentation release process to publish the
documentation for releases tagged "prerelease" on GitHub, while
discarding them when deciding on the latest version (the one that shows
on `/` on the docs site) and omitting them from the `versions.json` file
(meaning they do not appear on the dropdown).

This PR also makes a bit of cleanup/bug fixing:
- The change in `nix` toolset name (#4724) needs to be protected by a
  version check, as we checkout older versions of the repo during docs
  build.
- The data types BlogSubmit and BlogId seem to have survived the "dead
  code detection" in #4956.
- The documentation build step had not been updated to pass down the
  correct version string (#4513).

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-12 18:54:47 +00:00