Commit Graph

211 Commits

Author SHA1 Message Date
Gary Verhaegen
f0ca332983
fix release notification (#9075)
CHANGELOG_BEGIN
CHANGELOG_END
2021-03-10 17:12:14 +00:00
Gary Verhaegen
4bf05edf2b
tell people to speak up about snapshot blockers (#9068)
Suggested by @stefanobaghino-da; thanks!

CHANGELOG_BEGIN
CHANGELOG_END
2021-03-10 11:18:57 +00:00
Gary Verhaegen
caa023b72e
ci/cron/check: remove dade-assist calls (#9048)
* ci/cron/check: remove dade-assist calls

We can only run this in a context where the dev-env is already set up
anyway, as that's how we get Bazel to build the script in the first
place.

CHANGELOG_BEGIN
CHANGELOG_END

* remove skip_java logic
2021-03-08 13:55:19 +00:00
Gary Verhaegen
121534c54d
ci/cron/check: low-hanging perf improvement (#9042)
Two quick improvements I made while waiting on #9039:
- Avoid loading Java. When looking at the logs flow by this seemed to be
  taking a huge amount of time.
- Isolate the gcloud config files, which allows for running gcloud
  downloads in parallel.

Together these reduce the `check_releases` runtime from about 5 hours to
about 2. There's much more (and smarter) work needed on this, but this
was really easy to do.

CHANGELOG_BEGIN
CHANGELOG_END
2021-03-08 07:56:33 +00:00
Gary Verhaegen
4fd42a6772
reduce noise on daily tests (#9039)
Getting a separate message per test was fine when there was one, but
this kind of got our of hand at this point.

CHANGELOG_BEGIN
CHANGELOG_END
2021-03-05 22:50:23 +00:00
Gary Verhaegen
2688ad6f0d
autorelease: improved PR message (#9008)
Thanks to @stefanobaghino-da for the suggestion.

CHANGELOG_BEGIN
CHANGELOG_END
2021-03-03 21:16:23 +00:00
Moritz Kiefer
4001f5c9bd
Conduitify download in release file (#8988)
* Conduitify download in release file

Apologies, my ocd kicked in when I saw this in another PR.

changelog_begin
changelog_end

* fixup deps

changelog_begin
changelog_end

* i should stop working

changelog_begin
changelog_end
2021-03-03 08:49:41 +01:00
Gary Verhaegen
9da40ea426
ci/cron: fix retry policy (#8985)
I think the retry is clobbering the files. Here is my theory:
- The HTTP request is lazy, i.e. it starts producing a byte stream
  before it has finished downloading.
- The connection somehow crashes in the middle of that lazy handling,
  possibly because the Haskell code blocks for too long on something
  else and GCP thus closes the connection. (If this is true, making sure
  we download the entire thing before we start writing may make the
  download more reliable.) This explains why we get a "resource vanished"
  and not a plain 404 to start with.
- The retry policy doesn't know anything about HTTP requests; it just
  sees an IO action throwing an exception and restarts the whole thing.
- Because the IO action opens the file in Append mode, we thus end up
  with a file that is too big and has its "starting bytes" multiple
  times. That obviously fails to sign-check.

If this is what happens then the retry does not help at all, which does
seem to be what we've been observing (though I haven't tracked the exact
error rate too closely). The fix would likely be as simple as changing
`IO.AppendMode to IO.WriteMode (which truncates, per [documentation]).

[documentation]: https://hackage.haskell.org/package/base-4.14.1.0/docs/System-IO.html

CHANGELOG_BEGIN
CHANGELOG_END
2021-03-02 13:51:00 +00:00
Gary Verhaegen
634b38a92a
run PR builds on NOTICES updates (#8931)
CHANGELOG_BEGIN
CHANGELOG_END
2021-02-24 13:36:51 +00:00
Moritz Kiefer
8e8284c907
Retry on HttpException in release check (#8909)
ConnectionTimeout gets thrown as HttpException which seems very
sensible to retry.

changelog_begin
changelog_end
2021-02-22 17:37:08 +01:00
Moritz Kiefer
c309ef915c
Stream stdout of verify_signatures (#8911)
There is really no reason to first capture this to a String and then
putStrLn it. That only causes issues since if we crash with a non-zero
exitcode we won’t output anything.

changelog_begin
changelog_end
2021-02-22 16:03:25 +00:00
Gary Verhaegen
eb14dcf6e9
bump check_releases timeout (#8812)
It's become quite flaky at just 4h.

Yes, I do plan to actually fix it, but it may take a while.

CHANGELOG_BEGIN
CHANGELOG_END
2021-02-10 22:45:25 +00:00
Brian Healey
dd73af06fb
replace deprecated blackduck property (#8811)
CHANGELOG_BEGIN
CHANGELOG_END
2021-02-10 19:54:31 +00:00
Moritz Kiefer
773ddada04
Upgrade nixpkgs (#8190)
* Upgrade nixpkgs

changelog_begin
changelog_end

* .

changelog_begin
changelog_end

* .

changelog_begin
changelog_end

* No ibazel

changelog_begin
changelog_end

* .

changelog_begin
changelog_end

* Switch agent pool

changelog_begin
changelog_end

* .

* .

changelog_begin
changelog_end
2021-02-08 11:12:07 +00:00
Gary Verhaegen
29c2577fa2
fix perf_speedy report (escaping ") (#8655)
The report currently chokes on quotes in the commit message (see
550aa48fc5). Rather than trying to
correctly excape things in Bash, this PR delegates the quote handling to
jq, because having to deal with Bash embedded in YAML is hard enough.

CHANGELOG_BEGIN
CHANGELOG_END
2021-02-03 19:47:27 +01:00
Moritz Kiefer
d7d954335e
Retry asset downloads in check_releases (#8730)
We’ve seen a number of "resource vanished (connection reset by peer)"
errors. Slapping some retries on that should hopefully make the CI job
a bit more robust.

changelog_begin
changelog_end
2021-02-03 12:04:59 +01:00
Moritz Kiefer
c89e00342d
Clean broken entries from the Bazel cache (#8668)
* Clean broken entries from the Bazel cache

This is hopefully a somewhat reasonable workaround for the "output not
created" errors that keep annoying us.

For now, this is just part of the hourly cronjob but we could move it
somewhere else if desired.

changelog_begin
changelog_end

* Fix GCS credentials

changelog_begin
changelog_end
2021-01-28 17:57:09 +00:00
Gary Verhaegen
fef712bf60
Upgrade linux nodes to 20.04 (#8617)
CHANGELOG_BEGIN

- Our Linux binaries are now built on Ubuntu 20.04 instead of 16.04. We
  do not expect any user-level impact, but please reach out if you
  do notice any issue that might be caused by this.

CHANGELOG_END
2021-01-27 17:38:34 +01:00
Gary Verhaegen
fc83f7088d
limit blackduck scans to main (#8647)
CHANGELOG_BEGIN
CHANGELOG_END
2021-01-27 17:29:27 +01:00
Moritz Kiefer
e0c5abd647
Switch to GHC 8.10.3 (#8394)
* Switch to GHC 8.10.3

changelog_begin
changelog_end

* Update bazel-haskell-deps.bzl

Co-authored-by: Andreas Herrmann <42969706+aherrmann-da@users.noreply.github.com>

* Comment on rules_haskell patch

changelog_begin
changelog_end

* .

changelog_begin
changelog_end

Co-authored-by: Andreas Herrmann <42969706+aherrmann-da@users.noreply.github.com>
2021-01-25 11:53:53 +00:00
Gary Verhaegen
532f996f12
ci: clean-up hard drive more often (#8582)
CHANGELOG_BEGIN
CHANGELOG_END
2021-01-20 17:22:53 +01:00
Gary Verhaegen
36175c0a77
ci/cron: remove hidden.json (#8525)
I discovered yesterday that the `snapshots.json` (and actually also the
`versions.json`) file is no longer purely internal to the docs process,
as it was meant to be, but is now depended upon by the assistant. This
means the renaming from `snapshots.json` to `hidden.json` cannot happen,
and we reverted that yesterday in #8513 (& #8514), though that was done
in a bit of a hurry. This PR aims at cleaning up the resulting mess and
achieve a better long-term end state.

I will be manually removing the `hidden.json` file as soon as this is
merged, so that nothing ends up depending on _that_. There is no
occurrence of `hidden.json` outside this docs cron so hopefully this
works out.

CHANGELOG_BEGIN
CHANGELOG_END
2021-01-15 16:04:54 +00:00
Gary Verhaegen
7f076bcd56
docs cron: update meta files unconditionally (#8514)
CHANGELOG_BEGIN
CHANGELOG_END
2021-01-14 20:04:48 +01:00
Moritz Kiefer
dfc48a0010
Update both hidden.json and snapshots.json (#8513)
* Update both hidden.json and snapshots.json

The assistant relies on the latter, our docs cronjob on the former. I
have no idea why we have two but keeping them in sync should be fine.

changelog_begin
changelog_end

* maybe I should test if my code compiles before pushing

changelog_begin
changelog_end
2021-01-14 19:07:35 +01:00
Gary Verhaegen
742feb58be
trigger PRs job on generated PRs (#8489)
CHANGELOG_BEGIN
CHANGELOG_END
2021-01-13 11:30:17 +00:00
Gary Verhaegen
ce96802a6d
only update NOTICES from main daily compats (#8473)
See #8468 for failure mode of current config.

CHANGELOG_BEGIN
CHANGELOG_END
2021-01-12 12:10:27 +00:00
Moritz Kiefer
b83c4e945d
Bump perf test for scalafmt update (#8444)
changelog_begin
changelog_end
2021-01-09 13:12:54 +01:00
Gary Verhaegen
7f7e4f9fdc
fix docs cron (#8412)
Typo introduced in #8384.

CHANGELOG_BEGIN
CHANGELOG_END
2021-01-06 13:25:52 +00:00
Gary Verhaegen
f45a065c54
ci: pin down Ubuntu versions (#8388)
For a couple weeks now there has been a warning on the Azure Pipelines
web UI that says `ubuntu-latest` is in the process of switching from
18.04 to 20.04. I am not aware of any specific issue this would cause
for our particular workflows, but I don't like my dependencies changing
from under me.

CHANGELOG_BEGIN
CHANGELOG_END
2021-01-05 10:39:59 +01:00
Gary Verhaegen
eb9aba680e
ci/cron: remove temp hack (#8385)
CHANGELOG_BEGIN
CHANGELOG_END
2021-01-04 19:06:30 +00:00
Gary Verhaegen
b506393dc8
ci/cron: use proc instead of shell (#8384)
As suggested by @cocreature in #8382.

CHANGELOG_BEGIN
CHANGELOG_END
2021-01-04 19:47:13 +01:00
Gary Verhaegen
2c497a3e75
fix typo in docs cron (#8382)
Introduced in #8372.

CHANGELOG_BEGIN
CHANGELOG_END
2021-01-04 18:32:57 +01:00
Gary Verhaegen
2083f74cf2
add docs.daml.com/latest (#8372)
This commit changes the docs cron to create a new file
`docs.daml.com/latest`, a simple text file containing the version number
of the latest released version. This is done in response to #8354, to
avoid having to replicate the logic for which version is the latest
across this, the assistant and the get-daml.sh script.

This commit also cleans up a small transitional FIXME in the docs cron
regarding the transition from `snapshots.json` to `hidden.json`.

For the solution to #8354 to be complete, we'll also need to update
get-daml.sh and the assistant to use the new latest file. Note that this
file would only be published on the next stable release, so this commit
also includes a temporary hack to re-generate it (and `versions.json`
and `hidden.json`) unconditionally on every run; this can be removed as
soon as this has run once.

CHANGELOG_BEGIN
CHANGELOG_END
2021-01-04 15:14:42 +01:00
Gary Verhaegen
b45e37991f
update test sha for copyright update (#8361)
CHANGELOG_BEGIN
CHANGELOG_END
2021-01-01 22:16:37 +00:00
Gary Verhaegen
a925f0174c
update copyright notices for 2021 (#8257)
* update copyright notices for 2021

To be merged on 2021-01-01.

CHANGELOG_BEGIN
CHANGELOG_END

* patch-bazel-windows & da-ghc-lib
2021-01-01 19:49:51 +01:00
Gary Verhaegen
93f449d245
rename master to main (#8245)
As we strive for more inclusiveness, we are becoming less comfortable
with historically-charged terms being used in our everyday work.

This is targeted for merge on Dec 26, _after_ the necessary
corresponding changes at both the GitHub and Azure Pipelines levels.

CHANGELOG_BEGIN

- DAML Connect development is now conducted from the `main` branch,
  rather than the `master` one. If you had any dependency on the
  digital-asset/daml repository, you will need to update this parameter.

CHANGELOG_END
2020-12-27 14:19:07 +01:00
Moritz Kiefer
f3e481525c
Fix compare.sh again (#8346)
Turns out bash is hard and I’m stupid :sadpanda:

We need to write thoutput to stderr, otherwise this ends up in the
JSON output which obviously is not valid JSON.

changelog_begin
changelog_end
2020-12-18 20:59:38 +00:00
Moritz Kiefer
0391b867ee
Fix compare.sh in perf test (#8344)
We changed the patch to target more than one file which made the
checkout insufficient for restoring the state and then the following
git checkout of current fails with:

```
error: Your local changes to the following files would be overwritten by checkout:
        stack-snapshot.yaml
```

A git reset --hard should make sure everything gets reset.

changelog_begin
changelog_end
2020-12-18 20:59:22 +01:00
Gary Verhaegen
e51838efff
fail on failed perf check (#8334)
Today the [perf check failed], but we got no notification of it. I'm not
sure what's happening as I can't reproduce any of it locally: not only
does the `bazel run` command work for me (despite the ghc-lib URL
returning a 404 when I try it manually), I also can't reproduce the fact
that Bash, on CI, doesn't seem to fail on either the `bazel run` error
or the fact that on the next line `cat` tries to access a file that
doesn't exist (for which CI does print the error message).

This PR does two things:

- Add an explicit check that _should_ get Bash to actually fail should
  this happen again in the future. It is not a great fix but at least
  we'll know if it happens again (to the best of my knowledge today was
  the first time we hit this).
- Amend the existing patch we apply on the baseline commit to use the
  GCS-hosted ghc-lib packages.

CHANGELOG_BEGIN
CHANGELOG_END

[perf check failed]: https://dev.azure.com/digitalasset/daml/_build/results?buildId=64395&view=results
2020-12-18 13:42:54 +01:00
Remy
3783a158ff
Bump perf test (#8269)
Needs bumping due to the API changes for GenNode (#8217)

changelog_begin
changelog_end
2020-12-11 17:37:44 +00:00
Brian Healey
6a1e0a633b
Avoid haskell and jvm bazel blackduck scan stomping (#8247)
* Run bazel scan for all but haskell first, then haskell at end so they do not stomp on each other

CHANGELOG_BEGIN
CHANGELOG_END

Signed-off-by: Brian Healey <brian.healey@digitalasset.com>

* use common prefix between runs so results are aggregated, get branch name from running job rather than assuming master

* DO NOT MERGE: disable all jobs except for blackduck scan

* haskell before full scan

* parens instead of braces

* differentiate haskell and jvm bazel runs with unique code location

* unique code location prefix

* fix syntax

* unique code location to avoid clashing bazel runs

* Use master security-blackduck script helper

* reenable all jobs to make mergeable

* cleanup whitespace

* Use Build.SourceBranchName for branch

* Update ci/cron/daily-compat.yml

Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>

* DO NOT MERGE: skip all jobs but blackduck, skip update notices file step

* reenable all jobs to make mergeable

Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
2020-12-11 07:56:05 -05:00
Moritz Kiefer
75d28d1242
Bump perf test (#8256)
Needs bumping due to the minor API changes for multi-party submissions

changelog_begin
changelog_end
2020-12-11 12:29:49 +00:00
Gary Verhaegen
029c655adc
blackduck: open PR on NOTICES file change (#8215)
CHANGELOG_BEGIN
CHANGELOG_END
2020-12-10 10:08:28 +01:00
Moritz Kiefer
ec0fcb39f1
Fix docs cron again (#8208)
Unfortunately, I missed the fact that we had our own logic for
handling process failures which resulted in uncatchable
exceptions. I’ve changed one place to use the upstream handling and
the other to call `fail` which throws an IOException like I would have
expected.

changelog_begin
changelog_end
2020-12-08 22:06:22 +01:00
Moritz Kiefer
0c7791bc1c
Fix docs cron (#8207)
The change in #8191 to publish daml on sql docs failed because the
versions.json and snapshots.json files don’t exist initially. This PR
fixes that by catching the exception and treating it as an empty file.

changelog_begin
changelog_end
2020-12-08 21:32:33 +01:00
Moritz Kiefer
880d290285
Publish DAML on SQL Docs (#8191)
changelog_begin
changelog_end
2020-12-08 13:49:37 +01:00
Brian Healey
ca294eb14d
add blackduck scan to run on master (#6130) (#8161)
* add blackduck scan to run on master (#6130)

* add blackduck scan

* disable go scanning
exclude entire language-support/ts directory for node scanning
break to multiple lines to make command line params easier to parse

* Increase timeout for blackduck binary scan

* update blackduck scan config

* remove some exclusions, force python3

* exclude GO until path to go executable can be resolved

* added readme explanation of why we want this file

* fail in case of policy violation

* ensure haskell bazel scan completes before running second round scan for bazel jvm and node and other langs

* trigger notices file gen to ensure BOM complete

* remove trailing end of lines

* run with latest detect version and unique code location name changes to wrapper script

* Add blackduck to daily compat job

* DO NOT MERGE: condition false to disable other jobs for testing

* remove parameters not available to cronjob

* Revert changes to regular CI pipeline

CHANGELOG_BEGIN
CHANGELOG_END

Signed-off-by: Brian Healey <brian.healey@digitalasset.com>

* Do not get branch name from variable

* Upgrade com.fasterxml.jackson.core:jackson-databind to 2.12.0 to address security vulnerability

* Remove disabling of other jobs, set to branch to be used on prod runs

* Apply suggestions from code review

Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>

* Address code review comments

* Updated NOTICES file

* Run bazel build, update NOTICES file

* Correct dade-assist

* do not have perms to pipe to dev/null

* Add md file explaining how to update NOTICES file

* Add instructions for running blackduck locally

* Add a link to full security-blackduck readme

Co-authored-by: Gary Verhaegen <gary.verhaegen@digitalasset.com>
2020-12-07 19:59:39 +00:00
Gary Verhaegen
485dd1b597
perf_json_http: compress results (#8123)
I wanted to suggest that on the PR but caught it after it was merged. So
I made a note of it, which promptly got lost. As the end of the year
approaches I've started trying to clean up a little.

CHANGELOG_BEGIN
CHANGELOG_END
2020-12-01 17:44:43 +01:00
Gary Verhaegen
3cef53135c
ci/cron: do not push artifacts to gcs bucket (#8067)
Having the cron push artifacts to GCP was really only meant to happen
once. I got distracted and worked on other things. This PR closes that
work loop such that the current state and expectations are:

- Every new release pushes to GCP as part of the release process.
- The cron only checks that the GCP backup exists and matches, but does
  not push if it doesn't.

The reason for this is we want the cron job to fail if there are
additional, unexpected files in a release, rather than automatically
commit those files for the long term.

CHANGELOG_BEGIN
CHANGELOG_END
2020-11-25 19:12:03 +01:00
Gary Verhaegen
b23304c691
add default capability to macos (#5915)
This is the macOS part of #5912, which I have separated because our
macOS nodes have a different deployment process so it seemed easier to
track the deployment of the change separately.

CHANGELOG_BEGIN
CHANGELOG_END
2020-11-25 15:34:33 +01:00
Remy
cb5b439e39
Bump test_sha (#7512) (#7860)
This changed in #7858 but it’s a harmless change.

CHANGELOG_BEGIN
CHANGELOG_END
2020-11-02 20:18:32 +00:00
Gary Verhaegen
cdf6160c76
ci/cron/check: use whileJust_ for recursion (#7747)
As suggested in #7746.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-20 16:03:46 +02:00
Gary Verhaegen
dd01dbc5a4
ci/cron: change github contact email (#7741)
GitHub requests that when we use the API without a token, which is what
we do here, we use a user-agent header that allows them to contact us in
case there is an issue they need to discuss. This PR updates that
address, and cleans-up a bit of duplication around it.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-20 12:08:31 +02:00
Gary Verhaegen
7eb4b352dd
ci/cron/check: replace wget with Haskell code (#7731)
As promised in #7696.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-19 19:07:09 +02:00
Gary Verhaegen
e1b26d27ad
ci/cron/check: limit simultaneous downloads (#7703)
As requested in review of #7696.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-16 11:37:50 +02:00
Gary Verhaegen
305a097a93
ci/cron: move concurrency from Bash to Haskell (#7696)
One small step further in removing the Bash.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-15 14:00:55 +02:00
Gary Verhaegen
ad79acdb65
ci/cron: ability to run check locally (#7690)
This PR allows the script to run without GCP credentials. It will
obviously then skip the bits that require GCP credentials, but that
still leaves it with plenty of things to do.

Because checking all releases can still be quite long (around an hour on
CI, and my personal connection is a bit slower), this also introduces a
new parameter that restricts the number of releases to test.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-15 12:18:45 +02:00
Moritz Kiefer
80a25bf54a
Bump perf sha (#7679)
The change from #7666 is benign so we can simply bump this.

changelog_begin
changelog_end
2020-10-14 11:38:49 +00:00
Gary Verhaegen
8e905b34c0
ci/bash-lib: fix gcs return code (#7630)
Currently the return code of the function is the return code of the
`eval "$restore_trap"` line, whereas semantically we want the return
code of the `gsutil` call. This is not an issue in most cases as the
`set -e` should kick in, but if the function appears as the condition in
an `if` statement the `-e` flag is suspended.

The main use-case right now is that the daily license check is _not_
uploading artifacts.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-11 19:07:22 +02:00
Gary Verhaegen
42b7fa5ab9
ci/cron: fix gcs path (#7626)
Change the path used to push to the backup gcs bucket to match what is
put by the release script. This needs to get merged before we run the
next daily.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-09 15:31:52 +02:00
Gary Verhaegen
cd427dc2d2
ci/cron: upload artifact to daml-data if missing (#7616)
If we don't already have a copy of an artifact in our "disaster
recovery" storage box, put one.

Note: as implemented, this upload mechanism happens only if the release
was successfully verified signature-wise, so this should not result in
us saving broken artifacts. Also, CI does not have deletion or overwrite
access to this bucket, so overall this should be pretty safe.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-09 14:55:37 +02:00
Gary Verhaegen
8c1fbf6225
ci/bash_lib: generalize save_gcp_data (#7599)
This PR extends the existing `save_gcp_data` function to handle any
`gsutil` command. This is done to support existence checking using
`gsutil ls` for private artifacts in release checking (`ci/cron`).

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-08 18:37:14 +02:00
Gary Verhaegen
19d7086a21
fix ci/cron: remove references to LOG (#7607)
CHANGELOG_BEGIN
CHANGELOG_END
2020-10-08 18:15:25 +02:00
Gary Verhaegen
19c658ae15
ci/cron: move temp file handling to Haskell (#7596)
CHANGELOG_BEGIN
CHANGELOG_END
2020-10-07 17:45:00 +02:00
Gary Verhaegen
c238985bf9
ci/cron: actually fail on invalid signature (#7592)
At the moment, because the signature check appears in a `if` statement,
failed signatures do not actually fail the script and would thus still
result in "success" messages to Slack.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-07 13:12:13 +02:00
Gary Verhaegen
5799557570
ci/cron: check all releases (#7586)
This walks through the paginated GH API to fetch all releases and check
their signatures.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-07 12:04:08 +02:00
Gary Verhaegen
4db8c3ada1
ci/cron: move Bash cron inside Haskell script (#7585)
Yes, this is how I write Haskell. I'm told it's an improvement over
Bash.

Jokes aside, plan is to chip away at the Bash script, starting with
replacing the outermost loop with a proper "get _all_ releases" call
from Haskell, but I like keeping things working in small steps, and even
long-term I have no desire to reimplement the gpg signature checking
code in Haskell.

I have tested that things still work (on my machine); the only
difference is that we now only get the full output all at once at the
end, rather than one signature at a time. I don't think anyone is
looking at the output in real-time, so this should not be a huge issue.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-06 18:45:29 +02:00
Gary Verhaegen
21374713d0
ci/cron: add opt-parse applicative (#7583)
As requested in [previous PR].

[previous PR]: https://github.com/digital-asset/daml/pull/7569#discussion_r499733636

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-06 17:14:27 +02:00
Gary Verhaegen
67746b7710
ci/cron: add arg to select docs (#7569)
This is a preparatory step for moving at least some of the logic of
checking signatures to this script. The reasoning for putting signatures
in the same script basically boils down to "it already has GitHub
pagination".

I also removed the `run.sh` wrapper because it did not add anything
anymore. It used to be useful, but across various changes it's sort of
lost its purpose.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-05 19:35:09 +02:00
Gary Verhaegen
eb6b2ce1c6
ci/cron: small cleanup (#7570)
Small improvements I noticed could be made while working on #7569, in a
separate PR because they're quite unrelated.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-05 19:33:48 +02:00
Gary Verhaegen
2973228f77
signature check: report to Slack (#7568)
If a tree falls in the forest and all that.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-05 17:31:11 +02:00
Gary Verhaegen
fda2eca084
periodically check signatures (#7543)
This is a first, very incomplete step in the spirit of small,
incremental PRs. Known missing features:

- Should check all versions, not just the 30 most recent ones.
- Should also download from GCP backup and compare.
- Should alert on Slack if anything is unexpected.
- Should handle versions prior to us starting to sign (and do what?).
- Should also check artifacts in Artifactory, not just GitHub Releases.
- Optionally should save to GCP if we don't have a backup already.

So at the moment it's just downloading the artifacts for the 30 most
recent releases and printing a message stating whether we have a
signature and whether it's valid.

CHANGELOG_BEGIN
CHANGELOG_END
2020-10-01 21:01:42 +02:00
Moritz Kiefer
1243afc4a1
Remove reference to release-notes.rst (#7524)
* Remove reference to release-notes.rst

https://github.com/digital-asset/daml/pull/7458 shuffled this
around. While we could update it, it doesn’t really make any sense. We
post our release notes to the blog now and not in the docs so this
whole checkout procedure is redundant. This is also true if we wanted
to make a bugfix release for a release < 1.5 where this file still
existed. The trigger_sha is always on master (following our current
release process) so the file would still not exist.

I did also remove it from the docs cronjob. We never reupload old docs
so this doesn’t make a difference.

changelog_begin
changelog_end

* Stupid whitespace change because windows is pissing me off

changelog_begin
changelog_end
2020-09-30 11:34:43 +00:00
Moritz Kiefer
89c0f6ca41
Bump test_sha (#7512)
This changed in #7501 but it’s a harmless change.

changelog_begin
changelog_end
2020-09-29 11:01:01 +00:00
Remy
65f1a247d0
Bump test_sha in perf tests (#7441)
CHANGELOG_BEGIN
CHANGELOG_END
2020-09-18 18:30:39 +00:00
Leonid Shlyapnikov
d13e7aa184
JSON API daily perf test cron job (#7406)
* Add http-json-perf daily cron job

changelog_begin
changelog_end

* commenting out other jobs so we can manually test the new one

* commenting out other jobs so we can manually test the new one

* Fix the shell script

* Fixing the gs bucket, `gs://http-json bucket does not exist`

* uncomment the other jobs

* timestamp from git log

* get rid of DAR copying

* comment out the other jobs, so we can test it

* uncomment the other jobs
2020-09-16 13:02:02 -04:00
Gary Verhaegen
b2d58a3304
save daily perf results (#7396)
It's a real shame I forgot to do this sooner, but better late than never
I suppose.

CHANGELOG_BEGIN
CHANGELOG_END
2020-09-14 18:38:31 +00:00
Gary Verhaegen
51e7c88bf5
announce release rotation on Tuesday (#7151)
It's been pointed out to me that some people actually plan their work,
and for them, it would be useful to know at least a day in advance that
they will be doing a release. This PR attempts to accommodate that.

Note: this will run as a new, separate cron. I'll do the Azure setup
after this gets approved.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-17 13:42:35 +02:00
Gary Verhaegen
d43419d339
update perf sha (#7140)
Change in Speedy API.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-14 12:55:06 +02:00
Gary Verhaegen
1baea84ca0
fix auth header for compat pr (#7134)
On the last release, the job succeeded despite no being able to create
the compat PR. This fixes:

- The curl call to actually return non-0 on non-2xx HTTP response.
- The way in which we encode the credentials.

This also attempts to create a Bash library, hopefully this time in a
way that doesn't get destroyed by our release process. IIUC pipeline
instructions (YAML files) are all parsed and read before any execution,
so by embedding the Bash library in a template we should get the correct
version (i.e. the one that is running the pipeline) even when checking
out other commits.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-14 11:35:57 +02:00
Moritz Kiefer
67f350694c
Fix release cron job (#7095)
CI failed because $4 was unset. We explicitly check later if it is set
so this is intentional, we just need to actually get to that check and
not fail due to set -u.

changelog_begin
changelog_end
2020-08-12 10:50:28 +02:00
Stephen Compall
6306b9f8d8
perf test harmlessly changed in #6907; match sha (#7065)
CHANGELOG_BEGIN
CHANGELOG_END
2020-08-10 07:26:18 +00:00
Gary Verhaegen
00f3de63c9
rotate responsibility for release process (#7011)
This PR attempts to add some automation around assigning release
management. The PR adds a file `release/rotation`; each week, the
updated CI cron job will:

- Open a PR for the new release [as current].
- Assign the first user in the file to that PR.
- Add the Standard-Change label to the PR.
- Start the build for that PR [as current].
- Open a new PR that rotates the `release/rotate` file, i.e. pushes back
  the first line to the end of the file.

This PR also adds mentions of the "release handler" (the first line of
`release/rotation`) to the various messages we send to Slack along the
release process.

The initial state of the `release/rotation` file has been created by
listing all the volunteers (Language team, Application Runtime team, as
well as @SamirTalwar-DA and @stefanobaghino-da) and piping the file
through `shuf`. (Then I put myself at the top so I can hopefully iron
out the issues with the first attempt.)

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-05 18:58:56 +02:00
Gary Verhaegen
1ff75f1256
remove monthly report (#6967)
This script is no longer relevant to our internal processes. The report
is now generated by the security team and validated by us, rather than
produced and validated by us.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-04 12:01:07 +02:00
Remy
cc0b497d23
Bump test_sha in perf tests (#6972)
CHANGELOG_BEGIN
CHANGELOG_END
2020-08-03 17:21:49 +00:00
Gary Verhaegen
ef465de0a8
run build on automated release PRs (#6964)
Even though the Azure Pipelines bot account _clearly_ has write access
to our repo, as it can create the PR, it does not count as having write
access for the purposes of Azure deciding to run the build on the PRs it
opens. Not having the build run on the release PR would defeat the whole
point of having it, so this adds a little nudge to Azure so it does
start the build after opening the PR.

CHANGELOG_BEGIN
CHANGELOG_END
2020-08-03 16:42:00 +02:00
Gary Verhaegen
8043756883
better release triggers (#6859)
Based on feedback from @nickchapman-da, this PR aims at making the
release process easier by:

- Automatically opening a release PR on Wednesday morning. The goal here
  is that by the time we start working, there is a release already
  built, so we save about an hour on waiting for that. This obviously
  doesn't help with ad-hoc releases.
- On a release PR build, posting to Slack when the release is ready to
  merge.
- On a release master build, posting to Slack when a release is ready to
  be tested.

My hope is that this makes the release process less tedious. This is not
trying to address the actual release testing, but hopefully should
reduce the annoyance of having to constantly go and check if the release
is ready.

CHANGELOG_BEGIN
CHANGELOG_END
2020-07-24 16:40:11 +00:00
Gary Verhaegen
97d1fa1e04
fix docs cron (#6864)
I lost the public ACL in #6817.

CHANGELOG_BEGIN
CHANGELOG_END
2020-07-24 16:13:00 +00:00
Gary Verhaegen
818a52b094
simplify docs cron (#6817)
simplify docs cron

This commit changes the "live state" to be that all versions are there
on S3, most of them hidden the way snapshots currently are, and only
displays in the drop-down the list of "supported" versions, i.e. stable
and >= 1.0.0.

The docs cron will now:

- Get list of versions from GitHub (as it does now)
- Get list of versions from S3 (as it does now: versions.json +
  snapshots.json, though it assumes we'll have a follow-up PR to change
  the latter to hidden.json)
- Compare; if the sets of versions are the same, stop there. (Note: this
  "set of versions" here includes the notion of which versions are shown,
  not just which ones exist. See the Versions data type in the code.)
- If there is a new hidden version, just build that, push it, change
  nothing else. No need to download any of the existing versions or mess
  around with anything else (except updating `hidden.json`, otherwise
  we're going to be doing this way too often.)
- If there is a new visible version:
  - check if we have it locally (i.e. from the previous step: it's a
    version we just added)
  - figure out the old and new default versions, and then apply the diff
    to the top-level directory. Basically download the two folders, list
    files that exist in the old one and not in the new one, delete those
    from S3, then push the new one to the top-level on S3.
- update versions.json & hidden.json (and for now snapshots.json)

This means that:

- we never mess with the existing versions; we don't need to download
  them, we don't need to change them, we don't clean them up. Old links
  keep working forever.
- The running time for the docs cron is roughly constant, in that it
  should very rarely have to either build or upload (or download) more
  than 2 versions per run, and if those instances happen they'd be
  accidents (we made 3 actual releases in an hour), not build-up over
  time.
2020-07-24 14:40:32 +02:00
Remy
28ab504b21
Bump test_sha in perf tests (#6825)
CHANGELOG_BEGIN
CHANGELOG_END
2020-07-22 11:00:53 +00:00
Remy
d538d9a53e
Bump test_sha in perf tests (#6816)
CHANGELOG_BEGIN
CHANGELOG_END
2020-07-21 16:56:01 +00:00
Moritz Kiefer
631ed3e891
Bump timeouts in compat tests (#6689)
This bumps the timeout of the compat tests on PRs to 360 minutes
matching other jobs on a PR (we mainly hit this if ghc-lib is rebuilt)
and the timeout on the daily jobs to 720 minutes (we hit this if
_everything_ is rebuilt).

I am slightly worried about the timeout on the daily job. After having
taken a look at it, there are a few reasons how we ended up here:

1. We started including more tests, e.g., sandbox-classic. Not much we
   can do here, those tests are useful.

2. We have a very large number of snapshots for 1.3.0. There are a few
   reasons for this:

   1. Timing: We branched off early for the 1.2.0 release so the first
      snapshot for 1.3 was on June 3th. For 1.4 it looks like the first
      snapshot will be on July 15th so that’s roughly 2 extra
      snapshots just due to timing.

   2. Additional snapshots: We had one broken snapshot due to a broken
      VSCode extension that we didn’t delete (probably not worth doing
      at this point). We also had to backport to an old snapshot which
      resulted in another extra snapshot. We also had one extra
      snapshot which was supposed to be the RC but wasn’t since the
      ANF revert needed to go in.

   The only thing that is clearly useless is the one broken snapshot
   but that doesn’t change things that much. I see 2 orthogonal
   options for improving this assuming we agree that the current
   runtime is worryingly high.

   1. Prune snapshots more aggressively, e.g., only include the last 3
      snapshots. That’s a pretty arbitrary decision but it would
      enforce a hard limit.

   2. Reduce test combinations. E.g., only test snapshots vs stable
      releases but not snapshots vs snapshots.

3. We end up forcing a full build quite frequently. Here are just 2
   examples of how we’ve done that so far.

   1. Upgrade rules_haskell. Basically all tests are run by a Haskell
      binary so this forces a full rebuild.

   2. Change runfiles of `daml`.

I don’t think there is much we can do about 1 or 3 which leaves us
with 2. One not entirely unreasonable option is to just do nothing. We
did have periods where things went pretty smoothly for the most part
and each month we reset to a much smaller number of releases (we also
have to start throwing out old stable releases at some
point). Otherwise reducing the number of test combinations seems the
most promising option to me.

changelog_begin
changelog_end
2020-07-10 12:34:53 +00:00
Moritz Kiefer
6c0bbd3ba6
Bump test_sha in perf tests (#6649)
This changed by the revert of the ANF changes which is harmless by the
same reasoning that made bumping it harmless when we introduced it.

changelog_begin
changelog_end
2020-07-08 12:26:11 +00:00
nickchapman-da
14ca4e5e79
bump-perf (#6553)
changelog_begin
changelog_end
2020-06-30 22:08:36 +00:00
Gary Verhaegen
beb33f2ab1
add explanation for clearing shared segments (#6545)
As requested on #6530.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-30 13:21:32 +00:00
Gary Verhaegen
55776f92ba
clear shared memory segment on macOS (#6530)
For a while now we've had errors along the line of

```
FATAL:  could not create shared memory segment: No space left on device
DETAIL:  Failed system call was shmget(key=5432001, size=56, 03600).
HINT:  This error does *not* mean that you have run out of disk space.
It occurs either if all available shared memory IDs have been taken, in
which case you need to raise the SHMMNI parameter in your kernel, or
because the system's overall limit for shared memory has been reached.
        The PostgreSQL documentation contains more information about
shared memory configuration.
child process exited with exit code 1
```

on macOS CI nodes, which we were not able to reproduce locally. Today I
managed to, sort of by accident, and that allowed me to dig a bit
further.

The root cause seems to be that PostgreSQL, as run by Bazel, does not
always seem to properly unlink the shared memory segment it uses to
communicate with itself. On my machine, running:

```
bazel test -t- --runs_per_test=100 //ledger/sandbox:conformance-test-wall-clock-postgresql
```

and eyealling the results of

```
watch ipcs -mcopt
```

I would say about one in three runs leaks its memory segment. After much
googling and some head scratching trying to figure out the C APIs for
managing shared memory segments on macOS, I kind of stumbled on a
reference to `pcirm` in a comment to some low-ranking StackOverflow
answer. It looks like it's working very well on my machine, even if I
run it while a test (and therefore an instance of pg) is running. I
believe this is because the command does not actually remove the shared
memory segments, but simply marks them for removal once the last process
stops using it. (At least that's what the manpage describes.)

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-30 01:40:16 +02:00
Remy
f5c65696f7
Update LF Perf test SHA (#6510)
CHANGELOG_BEGIN
CHANGELOG_END
2020-06-26 12:11:50 +00:00
Gary Verhaegen
7d3dae4b1f
update perf-sha (#6457)
CHANGELOG_BEGIN
CHANGELOG_END
2020-06-22 18:46:19 +02:00
Remy
149bfc89ff
Update LF Perf test SHA (#6416)
CHANGELOG_BEGIN
CHANGELOG_END
2020-06-18 14:27:26 +00:00
nickchapman-da
e19888d979
update for no stack-tracing in speedy perf (#6363) 2020-06-16 11:36:05 +00:00