daml/ci
Gary Verhaegen 4a6ab84b69
add default machine capability (#5912)
add default machine capability

We semi-regularly need to do work that has the potential to disrupt a
machine's local cache, rendering it broken for other streams of work.
This can include upgrading nix, upgrading Bazel, debugging caching
issues, or anything related to Windows.

Right now we do not have any good solution for these situations. We can
either not do those streams of work, or we can proceed with them and
just accept that all other builds may get affected depending on which
machine they get assigned to. Debugging broken nodes is particularly
tricky as we do not have any way to force a build to run on a given
node.

This PR aims at providing a better alternative by (ab)using an Azure
Pipelines feature called
[capabilities](https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/agents?view=azure-devops&tabs=browser#capabilities).
The idea behind capabilities is that you assign a set of tags to a
machine, and then a job can express its
[demands](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/demands?view=azure-devops&tabs=yaml),
i.e. specify a set of tags machines need to have in order to run it.

Support for this is fairly badly documented. We can gather from the
documentation that a job can specify two things about a capability
(through its `demands`): that a given tag exists, and that a given tag
has an exact specified value. In particular, a job cannot specify that a
capability should _not_ be present, meaning we cannot rely on, say,
adding a "broken" tag to broken machines.

Documentation on how to set capabilities for an agent is basically
nonexistent, but [looking at the
code](https://github.com/microsoft/azure-pipelines-agent/blob/master/src/Microsoft.VisualStudio.Services.Agent/Capabilities/UserCapabilitiesProvider.cs)
indicates that they can be set by using a simple `key=value`-formatted
text file, provided we can find the right place to put this file.

This PR adds this file to our Linux, macOS and Windows node init scripts
to define an `assignment` capability and adds a demand for a `default`
value on each job. From then on, when we hit a case where we want a PR
to run on a specific node, and to prevent other PRs from running on that
node, we can manually override the capability from the Azure UI and
update the demand in the relevant YAML file in the PR.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-09 18:21:42 +02:00
..
azure-cleanup replace DAML Authors with DA in copyright headers (#5228) 2020-03-27 01:26:10 +01:00
cron add default machine capability (#5912) 2020-05-09 18:21:42 +02:00
docker/daml-sdk Add /etc/nsswitch.conf to our Dockerfile (#5882) 2020-05-07 09:44:44 +02:00
build-unix.yml Move Bazel configuration before formatting (#5893) 2020-05-07 11:21:02 +00:00
build-windows.yml enable patch releases (fix) (#5634) 2020-04-20 17:01:08 +02:00
check-changelog.sh enable patch releases (#5584) 2020-04-16 17:50:55 +02:00
compatibility-windows.yml Make compat tests work on windows (#5732) 2020-04-28 16:06:36 +02:00
compatibility.yml fix compatibility test (#5736) 2020-04-27 13:15:48 +00:00
configure-bazel.sh Apply platform_suffix on all Windows pipelines (#5846) 2020-05-05 18:02:39 +00:00
dev-env-install.sh replace DAML Authors with DA in copyright headers (#5228) 2020-03-27 01:26:10 +01:00
dev-env-push.py replace DAML Authors with DA in copyright headers (#5228) 2020-03-27 01:26:10 +01:00
report-end.yml replace DAML Authors with DA in copyright headers (#5228) 2020-03-27 01:26:10 +01:00
report-start.yml replace DAML Authors with DA in copyright headers (#5228) 2020-03-27 01:26:10 +01:00
slack_user_ids Notify Sofia on #team-daml-ci (#5487) 2020-04-08 09:31:54 +00:00
tell-slack-failed.yml fix tell-slack-failed CI "function" (#5670) 2020-04-22 15:21:04 +02:00
windows-diagnostics.ps1 windows: CI agent diagnostics (#1146) 2019-05-15 11:59:56 +02:00