Commit Graph

23 Commits

Author SHA1 Message Date
Gary Verhaegen
cb1f4ec773
ci/windows: disable spool (#10200)
* ci/windows: disable spool

We're not expecting to print anything, and @RPS5' security newsletter
says this is a vector of attack.

CHANGELOG_BEGIN
CHANGELOG_END

* increase no-spool to 6

* Windows name truncation causing collisions

* update main group

* remove temp group
2021-07-07 12:44:33 +00:00
Gary Verhaegen
31a76a4a2a
allow CI pools to use any zone (#10069)
This morning we started with very restricted CI pools (2/6 for Windows
and 7/20 for Linux), apparently because the region we run in (us-east1)
has three zones, two of them were unable to allocate new nodes, and the
default policy is to distribute nodes evenly between zones.

I've manually changed the distribution policy. Unfortunately this option
is not yet available in our version of the GCP Terraform plugin.

CHANGELOG_BEGIN
CHANGELOG_END
2021-06-22 10:43:08 +02:00
Gary Verhaegen
646c956457
new windows signing (#9786)
CHANGELOG_BEGIN
CHANGELOG_END
2021-05-25 16:23:17 +02:00
Gary Verhaegen
45bca6e68b
test_windows_signing: install for u (#9776)
Turns out "`--global`" means "for this user".

CHANGELOG_BEGIN
CHANGELOG_END
2021-05-21 15:49:31 +02:00
Gary Verhaegen
4af6608185
fix signing machine (#9772)
Turns out PowerShell is not Bash. Who knew? 🤷

CHANGELOG_BEGIN
CHANGELOG_END
2021-05-21 12:36:56 +02:00
Gary Verhaegen
f5c5b634eb
prepare for EV Windows signing (#9758)
Setting up a non-disruptive way to test out EV signing of our Windows
artifacts.

CHANGELOG_BEGIN
CHANGELOG_END
2021-05-21 10:46:45 +02:00
Gary Verhaegen
cfae2d88f5
update Terraform files to match reality (#8780)
* fixup terraform config

Two changes have happened recently that have invalidated the current
Terraform files:

1. The Terraform version has gone through a major, incompatible upgrade
   (#8190); the required updates for this are reflected in the first
   commit of this PR.
2. The certificate used to serve [Hoogle](https://hoogle.daml.com) was
   about to expire, so Edward created a new one and updated the config
   directly. The second commit in this PR updates the Terraform config
   to match that new, already-in-prod setting.

Note: This PR applies cleanly, as there are no resulting changes in
Terraform's perception of the target state from 1, and the change from 2
has already been applied through other channels.

CHANGELOG_BEGIN
CHANGELOG_END

* update hoogle cert
2021-02-08 17:25:04 +00:00
Gary Verhaegen
a925f0174c
update copyright notices for 2021 (#8257)
* update copyright notices for 2021

To be merged on 2021-01-01.

CHANGELOG_BEGIN
CHANGELOG_END

* patch-bazel-windows & da-ghc-lib
2021-01-01 19:49:51 +01:00
Gary Verhaegen
7c2ba6f996
infra: add prod label (#8140)
Requested by @nycnewman.

CHANGELOG_BEGIN
CHANGELOG_END
2020-12-03 01:55:43 +01:00
Gary Verhaegen
c8f31ca16a
switch CI nodes from n1-standard-8 to c2-* (#6514)
switch CI nodes from n1-standard-8 to c2-*

A while back (#4520), I did a bunch of performance tests when trying to
size up the requirements for the hosted macOS nodes we needed to buy. As
part of that testing, it looked like `c2-standard-8` nodes were faster
(full build down from ~95 to ~75 minutes) and marginally cheaper
($0.4176 vs $0.4280) than the `n1-standard-8` we are currently using.

Then I got distracted, and I forgot to upgrade our existing machines.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-27 12:20:29 +02:00
Gary Verhaegen
b9fbba7fc5
shorten Windows CI username (#6190)
Keeping CI working on Windows involves a constant fight against
MAX_PATH, which is a very short 260 characters. As the username appears
in some paths, sometimes multiple times, we can save a few precious
characters by having it shorter.

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-06 15:03:15 +02:00
Gary Verhaegen
4a6ab84b69
add default machine capability (#5912)
add default machine capability

We semi-regularly need to do work that has the potential to disrupt a
machine's local cache, rendering it broken for other streams of work.
This can include upgrading nix, upgrading Bazel, debugging caching
issues, or anything related to Windows.

Right now we do not have any good solution for these situations. We can
either not do those streams of work, or we can proceed with them and
just accept that all other builds may get affected depending on which
machine they get assigned to. Debugging broken nodes is particularly
tricky as we do not have any way to force a build to run on a given
node.

This PR aims at providing a better alternative by (ab)using an Azure
Pipelines feature called
[capabilities](https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/agents?view=azure-devops&tabs=browser#capabilities).
The idea behind capabilities is that you assign a set of tags to a
machine, and then a job can express its
[demands](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/demands?view=azure-devops&tabs=yaml),
i.e. specify a set of tags machines need to have in order to run it.

Support for this is fairly badly documented. We can gather from the
documentation that a job can specify two things about a capability
(through its `demands`): that a given tag exists, and that a given tag
has an exact specified value. In particular, a job cannot specify that a
capability should _not_ be present, meaning we cannot rely on, say,
adding a "broken" tag to broken machines.

Documentation on how to set capabilities for an agent is basically
nonexistent, but [looking at the
code](https://github.com/microsoft/azure-pipelines-agent/blob/master/src/Microsoft.VisualStudio.Services.Agent/Capabilities/UserCapabilitiesProvider.cs)
indicates that they can be set by using a simple `key=value`-formatted
text file, provided we can find the right place to put this file.

This PR adds this file to our Linux, macOS and Windows node init scripts
to define an `assignment` capability and adds a demand for a `default`
value on each job. From then on, when we hit a case where we want a PR
to run on a specific node, and to prevent other PRs from running on that
node, we can manually override the capability from the Azure UI and
update the demand in the relevant YAML file in the PR.

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-09 18:21:42 +02:00
Gary Verhaegen
08a5a64325
replace Windows agents (#5527)
It looks like the change in Windows agent names has caused an issue:
because Windows agents are not always properly cleaned up on shutdown,
i.e. they do not always have time to tell Azure they are going away, and
because GCP likes to reuse the same names for machines in a group, we've
been seeing errors like:

```
ERROR: The running command stopped because the preference variable
"ErrorActionPreference" or common parameter is set to Stop: Pool 11
already contains an agent with name VSTS-WIN-3QCX.
```

recently. Today, only 2 out of our 6 agents have managed to register
with Azure. This PR should fix that.

ChaNGELOG_BEGIN
CHANGELOG_END
2020-04-14 13:58:42 +02:00
Gary Verhaegen
66e7068b39
better Windows machine names (#5374)
This is a small QoL improvement, mostly targeted at myself: have Windows
agents register with Azure using the name they display on the GCP
console, so I don't need to find a build and look at the "Agent
Diagnostics" step to figure out the corresponding between Azure and GCP.

CHANGELOG_BEGIN
CHANGELOG_END
2020-04-07 01:33:36 +02:00
Gary Verhaegen
1872c668a5
replace DAML Authors with DA in copyright headers (#5228)
Change requested by Manoj.

CHANGELOG_BEGIN
CHANGELOG_END
2020-03-27 01:26:10 +01:00
Gary Verhaegen
0a251b3fa5
switch CI nodes to permanent (#4455)
CHANGELOG_BEGIN
CHANGELOG_END
2020-02-11 02:07:42 +01:00
Gary Verhaegen
5606ab350c
fix Windows CI node startup script (#4371)
This is an attempt to apply a potential fix discovered as part of the
investigation in #4370. The issue seems to be that Chocolatey is using a
protocol deemed not secure enough and disabled in recent Windows images
(our node creation script dynamically selects the lmatest "Windows 2016"
server image from GCP).

CHANGELOG_BEGIN
CHANGELOG_END
2020-02-04 14:37:53 +01:00
Gary Verhaegen
878429e3bf
update copyright notices to 2020 (#3939)
copyright update 2020

* update template
* run script: `dade-copyright-headers update .`
* update script
* manual adjustments
* exclude frozen proto files from further header checks (by adding NO_AUTO_COPYRIGHT files)
2020-01-02 21:21:13 +01:00
Gary Verhaegen
99ea93168d
update copyright notices (#2499) 2019-08-13 17:23:03 +01:00
Moritz Kiefer
1cfa27d616
Install the Windows SDK on CI nodes (#1272)
This provides signtool.exe which we need to sign our Windows installer.
2019-05-21 13:42:49 +02:00
Gary Verhaegen
e95575b033 install StackDriver on build machines (#905)
Requested by Security
2019-05-04 22:55:51 +00:00
Jonas Chevalier
769c04d3ba infra: reduce differences with hosted (#698) 2019-04-25 20:49:38 +00:00
Jonas Chevalier
3b8ae1ff86 infra: add a VSTS windows agents (#368) 2019-04-18 11:20:57 +00:00