This PR contains many small changes:
- A small refactoring whereby the "es-init" machine is now
(syntactically) integrated with the two instance groups, to cut down a
bit on repetition.
- The feeder machine is now preemptible, because I've seen it recover
enough times that I'm confident this will not cause any issue.
- Indices are now sharded.
- Return values from ES are filtered, cutting down a bit on network
usage and memory requirements to produce the responses.
- Bulk uploads for a single job are now done in parallel. This results
in about a 2x speedup for ingestion.
- crontab was changed to very minute instead of every 5 minutes.
CHANGELOG_BEGIN
CHANGELOG_END
We currently have about 1% (28 out of 2756) of our build logs that have
invalid JSON files. They are all about a `-profile` file being
incomplete, and since those files represent a single JSON object we
can't do smarter things like filtering invalid individual lines.
I haven't looked deeply into _why_ we create invalid files, but this
should let our ingestion process make some progress in the meantime.
CHANGELOG_BEGIN
CHANGELOG_END
This PR adds a machine that will, every 5 minutes, look at the GCS
bucket that stores Bazel metrics and push whatever it finds to
ElasticSearch.
A huge part of this commit is based on @aherrmann-da's work. You can
assume that all the good bits are his.
CHANGELOG_BEGIN
CHANGELOG_END
This PR adds a Kibana instance to each ES node, and duplicates the load
balancer mechanism to expose both raw ES and Kibana.
CHANGELOG_BEGIN
CHANGELOG_END
This PR adds a basic ES cluster to our infrastructure, completely open
and unprotected but only accessible through VPN.
And, as of yet, through its IP address. I'm not sure whether it's worth
adding a DNS for it.
CHANGELOG_BEGIN
CHANGELOG_END