Commit Graph

554 Commits

Author SHA1 Message Date
Chris Raible
85408d10b7
Added connection pool metrics to prometheus client (#21576)
ref
https://linear.app/ghost/issue/ENG-1592/start-monitoring-connection-pool-utilization-in-ghost

- This commit adds prometheus metrics to the connection pool so we can
start to track connection pool utilization, number of pending acquires,
and also adds some basic SQL query summary metrics like queries per
minute and query duration percentiles.
- The connection pool has now been theorized to be a main constraint of
Ghost for some time, but it's been challenging to get actual visibility
into the state of the connection pool. With this change, we should be
able to directly observe, monitor and alert on the connection pool.
- Updated grafana version to fix a bug in the query editor that was
fixed in 8.3, even though this is a couple versions ahead of production
2024-11-07 23:01:34 -08:00
Chris Raible
8aba92e444
Fixed CPU Usage chart in grafana dashboard (#21568)
ref
https://linear.app/ghost/issue/ENG-1505/start-monitoring-ghosts-constraints-and-our-3-goals-using-prometheus

- Using `irate` for aggregating CPU usage was resulting in some strange
behavior — the CPU Usage chart would zero out after a few mins of
running. Switching to regular `rate` seems to have fixed the issue
completely.
2024-11-07 10:18:41 -08:00
Chris Raible
a26f63dc11
Configured local prometheus and pushgateway in docker-compose (#21538)
ref
https://linear.app/ghost/issue/ENG-1746/enable-ghost-to-push-metrics-to-a-pushgateway

- Added prometheus job to scrape the pushgateway
- Updated grafana dashboard to use the metrics from the pushgateway
- Added some logging to prometheus client to log errors when pushing
metrics to pushgateway
2024-11-06 11:36:37 -08:00
Chris Raible
190ebcd684
Added ability to push prometheus metrics to a pushgateway (#21526)
ref
https://linear.app/ghost/issue/ENG-1746/enable-ghost-to-push-metrics-to-a-pushgateway

- We'd like to use prometheus to expose metrics from Ghost, but the
"standard" approach of having prometheus scrape the `/metrics` endpoint
adds some complexity and additional challenges on Pro.
- A suggested simpler alternative is to use a pushgateway, to have Ghost
_push_ metrics to prometheus, rather than have prometheus scrape the
running instances.
- This PR introduces this functionality behind a configuration. 
- It also includes a refactor to the current metrics-server
implementation so all the related code for prometheus is colocated, and
the configuration is a bit more organized. `@tryghost/metrics-server`
has been renamed to `@tryghost/prometheus-metrics`, and it now includes
the metrics server and prometheus-client code itself (including the
pushgateway code)
- To enable the prometheus client alone, `prometheus:enabled` must be
true. This will _not_ enable the metrics server or the pushgateway — it
will essentially collect the metrics, but not do anything with them.
- To enable the metrics server, set `prometheus:metrics_server:enabled`
to true. You can also configure the host and port that the metrics
server should export the `/metrics` endpoint on in the
`prometheus:metrics_server` block.
- To enable the pushgateway, set `prometheus:pushgateway:enabled` to
true. You can also configure the pushgateway's `url`, the `interval` it
should push metrics in (in milliseconds) and the `jobName` in the
`prometheus:pushgateway` block.
2024-11-05 11:50:39 -08:00
Chris Raible
fe9b01910d
Cleaned up browser test output in CI (#21462)
no issue

- The browser test output in CI is really noisy, because the `NX_DAEMON`
doens't run in CI, but we're trying to use NX to watch and rebuild the
typescript modules. This is outputting a ton of "NX Daemon is not
running" type of errors, which make it difficult to sift through the
actual test results.
- We don't actually need to watch the typescript files, we just need to
build them once before starting. This is defined as an NX dependency for
the browser tests target, so we don't need to explicitly build the TS
packages at all. Removing the typescript watch & build command removes
the noisy errors, without impacting how the tests actually run.
2024-10-30 12:39:41 -07:00
Chris Raible
b44ad06015
Fixed browser tests yielding a false passing result in CI (#21401)
no issue

- Browser tests in CI were yielding a passing result even if one or more
tests failed (including retries).
- The `yarn dev` command that triggers the browser tests in CI was
catching any errors and exiting with code 0, resulting in a  in CI.
- This commit changes `yarn dev` to exit with code 1 if the browser
tests fail, so that CI will correctly fail if any of the browser tests
fail.
2024-10-24 17:22:37 -07:00
Chris Raible
af0f26c75f
Added Dev Container setup (#21279)
no issue

- Dev Containers let you work on Ghost in a consistent, isolated
environment with all the necessary development dependencies
pre-installed. VSCode (or Cursor) can effectively run _inside_ the
container, providing a local quality development environment while
working in a well-defined, isolated environment.
- For now the default setup only works with "Clone repository in
Container Volume" or "Clone PR in Container Volume" — this allows for a
super quick and simple setup. We can also introduce another
configuration to allow opening an existing local checkout in a Dev
Container, but that's not quite ready yet.
- This PR also added the `yarn clean:hard` command which: deletes all
node_modules, cleans the yarn cache, and cleans the NX cache. This will
be necessary for opening a local checkout in a Dev Container.
- To learn more about Dev Containers, read this guide from VSCode:
https://code.visualstudio.com/docs/devcontainers/containers#_personalizing-with-dotfile-repositories

---------

Co-authored-by: Joe Grigg <joe@ghost.org>
Co-authored-by: Steve Larson <9larsons@gmail.com>
2024-10-24 11:15:08 -07:00
Daniel Lockyer
8fb7da4be0 Bumped Node versions in CI
- we should be running most of CI on Node 20
- this has fallen out of sync because we declare the version in so many
  places
- to help with this, I've extracted the Node version to an env var, and
  re-used that across the workflow
2024-10-23 13:23:35 +02:00
renovate[bot]
cb453febe9 Update benchmark-action/github-action-benchmark action to v1.20.4 2024-10-23 13:22:37 +02:00
Michael Barrett
5492e64988
Updated admin-x-activitypub URL to point at shorter cached version (#21378)
no refs
2024-10-23 11:36:16 +01:00
Laurent Goderre
e2519848c1
Set CI timezone to non-UTC to catch timezone-related issues (#19676)
- this helps catch test failures that are due to us writing timezone dependent code

Co-authored-by: Daniel Lockyer <hi@daniellockyer.com>
2024-10-16 15:35:47 +02:00
Chris Raible
401ec7d14d
Improved pre-commit hook to automatically remove submodules (#21222)
no issue

# Before
The pre-commit hook would abort the commit if any submodules were staged
for commit, and prompt the user to manually un-stage them and retry the
commit.

# Now
The pre-commit hook automatically un-stages any staged submodules, then
allows the commit to proceed.

# Why?
This was a daily annoyance that caused many common git commands to
abort, and required manual un-staging of the submodules before retrying
the commit:
- `git commit -a`
- `git add . && git commit`
- `git add -A && git commit`

If we ever _do_ need to commit submodules, we can always add them back
and run `git commit --no-verify` to accomplish that (which we would have
needed to do before regardless). This should accomplish the same goal of
not allowing submodules to be committed, but reduce the day to day
friction of making commits in Ghost.
2024-10-12 03:40:31 -07:00
Daniel Lockyer
e2d0c2f138 Remove docker-compose version
- this is apparently not needed anymore
2024-10-08 14:34:11 +01:00
Steve Larson
1bfe788689
Added translation gitmoji (#21226)
no ref
2024-10-04 16:28:31 +00:00
Chris Raible
8b26b52513
Added prometheus and grafana services to docker compose (#21213)
ref
https://linear.app/tryghost/issue/ENG-1591/add-prometheus-and-grafana-services-to-docker-compose

This commit adds 2 new services to the docker compose file to enable
monitoring metrics from Ghost locally in real-time:
1. Prometheus - a service that scrapes Ghost's new `/metrics` endpoint
introduced in this
[commit](768336efad).
2. Grafana - a service that consumes the metrics from prometheus and
exposes them in a dashboard that you can view locally at
`localhost:3000`.

# Usage
Both of these services are selectively enabled using docker compose
[profiles](https://docs.docker.com/compose/how-tos/profiles/). This way,
if you don't opt-in to using these monitoring tools, they won't start
and consume resources on your host machine. To enable these services,
enable the `monitoring` profile by either setting the `COMPOSE_PROFILES`
environment variable to `monitoring`, or specifying the `--profile
monitoring` CLI argument to any `docker compose ...` commands.

I've found the easiest way to configure this in an 'always on' fashion
is to create a `.env` file in the project's root directory and add
`COMPOSE_PROFILES=monitoring` to it. As an added convenience, you can
also set `COMPOSE_FILE=.github/scripts/docker-compose.yml`, which will
allow you to run `docker compose ...` commands from the root directory
without specifying the full path each time.

# Intended for development only
These services are meant for local development only, and are not
configured for a production use-case. For example, the Grafana instance
is configured to have _no authorization_ so you won't need a
username/password to login at `localhost:3000`. Prometheus is also
configured to scrape the metrics once every second, which is likely
excessive for production use-cases, but may be useful for getting more
granular metrics while e.g. load testing locally.

# Dashboards
The Grafana instance includes a default dashboard including most of the
main default metrics provided by our prometheus client integration. The
dashboard is defined in a JSON file at
`.github/scripts/docker/grafana/dashboards/main-dashboard.json' and can
be modified & committed to add new visualizations that will be available
to anyone work on Ghost locally. You can also add other dashboards to
the same directory for specific use-cases, which should be picked up and
made available in the Grafana UI. [Read
more](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/view-dashboard-json-model/)
about Grafana's JSON schema for dashboards.
2024-10-03 14:43:07 -07:00
Hannah Wolfe
269ed3891d
Configured i18n tests to run for sodo-search (#21199)
ref https://github.com/TryGhost/Ghost/pull/21055

- Now that sodo-search has i18n, we should run i18n tests when this
package changes as well
2024-10-03 07:56:02 -05:00
Daniel Lockyer
c58cbe4fb9 Bumped CI fetch-depth to 1000
refs https://ghost.slack.com/archives/C02G9E68C/p1727704490753759

- if you open a PR and it becomes outdated enough such that the base
  commit was 100 commits ago, the workflow starts to fail
- to help prevent this, we can increase it by 1000, which should more
  than cover enough use-cases but still keep checkout quick
2024-09-30 16:11:47 +02:00
Daniel Lockyer
607dee288b Cleaned up branch triggers
- we don't use `arch` anymore, and `2.x` and `3.x` are ollldddddd, so
  we're not going to run CI on them
2024-09-30 09:38:28 +02:00
Daniel Lockyer
d86f94db2d Fixed looking up users with special username characters
- users like `renovate[bot]` have brackets in the username
- this breaks the command and it exits with `exit code 3.`
- to fix this, we can encode the username before passing it in
2024-09-30 09:38:28 +02:00
Daniel Lockyer
75afbb4f2a Added missing if-statement to CI workflow
- without this, we're constantly purging the cache, which exceeds the
  rate limit afforded to us from jsdelivr
2024-09-26 19:51:07 +02:00
Fabien O'Carroll
5f637af3cf Added workflow for deploying @tryghost/admin-x-activitypub
ref https://linear.app/tryghost/issue/AP-438

This will build and release the Admin X activitypub app when we bumb the
package.json version and push to `main`
2024-09-26 23:24:26 +07:00
Daniel Lockyer
f19c01a11f Added workflow changes to support PR deploys to staging
ref https://linear.app/tryghost/issue/DEV-31/staging-deploys-of-feature-branchesprs

- we want the ability to ship a PR to staging, so we can test and QA
  without merging to `main`
- most of the infrastructure is already in place for this, so it's
  mostly a case of wiring it all up
- this commit will send a slightly different payload to the build
  process, to indicate it's coming from a PR
- I've also added a check that the user is a member of the org, so we
  don't get random builds from non-members
- to trigger this, we should be able to add the `deploy-to-staging`
  label and it Just Works :TM:
2024-09-26 15:38:35 +02:00
Daniel Lockyer
eebd198027
Fixed requiring passing tests for canary builds
- this disappeared due to a regression in a previous commit
2024-09-25 10:29:19 +02:00
Daniel Lockyer
5a72c5ad91 Updated Nx to v19
refs https://github.com/nrwl/nx/releases/tag/19.8.0

- this commit updates Nx to v19
- we need to add some extra commands to the dev script to stop and
  restart the Nx daemon, so it's ready and running before we execute a
  bunch of Nx commands concurrently
- this also updates nx.json to the format needed for the latest version
2024-09-25 10:16:08 +02:00
Daniel Lockyer
0854f8a531 Exported Git commit hash from update script
ref https://linear.app/tryghost/issue/DEV-25/move-version-bumping-logic-into-ghost-repo

- this allows us to re-use the value from outside the script in CI
2024-09-24 14:08:38 +02:00
Daniel Lockyer
ca691e99e8 Fixed branch name in canary workflow
- `github.ref_name` is a more reliable way to find the branch name than
  we were previously doing
2024-09-24 10:46:44 +02:00
Daniel Lockyer
24447f438e Passed along branch name to canary job
ref https://linear.app/tryghost/issue/DEV-20/faster-builds

- we should pass along the branch name in the metadata field so it's
  more DRY for canary builds in the future
2024-09-24 10:17:22 +02:00
Daniel Lockyer
ae8f8f128b Added version bumping script to repo
ref https://linear.app/tryghost/issue/DEV-25/move-version-bumping-logic-into-ghost-repo

- we're slowly migrating our build code into the OSS repo, which means
  we need to move scripts over
- we have this as a bash script, but I've rewritten it to JS so it's a
  little more maintainable
- this script will just bump the version in the package.json files and
  set the GHA output
2024-09-24 09:11:54 +02:00
Daniel Lockyer
9093ffbf98 Fixed browser tests incorrectly executing Stripe CLI
- we shouldn't try and load the Stripe CLI via the dev script because
  it's done in the browser tests and involves more setup than the dev
  script contains
- this cuts 2mins from the browser tests because they're no longer
  waiting for the Stripe CLI to be auth'd
2024-09-23 17:43:53 +02:00
Daniel Lockyer
7c346c28eb Enabled Nx caching on main
- we should be able to trust Nx enough that we can sustain the build
  cache across commits, which will speed up the workflow because we
  don't need to rebuild our TS projects all the time
2024-09-23 15:19:31 +02:00
Daniel Lockyer
5791be4937 Merged setup steps in CI
- we don't need these to be separate steps and having them separate
  actually makes CI slower because it takes ~10-12s for GitHub to start
  new jobs
2024-09-23 14:48:17 +02:00
Chris Raible
b90aca2816
Removed jaeger container from docker compose (#20994)
no issue

- OpenTelemetry has been problematic in a number of ways (boot time,
breaking the frontend). May revisit it at some point in the future, but
for now it is only exporting metrics via prometheus and not traces, so
there's currently nothing sending data to this jaeger container
- Cleaning it up for now as it's just sitting there idly consuming
resources
2024-09-12 10:37:54 -07:00
Chris Raible
2a0d49c539
Added MySQL data volume to docker compose (#20982)
no issue

- This allows us to run `docker-compose down` or to restart docker
desktop without losing all our local databases
- Added a data volume to the MySQL service in the `docker-compose.yml`
file to persist the data between container restarts
- The `yarn docker:reset` command will still reset all the data in the
database since it uses `down -v` to remove the volumes as well
2024-09-12 09:38:24 -07:00
Sam Lord
625c89e37f
Added the ability to run browser tests using local Portal (#20990)
ref DOGM-32

Using the dev script as a template, this script runs the tests with
local copies of the applications needed instead of the released CDN
versions
2024-09-12 15:55:47 +01:00
Daniel Lockyer
77df6186f0 Removed auto-labeller for PRs
- this was an early attempt to group PRs together by labels, so we can
  triage PRs easier, but it's not finished and actually producing more
  noise than signal
- we might want to re-add this in the future, but for now, silence 🧘
2024-08-29 11:06:00 +02:00
Hannah Wolfe
2720791434
Updated CI workflow to run on PR label/unlabel
- We have browser tests which only run if the browser tests flag is added to the PR
- The label has to be present on PR creation, which is hard to remember/doesn't fit with various workflows
- The default type of action for the pull_request trigger are opened, synchronize, reopened
- This PR adds labeled and unlabeled to those, which I think will help us to run the tests as expected
- The expectation is that adding the browser test label will now trigger the tests to run
2024-08-29 09:24:37 +01:00
Princi Vershwal
f984fbd47e
🎨 Improved the performance of the /members/events/ aggregated_click_event endpoint (#20790)
Ref https://linear.app/tryghost/issue/ONC-216/improve-the-performance-of-the-membersevents-aggregated-click-event
2024-08-22 18:26:10 +05:30
Daniel Lockyer
0f3805e096 Changed color of adminX prefix for yarn dev
- red makes it look like an error, which is very misleading
- I've changed this to a random purple I found
- credits to @vershwal and @dvdwinden
2024-08-20 12:35:24 +02:00
Chris Raible
f147167a29
Added SQLite and MySQL check to migration review checklist (#20708)
no issue

- knex can behave differently with SQLite and MySQL, which can cause
migrations to behave differently in each database. This PR adds a check
to the migration review checklist to remind us to test the migration in
both databases before merging.
2024-08-01 13:38:59 -07:00
renovate[bot]
b754513658 Update jaegertracing/all-in-one Docker tag to v1.58 2024-06-21 09:38:38 +01:00
Chris Raible
417c9c49ea
Added OpenTelemetry instrumentation to Ghost backend (#20144)
This commit adds OpenTelemetry instrumentation to Ghost's backend, which
allows us to view traces similar to what we see in Sentry Performance
locally.

OpenTelemetry is enabled if `NODE_ENV === 'development'` or if it is
explicitly enabled via config with `opentelemetry:enabled`.

It also adds a [Jaeger](https://www.jaegertracing.io/) container to
Ghost's docker-compose file for viewing the traces. There's no setup
required (beyond running `yarn docker:reset` to pickup the changes in
the docker-compose file the first time — but this will also reset your
DB so be careful). This will launch the Jaeger container, and you can
view the UI to see the traces at `http://localhost:16686/search`.
2024-06-19 13:56:51 -07:00
renovate[bot]
54bd9a1ab4 Update benchmark-action/github-action-benchmark action to v1.20.3 2024-05-20 10:39:02 +01:00
renovate[bot]
15569145f3 Update benchmark-action/github-action-benchmark action to v1.20.1 2024-05-09 10:23:48 +02:00
Michael Barrett
af92297ca9
Added Redis via Docker (#20085)
no refs

Redis can be utilised for various caching purposes within Ghost. This PR
adds a Redis service to the docker-compose file to allow for easier
local development when Redis is required
2024-05-07 11:02:36 +01:00
Joe Grigg
c744740761 Updated canary build CI to use the main Moya build pipeline
ref ENG-807
2024-05-01 13:22:01 +01:00
Djordje Vlaisavljevic
7a3bbfde10
Added ActivityPub playground (#20081)
ref MOM61

- Adds admin-x react app we’ll use as ActivityPub playground to the
sidebar nav behind the feature flag.
- Wired up routing to Ember
- Setup the project as `admin-x-activitypub`

---------

Co-authored-by: Ronald Langeveld <hi@ronaldlangeveld.com>
2024-04-25 16:44:29 +08:00
Daniel Lockyer
10e81aeed8
ℹ️ Added support for Node 20
ref https://linear.app/tryghost/issue/ENG-765/add-support-for-node-20

- this adds support for Node 20 to Ghost and CI, as Node 20 is an LTS
  version and we should pick it up
2024-04-18 13:17:21 +02:00
renovate[bot]
e8ea2e4db0 Update GitHub Artifact Actions to v4 2024-04-18 12:36:39 +02:00
renovate[bot]
4ab31122a4 Update actions/cache action to v4 2024-04-18 12:36:07 +02:00
renovate[bot]
2d6a361bb5 Update dorny/paths-filter action to v2.12.0 2024-04-16 09:44:01 +02:00