ref
https://linear.app/ghost/issue/ENG-1592/start-monitoring-connection-pool-utilization-in-ghost
- This commit adds prometheus metrics to the connection pool so we can
start to track connection pool utilization, number of pending acquires,
and also adds some basic SQL query summary metrics like queries per
minute and query duration percentiles.
- The connection pool has now been theorized to be a main constraint of
Ghost for some time, but it's been challenging to get actual visibility
into the state of the connection pool. With this change, we should be
able to directly observe, monitor and alert on the connection pool.
- Updated grafana version to fix a bug in the query editor that was
fixed in 8.3, even though this is a couple versions ahead of production
no issue
- We had reintroduced nodemon in
af0f26c75f (diff-bf18f8caf848e17b35e266db04bcaeaad05a3e5d069846615d2b1260482396e1)
for the docker setup, but it has since caused some issues with the `yarn
dev` script.
- In particular, it was causing a restart while migrations were running
in development, which left a migration lock on and prevented Ghost from
starting.
- This commit removes nodemon and replaces it with node --watch, which
we had been using in the past without issues.
refs https://linear.app/ghost/issue/AP-500
We've got a new @tryghost/activitypub package, which is gonna handle all
of the wiring between Ghost and ActivityPub. Currently that is just the
configuration of webhooks for the internal ActivityPub integration.
All this logic is run on the boot of Ghost, though notably in a
non-blocking manner, it's initialised as part of the background services
so it should not have an effect on the time to serving requests. having
said that - it needs to be defensive against errors, which is why the
entire network request is in a try/catch, as well as lookups for the
integration able to handle missing data.
Unit tests use an in-memory sqlite instance, which means we're testing a
full flow, ideally there would be a way to get the schema from Ghost for
this, but for now this is acceptable IMO.
refs https://linear.app/ghost/issue/AP-500
The logic for generating identity tokens, whilst small, is something
that we don't want to duplicate - as it concerns security & access - so
can easily break interactions between different services. We're gonna
need to use identity tokens as part of the initialisation of the
activitypub service, so this is pulling it out preemptively for that use
case
We shouldn't have logic inside of the endpoint controllers anyway, so
this is kinda general cleanup.
ref
https://linear.app/ghost/issue/ENG-1746/enable-ghost-to-push-metrics-to-a-pushgateway
- Trying to get Ghost working with the prometheus pushgateway in
staging, but it's logging an error each time it tries to push the
metrics. The error output is pretty useless for debugging, so this
commit improves the error messages to make it easier to debug.
ref
https://linear.app/ghost/issue/ENG-1746/enable-ghost-to-push-metrics-to-a-pushgateway
- We'd like to use prometheus to expose metrics from Ghost, but the
"standard" approach of having prometheus scrape the `/metrics` endpoint
adds some complexity and additional challenges on Pro.
- A suggested simpler alternative is to use a pushgateway, to have Ghost
_push_ metrics to prometheus, rather than have prometheus scrape the
running instances.
- This PR introduces this functionality behind a configuration.
- It also includes a refactor to the current metrics-server
implementation so all the related code for prometheus is colocated, and
the configuration is a bit more organized. `@tryghost/metrics-server`
has been renamed to `@tryghost/prometheus-metrics`, and it now includes
the metrics server and prometheus-client code itself (including the
pushgateway code)
- To enable the prometheus client alone, `prometheus:enabled` must be
true. This will _not_ enable the metrics server or the pushgateway — it
will essentially collect the metrics, but not do anything with them.
- To enable the metrics server, set `prometheus:metrics_server:enabled`
to true. You can also configure the host and port that the metrics
server should export the `/metrics` endpoint on in the
`prometheus:metrics_server` block.
- To enable the pushgateway, set `prometheus:pushgateway:enabled` to
true. You can also configure the pushgateway's `url`, the `interval` it
should push metrics in (in milliseconds) and the `jobName` in the
`prometheus:pushgateway` block.
ref https://linear.app/tryghost/issue/ENG-1556/
- added background job queue behind config flags
- when enabled, is only used for the member email analytics updates in
order to speed up the parent job, and take load off of the main process
that is serving requests
The intent here is to decouple certain code paths from the main process where it is unnecessary, or worse, where it's part of the request. Primary use cases are email analytics (particularly the member stats [open rate]) which are not particularly helpful in the period immediately following an email send, while the click traffic and delivered/opened events are.
Related, the email link clicks themselves send off a cascade of events that are quite a burden on the main process currently and are somewhat tied to the request response when they needn't be. We'll be looking to tackle that after some initial testing with the email analytics job.
no issue
- Dev Containers let you work on Ghost in a consistent, isolated
environment with all the necessary development dependencies
pre-installed. VSCode (or Cursor) can effectively run _inside_ the
container, providing a local quality development environment while
working in a well-defined, isolated environment.
- For now the default setup only works with "Clone repository in
Container Volume" or "Clone PR in Container Volume" — this allows for a
super quick and simple setup. We can also introduce another
configuration to allow opening an existing local checkout in a Dev
Container, but that's not quite ready yet.
- This PR also added the `yarn clean:hard` command which: deletes all
node_modules, cleans the yarn cache, and cleans the NX cache. This will
be necessary for opening a local checkout in a Dev Container.
- To learn more about Dev Containers, read this guide from VSCode:
https://code.visualstudio.com/docs/devcontainers/containers#_personalizing-with-dotfile-repositories
---------
Co-authored-by: Joe Grigg <joe@ghost.org>
Co-authored-by: Steve Larson <9larsons@gmail.com>
no issue
- bumped gscan version to provide `skipChecks` flag to `check` function
- added `optimization:themes:skipBootChecks` config flag defaulting to `false` to maintain current behaviour
- updated theme service initialization to use `skipChecks: true` when the config flag is set
- we only want to skip the checks during boot in specific cases to improve performance, they are still useful for general development and any production use-cases where themes get edited directly on the server
- updated our theme validate module to accept and pass through `skipChecks` option
- switched the `isZip` positional argument of `validate.check()` to an options object property to make usage cleaner
- we don't need to require the entire package and this costs 5% of our
boot time
- this commit bumps NQL to the latest version, which fixes the requires
to help with treeshaking and loading less code
This PR contains the following updates:
| Package | Type | Update | Change |
|---|---|---|---|
|
[parse-prometheus-text-format](https://redirect.github.com/yunyu/parse-prometheus-text-format)
| devDependencies | pin | [`^1.1.1` ->
`1.1.1`](https://renovatebot.com/diffs/npm/parse-prometheus-text-format/1.1.1/1.1.1)
|
Add the preset `:preserveSemverRanges` to your config if you don't want
to pin your dependencies.
---
### Configuration
📅 **Schedule**: Branch creation - "every weekday" (UTC), Automerge - At
any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.
♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.
🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.
---
- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box
---
This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/TryGhost/Ghost).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC45Ny4wIiwidXBkYXRlZEluVmVyIjoiMzguOTcuMCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOltdfQ==-->
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
ref
https://linear.app/tryghost/issue/ENG-1505/add-prometheus-metrics-server-to-allow-monitoring-ghost-metrics
# Summary
This commit includes two main components: a prometheus client class to
collect metrics from Ghost, and a standalone metrics server that exposes
a /metrics endpoint at a separate port (9416 by default) from the main
Ghost app.
The prometheus client is a very thin wrapper around
[prom-client](https://github.com/siimon/prom-client). We could use
prom-client directly, but this approach should make it easier to switch
to a different prometheus client package (or make our own) if we ever
need to down the line.
The list of default metrics this enables is specified in an e2e test
[here](https://github.com/TryGhost/Ghost/pull/21192/files#diff-ebc52236be2cd14b40be89220ae961f48d3f837693f7d1da76db292348915941R66-R92).
This also gives us the ability to create and collect custom metrics,
although none are included in this commit yet.
# Configuration
The prometheus client and the metrics server are both disabled by
default, but can be enabled by setting the metrics_server:enabled flag
to true.
You can also define a custom host and port using `metrics_server:host`
and `metrics_server:port`.
## Why not expose the /metrics endpoint in one of the existing express
apps?
The standalone express app exists for two main reasons:
1. We don't want these metrics to be public, and the easiest way to
accomplish that is to expose the /metrics endpoint at a different port
that won't be exposed to the internet.
2. Creating a standalone express instance decouples the metrics endpoint
from the Ghost server, so if Ghost is not responding for whatever
reason, we should still be able to scrape metrics to understand what's
going on internally.
## Impact on Boot & Shut down time
The prometheus client is initialized early in the boot process so we can
collect metrics during the boot sequence. Testing locally has shown that
this increases boot time by ~20ms. The metrics server which exposes the
/metrics endpoint is not initialized until after the background
services, and it is not awaited, to avoid impacting boot time. None of
this code, including the requires, will run if the
metrics_server:enabled flag is set to false (or not set).
Shutting down the metrics server is added as a cleanup task for the main
Ghost server instance, and is setup to shut down with 0 grace period to
avoid impacting shut down time.