ref
https://linear.app/tryghost/issue/KTLO-45/deploy-members-caching-solution-to-a-single-site-to-validate-and-test
Currently we only cache publicly available content. Any content that is
accessed by a logged in member is only cached for that specific member
based on their cookie. As a result, almost all requests from logged in
members bypass our caching layer and reach Ghost, which adds unnecessary
load to Ghost and its database.
This change adds experimental headers that allow our CDN to understand
which tier to cache the content against, and securely tell the CDN which
tier a logged in member has access to. With these changes, we can cache
the member content against the tier, rather than the individual member,
which should result in a higher cache HIT ratio and reduce the load on
Ghost.
For requests to the frontend of the site, Ghost will set a custom
`X-Member-Cache-Tier` header to the ID of the tier of the member who is
accessing the content. This tells the CDN which tier to cache the
content against.
For requests to either `/members/?token=...` endpoint (the magic link
endpoint) or `/members/api/member`, Ghost will set a `ghost-access` and
`ghost-access-hmac` cookie with the ID of the tier of the logged in
member. With these two pieces of information, our CDN can serve cached
content to logged in members.
These headers are experimental, and can only be enabled via Ghost's
config. To enable these headers, set `cacheMembersContent:enabled` to
`true` and provide an HMAC key in `cacheMembersContent:hmacSecret`.
ref https://linear.app/tryghost/issue/CFR-29
- Removed the mobiledoc and lexical columns from the posts input
serializer, meaning they will no longer be queried for.
Get helpers are essentially a gateway to the Content API. We already
strip out the mobiledoc and lexical fields in the output
serializer/returned response, but this means we're passing the mobiledoc
and lexical fields back from the db. This is pointless and these fields
are substantial in size - by far the largest fields in the whole ghost
db - leading to slowed performance.
I've updated the posts input serializer to strip out the lexical and mobiledoc
columns so we stop doing a `select *` with every query.
ref ENG-824
- the bug is causing resize prefixes being added to images served from
outside of Ghost.
- this now would only append the prefex to images served by Ghost and
other images urls' would get served as is.
- we can determine that by checking whether imageName doesn't exist,
meaning the source is a third party.
- this mostly affect edge case users, eg where a feature image url was
passed in via the API and doesn't get served by Ghost.
fix https://linear.app/tryghost/issue/SLO-87/cannot-read-properties-of-undefined-reading-createimpl-an-unexpected
refs https://github.com/jsdom/jsdom/issues/3709
- in the event we are given some HTML to parse, and that fails, we
currently return a HTTP 500 because it's unhandled
- the instance we saw was due to `<constructor>` crashing jsdom, we've
opened an issue for that
- in terms of handling the error gracefully, we can surround the code
in a try-catch and return a more suitable error. I've gone for a
ValidationError for now - you could debate whether a different one is
more appropriate
- also added Sentry error capturing so we're not blind to these,
ultimately we should make sure the parser can handle all
user-submitted data
refs
[ENG-827](https://linear.app/tryghost/issue/ENG-827/🐛-crash-on-resizing-animated-gif)
Added a timeout to the image resizing middleware to prevent crashes when
an image is taking too long to resize. When the timeout is reached and
the image has not been resized, the middleware will return the original
image
ref
https://linear.app/tryghost/issue/ENG-851/implement-a-minimal-but-complete-version-of-redirect-caching-to
ref https://app.incident.io/ghost/incidents/55
Often immediately after sending an email, sites receive a large volume
of requests to LinkRedirect endpoints from members clicking on the links in
the email.
We currently don't cache any of these requests in our CDN, because we
also record click events, update the member's `last_seen_at` timestamp,
and send webhooks in response to these clicks, so Ghost needs to handle
each of these requests itself. This means that each of these LinkRedirect requests
hits Ghost, and currently all these requests hit the database to lookup
where to redirect the member to.
Each one of these requests can make up to 11 database queries, which can
quickly exhaust Ghost's database connection pool. Even though the
LinkRedirect lookup query is fairly cheap and quick, these queries aren't
prioritized over the "record" queries Ghost needs to handle, so they can
get stuck behind other queries in the queue and eventually timeout.
The result is that members are unable to actually reach the destination
of the link they clicked on, instead receiving a 500 error in Ghost, or
it can take a long time (60s+) for the redirect to happen.
This PR uses our existing `adapterManager` to cache the redirect lookups
either in-memory or in Redis (if configured — by default there is no caching). This only removes 1 out of
11 queries per redirect request, so it won't reduce the load on the DB
drastically, but it at least decouples the serving of the LinkRedirect from
the DB so the member can be redirected even if the DB is under heavy
load.
Local load testing results have shown a decrease in response times from
60 seconds to ~50ms for the redirect requests when handling 500 requests
per second, and reduced the 500 error rate to 0.
refs f39d1d3aa3
- similar to the commit above, the JSON parser changed between Node 18
and Node 20, so the error message changed too
- we actually just want to check the error is forwarded to the user, so
we can do that by getting the error message from JSON.parse and check
against that
ref ENG-774
ref https://linear.app/tryghost/issue/ENG-774
Staff Tokens will have both a `user` and an `apiKey` present on the
`loadedPermissions`.
The check here for `apiKey` was written when we could assume that an
`apiKey` was an Admin Integration - so it completely overwrote the
previous `allowed` list. When we added the concept of Staff Tokens -
this resulted in a privilege escalation.
This is a good lesson in not using proxies or indicators for data, as
changes elsewhere can invalidate them - if we had been specific and
checked the role of the current actor we wouldn't've had this bug!
ref https://linear.app/tryghost/issue/CFR-4/
- added request queueing middleware (express-queue) to handle high
request volume
- added new config option `optimization.requestQueue`
- added new config option `optimization.requestConcurrency`
- added logging of request queue depth - `req.queueDepth`
We've done a fair amount of investigation around improving Ghost's
resiliency to high request volume. While we believe this to be partly
due to database connection contention, it also seems Ghost gets
overwhelmed by the requests themselves. Implementing a simple queueing
system allows us a simple lever to change the volume of requests Ghost
is actually ingesting at any given time and gives us options besides
simply increasing database connection pool size.
---------
Co-authored-by: Michael Barrett <mike@ghost.org>
ref ENG-728
ref https://linear.app/tryghost/issue/ENG-728
This is not used anywhere, and makes the code more complicated, it's a good
step toward simplifying permissions and pulling them out of the database.
ref ENG-728
ref https://linear.app/tryghost/issue/ENG-728
This is NOT a functionality change. The Post#permissible method unit
tests have been updated to pass `true` as `hasUserPermission` and we can
see that the permission functionality remains the same.
The permissible method of the post model is responsible for removing
permission based on the data that is being modified, but the permissions
module is setup to allow the permissible method to grant permission -
this means that we call permissible, even if the current actor doesn't
have permission, this results in code that is hard to understand and
manage.
We are going to be instead returning early if an actor does not have
permission, this will allow permissible method signatures to be greatly
simplified (removing the need for hasUserPermission, hasApiKeyPermission
& hasMemberPermission arguments).
fixes https://linear.app/tryghost/issue/ENG-746/http-500-responses-when-handle-image-sizes-middleware-hits-missing
- in the event a request comes in for a resized image, but the source
image does not exist, we return a rendered 404 page
- we do this because we pass the NotFoundError to `next`, which skips
over the static asset code where we return a plaintext 404
- also included a breaking test that ensure we go to the next middleware
without an error
ref [ENG-747](https://linear.app/tryghost/issue/ENG-747/)
ref https://linear.app/tryghost/issue/ENG-747
H'okay - so what we're trying to do here is make get helper queries more
cacheable. The way we're doing that is by modifying the filter used when
we're trying to remove a single post from the query.
The idea is that we can remove that restriction on the filter, increase
the number of posts fetched by 1 and then filter the fetched posts back
down, this means that the same query, but filtering different posts,
will be updated to make _exactly_ the same query, and so share a cache!
We've been purposefully restrictive in the types of filters we
manipulate, so that we only deal with the simplest cases and the code is
easier to understand.
closes ENG-632
- This listens to a new property in the `milestones` config to set a minimum value of Milestones we wanna use the Slack notification service for
refs https://linear.app/tryghost/issue/ENG-670
We keep running into issues with a sites content not being correct,
and slow get helpers being the suspect - but it's difficult to prove.
The idea behind this it to give us concrete evidence, which will allow
us to diagnose the problem faster.
refs https://linear.app/tryghost/issue/ENG-600
- users need an option so they can perform actions like delete users
without blowing up Ghost as large dbs can OOM node
ref https://github.com/TryGhost/Ghost/issues/12802
fixes DMA-27
- You can choose any support and newsletter email address in the UI
without verification (as long as your SMTP-server / Mailgun can send
from it)
- All emails will use the mail.from config as the from address as a
default:
- Staff notification emails no longer use the made up ghost@domain email
address
- Newsletters no longer default to 'noreply@domain'
- Member related emails (signin/signup/comment notifications...) will
continue to be send from the chosen support address (Portal settings →
Account page), but will now default to the mail.from config instead of
noreply@domain if no support address is set.
no issue
- Renaming the configuration parameter created in this commit:
e0dae46dfc
- No functional difference, this change just makes the configuration a
bit more succinct
no issue
- To help debug potential causes of slow/aborted get helpers, it would
be cool to get more visibility into how Ghost handles database
connections, particularly if it has to spend a long time waiting to
acquire a new connection from the pool.
- Under the hood, knex uses a package called tarn
(https://github.com/Vincit/tarn.js/tree/3.0.2) to manage the connection
pool. Tarn provides some hooks for instrumentation, so we can use those
to get some basic visibility into the connection pool.
- This PR adds handling for creating, acquiring and releasing
connections from Tarn's connection pool which logs some basic metrics,
particularly the queue length and time it takes to acquire a connection.
no issue
- To help debug ABORTED_GET_HELPER errors, this PR adds Sentry
instrumentation to the get helpers
- It also adds the homepage, any pages/posts, the tag page, and the
author page to the list of transactions that will send to Sentry
fixes PROD-102
When a newsletter has a sender_email stored in the database that Ghost
is not allowed to send from, we no longer return it as sender_email in
the API. Instead we return it as the sender_reply_to. That way the
expected behaviour is shown correctly in the frontend and the API result
also makes more sense.
In addition to that, when a change is made to a newsletters reply_to
address we'll clear any invalid sender_email values in that newsletter.
That makes sure we can clear the sender_reply_to value instead of
keeping the current fallback to sender_email if that one is stored.
On top of that, this change correclty updates the browse endpoint to use
the newsletter service instead of directly using the model.
refs https://linear.app/tryghost/issue/BIZ-6/[wip]-update-segment-events
- With the removal of the `integration.added` event, we have no more model events remaining to listen to for our analytics
- Removal of the function entirely seems the easier and more straightforward way
refs https://linear.app/tryghost/issue/BIZ-6/[wip]-update-segment-events
- Removed model events to listen to: `post.published`, `page.published`, and `theme.uploaded` in segment service, as we're not actively using those.
- Updated tests to reflect the changes (from 4 events to 1 model event)
refs
[ARCH-33](https://linear.app/tryghost/issue/ARCH-33/fix-sentry-environment)
To ensure that we are correctly identifying the environment that data is
being sent to Sentry from, we can use the `PRO_ENV` environment variable
if it is available. This will be set to `production` in production and
`staging` in staging. If `PRO_ENV` is not available, we will fall back
to retrieving the environment from config (`env`)
fixes PROD-61
This adds a new default plan setting. It defaults to yearly, which is
the current default selected interval in Portal.
Behind the new portal improvements feature flag, the default plan can be
changed. It will also change automatically if the available intervals
are changed.
This PR also wires up passing the new setting to the Portal preview.
fixes GRO-71
- Current flow: unchanged
- New managed flow: verification required
- New managed flow with custom sending domain: only verification
required for different domains
- Self hosters (feature flag): no verification required
refs https://github.com/TryGhost/Product/issues/4181
We were seeing slow queries when joining on this table, and the index
speeds them up. The down migration is tricky because when we add the
index MySQL can optimise away some `KEY` indexes on the `newsletter_id`
column. When we then go to remove the newly created index, there is no
index for the FK!
We also remove the use of `force index` as 1. the index we're forcing is
optimised away and 2. we don't need it anymore!
Co-authored-by: Daniel Lockyer <hi@daniellockyer.com>
ref GRO-54
fixes GRO-63
fixes GRO-62
fixes GRO-69
When the config `hostSettings:managedEmail:enabled` is enabled, or the
new flag (`newEmailAddresses`) is enabled for self-hosters, we'll start
to check the from addresses of all outgoing emails more strictly.
- Current flow: nothing changes if the managedEmail config is not set or
the `newEmailAddresses` feature flag is not set
- When managedEmail is enabled: never allow to send an email from any
chosen email. We always use `mail.from` for all outgoing emails. Custom
addresses should be set as replyTo instead. Changing the newsletter
sender_email is not allowed anymore (and ignored if it is set).
- When managedEmail is enabled with a custom sending domain: if a from
address doesn't match the sending domain, we'll default to mail.from and
use the original as a replyTo if appropriate and only when no other
replyTo was set. A newsletter sender email addresss can only be set to
an email address on this domain.
- When `newEmailAddresses` is enabled: self hosters are free to set all
email addresses to whatever they want, without verification. In addition
to that, we stop making up our own email addresses and send from
`mail.from` by default instead of generating a `noreply`+ `@` +
`sitedomain.com` address
A more in depth example of all cases can be seen in
`ghost/core/test/integration/services/email-addresses.test.js`
Includes lots of new E2E tests for most new situations. Apart from that,
all email snapshots are changed because the from and replyTo addresses
are now included in snapshots (so we can see unexpected changes in the
future).
Dropped test coverage requirement, because tests were failing coverage
locally, but not in CI
Fixed settings test that set the site title to an array - bug tracked in
GRO-68
refs TryGhost/Product#4175
- Added error handling to Sentry's beforeSend function in both Admin and
Core, so if there is any error in beforeSend, we will still send the
unmodified event to Sentry
- This is in response to an incident yesterday wherein the beforeSend
function threw an error due to an unexpected missing value in the
exception. The event sent to Sentry was the error in the beforeSend
function, and the original error never reached Sentry.
- If the original event had reached Sentry, even if unmodified by the
logic in beforeSend, we could have been alerted to the issue sooner and
more easily identified all affected sites.
- Also added defensive logic to protect for certain values in the
exception passed to beforeSend not existing and added unit tests for the
beforeSend function in admin and core
refs https://github.com/TryGhost/Product/issues/4140
- added `social-image` image size to our `internalImagesSizes` list with a max-width of 1200
- extracted image utils from `{{img_url}}` helper to a utils file for re-use
- updated `getImageDimensions` method that reads image dimensions and modifies the finalised `metaData` object before use to adjust dimensions and associated URLs to match max width of 1200px
fixes https://github.com/TryGhost/Product/issues/3738https://www.notion.so/ghost/Member-Session-Invalidation-13254316f2244c34bcbc65c101eb5cc4
- Adds the transient_id column to the members table. This defaults to
email, to keep it backwards compatible (not logging out all existing
sessions)
- Instead of using the email in the cookies, we now use the transient_id
- Updating the transient_id means invalidating all sessions of a member
- Adds an endpoint to the admin api to log out a member from all devices
- Added the `all` body property to the DELETE session endpoint in the
members API. Setting it to true will sign a member out from all devices.
- Adds a UI button in Admin to sign a member out from all devices
- Portal 'sign out of all devices' will not be added for now
Related changes (added because these areas were affected by the code
changes):
- Adds a serializer to member events / activity feed endpoints - all
member fields were returned here, so the transient_id would also be
returned - which is not needed and bloats the API response size
(`transient_id` is not a secret because the cookies are signed)
- Removed `loadMemberSession` from public settings browse (not used
anymore + bad pattern)
Performance tests on site with 50.000 members (on Macbook M1 Pro):
- Migrate: 6s (adding column 4s, setting to email is 1s, dropping
nullable: 1s)
- Rollback: 2s
fixes https://github.com/TryGhost/Product/issues/4118
The newsletter uuids were not passed when fetching all the members current newsletters. Therefore, Portals logic broke to remove all newsletters that matched the uuid that was passed to the unsubscribe link. No newsletters were removed, still the notification toast said that the member was unsubscribed from the newsletter.
no refs
Fixed error caused by uploading empty redirects YAML file:
```
Cannot read properties of undefined (reading '302')
```
This error was occurring due to `yaml.load` returning `undefined` when
the provided yaml file was empty. I've made the check on the return
value of `yaml.load` stricter (i.e we only want an `object`) to prevent
this error from occurring.
fixes https://github.com/TryGhost/Product/issues/4085
Increases the performance for the post analytics export by adding new
indexes. These indexes are used when counting the amount of (paid)
subscribers that were attributed to a given post. With the indexes, the
time required to export 700 posts with 300k members decreases from 40s
to 0.6s.
Tests show that adding these indexes should be very fast (< 1 s) if the
tables contain up to 300k rows.
- resolves `DeprecationWarning: In future versions of Node.js, fs.rmdir(path, { recursive: true }) will be removed. Use fs.rm(path, { recursive: true }) instead` in tests