refs
[ENG-827](https://linear.app/tryghost/issue/ENG-827/🐛-crash-on-resizing-animated-gif)
Added a timeout to the image resizing middleware to prevent crashes when
an image is taking too long to resize. When the timeout is reached and
the image has not been resized, the middleware will return the original
image
ref
https://linear.app/tryghost/issue/ENG-851/implement-a-minimal-but-complete-version-of-redirect-caching-to
ref https://app.incident.io/ghost/incidents/55
Often immediately after sending an email, sites receive a large volume
of requests to LinkRedirect endpoints from members clicking on the links in
the email.
We currently don't cache any of these requests in our CDN, because we
also record click events, update the member's `last_seen_at` timestamp,
and send webhooks in response to these clicks, so Ghost needs to handle
each of these requests itself. This means that each of these LinkRedirect requests
hits Ghost, and currently all these requests hit the database to lookup
where to redirect the member to.
Each one of these requests can make up to 11 database queries, which can
quickly exhaust Ghost's database connection pool. Even though the
LinkRedirect lookup query is fairly cheap and quick, these queries aren't
prioritized over the "record" queries Ghost needs to handle, so they can
get stuck behind other queries in the queue and eventually timeout.
The result is that members are unable to actually reach the destination
of the link they clicked on, instead receiving a 500 error in Ghost, or
it can take a long time (60s+) for the redirect to happen.
This PR uses our existing `adapterManager` to cache the redirect lookups
either in-memory or in Redis (if configured — by default there is no caching). This only removes 1 out of
11 queries per redirect request, so it won't reduce the load on the DB
drastically, but it at least decouples the serving of the LinkRedirect from
the DB so the member can be redirected even if the DB is under heavy
load.
Local load testing results have shown a decrease in response times from
60 seconds to ~50ms for the redirect requests when handling 500 requests
per second, and reduced the 500 error rate to 0.
ref https://linear.app/tryghost/issue/ENG-826
- Changed staff deletion logic to do a bulk insert when adding a tag to
the users' associated posts
Staff deletion logic has really poor performance at scale because we do
individual updates for every post. If a user has dozens+ posts
(especially in a large db with thousands of posts), this can take >60s
and look like a timeout. Ultimately this should probably be a jobbed off
process, but for the time being we can improve this by doing a bulk
insert.
Note that this update uses the pattern for the bulk tagging of posts
from the right click (bulk) actions in the posts lists in Admin. With
bulk actions, **we do not trigger web hooks or the post.edited events**.
We will document this and follow up on this separately.
ref
https://linear.app/tryghost/issue/ENG-845/error-attempted-to-set-lexical-on-the-deleted-record
ref
[https://linear.app/tryghost/issue/ENG-854/🐛-deleting-imported-posts-makes-ghost-unresponsive](https://linear.app/tryghost/issue/ENG-854/%F0%9F%90%9B-deleting-imported-posts-makes-ghost-unresponsive)
- When deleting a post in the editor's Post Settings Menu, if the post
has unsaved changes (indicated by the hasDirtyAttributes property in the
editor), Admin will crash because it tries to save a post revision
before leaving the editor, but the post has already been deleted so
saving fails.
- This can occur when editing a post and quickly deleting it from the
Post Settings Menu before saving is completed.
- It can also occur when attempting to delete an imported post, as the
editor will parse the lexical from the server and may make some minor,
invisible-to-the-user changes to the lexical string locally (e.g. JSON
formatting, or updating the JSON to use extended version of base lexical
nodes), which triggers the same error.
- This fix bypasses the attempt to save a post revision when leaving the
editor if the post is already deleted, which allows the transition back
to the Posts route to succeed.
refs f39d1d3aa3
- similar to the commit above, the JSON parser changed between Node 18
and Node 20, so the error message changed too
- we actually just want to check the error is forwarded to the user, so
we can do that by getting the error message from JSON.parse and check
against that
ref 78311591d0
- updated tests to not click a button on the setup/done screen that is no longer shown
- fixed setup flow showing an alert bar due to not handling the `TransitionAborted` error that is thrown by the setup/done->dashboard redirect
ref https://linear.app/tryghost/issue/KTLO-1/members-spam-signups
- Some customers are seeing many spammy signups ("hundreds a day") — our
hypothesis is that bots and/or email link checkers are able to signup by
simply following the link in the email without even loading the page in
a browser.
- Currently new members signup by clicking a magic link in an email,
which is a simple GET request. When the user (or a bot) clicks that link, Ghost
creates the member and signs them in for the first time.
- This change, behind an alpha flag, requires a new member to click the
link in the email, which takes them to a new frontend route `/confirm_signup/`, then submit a form on the page which sends a POST request to the
server. If JavaScript is enabled, the form will be submitted
automatically so the only change to the user is an extra flash/redirect
before being signed in and redirected to the homepage.
- This change is behind the alpha flag `membersSpamPrevention` so we can
test it out on a few customer's sites and see if it helps reduce the
spam signups. With the flag off, the signup flow remains the same as
before.
ref https://linear.app/tryghost/issue/ENG-790/remove-use-of-sub-queries-in-email-analytics
- the `delivered_at` column is typically entirely/nearly entirely filled with values meaning the `IS NOT NULL` query matches a huge number of rows that MySQL has to fetch from the index to count
- using `IS NULL` switches that behaviour around as it will now match very few rows which has been shown in testing to be considerably quicker
- after switching to `IS NULL` the query returns an "undelivered" count rather than a "delivered" count, in order to keep the rest of the system behaviour the same we can calculate the delivered count by subtracting the query result from the total number of emails sent which we can fetch using a very fast primary key lookup query on the `emails` table
ref https://linear.app/tryghost/issue/CFR-13
- enabled saving traces on browser test failure; this makes troubleshooting a lot easier
- updated handling in offers tests to ensure the tier has fully loaded in the UI (not just `networkidle`)
- updated publishing test to examine the publish button reaction to the save action response instead of a 300ms pause
In general, our tests use a lot of watching for 'networkidle' - and sometimes just raw timeouts - which do not scale well into running tests on CI. In particular, 'networkidle' does not work if we're expecting to see React components' state updates propagate and re-render. We should always instead look to the content which encapsulates the response and the UI updates. This is something we should tackle on a larger scale.
ref ENG-774
ref https://linear.app/tryghost/issue/ENG-774
Staff Tokens will have both a `user` and an `apiKey` present on the
`loadedPermissions`.
The check here for `apiKey` was written when we could assume that an
`apiKey` was an Admin Integration - so it completely overwrote the
previous `allowed` list. When we added the concept of Staff Tokens -
this resulted in a privilege escalation.
This is a good lesson in not using proxies or indicators for data, as
changes elsewhere can invalidate them - if we had been specific and
checked the role of the current actor we wouldn't've had this bug!
ref https://linear.app/tryghost/issue/CFR-4/
- added request queueing middleware (express-queue) to handle high
request volume
- added new config option `optimization.requestQueue`
- added new config option `optimization.requestConcurrency`
- added logging of request queue depth - `req.queueDepth`
We've done a fair amount of investigation around improving Ghost's
resiliency to high request volume. While we believe this to be partly
due to database connection contention, it also seems Ghost gets
overwhelmed by the requests themselves. Implementing a simple queueing
system allows us a simple lever to change the volume of requests Ghost
is actually ingesting at any given time and gives us options besides
simply increasing database connection pool size.
---------
Co-authored-by: Michael Barrett <mike@ghost.org>
ref ENG-728
ref https://linear.app/tryghost/issue/ENG-728
This is not used anywhere, and makes the code more complicated, it's a good
step toward simplifying permissions and pulling them out of the database.
ref ENG-728
ref https://linear.app/tryghost/issue/ENG-728
This is NOT a functionality change. The Post#permissible method unit
tests have been updated to pass `true` as `hasUserPermission` and we can
see that the permission functionality remains the same.
The permissible method of the post model is responsible for removing
permission based on the data that is being modified, but the permissions
module is setup to allow the permissible method to grant permission -
this means that we call permissible, even if the current actor doesn't
have permission, this results in code that is hard to understand and
manage.
We are going to be instead returning early if an actor does not have
permission, this will allow permissible method signatures to be greatly
simplified (removing the need for hasUserPermission, hasApiKeyPermission
& hasMemberPermission arguments).
fixes https://linear.app/tryghost/issue/ENG-746/http-500-responses-when-handle-image-sizes-middleware-hits-missing
- in the event a request comes in for a resized image, but the source
image does not exist, we return a rendered 404 page
- we do this because we pass the NotFoundError to `next`, which skips
over the static asset code where we return a plaintext 404
- also included a breaking test that ensure we go to the next middleware
without an error
ref ENG-761
ref https://linear.app/tryghost/issue/ENG-761
Creating these pipelines is expensive, and we don't want to do it
repeatedly for the same controller. Adding caching should reduce the
amount of time spent setting up pipelines for each usage of the `get`
helper.
ref [ENG-747](https://linear.app/tryghost/issue/ENG-747/)
ref https://linear.app/tryghost/issue/ENG-747
H'okay - so what we're trying to do here is make get helper queries more
cacheable. The way we're doing that is by modifying the filter used when
we're trying to remove a single post from the query.
The idea is that we can remove that restriction on the filter, increase
the number of posts fetched by 1 and then filter the fetched posts back
down, this means that the same query, but filtering different posts,
will be updated to make _exactly_ the same query, and so share a cache!
We've been purposefully restrictive in the types of filters we
manipulate, so that we only deal with the simplest cases and the code is
easier to understand.
closes ENG-632
- This listens to a new property in the `milestones` config to set a minimum value of Milestones we wanna use the Slack notification service for
refs https://linear.app/tryghost/issue/ENG-750
- when adding a recommendation, we fetch the recommended site's metadata
- before this change, if the metadata fetch failed for some reason, we'd show an error and block the recommendation from being added
- after this change, we use fallback values if the metadata fails to fetch, instead of blocking the recommendation from being added. We use the site domain as the title and leave the rest empty (no favicon, no description)
- this change also means we are not checking whether a site exists or not for the publisher anymore. It’s then up to the publisher to make sure they don’t enter broken URLs
ref ENG-747
ref https://linear.app/tryghost/issue/ENG-747
H'okay - so what we're trying to do here is make get helper queries more
cacheable. The way we're doing that is by modifying the filter used when
we're trying to remove a single post from the query.
The idea is that we can remove that restriction on the filter, increase
the number of posts fetched by 1 and then filter the fetched posts back
down, this means that the same query, but filtering different posts,
will be updated to make _exactly_ the same query, and so share a cache!
We've been purposefully restrictive in the types of filters we
manipulate, so that we only deal with the simplest cases and the code is
easier to understand.
ref https://linear.app/tryghost/issue/ENG-721
- when changing the response to a `204` for requests with no cookie we'd lost the `Cache-Control: public, max-age: 0` header which meant some cache systems weren't caching as efficiently as possible
ref https://linear.app/tryghost/issue/TRI-65
In the context of referrals, we want to understand how useful our
“Powered by Ghost” badges are.
There are currently a few places where the “Powered by Ghost” badge can
be rendered:
- in newsletters (enabled/disabled by publisher, on a newsletter basis)
- in Portal popups, e.g. member signup/signin/account settings
- in the footer of some themes, including Source & Casper
We're adding the query param ?via to evaluate the usage of the badge in
newsletters.
ref https://linear.app/tryghost/issue/ENG-740/http-500-error-when-image-processing-fails
refs 4aad551c72
- upon further discussion, we've decided it's better to throw an error
in this case because the uploaded image is deemed invalid and storing
it on the filesystem might cause more issues with resizing/further
processing in the future
- this commit implements that and alters the tests
fixes ENG-740
fixes https://linear.app/tryghost/issue/ENG-740/http-500-error-when-image-processing-fails
- in the event the image transform library throws (which can happen for
many reasons; sharp/libvips can come across a number of errors), we
currently return this as a HTTP 500 error to the user
- in this case, we should just try-catch the call and jump to the
non-processing flow where it just saves the original image
- also added breaking test
closes https://linear.app/tryghost/issue/ENG-721
ref https://linear.app/tryghost/issue/ENG-708
Comments-UI loads `/ghost/admin-frame/` in an iframe to check if a Staff User is authenticated in order to show moderation options. That iframe request loads a HTML page which in turn contains a script that fires off an API request that attempts to fetch the logged-in user details, resulting in a 403 "error" showing up when not authenticated. In the vast majority of cases there will be no staff user authenticated so lots of extra requests and "errors" are seen unnecessarily.
- adjusted the `/ghost/auth-frame/` endpoint to check if the request contains an Admin session cookie
- if it does, continue as before with rendering the HTML page so the script is loaded
- if it doesn't, return an empty 204 response avoiding the script request and subsequent 403-generating API request
- eliminates the 403 error being generated for all typical visitor traffic, the error should only be seen when an Admin was previously logged in but their cookie is no longer valid (either from logging out, or going past the 6month validity period)
fixes https://github.com/TryGhost/Product/issues/4237
- this fixes the fact that we return a HTTP 500 response when the oembed
library receives an error, such as a 401 or 403
- includes special handling for cases where we want to return a slightly
different error message
- also adds unit tests for @tryghost/oembed-service package
fixes ENG-733
ref https://linear.app/tryghost/issue/ENG-733/handle-image-uploads-where-name-is-too-long
- filesystems usually have a filename length limit; ie. on macOS it is
255 characters
- if a file is uploaded with a longer filename, we'll return a HTTP 500
- we shouldn't do this as it is user error, so we can just catch the
error code and return BadRequest
- this implements that, and adds a breaking test
refs INC-36
fixes https://github.com/TryGhost/Ghost/issues/19796
- The tiers-only paywall was incorrectly rendering "Free". Example:
"This post is for subscribers of the Free, Silver and Gold tiers only"
- Steps to reproduce the issue:
1. Create a post with public visibility, publish it
2. Then swap the visibility to specific tiers. The default selects all
paid tiers. Leave it like that
3. Update the post. The paywall show Free, even though it should be
showing only the paid tiers
- This fix filters out the "free" tier when visibility is set to tiers,
before updating a Post or a Page. The fix includes bulk updates from the
list of Posts and Pages (right-click on a Post/Page > Change Access).
closes ENG-627
We were using `cheerio` to parse+modify+serialize our rendered HTML to modify links for member attribution. Cheerio's serializer has a [long-standing issue](https://github.com/cheeriojs/cheerio/issues/720) (that we've [had to deal with before](https://github.com/TryGhost/SDK/issues/124)) where it replaces single-quote attributes with double-quote attributes. That was resulting in broken rendering when content used single-quotes such as in HTML cards that have JSON data inside a `data-` attribute or otherwise used single-quotes to avoid escaping double-quotes in an attribute value.
- swapped the implementation that uses `cheerio` for one that uses `html5parser` to tokenize the html string, from there we can loop over the tokens and replace the href attribute values in the original string without touching any other part of the content. Avoids a full parse+serialize process which is both more costly and can result unexpected content changes due to serializer opinions.
- fixes the quote change bug
- uses tokenization directly to avoid cost of building a full AST
- updated Content API Posts snapshot
- one of our fixtures has a missing closing tag which we're no longer "fixing" with a full parse+serialize step in the link replacer (keeps modified src closer to original and better matches behaviour elsewhere in the app / without member-attribution applied)
- the link replacer no longer converts `attr=""` to `attr` (these are equivalent in the HTML spec so no change in behaviour other than preserving the original source html)
- added a benchmark test file comparing the two implementations because the link replacer runs on render so it's used in a hot path
- new implementation has a 3x performance improvement
- the separate files with the old/new implementations have been cleaned up but I've left the benchmark test file in place for future reference
Benchmark results comparing implementations:
```
❯ node test/benchmark.js
LinkReplacer
├─ cheerio: 5.03K /s ±2.20%
├─ html5parser: 16.5K /s ±0.43%
Completed benchmark in 0.9976526670455933s
┌─────────────┬─────────┬────────────┬─────────┬───────┐
│ (index) │ percent │ iterations │ current │ max │
├─────────────┼─────────┼────────────┼─────────┼───────┤
│ cheerio │ '' │ '5.03K/s' │ 5037 │ 5037 │
│ html5parser │ '' │ '16.5K/s' │ 16534 │ 16534 │
└─────────────┴─────────┴────────────┴─────────┴───────┘
```
refs. https://linear.app/tryghost/issue/DES-122/bookmark-card-issues
This PR addresses the following content card related problems:
1. The design of the following cards are more self-contained so it makes
more sense to use `px` for their font-sizes and spacings so it looks the
same regardless of the theme. Of course themes still can override these
values.
Updated cards to use `px` for font sizing:
- audio
- bookmark
- file
- product
2. So far header and signup cards had been using `rem` for font-sizes
and some sizing. This commit updates these to use `em` instead so that
it's consistent with all other cards.
3. The favicon sometimes is not available for bookmark cards. This PR also
fixes that by providing a default favicon for these cases.
closes ENG-681
There's no need to provide an `order` param with every request in Comments-UI if the API has default ordering that matches our requirements. The order param makes logs more noisy/harder to read than they need to be so we want to get rid of it.
- modified comments API input serializer to add a default order param to the browse and replies endpoints when none is provided
- removed order param from the requests that Comments-UI makes
refs https://linear.app/tryghost/issue/ENG-676/
We want to make sure that we're not serving stale liked counts for
comments, which means we need to cache bust when they're liked/unliked
Unfortuantely this means we need to fetch the comment from the db so
that we have access to the post id.
refs https://linear.app/tryghost/issue/ENG-676/
This is the meat of the change and actually causes the cache to be
invalidated on adds and edits to the comments endpoints.
It doesn't currently include the liked/unliked actions at the moment
as we don't have easy access to the post id from those endpoints.
refs https://linear.app/tryghost/issue/ENG-676/
This is pretty simple as we can reuse the existing browse method
on the CommentsController, but we need to add support for the post_id
option to the endpoint, for it to be added to the frame.
We also need to update the browse method to enforce the post_id on the
NQL filter. I initially tried this with string concatenation, but ran
into way too many bugs, so we're using a mongo transformer instead.