fixes https://github.com/TryGhost/Team/issues/2724
This change also includes new snapshots for email sending (similar for email previews, but this time for the real emails to make sure we catch changes).
- this cleans up all imports or variables that aren't currently being used
- this really helps keep the tests clean by only allowing what is needed
- I've left `should` as an exemption for now because we need to clean up
how it is used
fixes https://github.com/TryGhost/Team/issues/2611
The old email flow is no longer used since we introduced the email stability flow. This commit removes the related code and tests. The general test coverage decreased a bit as a result, because the old email flow probably had a high test coverage. The new flow is in separate packages, so it couldn't contribute to a higher test coverage (but it does have 100% unit test coverage).
fixes https://github.com/TryGhost/Team/issues/2683
When sending a newsletter with a replacement that has a fallback, the
replacement only happens in the HTML version of the newsletter. The
plaintext version isn't replaced.
This commit fixes the issue and adds some tests to make sure it doesn't
happen again.
The cause of the issue was that we used the original matched Regex text
to replace. But that was calculated on the HTML version, so double
quotes were encoded. This change updates the generated 'token' regex to
also match on both a double quote as the escaped double quote.
refs https://github.com/TryGhost/Team/issues/2667
Some tests still accessed the internet. Now network access is disabled
by default. This change also introduces two helper methods related to
networking (mocking Slack and Mailgun).
This fixes two unreliable tests:
- Staff service was accessing a Slack test API -> timeout possible
- MentionSendingService was trying to send webmentions for every post
publish/change -> possible timeouts and job issues
fixes https://github.com/TryGhost/Team/issues/2562
New event fetching loops:
- Reworked the analytics fetching algorithm. Instead of starting again
where we stopped during the last fetching minus 30 minutes, we now just
continue where we stopped. But with ms precision (because no longer
database dependent after first fetch), and we stop at NOW - 1 minute to
reduce chance of missing events.
- Apart from that, a missing fetching loop is introduced. This fetches
events that are older than 30 minutes, and just processes all events a
second time to make sure we didn't skip any because of storage delays in
the Mailgun API.
- A new scheduled fetching loop, that allows us to schedule between a
given start/end date (currently only persisted in memory, so stops after
a reboot)
UI and endpoint changes:
- New UI to show the state of the analytics 'loops'
- New endpoint to request the analytics loop status
- New endpoint to schedule analytics
- New endpoint to cancel scheduled analytics
- Some number formatting improvements, and introduction of 'opened'
count in debug screen
- Live reload of data in the debug screen
Other changes:
- This also improves the support for maxEvents. We can now stop a
fetching loop after x events without worrying about lost events. This is
used to reduce the fetched events in the missing and scheduled event
loop (e.g. when the main one is fetching lots of events, we skip the
other loops).
- Prevents fetching the same events over and over again if no new events
come in (because we always started at the same begin timestamp). The
code increases the begin timestamp with 1 second if it is safe to do so,
to prevent the API from returning the same events over and over again.
- Some optimisations in handing the processing results (less merges to
reduce CPU usage in cases we have lots of events).
Testing:
- You can test with lots of events using the new mailgun mocking server
(Toolbox repo `scripts/mailgun-mock-server`). This can also simulate
events that are only returned after x minutes because of storage delays.
no issue
- When we receive an email failure with an empty message, the saving of
the model would fail because of schema validation that requires strings
to be non-empty.
- This adds more logging to the email analytics service to help debug
future issues
- Performance improvement to storing delivered, opened and failed emails
by replacing COALESCE with WHERE X IS NULL (tested and should give a
decent performance boost locally).
no issue
There are a couple of issues with resetting the Ghost instance between
E2E test files:
These issues came to the surface because of new tests written in
https://github.com/TryGhost/Ghost/pull/16117
**1. configUtils.restore does not work correctly**
`config.reset()` is a callback based method. On top of that, it doesn't
really work reliably (https://github.com/indexzero/nconf/issues/93)
What kinda happens, is that you first call `config.reset` but
immediately after you correcty reset the config using the `config.set`
calls afterwards. But since `config.reset` is async, that reset will
happen after all those sets, and the end result is that it isn't reset
correctly.
This mainly caused issues in the new updated images tests, which were
updating the config `imageOptimization.contentImageSizes`, which is a
deeply nested config value. Maybe some references to objects are reused
in nconf that cause this issue?
Wrapping `config.reset()` in a promise does fix the issue.
**2. Adapters cache not reset between tests**
At the start of each test, we set `paths:contentPath` to a nice new
temporary directory. But if a previous test already requests a
localStorage adapter, that adapter would have been created and in the
constructor `paths:contentPath` would have been passed. That same
instance will be reused in the next test run. So it won't read the new
config again. To fix this, we need to reset the adapter instances
between E2E tests.
How was this visible? Test uploads were stored in the actual git
repository, and not in a temporary directory. When writing the new image
upload tests, this also resulted in unreliable test runs because some
image names were already taken (from previous test runs).
**3. Old 2E2 test Ghost server not stopped**
Sometimes we still need access to the frontend test server using
`getAgentsWithFrontend`. But that does start a new Ghost server which is
actually listening for HTTP traffic. This could result in a fatal error
in tests because the port is already in use. The issue is that old E2E
tests also start a HTTP server, but they don't stop the server. When you
used the old `startGhost` util, it would check if a server was already
running and stop it first. The new `getAgentsWithFrontend` now also has
the same functionality to fix that issue.
fixes https://github.com/TryGhost/Team/issues/2484
The flow only send the email to segments that were targeted in the email
content. But if a part of the email is only visible for `status:free`,
that doesn't mean we don't want to send the email to `status:-free`.
This has been corrected in the new email flow.
fixes https://github.com/TryGhost/Team/issues/2432
Adds outbound_link_tagging setting (enabled by default and behind
feature flag). If the feature flag is enabled, and the setting is
disabled, we won't add ?ref to links in emails.
This includes new E2E tests for email click tracking, which were also
extended to check outbound link tagging (for both MEGA and the new email
stability flow).
Also fixes a test fixture for the comments_enabled setting.
fixes https://github.com/TryGhost/Team/issues/2339
The email service is now fully covered by tests, and this commit also forces the test coverage to remain 100% after future changes.
We're seeing behaviour from Mailgun where permanent failures with a
5xx error code are not being added to their internal suppression list,
which is resulting in the Ghost list becoming out of sync with
Mailgun.
Rather than adding emails to the suppression list when Mailgun does,
we're instead going to add emails _after_ Mailgun does, by waiting for
an error code which tells us the email is already on the suppression
list.
Those codes are 605 for previous bounces and 607 for previous spam complaints.
fixes https://github.com/TryGhost/Team/issues/2398
There was an error when fetching the existing email recipient failure. It ended up matching all recipient failures. The result was that only one failure was stored in the database.
When Mailgun fails to deliver an email to an address because the
address has already bounced before, it gives us a permanent fail event
with a 605 error code rather than a 5xx one. Because we want to
"backfill" our suppressions data with previously bounced email
addresses, we want to handle this specific error code.
We may update this logic in the future based on new information from
Mailgun with respect to their 6xx error codes and the
meanings/underlying cause of theme.
This also moves the tests which check for whether or not emails are
suppressed into their own fail so that we do not pollute the event
storage tests, and adds more tests cases.
We also fix a leaky sinon stub which we were not resetting in the email
event storage tests
no issue
With the increased usage of DomainEvents, it gets harder to build
reliable tests without having to resort to timeouts. This utility method
allows us to wait for all events to be processed before continuing with
the test.
This change should speed up tests and make them more reliable.
It only adds extra code when running tests and shouldn't impact
production.
fixes https://github.com/TryGhost/Team/issues/2366
refs https://ghost.slack.com/archives/C02G9E68C/p1670232405014209
Probem described in issue.
In the old MEGA flow:
- The `email_verification_required` check is now repeated inside the job
In the new email service flow:
- The `email_verification_required` is now checked (didn't happen
before)
- When generating the email batch recipients, we only include members
that were created before the email was created. That way it is
impossible to avoid limit checks by inserting new members between
creating an email and sending an email.
- We don't need to repeat the check inside the job because of the above
changes
Improved handling of large imports:
- When checking `email_verification_required`, we now also check if the
import threshold is reached (a new method is introduced in
vertificationTrigger specifically for this usage). If it is, we start
the verification progress. This is required for long running imports
that only check the verification threshold at the very end.
- This change increases the concurrency of fastq to 3 (refs
https://ghost.slack.com/archives/C02G9E68C/p1670232405014209). So when
running a long import, it is now possible to send emails without having
to wait for the import. Above change makes sure it is not possible to
get around the verification limits.
Refactoring:
- Removed the need to use `updateVerificationTrigger` by making
thresholds getters instead of fixed variables.
- Improved awaiting of members import job in regression test
The MailgunEmailSuppression list was incorrectly adding emails
to the suppression list for permanent failure events which have
an error code outside of the 5xx range.
fixes https://github.com/TryGhost/Team/issues/1996
**Issue**
Our Magic links are valid for 24 hours. After first usage, the token
lives for a further 10 minutes, so that in the case of email servers or
clients that "visit" links, the token can still be used.
The implementation of the 10 minute window uses setTimeout, meaning if
the process is interrupted, the 10 minute window is ignored completely,
and the token will continue to live for the remainder of it's 24 hour
validity period. To prevent that, the tokens are cleared on boot at the
moment.
**Solution**
To remove the boot clearing logic, we need to make sure the tokens are
only valid for 10 minutes after first use even during restarts.
This commit adds 3 new fields to the SingleUseToken model:
- updated_at: for storing the last time the token was changed/used). Not
really used atm.
- first_used_at: for storing the first time the token was used
- used_count: for storing the number of times the token has been used
Using these fields:
- A token can only be used 3 times
- A token is only valid for 10 minutes after first use, even if the
server restarts in between
- A token is only valid for 24 hours after creation (not changed)
We now also delete expired tokens in a separate job instead of on boot /
in a timeout.
refs https://ghost.slack.com/archives/C02G9E68C/p1670960248186789
This reverts a change that was made here:
f4fdb4fa6c (r93071549),
but it still moved the original code to a new location in the
LastSeenAtUpdater
It includes a new E2E test to make sure timezones are supported
correctly.
- By not using Bookshelf, we no longer fire webhook calls
- By not using the member repository, we don't fetch and update the
member model and the labels relation in a forUpdate transaction, which
caused deadlock issues on the labels/members_labels tables which were
hard to resolve. Until now I was unable to find the other conflicting
transaction that caused this deadlock. Moving to raw knex (instead of
Bookshelf) and only updating the last_updated_at column should remove
the deadlock issue.
This removed the test for the email service wrapper, since it started
failing for an unknown reason and the test didn't make much sense (was
added earlier only to bump test threshold).
refs https://github.com/TryGhost/Team/issues/2367
This ensures that a Member is not considered subscribed to any emails, so that
counts for newsletter recipients are correct. Eventually we will filter members
on their email suppression status but this is not implemented yet.
no issue
- The sleep method has been used in 8 modules reimplementing the same thing over and over again. It's usually a sign of async event processing outside of the request/response loop. It's good to have a single point of implementation for a "hack" like this, so we could track it easier and address the even processing delay in a more optimal way centrally if it ever becomes a bottleneck
refs https://github.com/TryGhost/Team/issues/2339
- Includes a new pattern in the job manager that allows us to properly
await jobs.
- Added new convenience mocking methods to stub settings
- Tests the main flows for bulk sending:
- Sending in multiple batches
- Sending to multiple segments
- Handling a failed batch and retrying that batch
- Fixes bug in batch generation (ordering not working)
In a different PR I'll add more detailed tests.
fixes https://github.com/TryGhost/Team/issues/2332
Saves events in the database and collects error information.
Do note that we can emit the same events multiple times, and as a result
out of order. That means we should correctly handle that a delivered
event might be fired after a permanent failure. So a delivered event is
ignored if the email is already marked as failed. Also delivered_at is
reset to null when we receive a permanent failure.
fixes https://github.com/TryGhost/Team/issues/2310
This moves the processing of the events from the event-processor to a
new email-event-processor in the email-service package.
- The `EmailEventProcessor` only translates events from
providerId/emailId to their known emailId, memberId and recipientId, and
dispatches the corresponding events.
- Since `EmailEventProcessor` runs in a separate worker thread, we can't
listen for the dispatched events on the main thread. To accomplish this
communication, the events dispatched from the `EmailEventProcessor`
class are 'posted' via the postMessage method and redispatched on the
main thread.
- A new `EmailEventStorage` class reacts to the email events and stores
it in the database. This code mostly corresponds to the (now deleted)
subclass of the old `EmailEventProcessor`
- Updating a members last_seen_at timestamp has moved to the
lastSeenAtUpdater.
- Email events no longer store `ObjectID` because these are not
encodable across threads via postMessage
- Includes new E2E tests that test the storage of all supported Mailgun
events. Note that in these tests we run the processing on the main
thread instead of on a separate thread (couldn't do this because
stubbing is not possible across threads)
There are some missing pieces that will get added in later PRs (this PR
focuses on porting the existing functionality):
- Handling temporary failures/bounces
- Capturing the error messages of bounce events
fixes https://github.com/TryGhost/Team/issues/2282
Added a new email service package that is used when the email stability
flag is enabled. Currently not yet implemented so will throw an error
for all entry points (if flag enabled).
Removed usage of `labs.isSet.bind` across the code, because that breaks
the stubbing of labs by `mockManager.mockLabsEnabled` and
`mockManager.mockLabsDisabled`. `flag => labs.isSet(flag)` should be
used instead.
All email depending tests now disable the `emailStability` feature flag
to keep the tests passing + make sure we still run all the tests for the
old flow while the email stability package is being built.
refs https://github.com/TryGhost/Team/issues/2246
- This change helps avoid race conditions due to a lack of a transaction
in the email job. It also moves the status check before creating the
email batches (can take a while) to prevent other timing issues in case
the job got scheduled multiple times.
- Sets the patch option to true when changing the status of an email
batch. If we don't do this, the bookshelf-relations plugin might try to
save relations too. This could have caused a 'no rows updated' error.
- Added a test that tests if the email job can only run once
- Added logging to batching logic
fixes https://github.com/TryGhost/Team/issues/2096
When generating the recipient data for emails, the email clicks
implementation is resulting in a recipient variable being added called
replacement_xxx once for each link containing the same UUID.
This generates a lot of unnecessary data overhead for emails, and it
turns out that mailgun has a 25MB message limit. We wouldn't have come
close if we only included the uuid once.
refs https://github.com/TryGhost/Team/issues/1967
- Test is good to test if the whole flow works as expected, and works together
- We can test independent parts in separate tests that have better coverage of more edge cases
- Adds a basic helper to get an agent for the frontend (spent too much time on a better solution so I decided to keep the existing supertest agent)
refs https://github.com/TryGhost/Toolbox/issues/363
- this commit pulls all code involving the Mailgun client SDK into one
new package called `mailgun-client`
- this means we should be able to replace `mailgun-js` (deprecated) with
`mailgun.js` (the new, official one) without editing code all over the
place
- this also lays some groundwork for better testing of smaller
components
refs https://github.com/TryGhost/Toolbox/issues/354
- this commit turns the Ghost repo into a monorepo so we can bring our
internal packages back in, which makes life easier when working on
Ghost