Commit Graph

16 Commits

Author SHA1 Message Date
Simon Backx
89ababb71f
Updated email event fetching to stop when begin and end are the same (#16326)
no issue

Optimization that makes sure we stop fetching when it is no longer
needed.
2023-02-23 15:44:01 +01:00
Simon Backx
923c522778
Implemented email analytics retrying (#16273)
fixes https://github.com/TryGhost/Team/issues/2562

New event fetching loops:
- Reworked the analytics fetching algorithm. Instead of starting again
where we stopped during the last fetching minus 30 minutes, we now just
continue where we stopped. But with ms precision (because no longer
database dependent after first fetch), and we stop at NOW - 1 minute to
reduce chance of missing events.
- Apart from that, a missing fetching loop is introduced. This fetches
events that are older than 30 minutes, and just processes all events a
second time to make sure we didn't skip any because of storage delays in
the Mailgun API.
- A new scheduled fetching loop, that allows us to schedule between a
given start/end date (currently only persisted in memory, so stops after
a reboot)

UI and endpoint changes:
- New UI to show the state of the analytics 'loops'
- New endpoint to request the analytics loop status
- New endpoint to schedule analytics
- New endpoint to cancel scheduled analytics
- Some number formatting improvements, and introduction of 'opened'
count in debug screen
- Live reload of data in the debug screen

Other changes:
- This also improves the support for maxEvents. We can now stop a
fetching loop after x events without worrying about lost events. This is
used to reduce the fetched events in the missing and scheduled event
loop (e.g. when the main one is fetching lots of events, we skip the
other loops).
- Prevents fetching the same events over and over again if no new events
come in (because we always started at the same begin timestamp). The
code increases the begin timestamp with 1 second if it is safe to do so,
to prevent the API from returning the same events over and over again.
- Some optimisations in handing the processing results (less merges to
reduce CPU usage in cases we have lots of events).

Testing:
- You can test with lots of events using the new mailgun mocking server
(Toolbox repo `scripts/mailgun-mock-server`). This can also simulate
events that are only returned after x minutes because of storage delays.
2023-02-20 16:44:13 +01:00
Simon Backx
48f9485f46
🐛 Fixed storing email failures with an empty message (#16260)
no issue

- When we receive an email failure with an empty message, the saving of
the model would fail because of schema validation that requires strings
to be non-empty.
- This adds more logging to the email analytics service to help debug
future issues
- Performance improvement to storing delivered, opened and failed emails
by replacing COALESCE with WHERE X IS NULL (tested and should give a
decent performance boost locally).
2023-02-13 15:25:36 +01:00
Simon Backx
47cd7a7095
🐛 Handled unknown Mailgun events (#15995)
refs https://ghost.slack.com/archives/C02G9E68C/p1670916538764019

- We receive events that don't have an emailId or providerId.
- We filter those events now and log them as an error
2022-12-14 11:17:45 +01:00
Simon Backx
d8187123af
Added storage for email failures (#15901)
fixes https://github.com/TryGhost/Team/issues/2332

Saves events in the database and collects error information.

Do note that we can emit the same events multiple times, and as a result
out of order. That means we should correctly handle that a delivered
event might be fired after a permanent failure. So a delivered event is
ignored if the email is already marked as failed. Also delivered_at is
reset to null when we receive a permanent failure.
2022-12-01 10:00:53 +01:00
Simon Backx
f4fdb4fa6c
Added new email event processor (#15879)
fixes https://github.com/TryGhost/Team/issues/2310

This moves the processing of the events from the event-processor to a
new email-event-processor in the email-service package.

- The `EmailEventProcessor` only translates events from
providerId/emailId to their known emailId, memberId and recipientId, and
dispatches the corresponding events.
- Since `EmailEventProcessor` runs in a separate worker thread, we can't
listen for the dispatched events on the main thread. To accomplish this
communication, the events dispatched from the `EmailEventProcessor`
class are 'posted' via the postMessage method and redispatched on the
main thread.
- A new `EmailEventStorage` class reacts to the email events and stores
it in the database. This code mostly corresponds to the (now deleted)
subclass of the old `EmailEventProcessor`
- Updating a members last_seen_at timestamp has moved to the
lastSeenAtUpdater.
- Email events no longer store `ObjectID` because these are not
encodable across threads via postMessage
- Includes new E2E tests that test the storage of all supported Mailgun
events. Note that in these tests we run the processing on the main
thread instead of on a separate thread (couldn't do this because
stubbing is not possible across threads)

There are some missing pieces that will get added in later PRs (this PR
focuses on porting the existing functionality):
- Handling temporary failures/bounces
- Capturing the error messages of bounce events
2022-11-29 11:15:19 +01:00
Naz
fa8d94fce2 Fixed the typo
refs e9bfc4ef01

- Did a typo in the find and replace... and now correcting a typo of a typo  -_-
2022-08-04 15:38:32 +01:00
Naz
e9bfc4ef01 Changed the lingo to US of A variation
refs 16728a3ef1

- It's 'merica time!
2022-08-05 02:28:33 +12:00
Sam Lord
9c9d6c2340 Replaced GhostIgnition with @tryghost packages
refs: https://github.com/TryGhost/Toolbox/issues/146
2021-12-02 12:17:38 +00:00
Kevin Ansfield
af7aff5352 Added email analytics service tests
no issue

- fixed destructuring error when  creating instances of `EmailAnalyticsService` or `EventProcessor` with no options object
- fixed unprocessable complained event incrementing complained count rather than unprocessable count
- fixed `fetchAll()` and `fetchLatest()` returning `undefined` instead of a blank EventProcessingResult
2021-03-01 21:31:07 +00:00
Kevin Ansfield
88c648636c Initial update of email analytics packages to work as external modules
refs https://github.com/TryGhost/Ghost/pull/12541

- make `EventProcessor` a super-class designed to be inherited from in consumer applications for application-level implementation
  - helps to keep application-level concerns for event handling (eg, what to do with spam complaints) and things like application database knowledge in the consumer
- removed all database knowledge from `EmailAnalyticsService`
  - requires a `queries` option to be passed in that lets the consuming application provide knowledge and define how fetched stats should be aggregated
2021-02-24 21:23:56 +00:00
Kevin Ansfield
2b1dc351d2 Added members.email_open_rate aggregation to email analytics (#12458)
refs https://github.com/TryGhost/Ghost/issues/12421
requires https://github.com/TryGhost/Ghost/pull/12457

- updates stats aggregator to calculate and store an open rate for each member
  - uses two queries because I couldn't find a reasonable approach to perform the update in a single query as per the email aggregation
  - benchmarked locally at <1sec/1000members
  - will not store an open rate unless the number of tracked emails sent to a member is above a certain threshold (defaults to 5) to avoid new members being heavily weighted
- fixes typo in EmailAnalytics that was stopping member stats from being aggregated
2021-02-24 21:03:29 +00:00
Kevin Ansfield
bef65aa8a2 Fixed email count check in email-analytics service
no issue

- raw knex `.count()` does not return a straight number, we need to handle an array of rowDataPacket objects
2021-02-24 21:03:29 +00:00
Kevin Ansfield
2b747e6b53 Fixed misleading emailAnalytics.fetchLatest debug statement 2021-02-24 21:03:29 +00:00
Kevin Ansfield
a84ec49854 Added no-emails guard check when fetching latest email events
no issue

- replicates the same guard used in `fetchLatest()` to prevent contacting Mailgun when there are no emails in the database
2021-02-24 21:03:29 +00:00
Kevin Ansfield
7bbf644d0d Added email analytics service (#12393)
no issue

- added `EmailAnalyticsService`
  - `.fetchAll()` grabs and processes all available events
  - `.fetchLatest()` grabs and processes all events since the last seen event timestamp
  - `EventProcessor` passed event objects and updates `email_recipients` or `members` records depending on the event being analytics or list hygiene
    - always returns a `EventProcessingResult` instance so that progress can be tracked and merged across individual events, batches (pages of events), and total runs
    - adds email_id and member_id to the returned result where appropriate so that the stats aggregator can limit processing to data that has changed
    - sets `email_recipients.{delivered_at, opened_at, failed_at}` for analytics events
    - sets `members.subscribed = false` for permanent failure/unsubscribed/complained list hygiene events
  - `StatsAggregator` takes an `EventProcessingResult`-like object containing arrays of email ids and member ids on which to aggregate statistics.
  - jobs for `fetch-latest` and `fetch-all` ready for use with the JobsService
- added `initialiseRecurringJobs()` function to Ghost bootup procedure that schedules the email analytics "fetch latest" job to run every minute
2021-02-24 21:03:29 +00:00