Commit Graph

11 Commits

Author SHA1 Message Date
Steve Larson
a47298a75c
Reimplemented email analytics prioritizing email opens (#20914)
ref https://github.com/TryGhost/Ghost/pull/20835
- reimplemented email analytics changes that prioritized opened events
over other events in order to speed up open analytics
- added db persistence to fetch missing job to ensure we re-fetch every
window of events, especially important if we restart following a large
email batch

We learned a few things with the previous trial run of this. Namely,
that event throughput is not as high as we initially saw in the data for
particularly large databases. This set of changes is more conservative,
while a touch more complicated, in ensuring we capture edge cases for
really large newsletter sends (100k+ members).

In general, we want to make sure we're fetching new open events at least
every 5 mins, and often much faster than that, unless it's a quiet
period (suggesting we haven't had a newsletter send or much outstanding
event data).
2024-09-05 08:10:07 -05:00
Steve Larson
8f3985bc66
Reverted email analytics jobs commits (#20835)
ref https://linear.app/tryghost/issue/ENG-1518

After releasing the analytics job improvements, it appears for large
sites we're awfully close to missing some Mailgun events because of an
unexpected behavior of the aggregateStats call for just the opened
events job. This is taking 2-5x(+) the amount of time that the aggregate
queries take for the other jobs, despite not being dependent on the
events.

To err on the side of caution, we're going to roll this back and look to
optimize the aggregation queries before re-implementing. And we may be a
bit more cautious in giving _some_ but not _all_ priority to the
`opened` events.
2024-08-27 16:15:34 -05:00
Steve Larson
4267ff9be6
Updated email analytics job to prioritize open events (#20800)
ref https://linear.app/tryghost/issue/ENG-1477
- updated email analytics job to prioritize open events
- put limits on non-open event fetching
- updated job to now restart itself until processing is at a
sufficiently low volume

Previously the EmailAnalytics job would process all event data equally.
When there's sufficient recipients (>20k), we could see delays in the
open rate data in Admin because of all the delivered events being
processed. Open events are far more important to users, so we've now
prioritized processing those events before any others.

Processing of events shouldn't be any faster or slower with this as this
doesn't change throughput, just order.

NOTE: Use the mailgun-mock-server in TryGhost/Toolbox for testing.
2024-08-20 17:25:01 +00:00
Simon Backx
923c522778
Implemented email analytics retrying (#16273)
fixes https://github.com/TryGhost/Team/issues/2562

New event fetching loops:
- Reworked the analytics fetching algorithm. Instead of starting again
where we stopped during the last fetching minus 30 minutes, we now just
continue where we stopped. But with ms precision (because no longer
database dependent after first fetch), and we stop at NOW - 1 minute to
reduce chance of missing events.
- Apart from that, a missing fetching loop is introduced. This fetches
events that are older than 30 minutes, and just processes all events a
second time to make sure we didn't skip any because of storage delays in
the Mailgun API.
- A new scheduled fetching loop, that allows us to schedule between a
given start/end date (currently only persisted in memory, so stops after
a reboot)

UI and endpoint changes:
- New UI to show the state of the analytics 'loops'
- New endpoint to request the analytics loop status
- New endpoint to schedule analytics
- New endpoint to cancel scheduled analytics
- Some number formatting improvements, and introduction of 'opened'
count in debug screen
- Live reload of data in the debug screen

Other changes:
- This also improves the support for maxEvents. We can now stop a
fetching loop after x events without worrying about lost events. This is
used to reduce the fetched events in the missing and scheduled event
loop (e.g. when the main one is fetching lots of events, we skip the
other loops).
- Prevents fetching the same events over and over again if no new events
come in (because we always started at the same begin timestamp). The
code increases the begin timestamp with 1 second if it is safe to do so,
to prevent the API from returning the same events over and over again.
- Some optimisations in handing the processing results (less merges to
reduce CPU usage in cases we have lots of events).

Testing:
- You can test with lots of events using the new mailgun mocking server
(Toolbox repo `scripts/mailgun-mock-server`). This can also simulate
events that are only returned after x minutes because of storage delays.
2023-02-20 16:44:13 +01:00
Daniel Lockyer
6fc4aa8c4b
Reworked testing and documentation for email-analytics-provider-mailgun
- the tests here were no longer relevant because they were more testing
  things that have been moved to `mailgun-client`
- this commit cleans up the tests to ensure we're passing the correct
  parameters to the mailgun client package
- also adds jsdoc on all the functions and maintains 100% code coverage
2022-08-11 10:30:12 +02:00
Daniel Lockyer
9401d835ce
Moved Mailgun settings test to mailgun-client
- this test checks that the mailgun client respects the changes in
  settings, which is something that we used to ask
  `email-analytics-provider-mailgun` to do when the mailgun client was
  made in that package
- since then, we've pulled it out, so we should move the test to the
  `mailgun-client` library
2022-08-10 18:24:35 +02:00
Daniel Lockyer
bf254b9c6a Extracted Mailgun client to separate package
refs https://github.com/TryGhost/Toolbox/issues/363

- this commit pulls all code involving the Mailgun client SDK into one
  new package called `mailgun-client`
- this means we should be able to replace `mailgun-js` (deprecated) with
  `mailgun.js` (the new, official one) without editing code all over the
  place
- this also lays some groundwork for better testing of smaller
  components
2022-08-10 17:12:37 +02:00
ceecko
a9cce0281d Added support for eu Mailgun domain (#73)
closes: https://github.com/TryGhost/Ghost/issues/14640

- eu mailgun domains have a different structure. 
- we weren't accounting for this when fetching the next page of results, meaning that email stats didn't work on EU domains
2022-05-02 19:08:30 +01:00
Sam Lord
a96cf1a39a Use @tryghost/logging package instead of injected argument
refs: https://github.com/TryGhost/Toolbox/issues/146
2021-12-02 12:26:23 +00:00
Kevin Ansfield
0145c925a0 Added email analytics mailgun provider tests 2021-02-25 20:04:17 +00:00
Kevin Ansfield
788676845d Added empty email analytics packages 2021-02-24 21:03:29 +00:00