refs https://github.com/TryGhost/Ghost/issues/12461
- added two default aggregations for overall email count and opened email count
- when number of tracked emails is sufficient add the open rate aggregation to the update query
refs https://github.com/TryGhost/Ghost/issues/12421
requires https://github.com/TryGhost/Ghost/pull/12457
- updates stats aggregator to calculate and store an open rate for each member
- uses two queries because I couldn't find a reasonable approach to perform the update in a single query as per the email aggregation
- benchmarked locally at <1sec/1000members
- will not store an open rate unless the number of tracked emails sent to a member is above a certain threshold (defaults to 5) to avoid new members being heavily weighted
- fixes typo in EmailAnalytics that was stopping member stats from being aggregated
no issue
- job registration was checking for submitted emails in it's email count but the job registration method is called as soon as an email is created meaning the email has a status of 'pending' which prevented the analytics job from being started until a second email was sent
no issue
- email analytics may be desirable to fully switch off in certain circumstances, when that happens we want to prevent related background jobs from running and expose the feature flag via the config endpoint in the Admin API so that clients can adjust accordingly
no issue
- if emails are older than 30 days we wouldn't be able to fetch any analytics for them and if a site used emails in the past but is no longer using them it doesn't make sense to keep potentially expensive background worker threads spinning up
no issue
- recurring jobs spin up worker threads which can be quite CPU intensive even when not performing much processing, this can be problematic in environments where there are many Ghost instances running
- updated the email job scheduling to be skipped on bootup when there are no emails in the database and to be started when the first email is created as long as we're not in testing env
- increase analytics job schedule from every 2 minutes to every 5 minutes to help spread the load further across instances
no issue
- the 4.2.0 version of `sqlite3` that we're using is not compatible with `worker_threads`
- 5.0.0 should add support this but there are other errors
- 5.0.1 is released but not published (https://github.com/mapbox/node-sqlite3/issues/1386)
no issue
- added `EmailAnalyticsService`
- `.fetchAll()` grabs and processes all available events
- `.fetchLatest()` grabs and processes all events since the last seen event timestamp
- `EventProcessor` passed event objects and updates `email_recipients` or `members` records depending on the event being analytics or list hygiene
- always returns a `EventProcessingResult` instance so that progress can be tracked and merged across individual events, batches (pages of events), and total runs
- adds email_id and member_id to the returned result where appropriate so that the stats aggregator can limit processing to data that has changed
- sets `email_recipients.{delivered_at, opened_at, failed_at}` for analytics events
- sets `members.subscribed = false` for permanent failure/unsubscribed/complained list hygiene events
- `StatsAggregator` takes an `EventProcessingResult`-like object containing arrays of email ids and member ids on which to aggregate statistics.
- jobs for `fetch-latest` and `fetch-all` ready for use with the JobsService
- added `initialiseRecurringJobs()` function to Ghost bootup procedure that schedules the email analytics "fetch latest" job to run every minute