Data generator uses CSV imports for a massive speed increase, but
can't be used in some environments where SQL admin isn't
available. This allows us to set a flag to use the original
insert-based importer.
no issue
- The `RedirectsImporter` used by the data generator was creating
redirects with the wrong length for the `from` field, which didn't match
the actual behavior of Ghost.
- This commit corrects the length from 32 to 8, which is the actual
length of the `from` field in production.
- This change has no impact on Ghost's behavior, but makes the data
generator more representative of real world data for more accurate
testing.
no issue
When testing Stripe migrations, it is useful to be able to clear the
database quickly without deleting admins and tokens. This is possible
with the data generator.
no issue
- DataGenerator disables the redo log to make data imports faster but it's a persisted global config change and we were missing the re-enable query once the imports have finished
- when the redo log remains disabled an unexpected shutdown puts the database into a non-starting state with the error `Server was killed when Innodb Redo logging was disabled. Data files could be corrupt. You can try to restart the database with innodb_force_recovery=6`
no issue
The data generator created an offer for the free product. This caused an
error in admin UI because it couldn't find the tier for the offer.
This fixes the issue in both the data generator and the admin UI.
no issue
The data generator went out of memory when trying to generate fake data
for > 2M members. This adds some improvements to make sure it doesn't go
out of memory.
---------
Co-authored-by: Fabien "egg" O'Carroll <fabien@allou.is>
- we want to pass in the schema tables instead of cross requiring them
from a different package because it means the package isn't standalone
and moving the code structure around breaks the data generator
ref PROD-233
- Stored whether Docker is used in the config files
- When running `yarn setup`, any existing Docker container will be
reset. Run with `-y` to skip the confirmation step.
- `yarn setup` will always init the database and generate fake data
- Increased amount of default generated data to 100,000 members + 500
posts.
- Made lots of performance improvements in the data generator so we can
generate the default data in ±170s
ref PROD-244
- Added support for canceled subscriptions and different subscription statusses
- Removed generation of susbcriptions table (not used)
- Added old canceled subscriptions for free members
- Added both positive and negative MRR events
ref PROD-244
Instead of generating one batch with all recipients, we now will generate
one batch per 1000 members and distribute the recipients across them.
ref PROD-233
Errors were not handled properly because of a missing rollback and try/catch.
Using a function is easier generally.
Also added ignored contraint checks to increase performance a tiny bit.
They ended up not mattering much, so we can consider to remove them again.
refs https://github.com/TryGhost/DevOps/issues/119
- members_newsletters needs members_subscribe_events, but it was also
then generating `subscriptions` records and the whole thing was really
slow
- for now, subscriptions is not a used table so we can remove use of it
- also adds support for generating more than one subscription record
with an 80% chance of being subscribed
refs https://github.com/TryGhost/DevOps/issues/119
- this function can simply call the `import` function, which performs
the same code as we had here
- this makes the code cleaner to read and understand
refs https://github.com/TryGhost/DevOps/issues/119
- this switches away from using a static list of names in favor of ones
generated by faker, so we don't run into duplicate names
- also minor code re-arranging
fixes https://github.com/TryGhost/Product/issues/3738https://www.notion.so/ghost/Member-Session-Invalidation-13254316f2244c34bcbc65c101eb5cc4
- Adds the transient_id column to the members table. This defaults to
email, to keep it backwards compatible (not logging out all existing
sessions)
- Instead of using the email in the cookies, we now use the transient_id
- Updating the transient_id means invalidating all sessions of a member
- Adds an endpoint to the admin api to log out a member from all devices
- Added the `all` body property to the DELETE session endpoint in the
members API. Setting it to true will sign a member out from all devices.
- Adds a UI button in Admin to sign a member out from all devices
- Portal 'sign out of all devices' will not be added for now
Related changes (added because these areas were affected by the code
changes):
- Adds a serializer to member events / activity feed endpoints - all
member fields were returned here, so the transient_id would also be
returned - which is not needed and bloats the API response size
(`transient_id` is not a secret because the cookies are signed)
- Removed `loadMemberSession` from public settings browse (not used
anymore + bad pattern)
Performance tests on site with 50.000 members (on Macbook M1 Pro):
- Migrate: 6s (adding column 4s, setting to email is 1s, dropping
nullable: 1s)
- Rollback: 2s
refs: https://github.com/TryGhost/DevOps/issues/11
This is a pretty huge commit, but the relevant points are:
* Each importer no longer needs to be passed a set of data, it just gets the data it needs
* Each importer specifies its dependencies, so that the order of import can be determined at runtime using a topological sort
* The main data generator function can just tell each importer to import the data it has
This makes working on the data generator much easier.
Some other benefits are:
* Batched importing, massively speeding up the whole process
* `--tables` to set the exact tables you want to import, and specify the quantity of each
no issue
Before, when base data included labels for members, the random generated labels would also be generated. This prevents that, and ensures that the base-data labels are applied correctly to members.
no issue
Previously the number of opened emails was being generated incorrectly as the number of delivered emails was being reported too high.
Also, the faker date function occasionally fails for dates which are
too close together so this switches to manually generating a date
between the two.