View Source Plausible.Purge (Plausible v0.0.1)

Deletes data from a site.

Stats are stored on Clickhouse, and unlike other databases data deletion is done asynchronously.

All import tables have MergeTree's deduplication mechanism disabled by setting replicated_deduplication_window from default 100 to 0. When enabled, every insert into a given table is compared against hashes of 100 previous inserts (as complete parts, not concrete rows) and ignored when match is found. The prupose of that mechanism is making inserts of exact same batches idempotent when retrying them shortly after - for instance due to timeout, when the client can't easily tell if previous insert succeeded or not. Deduplication, however, only considers inserts, not mutations. Deletions do not affect stored hashes, so further inserts of parts that were deleted will still be treated as duplicates. That's why this feature is disabled for import tables.

Although deletions are asynchronous, the parts to delete are "remembered", so there's no risk of overlapping deletion causing problems with import following right after it.

IMPORTANT: Deletion requires revision if/when import tables get moved to sharded CH cluster setup. Mutation queries, which have to be run with ON CLUSTER in such setup, dispatch independent queries across shards and those queries can start at different times. This in turn means risk of deletions corrupting data of follow-up inserts in some edge cases. Ideally, imported entries should be unique for a given import

Summary

Functions

Deletes imported stats from and clears the stats_start_date field.

Move stats pointers so that no historical stats are available.

Functions

Link to this function

delete_imported_stats!(site)

View Source
@spec delete_imported_stats!(Plausible.Site.t() | Plausible.Imported.SiteImport.t()) ::
  :ok

Deletes imported stats from and clears the stats_start_date field.

The stats_start_date is expected to get repopulated the next time Plausible.Sites.stats_start_date/1 is called.

If the input argument is a site, all imported stats are deleted. If it's a site import, only imported stats for that import are deleted.

Link to this function

delete_imported_stats!(site, import_id)

View Source
Link to this function

delete_native_stats!(site)

View Source
@spec delete_native_stats!(Plausible.Site.t()) :: :ok

Move stats pointers so that no historical stats are available.