analytics

mirror of https://github.com/plausible/analytics.git synced 2024-12-23 17:44:43 +03:00

Author	SHA1	Message	Date
Karl-Aksel Puulmann	3310006337	Update 20240801091615_capitalize_known_sources.exs migration (#4525 ) Previous migration took forever on prod, likely because Map lookups are linear time in complexity. `transform/3` helps achieve the same functionality with the help of a hash table and updated WHERE clause allows skipping most rows which dont need updating Co-authored-by: Uku Taht <Uku.taht@gmail.com>	2024-09-04 13:57:28 +03:00
Uku Taht	77248c8800	Add data migration for capitalizing sources (#4418 )	2024-09-02 13:59:58 +03:00
ruslandoga	b64038af48	Fix alias migration warning (#4465 )	2024-08-26 11:16:13 +02:00
Karl-Aksel Puulmann	4967960278	Populate log_comment with debug information, /debug/clickhouse route (#4435 ) * Set log_comment with request information * CRMAuthPlug -> SuperAdminOnlyPlug * Super basic debug view * Handle clustered setups * Changelog entry * Cleanup * fragment trick to use ecto querying, filtering * Move clustered_table? function to IngestRepo module * Format * More resilient user_id getting in helper	2024-08-14 12:33:36 +03:00
Karl-Aksel Puulmann	ee3d1e770e	APIv2: visit:country_name, visit:region_name, visit:city_name dimensions (#4328 ) * Add data migration for creating and syncing location_data table and dictionary * Migration to populate location data * Daily cron to refresh location dataset if changed * Add support for visit:country_name, visit:region_name and visit:city_name dimensions Under the hood this relies on a `location_data` table in clickhouse being regularly synced with plausible/location repo and dictionary lookups used in ALIAS columns * Update queue name * Update documentation * Explicit structs * Improve docs further * Migration comment * Add queues * Add error when already loaded * Test for filtering by new dimensions * Update deps * dimension -> select_dimension * Update a test	2024-08-13 09:44:58 +03:00
ruslandoga	a139e8295c	Fix warning in migration (#4133 )	2024-05-24 16:26:25 +02:00
RobertJoonas	c106595be0	Migration: add imported custom events (#4076 ) * Add `imported_custom_events` to CH * remove redundant table setting * add path column --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>	2024-05-07 11:46:27 +01:00
Karl-Aksel Puulmann	850a843d82	Add migration to add ALIAS columns to common session/visit properties (#4058 ) This allows for simplifications in the API code	2024-05-05 11:30:39 +03:00
Adrian Gruntkowski	069170eb1d	Add `active_visitors` column to `imported_pages` CH table (#4028 )	2024-04-22 09:34:01 +02:00
RobertJoonas	bd73fc8266	CH Migration - add more imported metrics and properties (#3949 ) * add migration * add utm_source to imported_sources * quickfix * satisfy credo * Revert "satisfy credo" This reverts commit `bb0b228164`. * Revert "quickfix" This reverts commit `ab6f70c79e`. --------- Co-authored-by: ruslandoga <67764432+ruslandoga@users.noreply.github.com>	2024-04-04 09:12:53 +01:00
hq1	1f778e0c11	CH Migration: exit page hostname on sessions_v2 (#3953 ) * CH Migration: exit/entry hostnames in sessions_v2 * Leave only exit_page_hostname, we already record hostnames * Use ClickHouse DDL in favour of ecto so that cluster is included * Compress with ZSTD(3)	2024-04-03 09:42:47 +02:00
Karl-Aksel Puulmann	70d21f1b69	Improve compression of event table session columns (#3864 )	2024-03-05 12:45:19 +02:00
Karl-Aksel Puulmann	ea38b45685	Ecto migration to move sessions_v2 to VersionedCollapsingMergeTree (#3809 ) This migration will noop in staging/production as it already has been run. It also leaves behind a backup table that initially takes no extra space but will need to be cleaned up manually	2024-02-28 11:22:27 +02:00
Karl-Aksel Puulmann	acd4f33c8e	Migration to improve compression options of sessions_v2 and events_v2 tables (#3803 ) Part of https://3.basecamp.com/5308029/buckets/35611491/messages/7061875211 Based on work in https://3.basecamp.com/5308029/buckets/35611491/messages/6949640880#__recording_7015576045	2024-02-23 09:19:57 +02:00
Adrian Gruntkowski	6915dd5e78	Add table migrations for supporting multiple imports per site (#3785 ) * Add migration deleting jobs from legacy analytics import queue * Add `site_imports` table * Add `import_id` column to imported_* CH tables * Extend one of `site_imports` indexes	2024-02-13 17:26:19 +01:00
Karl-Aksel Puulmann	f5e15941de	Add migration for minmax indexes on session timestamps (#3772 )	2024-02-13 10:35:24 +02:00
hq1	52d5fac362	Fixup `4cdf843ad`: (#3434 ) - migration is irreversible for now, likely due to https://github.com/plausible/ecto_ch/issues/81#issuecomment-1624754909 or maybe https://github.com/plausible/ecto_ch/pull/58/files - IO.inspect call removed	2023-10-17 16:07:06 +02:00
Adrian Gruntkowski	4cdf843ade	Disable deduplication for improted_* tables in CH (#3429 ) * Disable deduplication for improted_* tables in CH * Make migration cluster-aware * Bump cache to avoid problems * Fix cache dump * Remove cache bump	2023-10-17 12:51:13 +02:00
Vini Brasil	d98242895b	Add revenue fields to events_v2 (#3018 )	2023-06-12 18:12:41 +01:00
ruslandoga	f489d96251	skip v1 table drop in self-host (#2945 )	2023-05-25 10:32:33 +03:00
hq1	57af1f19ec	Drop events/sessions tables after V2 migration (#2908 )	2023-05-10 09:19:50 +02:00
hq1	b9c2110472	V2 migration tweaks for self hosted release (#2825 ) * Get rid of PASS_V2_SCHEMA_MIGRATION * Use in-memory domain lookup + regular table settings * Remove faulty date arithmetic + prev part calculation * Set V2_MIGRATION_DONE in Mix.env == :dev * Mute credo	2023-04-13 12:09:39 +02:00
Adam	6637751a5e	Implement Numeric IDs migration (#2762 ) * Implement Numeric IDs migration * Fix typo * Mute credo for now * Improve configurability and add stop_t * Adjust to Ch/Chto only * Fix opts key for dictionary password * Add regular ecto migration with numeric ids v2 schemas (#2768) * Add regular ecto migration * Fix typo * Update priv/ingest_repo/migrations/20230320094327_create_v2_schemas.exs Co-authored-by: Vini Brasil <vini@hey.com> * Implement v2 events/sessions schema modules (#2777) * Implement v2 events/sessions schema modules * Clean up session schemas --------- Co-authored-by: Vini Brasil <vini@hey.com> * Update moduledocs --------- Co-authored-by: Vini Brasil <vini@hey.com>	2023-03-23 09:47:41 +01:00
Adam	6d79ca5093	Switch to new clickhouse adapter (ch/chto) (#2733 ) * another clickhouse adapter * don't restore stats_removal.ex * fix events main-graph error (#2746) * update ch, chto * update chto again (#2759) * Stop treating page filter as an entry_page filter (#2752) * remove dead code * stop treating page filter as entry page filter in breakdown queries * stop treating page filter as entry page filter in aggregate queries * stop treating page filter as entry page filter in timeseries queries * mix format * update changelog * break code down to smaller functions to keep credo happy * remove unused functions * make CSV export return only conversions with goal filter (#2760) * make CSV export return only conversions with goal filter * update changelog * update elixir version in mix.exs (#2742) * revert admin.ex changes (#2776) --------- Co-authored-by: ruslandoga <67764432+ruslandoga@users.noreply.github.com> Co-authored-by: ruslandoga <rusl@n-do.ga> Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>	2023-03-21 09:55:59 +01:00
Adam Rutkowski	043e3ed572	Clickhouse migration: add ingest_counters table (#2692 ) * Clickhouse migration: add ingest_counters table * Add toStartOfMinute() to ordering key * Explicitly include column to be summarized	2023-02-23 09:34:44 +01:00
Vini Brasil	ce03b5ebd7	Move CH migration from clickhouse_repo/ to ingest_repo/ (#2683 ) This commit moves a migration that was created in `clickhouse_repo/` to its correct folder, `ingest_repo/`. The migration was created in `1cb07efe6d` prior the read/write separation.	2023-02-16 09:00:26 +01:00
Adam Rutkowski	8f85b110aa	Split Clickhouse pools into Read-Only and Read/Write (dedicated to writes) (#2661 ) * Configure ingest repo access/pool size If I'm not mistaken 3 is a sane default, the only inserts we're doing are: - session buffer dump - events buffer dump - GA import dump And all are serializable within their scopes? * Add IngestRepo * Start IngestRepo * Use IngestRepo for inserts * Annotate ClickhouseRepo as read_only So no insert* functions are expanded * Update moduledoc * rename alias * Fix default env var value so it can be casted * Use IngestRepo for migrations * Set default ingest pool size from 3 to 5 in case conns are restarting or else... * Ensure all Repo prometheus metrics are collected	2023-02-12 17:50:57 +01:00

27 Commits