Commit Graph

27 Commits

Author SHA1 Message Date
Karl-Aksel Puulmann
3310006337
Update 20240801091615_capitalize_known_sources.exs migration (#4525)
Previous migration took forever on prod, likely because Map lookups are linear time in complexity.
`transform/3` helps achieve the same functionality with the help of a hash table and updated
WHERE clause allows skipping most rows which dont need updating

Co-authored-by: Uku Taht <Uku.taht@gmail.com>
2024-09-04 13:57:28 +03:00
Uku Taht
77248c8800
Add data migration for capitalizing sources (#4418) 2024-09-02 13:59:58 +03:00
ruslandoga
b64038af48
Fix alias migration warning (#4465) 2024-08-26 11:16:13 +02:00
Karl-Aksel Puulmann
4967960278
Populate log_comment with debug information, /debug/clickhouse route (#4435)
* Set log_comment with request information

* CRMAuthPlug -> SuperAdminOnlyPlug

* Super basic debug view

* Handle clustered setups

* Changelog entry

* Cleanup

* fragment trick to use ecto querying, filtering

* Move clustered_table? function to IngestRepo module

* Format

* More resilient user_id getting in helper
2024-08-14 12:33:36 +03:00
Karl-Aksel Puulmann
ee3d1e770e
APIv2: visit:country_name, visit:region_name, visit:city_name dimensions (#4328)
* Add data migration for creating and syncing location_data table and dictionary

* Migration to populate location data

* Daily cron to refresh location dataset if changed

* Add support for visit:country_name, visit:region_name and visit:city_name dimensions

Under the hood this relies on a `location_data` table in clickhouse being regularly synced with
plausible/location repo and dictionary lookups used in ALIAS columns

* Update queue name

* Update documentation

* Explicit structs

* Improve docs further

* Migration comment

* Add queues

* Add error when already loaded

* Test for filtering by new dimensions

* Update deps

* dimension -> select_dimension

* Update a test
2024-08-13 09:44:58 +03:00
ruslandoga
a139e8295c
Fix warning in migration (#4133) 2024-05-24 16:26:25 +02:00
RobertJoonas
c106595be0
Migration: add imported custom events (#4076)
* Add `imported_custom_events` to CH

* remove redundant table setting

* add path column

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-05-07 11:46:27 +01:00
Karl-Aksel Puulmann
850a843d82
Add migration to add ALIAS columns to common session/visit properties (#4058)
This allows for simplifications in the API code
2024-05-05 11:30:39 +03:00
Adrian Gruntkowski
069170eb1d
Add active_visitors column to imported_pages CH table (#4028) 2024-04-22 09:34:01 +02:00
RobertJoonas
bd73fc8266
CH Migration - add more imported metrics and properties (#3949)
* add migration

* add utm_source to imported_sources

* quickfix

* satisfy credo

* Revert "satisfy credo"

This reverts commit bb0b228164.

* Revert "quickfix"

This reverts commit ab6f70c79e.

---------

Co-authored-by: ruslandoga <67764432+ruslandoga@users.noreply.github.com>
2024-04-04 09:12:53 +01:00
hq1
1f778e0c11
CH Migration: exit page hostname on sessions_v2 (#3953)
* CH Migration: exit/entry hostnames in sessions_v2

* Leave only exit_page_hostname, we already record hostnames

* Use ClickHouse DDL in favour of ecto so that cluster is included

* Compress with ZSTD(3)
2024-04-03 09:42:47 +02:00
Karl-Aksel Puulmann
70d21f1b69
Improve compression of event table session columns (#3864) 2024-03-05 12:45:19 +02:00
Karl-Aksel Puulmann
ea38b45685
Ecto migration to move sessions_v2 to VersionedCollapsingMergeTree (#3809)
This migration will noop in staging/production as it already has been run. It also leaves
behind a backup table that initially takes no extra space but will need to be cleaned up
manually
2024-02-28 11:22:27 +02:00
Karl-Aksel Puulmann
acd4f33c8e
Migration to improve compression options of sessions_v2 and events_v2 tables (#3803)
Part of https://3.basecamp.com/5308029/buckets/35611491/messages/7061875211
Based on work in https://3.basecamp.com/5308029/buckets/35611491/messages/6949640880#__recording_7015576045
2024-02-23 09:19:57 +02:00
Adrian Gruntkowski
6915dd5e78
Add table migrations for supporting multiple imports per site (#3785)
* Add migration deleting jobs from legacy analytics import queue

* Add `site_imports` table

* Add `import_id` column to imported_* CH tables

* Extend one of `site_imports` indexes
2024-02-13 17:26:19 +01:00
Karl-Aksel Puulmann
f5e15941de
Add migration for minmax indexes on session timestamps (#3772) 2024-02-13 10:35:24 +02:00
hq1
52d5fac362
Fixup 4cdf843ad: (#3434)
- migration is irreversible for now,
    likely due to https://github.com/plausible/ecto_ch/issues/81#issuecomment-1624754909
    or maybe https://github.com/plausible/ecto_ch/pull/58/files
  - IO.inspect call removed
2023-10-17 16:07:06 +02:00
Adrian Gruntkowski
4cdf843ade
Disable deduplication for improted_* tables in CH (#3429)
* Disable deduplication for improted_* tables in CH

* Make migration cluster-aware

* Bump cache to avoid problems

* Fix cache dump

* Remove cache bump
2023-10-17 12:51:13 +02:00
Vini Brasil
d98242895b
Add revenue fields to events_v2 (#3018) 2023-06-12 18:12:41 +01:00
ruslandoga
f489d96251
skip v1 table drop in self-host (#2945) 2023-05-25 10:32:33 +03:00
hq1
57af1f19ec
Drop events/sessions tables after V2 migration (#2908) 2023-05-10 09:19:50 +02:00
hq1
b9c2110472
V2 migration tweaks for self hosted release (#2825)
* Get rid of PASS_V2_SCHEMA_MIGRATION

* Use in-memory domain lookup + regular table settings

* Remove faulty date arithmetic + prev part calculation

* Set V2_MIGRATION_DONE in Mix.env == :dev

* Mute credo
2023-04-13 12:09:39 +02:00
Adam
6637751a5e
Implement Numeric IDs migration (#2762)
* Implement Numeric IDs migration

* Fix typo

* Mute credo for now

* Improve configurability and add stop_t

* Adjust to Ch/Chto only

* Fix opts key for dictionary password

* Add regular ecto migration with numeric ids v2 schemas (#2768)

* Add regular ecto migration

* Fix typo

* Update priv/ingest_repo/migrations/20230320094327_create_v2_schemas.exs

Co-authored-by: Vini Brasil <vini@hey.com>

* Implement v2 events/sessions schema modules (#2777)

* Implement v2 events/sessions schema modules

* Clean up session schemas

---------

Co-authored-by: Vini Brasil <vini@hey.com>

* Update moduledocs

---------

Co-authored-by: Vini Brasil <vini@hey.com>
2023-03-23 09:47:41 +01:00
Adam
6d79ca5093
Switch to new clickhouse adapter (ch/chto) (#2733)
* another clickhouse adapter

* don't restore stats_removal.ex

* fix events main-graph error (#2746)

* update ch, chto

* update chto again (#2759)

* Stop treating page filter as an entry_page filter (#2752)

* remove dead code

* stop treating page filter as entry page filter in breakdown queries

* stop treating page filter as entry page filter in aggregate queries

* stop treating page filter as entry page filter in timeseries queries

* mix format

* update changelog

* break code down to smaller functions to keep credo happy

* remove unused functions

* make CSV export return only conversions with goal filter (#2760)

* make CSV export return only conversions with goal filter

* update changelog

* update elixir version in mix.exs (#2742)

* revert admin.ex changes (#2776)

---------

Co-authored-by: ruslandoga <67764432+ruslandoga@users.noreply.github.com>
Co-authored-by: ruslandoga <rusl@n-do.ga>
Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>
2023-03-21 09:55:59 +01:00
Adam Rutkowski
043e3ed572
Clickhouse migration: add ingest_counters table (#2692)
* Clickhouse migration: add ingest_counters table

* Add toStartOfMinute() to ordering key

* Explicitly include column to be summarized
2023-02-23 09:34:44 +01:00
Vini Brasil
ce03b5ebd7
Move CH migration from clickhouse_repo/ to ingest_repo/ (#2683)
This commit moves a migration that was created in `clickhouse_repo/` to
its correct folder, `ingest_repo/`. The migration was created in
1cb07efe6d prior the read/write
separation.
2023-02-16 09:00:26 +01:00
Adam Rutkowski
8f85b110aa
Split Clickhouse pools into Read-Only and Read/Write (dedicated to writes) (#2661)
* Configure ingest repo access/pool size

If I'm not mistaken 3 is a sane default, the only
inserts we're doing are:

  - session buffer dump
  - events buffer dump
  - GA import dump

And all are serializable within their scopes?

* Add IngestRepo

* Start IngestRepo

* Use IngestRepo for inserts

* Annotate ClickhouseRepo as read_only

So no insert* functions are expanded

* Update moduledoc

* rename alias

* Fix default env var value so it can be casted

* Use IngestRepo for migrations

* Set default ingest pool size from 3 to 5

in case conns are restarting or else...

* Ensure all Repo prometheus metrics are collected
2023-02-12 17:50:57 +01:00