analytics/priv
Karl-Aksel Puulmann d620432227
Channels: Speed up clickhouse calculations (#4789)
* Fix interpolation in data_migration.ex

* Speed up calculating acquisition_channel in clickhouse

The previous `has` queries proved to be problematic and causing a lot of
CPU overhead.

Benchmarked via this query:

```sql
SELECT
  channel,
  count(),
  countIf(acquisition_channel(referrer_source, utm_medium, utm_campaign, utm_source, click_id_param) = channel) AS matches
FROM events_v2
WHERE timestamp > now() - toIntervalHour(48)
GROUP BY channel
ORDER BY count() desc
```

Before this fix:
```
query_duration_ms:                                                57960
DiskReadElapsedMs:                                                374.712
RealTimeMs:                                                       2891200.667
UserTimeMs:                                                       2704024.783
SystemTimeMs:                                                     1693.265
OSCPUWaitMs:                                                      90.253
OSCPUVirtualTimeMs:                                               2705709.58
```

After this fix:
```
query_duration_ms:                                                4367
DiskReadElapsedMs:                                                454.356
RealTimeMs:                                                       213892.207
UserTimeMs:                                                       199363.485
SystemTimeMs:                                                     1479.364
OSCPUWaitMs:                                                      13.739
OSCPUVirtualTimeMs:                                               200837.37
```

Note that the new tables are not tracked in our schema as usual as
they're pretty much temporary tables to create the dictionary without
needing to upload files to clickhouse servers.

* CREATE OR REPLACE table with SELECT
2024-11-11 10:39:51 +00:00
..
data_migrations Channels: Speed up clickhouse calculations (#4789) 2024-11-11 10:39:51 +00:00
ingest_repo Channels: click_id_param column (#4703) 2024-11-05 07:03:00 +00:00
json-schemas APIv2: Revenue metrics (#4659) 2024-10-09 10:18:48 +00:00
ref_inspector Update ref_inspector database (#3697) 2024-01-22 09:30:38 +01:00
repo Add fields and tables for teams (#4696) 2024-10-17 11:28:56 +00:00
static Add curl browser icon (#4372) 2024-07-18 15:30:28 +02:00
tracker/js Remove static tracker files (#2116) 2022-10-11 12:19:28 +02:00
ua_inspector Update ua_inspector (#4284) 2024-07-01 09:30:09 +02:00
verification Verification tweaks (#4234) 2024-06-18 05:58:56 +02:00
custom_sources.json Channel and source data updates (#4599) 2024-10-30 13:41:51 +00:00
ga4-source-categories.csv Acquisition channel (#4489) 2024-09-05 12:02:15 +03:00
legacy_plans.json Legacy plans (#3455) 2023-10-25 13:46:55 +03:00
paddle_sandbox.pem Move limit enforcement to accepting site ownership transfer (#3612) 2023-12-20 14:56:49 +00:00
paddle.pem Initial commit 2019-09-02 12:29:19 +01:00
placeholder_favicon.ico Add fallback for favicon (#2279) 2022-09-28 08:55:46 -03:00
plans_v1.json List plan benefits on the new upgrade page (#3444) 2023-10-23 19:42:00 +03:00
plans_v2.json List plan benefits on the new upgrade page (#3444) 2023-10-23 19:42:00 +03:00
plans_v3.json List plan benefits on the new upgrade page (#3444) 2023-10-23 19:42:00 +03:00
plans_v4.json Add data retention to the choose plan page (#3605) 2023-12-12 08:19:54 -03:00
referer_favicon_domains.json Channel and source data updates (#4599) 2024-10-30 13:41:51 +00:00
sandbox_plans.json Add data retention to the choose plan page (#3605) 2023-12-12 08:19:54 -03:00