analytics

mirror of https://github.com/plausible/analytics.git synced 2024-11-22 02:27:57 +03:00

Author	SHA1	Message	Date
RobertJoonas	6822b29016	Average Scroll Depth Metric: put scroll depth on the dashboard under a feature flag (#4832 ) * migration: add scroll_depth to events_v2 * (cherry-pick) ingest scroll depth * replace convoluted test with more concise ones * QueryParser: parse internal scroll_depth metric + validation * turn QueryComparisonsTest into QueryInternalTest * rename file * (cherry pick) query scroll depth `15b14d3` ...and move the tests into `internal_query_test.exs` * review feedback * Get rid of unnecessary separation between aggregate and group scroll depth * Drop irrelevant other metrics in tests * add test ensuring scroll depth unavailable in Stats API v1 * Put scroll depth on the dashboard * Top Stats * Main Graph * Top Pages > Details * feature flag for dashboard scroll depth access * ignore credo warning * enable scroll_depth flag in tests * remove duplication * write timestamps explicitly in a test * revert moving tests around * Add query_comparisons_test back * Move scroll_depth tests into query_test * Delete query_internal_test * rename setup util (got updated on master) * use pageleave_factory where applicable * Use the correct generated query-api.d.ts * npm format	2024-11-20 13:13:04 +00:00
RobertJoonas	e93c97de1e	migration: add scroll_depth to events_v2 (#4827 )	2024-11-19 09:59:23 +00:00
Karl-Aksel Puulmann	9af498833e	Channels: backfill utm_medium based on click_param_id (#4833 ) * Backfill utm_medium Follow-up to https://github.com/plausible/analytics/pull/4817 * Update backfill	2024-11-19 08:12:39 +00:00
Uku Taht	0bbdbc9f42	Imported channel migration (#4815 )	2024-11-14 17:12:18 +00:00
Uku Taht	daa42cbc9d	Update acquisition channel UDF to prioritize display over paid search (#4818 ) * Update acquisition channel UDF to prioritize display over paid search * Remove migration Will run this manually together with a backfill, self-hosted will get this for free. * Add test --------- Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>	2024-11-14 16:01:34 +00:00
Uku Taht	cf4ba664ed	Tiny source data update (#4821 ) * Merge teams.microsoft.com -> Microsoft Teams * Display favicon for Linkedin	2024-11-14 13:28:45 +00:00
hq1	86b3bf4f24	Set `guest_invitations.invitation_id` not null (#4812 ) Once https://github.com/plausible/analytics/pull/4811/files is migrated.	2024-11-13 12:48:55 +00:00
hq1	7cf61c9590	Add `invitation_id` column to `guest_invitations` schema (#4811 ) Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>	2024-11-12 14:27:13 +00:00
Adrian Gruntkowski	9004a02f30	Set NOT NULL on `teams.allow_next_upgrade_override` (#4807 )	2024-11-12 10:04:30 +00:00
Adrian Gruntkowski	e31aeff721	Set default for `teams.allow_next_upgrade_override` schema column (#4799 )	2024-11-12 09:05:16 +00:00
Karl-Aksel Puulmann	fc83040ec1	Channels: Run TRUNCATE with alter_sync=2 (#4804 ) ON CLUSTER fails since it tries to create conflicting dll entries on each node. Error: ```Cannot execute replicated DDL query, maximum retries exceeded. (UNFINISHED)```	2024-11-12 07:24:23 +00:00
Karl-Aksel Puulmann	4aa7dec301	Channels: Migration to add materialized column, backfill code (#4798 ) * Channels: Migration to add column, backfill code This change adds `acqusition_channel` columns to events_v2 and sessions_v2 tables. These columns are materialized - we don't ingest into them directly. Instead they're calculated based on other columns. The data migration changes now allow to also backfill the column. Tested the ability to change definitions by changing the function definitions and re-running the migration with backfill. Confirmed that the underlying data changed as expected. * quiet option * Exclude data migrations from validation * Migration consistency	2024-11-12 06:41:34 +00:00
Karl-Aksel Puulmann	3759db9b8c	Channels: Fix ON CLUSTER behavior (#4801 ) * Channels: Fix cluster behavior CREATE TABLE AS SELECT syntax did not work on cluster. Instead, let's do a normal insert. For safety and to avoid timing issues, ensure that INSERT waits for data to be inserted on all active replicas. * Proper replicated tables	2024-11-11 19:59:16 +00:00
Artur Pata	b22b35793c	Saved segments/create table (#4797 ) * Add migration for Saved Segments * Remove premature optimisation * Format * Refactor to explicit segment type	2024-11-11 16:31:43 +00:00
Karl-Aksel Puulmann	d620432227	Channels: Speed up clickhouse calculations (#4789 ) * Fix interpolation in data_migration.ex * Speed up calculating acquisition_channel in clickhouse The previous `has` queries proved to be problematic and causing a lot of CPU overhead. Benchmarked via this query: ```sql SELECT channel, count(), countIf(acquisition_channel(referrer_source, utm_medium, utm_campaign, utm_source, click_id_param) = channel) AS matches FROM events_v2 WHERE timestamp > now() - toIntervalHour(48) GROUP BY channel ORDER BY count() desc ``` Before this fix: ``` query_duration_ms: 57960 DiskReadElapsedMs: 374.712 RealTimeMs: 2891200.667 UserTimeMs: 2704024.783 SystemTimeMs: 1693.265 OSCPUWaitMs: 90.253 OSCPUVirtualTimeMs: 2705709.58 ``` After this fix: ``` query_duration_ms: 4367 DiskReadElapsedMs: 454.356 RealTimeMs: 213892.207 UserTimeMs: 199363.485 SystemTimeMs: 1479.364 OSCPUWaitMs: 13.739 OSCPUVirtualTimeMs: 200837.37 ``` Note that the new tables are not tracked in our schema as usual as they're pretty much temporary tables to create the dictionary without needing to upload files to clickhouse servers. * CREATE OR REPLACE table with SELECT	2024-11-11 10:39:51 +00:00
Karl-Aksel Puulmann	dbf7a099a3	Acquisition channels: Functions to calculate channels in clickhouse (#4701 ) * Expose a few data migration functions, add quiet option to do_run * Create functions and test acquisition channel logic in clickhouse Tests were lifted from test/plausible_web/controllers/api/external_controller_test.exs * Clean up test code a bit * Property test for acquisition channels * Handle empty strings properly in reference implementation * Fix spelling, minor issues * Revert "Property test for acquisition channels" This reverts commit `3fa0e0e4eb`. * Only test clickhouse functions * Solve minor code issue * update channels logic * Revert "Only test clickhouse functions" This reverts commit `e12784031a`. * Add more tests * Add small result assertion * Make query options explicit in data migrations * Move multi-query running logic to within datamigration lib * Unbreak numeric ids migration * Named params directly to Clickhouse * Update reference test implementation --------- Co-authored-by: Uku Taht <uku.taht@gmail.com>	2024-11-06 11:27:02 +00:00
Karl-Aksel Puulmann	4e10efe723	Channels: `click_id_param` column (#4703 ) * Add migration for click_id_source * click_id_param	2024-11-05 07:03:00 +00:00
Uku Taht	9d06d45e45	New remap sources migration that's case insensitive (#4771 )	2024-11-04 09:47:01 +00:00
Uku Taht	a1b1b84963	Remap sources migration (#4751 )	2024-10-31 08:07:50 +00:00
Uku Taht	c3a06caa97	Channel and source data updates (#4599 ) * Channel and source data updates * Update source mappings for migration * Fix codespell Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com> * Update lib/plausible/ingestion/acquisition.ex Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com> * Standardize access to utm params * Add wikipedia as "known" source * Move custom sources to json file * Add some advertising utm_sources * Move source mapping logic to refinspector file * Rename PlausibleWeb.RefInspector -> Plausible.Ingestion.Source * Move mapping overrides to custom_sources.json * More robust detection of paid sources * Add missing utm_sources to migration * Codespell * Add moduledoc for Plausible.Ingestion.Source * Fix dialyzer * Remove migration * Add more custom favicons * Re-generate referrer favicons file * Add doctest for sources --------- Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>	2024-10-30 13:41:51 +00:00
Adrian Gruntkowski	1e38bd8771	Add fields and tables for teams (#4696 ) * Add migration adding team related tables and fields * Add `team_site_transfers` table to the teams migration * Remove team_id FK from api_keys table * Change new FK constraints on existing tables to `nilify_all` on delete * Ensure unique indexes on invitation_id and transfer_id fields	2024-10-17 11:28:56 +00:00
hq1	67e35fa1d2	Migration: cascade delete enterprise plans on user removal (#4684 )	2024-10-16 08:43:16 +00:00
Karl-Aksel Puulmann	141eea88ff	APIv2: Revenue metrics (#4659 ) * WIP: Start refactoring revenue metrics * Hacks to make things work * Remove old revenue code, remove revenue metrics if needed * Update query_optimizer docs * Minor fixes * Add tests around average/total revenue when non-revenue goal filtering going on * Optimize, calculate filters as expected (OR-ing clauses) * Revenue: Handle cases where revenue metrics should not be returned or nil * Expose revenue metrics in internal schema, add tests * Docstring * Remove TODO * Typegen * Solve warnings * Remove nesting * ce_test fix * Tag tests as ee_only * Fix: When filtering by revenue goal and no conversions, return 0.0 instead of nil * More straight-forward preloading logic	2024-10-09 10:18:48 +00:00
ruslandoga	5fec52ab36	Release v2.1.4 (#4660 )	2024-10-09 07:45:17 +00:00
Karl-Aksel Puulmann	5ad743c8d3	APIv2: Comparisons for breakdowns, timeseries, time_on_page (#4647 ) * Refactor comparisons to a new options format Prerequisite for APIv2 comparison work * Experiment with default include deduplication * WIP Oops, breaks `include.total_rows` * WIP * Refactor breakdown.ex * Pagination fix: dont paginate split subqueries * Timeseries tests pass * Aggregate tests use QueryExecutor * Simplify QueryExecutor * Handle legacy time-on-page metric in query_executor.ex No behavioral changes * Remove keep_requested_metrics * Clean up imports * Refactor aggregate.ex to be more straight-forward in output format building * top stats: compute comparison via apiv2 * Minor cleanups * WIP: Pipelines * WIP: refactor for code cleanliness * QueryExecutor to QueryRunner * Make compilable * Comparisons for timeseries works Except for comparisons where comparison window is bigger than source query window * Add special case for timeseries * JSON schema tests for comparisons * Test comparisons with the new API * comparison date range parsing improvement * Make comparisons api internal-only * typegen * credo * Different schemata * get_comparison_query * Add comment on timeseries result format * comparisons typegen * Percent change for revenue metrics fix * Use defstruct for query_runner over map * Remove preloading atoms	2024-10-08 10:13:04 +00:00
Adrian Gruntkowski	e11fd159df	Add `notes` column to `users` table (#4612 )	2024-09-25 14:21:26 +00:00
ruslandoga	dca2eb5b81	Update Ecto dumps (#4481 ) * update Ecto dumps * rm tmp tables from dump	2024-09-23 12:50:08 +00:00
Artur Pata	82a15884ad	Automatically generate Typescript types for v2 API query schema (#4574 ) * Generate types from query schema * Flip the query schema so private is static * Ensure private schema stays private * Refactor comment, json schema utils	2024-09-18 11:01:20 +00:00
Uku Taht	7a77ebf9bf	Add feature-flagged channels UI (#4585 ) * Add feature-flagged channels UI * Implement channels modal * Channel -> Channels tab	2024-09-18 08:34:12 +00:00
Karl-Aksel Puulmann	ef57502854	APIv2: Implement pagination and `include.total_rows` (#4575 ) Offset-based pagination is used to make sure Looker integration is able to work as efficiently as possible. To know how many requests users need to do `include.total_rows` option was added.	2024-09-12 15:51:18 +03:00
Karl-Aksel Puulmann	bd11b4cf67	APIv2: Standard iso8601 timestamps, operate on UTC (#4563 ) * query.date_range is now in UTC instead of user timezone This simplifies things down the line and fixes several bugs where query.date_range is cast to naivedatetime for ecto purposes Many places still remain broken: - comparison queries - `to_date_range` calls * Make default_for_date_range not care about time zones * Make timezone parameter mandatory for to_date_range * Simplify utc_date_range, update legacy query builder * Fix more cases where query date range is needed * query.date_range -> query.utc_time_range * Query.date_range/1 function * ensure_include_imported update * Clean up send_email_report	2024-09-11 09:21:59 +03:00
Artur Pata	52b94842c0	Assert filters are tuples, simplify schema (#4541 )	2024-09-10 18:01:42 +03:00
Karl-Aksel Puulmann	e8d544c841	Remove does_not_contain support (#4564 ) It only needed to be live until users have reload. This has been live for >24h.	2024-09-10 15:38:04 +03:00
Karl-Aksel Puulmann	604dde99fd	APIv2: Regex operations, consistent operators (#4488 ) * Rename matches/does_not_match filters internally These have never been exposed to the frontend/user directly, only via APIv1 filtering syntax. As such we are free to rename these without breaking things * Rename function arguments for consistency, simplify * Add support for `match`/`not_match` operators for query apiv2 These match the string against a regular expression, as defined in https://github.com/google/re2/wiki/Syntax * not_match -> match_not * does_not_contain -> contains_not Note that for backwards compatibility: - Browser handles does_not_contain in URL - Backend will handle does_not_contain in queries for a day where we will remove it for better autocompletion * not_matches_wildcard -> matches_wildcard_not * prettier * match -> matches * Fix and test fix for matches_wildcard against prop when prop is missing * Custom properties support for matches/matches_not * Restore contains_not * Test contains and contains_not behavior for custom properties	2024-09-09 10:05:24 +03:00
Uku Taht	d56d6998df	Acquisition channel (#4489 ) * WIP * Add acquisition channel * Add detection for gclid and msclkid * Add GA4 source categories file as external resource	2024-09-05 12:02:15 +03:00
Uku Taht	90b81b615f	Add migration for acquisition channel (#4531 )	2024-09-05 11:51:16 +03:00
Karl-Aksel Puulmann	8fa3a83129	APIv2: and/or/not support (#4480 ) * First approximation of AND/OR/NOT support Broken by this: - Goal filtering - Table deciding - Imports * TableDecider handle nesting * Query.remove_top_level_filters * Plausible.Stats.Imported.SQL.Expression * Handle AND/OR/NOT with imported data, create Plausible.Stats.Imported.SQL.WhereBuilder * Add parser validations for event:goal, event:hostname and event:props:x filters top level constraints * Move module around * Query.get_filter -> Filters.filtering_on_dimension? in some callsites * Filters.get_toplevel_filter * TableDecider.sessions_join_events?, remove old method * Transforming filters in query_optimizer * Query API tests for and/or/not * Reorder parser steps * Post-merge test fixups * Solve merge issue * Simplify filtering_on_dimension? * Update transformer code * dimensions_used_in_filters min_depth option, simplify parser validations * rename_dimensions_used_in_filter * fix rename_dimensions_used_in_filter * Rename a test	2024-09-04 15:44:03 +03:00
Karl-Aksel Puulmann	3310006337	Update 20240801091615_capitalize_known_sources.exs migration (#4525 ) Previous migration took forever on prod, likely because Map lookups are linear time in complexity. `transform/3` helps achieve the same functionality with the help of a hash table and updated WHERE clause allows skipping most rows which dont need updating Co-authored-by: Uku Taht <Uku.taht@gmail.com>	2024-09-04 13:57:28 +03:00
Adrian Gruntkowski	533bf90329	Create `user_sessions` table (#4511 )	2024-09-03 10:02:43 +02:00
Uku Taht	77248c8800	Add data migration for capitalizing sources (#4418 )	2024-09-02 13:59:58 +03:00
RobertJoonas	f04c47f881	Support realtime periods in API v2 (#4469 ) * add realtime date_ranges into the private API schema This commit starts parsing date ranges into a new NaiveDateTimeRange struct, rather than a simple Date.Range. * transform realtime labels into negative integers + test * move schema type argument to last position in helper functions * allow passing a date param + tests * Update test/plausible/stats/query_parser_test.exs Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com> * Update test/plausible/stats/query_parser_test.exs Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com> * Update test/plausible/stats/query_parser_test.exs Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com> * Update test/plausible/stats/query_parser_test.exs Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com> * keep test file structure consistent * Turn NaiveDateTimeRange into DateTimeRange * change 'now' field from NaiveDateTime to DateTime in v2 query * fix minute interval labels + add missing tests * return query_result.date_range as iso8601 timestamps with timezone * allow timestamps with tz as date_range arguments in API v2 * delete Plausible.Timezones.to_utc_datetime * simplify returning comparison periods * add comment about realtime not supported in comparisons * pass only now instead of test_opts * drop redundant else branch * separate tests * stick to a single check_date_range function in tests * fix credo error --------- Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>	2024-09-02 12:56:58 +03:00
hq1	05136398cf	Migration: add installation meta (#4486 )	2024-08-29 11:05:36 +02:00
Karl-Aksel Puulmann	9c71161eab	APIv2: JSON schema validation, separate internal and public API validation (#4464 ) * Restore `date` internal parameter, validate via json schema * Improved error formatting from json schema, get most tests passing * Handle internal overrides to JSON schema * Parsing tests all pass * Remove some repeated code, enforce length/uniqueness in schema * Explicit separation between internal and public API validation * Mark file as external_resource * map_join * Update query tests * Update query tests * Serve schema under an /api/docs/query/schema.json endpoint * dotify errors	2024-08-26 14:01:27 +03:00
ruslandoga	b64038af48	Fix alias migration warning (#4465 )	2024-08-26 11:16:13 +02:00
Karl-Aksel Puulmann	11acadfde9	APIv2: docs-related changes (#4453 ) * Order QueryResult in API response This improves experience in docs when querying interactively * More utm in seeds * More improved seeds * Proper QueryResult.query structure * Allow docs to query /api/v2/query and sites The new endpoints use cookie authentication. The docs site uses these endpoints to provide an interactive docs editor. * query_result ordering test * Refresh router * Test module name	2024-08-22 10:44:41 +03:00
Karl-Aksel Puulmann	4967960278	Populate log_comment with debug information, /debug/clickhouse route (#4435 ) * Set log_comment with request information * CRMAuthPlug -> SuperAdminOnlyPlug * Super basic debug view * Handle clustered setups * Changelog entry * Cleanup * fragment trick to use ecto querying, filtering * Move clustered_table? function to IngestRepo module * Format * More resilient user_id getting in helper	2024-08-14 12:33:36 +03:00
Karl-Aksel Puulmann	b88074bf1b	Fix migration typo (#4437 )	2024-08-13 12:07:18 +03:00
Karl-Aksel Puulmann	ee3d1e770e	APIv2: visit:country_name, visit:region_name, visit:city_name dimensions (#4328 ) * Add data migration for creating and syncing location_data table and dictionary * Migration to populate location data * Daily cron to refresh location dataset if changed * Add support for visit:country_name, visit:region_name and visit:city_name dimensions Under the hood this relies on a `location_data` table in clickhouse being regularly synced with plausible/location repo and dictionary lookups used in ALIAS columns * Update queue name * Update documentation * Explicit structs * Improve docs further * Migration comment * Add queues * Add error when already loaded * Test for filtering by new dimensions * Update deps * dimension -> select_dimension * Update a test	2024-08-13 09:44:58 +03:00
hq1	7fb2bfbd29	Migration: turn google auth tokens into text column type (#4428 )	2024-08-09 12:17:26 +02:00
hq1	cc769dfb3d	Edit goals with display names (#4415 ) * Update Goal schema * Equip ComboBox with the ability of JS selection callbacks * Update factory so display_name is always present * Extend Goals context interface * Update seeds Also farming unsuspecting BEAM programmers for better sample page paths :) * Update ComboBox test * Unify error message color class with helpers seen elsewhere * Use goal.display_name where applicable * Implement LiveView extensions for editing goals * Sprinkle display name in external stats controller tests * Format * Fix goal list mobile view * Update lib/plausible_web/live/goal_settings/list.ex Co-authored-by: Artur Pata <artur.pata@gmail.com> * Update lib/plausible_web/live/goal_settings/form.ex Co-authored-by: Artur Pata <artur.pata@gmail.com> * Update the APIs: plugins and external * Update test so the intent is clearer * Format * Update CHANGELOG * Simplify form tabs tests * Revert "Format" This reverts commit `c1647b5307`. * Fixup format commit that went too far * ComboBox: select the input contents on first focus * Update lib/plausible/goal/schema.ex Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com> * Update lib/plausible/goals/goals.ex Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com> * Update lib/plausible_web/live/goal_settings/form.ex Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com> * Pass form goal instead of just ID * Make tab component dumber * Extract separate render functions for edit and create forms * Update test to account for extracted forms * Inline goal get query * Extract revenue goal settings to a component and avoid computing assigns in flight * Make LV modal preload optional * Disable preload for goal settings form modal * Get rid of phash component ID hack * For another render after render_submit when testing goal updates * Fix LV preload option * Enable preload back for goals modal for now * Make formatter happy * Implement support for preopening of LV modal * Preopen goals modal to avoid feedback gap on loading edited goal * Remove `console.log` call from modal JS * Clean up display name input IDs * Make revenue settings functional on first edit again * Display names: 2nd stage migration * Update migration with data backfill --------- Co-authored-by: Artur Pata <artur.pata@gmail.com> Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>	2024-08-09 11:12:00 +02:00

1 2 3 4 5 ...

331 Commits