Commit Graph

914 Commits

Author SHA1 Message Date
Adrian Gruntkowski
fede2f0a8a
Make final touches to Imports & Exports (#4025)
* Make final touches to Imports & Exports

* Change import content copy depending on CSV imports and exports flag state

* Remove unused aliases
2024-04-19 11:40:13 +02:00
Adrian Gruntkowski
c10580777e
Remove references to site.imported_data (#4006)
* Remove references to `site.imported_data`

* Count pre-existing ID 0 imports when showing pageview count summary for legacy imports

* Fix tests after rebase

* Dry `delete_imported_stats!`

* Clean up remaining imported data references and add notes
2024-04-19 11:15:51 +02:00
Adrian Gruntkowski
9bae3ccce3
Improve UI/UX of imports view and GA import flow (#4017)
* Add runtime config option for enabled/disabling csv imports and exports

* Use the new option to toggle rendering exports UI

* Disable import buttons when at maximum imports or when option disabled for CSV

* Improve forms for GA import flow

* Add test for maximum imports reached

* Remove "Changed your mind?" prefixing back button

* Hide UA imports in Integrations when `imports_exports` flag is enabled

* Implement `csv_imports_exports` feature flag

* Revert "Add runtime config option for enabled/disabling csv imports and exports"

This reverts commit e30f202dd3.

* Send import notification email only to the user who ran the import

* Improve rendering of disabled button state

* Put import status heroicon in front of import label
2024-04-18 12:12:48 +02:00
RobertJoonas
3a371fdf4d
Test new imported metrics (GA4) (#4014)
* test fixture imported data with stats requests

* take visits metric from the events table in event:page breakdown

* Remove assert_referrers after all

pageReferrer is an event scoped property in GA4, which when queried
along with session-level dimensions will return unexpected data.

Adding the pageReferrer dimension to the GA4 Data API request, it will
cause the selected metric totals to increase significantly, even though
they shouldn't.

* Adjust sources and utm_mediums assertions

* adjust assert_pages

* Make formatter happy

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-04-18 12:12:24 +02:00
Adrian Gruntkowski
9849743407
Always sort occupied date ranges in Imported.clamp_dates/3 (#4018) 2024-04-18 11:15:51 +02:00
Adam Rutkowski
2b1fbf0a0e Fix typo (and kick docker build) 2024-04-16 20:57:29 +02:00
hq1
6fb56dc1cc
Stats api hostname filter (#4008)
* Update Stats API tests

* Revert "Remove hostname filter from the external API (#3991)"

This reverts commit 884daa7943.
2024-04-16 20:36:57 +02:00
hq1
f635f0a6d3
Hostnames shield (#3990)
* Add shield hostname rules migration

* Add hostname rule schema

* Initialize hostname rules cache

* Extend Shields context with hostname related functions

* Instrument ingestion pipeline with hostname rule lookups

* Limit hostname suggestions by shield patterns

* Add LiveView for hostname rules management

* Test hostname cache

* Rename feature flag - should be separate from hostname filter

* Remove :shield_pages feature flag

* Update CHANGELOG

* Format

* Update lib/plausible/shield/hostname_rule.ex

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>

* Move tests from `lib/` 🤦

* Use plain `assign` where no short-circuit is necessary

* Fine tune the copy a little bit

* Prevent misplaced tests

* Treat a test with common sense

* Fixup another test that hasn't been really run before

* Make the form hint dynamic depending on rules count

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-04-16 20:30:20 +02:00
RobertJoonas
fce909d041
Improve merge imported (#4003)
* Pass the actual date args that are returned in fixtures (just for clarity)

* Change select key from operating_system to os

* Select (not set) from imported data as done from native stats

* Support breakdown by imported os_version

* Remove dead code

The `query.include_imported` is set to `false` from the start when
filters are included.

* Refactor Plausible.Stats.Imported

Extract a function that does group_by and dimension select, instead of
doing both separately inside the merge_imported function body

* Further refactor of Plausible.Stats.Imported

Get rid of code repetition

* Support breakdown by imported browser_version

* Support breakdown by imported referrer

* Support breakdown by imported referrer

* remove redundant if in select

* use greatest instead of coalesce

Co-authored-by: ruslandoga <doga.ruslan@gmail.com>

* add back the :member filter handling in Stats.Imported

---------

Co-authored-by: Uku Taht <uku.taht@gmail.com>
Co-authored-by: ruslandoga <doga.ruslan@gmail.com>
2024-04-16 16:58:22 +01:00
Adrian Gruntkowski
c07f00636d
Stop importing page referrer from GA4 (#4012)
* Stop importing page referrer from GA4

* Update GA4 import fixture

* Update fixture-based test
2024-04-16 15:35:36 +02:00
Adrian Gruntkowski
c1c03b729c
Reapply "Local CSV exports/imports and S3/UI updates (#3989)" (#3995) (#3996)
* Reapply "Local CSV exports/imports and S3/UI updates (#3989)" (#3995)

This reverts commit aee69e44c8.

* remove unused functions

* eh, that one was actually used

* ugh, they were both used

---------

Co-authored-by: ruslandoga <67764432+ruslandoga@users.noreply.github.com>
2024-04-11 09:15:01 +02:00
RobertJoonas
5163880968
fix test (#4001)
* fix test

* Update test/plausible_web/controllers/api/stats_controller/suggestions_test.exs

Co-authored-by: hq1 <hq@mtod.org>

* format

---------

Co-authored-by: hq1 <hq@mtod.org>
2024-04-10 12:40:42 +01:00
hq1
378d3bc6f5
Discard sessions switching hostnames for UTM/referrer breakdowns (#4000)
* Discard sessions switching hostnames for UTM/referrer breakdowns

Co-authored-by: Uku Taht <uku.taht@gmail.com>

* Format

---------

Co-authored-by: Uku Taht <uku.taht@gmail.com>
2024-04-10 11:38:15 +02:00
Adrian Gruntkowski
aee69e44c8
Revert "Local CSV exports/imports and S3/UI updates (#3989)" (#3995)
This reverts commit 1a0cb52f95.
2024-04-09 21:26:23 +02:00
ruslandoga
1a0cb52f95
Local CSV exports/imports and S3/UI updates (#3989)
* local CSV exports/imports and S3 updates

* credo

* dialyzer

* refactor input columns

* fix ci minio/clickhouse tests

* Update lib/plausible_web/live/csv_export.ex

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>

* fix date range filter in export_pages_q and process only pageviews

* remove toTimeZone(zero_timestamp) note

* use SiteImport.pending(), SiteImport.importing()

* escape [SiteImport.pending(), SiteImport.importing()]

* use random s3 keys for imports to avoid collisions (sometimes makes the upload get stuck)

* clamp import date ranges

* site is already in assigns

* recompute cutoff date each time

* use toDate(timestamp[, timezone]) shortcut

* show alreats on export cancel/delete and extract hint into a component

* switch to Imported.clamp_dates/4

* reprocess tables when imports are added

* recompute cutoff_date on each call

* actually use clamped_date_range on submit

* add warning message

* add expiry rules to buckets in make minio

* add site_id to imports notifications and use it in csv_importer

* try/catch safer

* return :ok

* date range is not available when no uploads

* improve ui and warning messages

* use Generic.notice

* fix flaky exports test

* begin tests

* Improve `Importer` notification payload shape

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-04-09 20:59:48 +02:00
Adrian Gruntkowski
bb108450cb
Fix flaky google import API tests due to hardcoded import ID (#3994) 2024-04-09 19:55:26 +02:00
hq1
884daa7943
Remove hostname filter from the external API (#3991) 2024-04-09 18:03:06 +02:00
Karl-Aksel Puulmann
441412a164
Return 400 when using invalid filters for stats api (#3986)
Currently a 500 is returned instead and logged to sentry.
2024-04-09 15:36:17 +03:00
Adrian Gruntkowski
14b00c6ac3
Fix dry run mode in DataMigration.SiteImports (#3988) 2024-04-09 13:04:17 +02:00
Adrian Gruntkowski
1c1ea95e16
Ensure only complete imports are considered in site imports data migration (#3987)
* Ensure only complete imports are considered in site imports data migration

* Refactor `SiteImports` data migration for clarity (h/t @RobertJoonas)

* Fix tests
2024-04-09 11:49:28 +02:00
Adrian Gruntkowski
d796788715
Keep sites.imported_data in sync with backfilled SiteImport when migrating (#3979)
* Keep `sites.imported_data` in sync with backfilled `SiteImport` when migrating

* Consider only completed site imports in data migration
2024-04-09 09:04:51 +02:00
ruslandoga
94deb89b9d
remove Plausible Team footer from self-hosted emails (#3980)
* remove Plausible Team footer from self-hosted

* don't test unsubscribe placeholder in small build
2024-04-09 09:04:23 +02:00
Adrian Gruntkowski
b951065724
Refactor Imported.check_dates (->clamp_dates) for better felxibility (#3983) 2024-04-09 09:04:11 +02:00
Karl-Aksel Puulmann
a6d4786959
Worker to clean site data from ClickHouse (#3959)
* Create a worker to clean clickhouse deleted sites data

The plan is to run this weekly, but going to trigger it manually the first few times on cloud

* Make asserting count more reliable

* credo

* PR feedback

* Fixes
2024-04-08 12:26:38 +03:00
Adrian Gruntkowski
a7603c9e49
Improve import procedure to ensure no time range overlaps (#3970)
* Always scope import ID by site as well

* Do not schedule new import job if there are any site imports in progress

* Disable import buttons when any import is in progress

* Simplify `schedule_job/4` (h/t @RobertJoonas)
2024-04-04 18:56:36 +02:00
Adrian Gruntkowski
cffff0340c
Handle Google API timeouts gracefully during imports (#3975) 2024-04-04 18:55:39 +02:00
Adrian Gruntkowski
33eed9d7db
Delete imports which have no stats (#3972) 2024-04-04 18:55:14 +02:00
hq1
f9f0407d68
Remove experimtnal_hostname_filter and keep it on by default (#3973)
* Remove `experimental_hostname_filter` and keep it on by default

* Catch up with changes done via e5b56dbe6
2024-04-04 17:20:16 +02:00
RobertJoonas
e5b56dbe62
Refactor VisitorGraph (#3936)
* Give a more semantic name to a function

* Make the LineGraph component thinner

* Move LineGraph into a separate file

* Move interval logic into interval-picker.js

This commit also fixes a bug where the interval name displayed inside
the picker component flickers the default interval when the graph is
loading.

The problem was that we were counting on graphData for returning us the
current interval: `let currentInterval = graphData?.interval`

We should always know the default interval before making the main-graph
request. Sending graphData to IntervalPicker component does not make
sense anyway.

* extract data fetching functions out of VisitorGraph component

* Return graph_metric key from Top Stats API

This commit introduces no behavioral changes - only starts returning an
additional field, allowing us to avoid the following logic in React:

1. Finding the metric names, given a stat display name. E.g.
   `Unique visitors (last 30 min) -> visitors`

2. Checking if a metric is graphable or not

* Move metric state into localStorage

This commit gets rid of the internal `metric` state in the VisitorGraph
component and starts using localStorage for that instead.

This commit also chains the main-graph request into the top-stats request
callback - meaning that we'll always fetch new graph data after top stats
are updated. And we do it all in a single function.

Doing so simplifies the loading state significantly, and also helps to
make it clear, that at all times, existing top stats are required before
we can fetch the graph. That's because the metric is determined by which
Top stats are returned (for example, we can't be sure whether revenue
metrics will be returned or not).

* Make sure graph tooltip says "Converted Visitors"

* Extract a StatsExport function component

Again, instead of relying on `graphData?.interval` we can read it from
localStorage, or default to the largest interval available. The export
should not be dependant on the graph.

* Extract SamplingNotice function component

* Extract WithImportedSwitch function component

* Stop "lazy-loading" the graph and top stats

Since the container is always on top on the page, it will be visible on
the first render in any case - no matter the screen size.

* Turn VisitorGraph into a function component

* Display empty container until everything has loaded

* Do not display loading spinner on realtime ticks

* Turn Top Stats into a fn component

* fetch top stats and graph async

* Make sure revenue metrics can remain on the graph

* Add an extra check to canMetricBeGraphed

* fix typo

* remove redundant double negation
2024-04-04 13:39:55 +01:00
Karl-Aksel Puulmann
3115c6e7a8
Reducing JOINs in queries (#3966)
* Move experimental_session_count? logic to within query object

* WIP new querying system for deciding what tables to query

* both -> either

* Include sample_percent in both tables

* Remove a hanging TODO

* Allow filtering by visit props on event queries if flag is on

* Make default sessions join more conditional

* Simplify events_join_sessions?

* Add some TODOs

* Fix assignment

* Handle entry/exit page visit props separately from props stored in events table

* Update test which created sessions/events differently from everyone else

* Make query_events private

* Dont filter by session properties on events table if querying sessions and joining in events

* Handle visits, pageviews, events and visitors metrics from other table

* both -> either

* events, pageviews are strictly event metrics

* Add support for (plain) breakdowns deciding which table to use

* Run tests with experimental_reduced_joins as a separate job

Also refactor which tests are run with postgres:15 to reduce number of jobs

* moduledocs for TableDecider

* Fix matrix

* Custom build name

* Move TEST_EXPERIMENTAL_REDUCED_JOINS check

* Handle percentage separately from other metrics

* Remove debug code

* TableDecider tests

* both => sample_percent

* Improve naming

* Simplify code

* Breakdowns retain old behavior if getting metric visitors

* Unify behavior of entry/exit page hostnames with rest

* Fix test naming
2024-04-04 13:54:23 +03:00
hq1
6af80dd246
Filter by hostnames (#3963)
* CH Migration: exit/entry hostnames in sessions_v2

* Leave only exit_page_hostname, we already record hostnames

* Use ClickHouse DDL in favour of ecto so that cluster is included

* Compress with ZSTD(3)

* Expose Hostname filter in the dashboard dropdown

* Add `exit_page_hostname` to ClickHouse `sessions_v2` schema

* Start tracking hostname changes in sessions

* Implement hostname filter suggestions

* Enable filtering by `event:hostname`

* Add tests for filtering by hostnames

* Ensure filter suggestions work for exit pages too

* Allow overriding hostnames with `send_pageview` mix task

* Remove `:window_time_on_page` flag

It seems that we can remove it after all?

* Initialize `experimental_hostname_filter` query parameter

* Rewrite cache store behaviour with regards to session hostnames

* Work around inconsistent session merging

So that `populate_stats` can get closer to actual ingestion

* Improve top stats test

* Make it possible to filter sessions by entry/exit hostnames

* Update pages tests

* Expose `experimental_hostname_filtering` temporarily in the UI

* Untested yet: also apply experimental filtering to sources

* Introduce `hostname_filter` feature flag

* Format

* Test top sources with hostname filter + experimental flag
2024-04-04 10:48:30 +02:00
Adrian Gruntkowski
e6d83e946f
Populate new columns in imports and exports (#3969)
* Extend `Imported*` schemas with newly added columns

* Populate newly added `Imported*` fields in GA4 imports

* Extend exports with newly added fields

* Extend CSV importer to ingest new fields

* Fix alias shadowing error

* Add more extensive GA4 import fixtures

* Apply rounding and casting to sampled visits
2024-04-04 10:33:19 +02:00
Adrian Gruntkowski
9f27fa303c
Fix dry run mode in DataMigration.SiteImports (#3965) 2024-04-02 14:05:34 +02:00
Adrian Gruntkowski
23a3699dd7
Improve import stats toggle and with_imported flag computation (#3960)
* Check import presence across all imports and not just the first one

Also, simplify imported data toggle rendering to not explicitly
refer to the earliest import source.

* Change imported stats toggle icon in dashboard

* Test `Imported.get_imports_date_range/1`

* Simplify failed UA/GA import email copy
2024-04-02 12:53:19 +02:00
Adrian Gruntkowski
71fe541359
Implement script for backfilling legacy site import entries and adjusting end dates of site imports (#3954)
* Always select and clear import ID 0 when referring to legacy imports

* Implement script for adding site import entries and adjusting end dates

* Log cases where end date computation is using fallback

* Don't log queries when running the migration to reduce noise
2024-04-02 12:53:02 +02:00
Adrian Gruntkowski
5bf59d1d8a
Implement adjusting imported date range to actual and existing stats (#3943)
* Implement adjusting imported date range to actual and existing stats

* Drop redundant prefix from import list entries

* Make pageview numbers in imports list formatted for readability

* Test and improve date range cropping

* DRY UA and GA4 stats start and end date API calls

* Extend UA/GA import controller tests and improve error handling

* refactor finding longest open range without existing data

* Fix typo in test description

Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>

* Rename `open_ranges` to `free_ranges`

---------

Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>
2024-03-28 09:32:41 +01:00
hq1
b31433a7bf
Ensure all the react container attributes are strings (#3948) 2024-03-26 11:01:59 +01:00
hq1
edf70d14b6
Use sessionStorage for "dashboard first launch" banner tracking (#3892)
* Use sessionStorage for offer e-mail report banner tracking

Keeping it within the cookie is problematic, as the banners don't
expire and overflow the cookie with data when enough new sites
are added.

Ref https://github.com/plausible/analytics/issues/3762

* Update changelog

* Extract a component

* Make is_dbip evaluate to quoted boolean
2024-03-26 09:49:15 +01:00
Karl-Aksel Puulmann
4af7019011
Ignore sessions without entry/exit pages when breaking down entry/exit pages (#3933)
* Ignore sessions without entry/exit pages when breaking down entry/exit pages

* Update stats controller tests to have more realistic test data (pageview followed by event)
2024-03-26 09:01:07 +02:00
hq1
2fae0146a4
Reapply 3918 (#3940)
* Reapply "Pages shield (#3918)"

This reverts commit 33b5c10654.

* Make the FF check work against the site actor
2024-03-25 10:36:22 +01:00
hq1
9989ce6927
Migration for 3918 (#3939)
* Revert "Pages shield (#3918)"

This reverts commit 53f94a9f82.

* Migration: Shield page rules
2024-03-25 10:19:50 +01:00
hq1
53f94a9f82
Pages shield (#3918)
* Migration: Shield page rules

* Add Ecto schema for Page Rules

* Add Page Rule cache

* Fix typo

* BTW: Use already imported function

* Extend Shields context interface + split existing tests

* Ingestion: filter matching patches + refactor shield actions

* Add LV section for adding Page Rules

* Validate max page path length

* Put Pages Shield behind a feature flag

* Update CHANGELOG

* Update docs link anchor

As per https://github.com/plausible/docs/pull/477

* Update lib/plausible/shields.ex

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>

* Update lib/plausible_web/live/shields/page_rules.ex

Co-authored-by: ruslandoga <doga.ruslan@gmail.com>

* Update lib/plausible_web/live/shields/page_rules.ex

Co-authored-by: ruslandoga <doga.ruslan@gmail.com>

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
Co-authored-by: ruslandoga <doga.ruslan@gmail.com>
2024-03-25 09:48:56 +01:00
Adrian Gruntkowski
ba5b80a8c0
Add label to site imports and populate it (#3914) 2024-03-22 11:17:02 +01:00
RobertJoonas
d6e1e8bebd
Put total conversions on the graph + goal-filtered CSV export improvements (#3929)
* Add validation for the events metric in main_graph

* Test the already existing events metric support in main-graph

* Put total conversions on the graph

* extract main_graph_csv function (refactor only)

* add total_conversions and conversion_rate to goal-filtered visitors.csv

* update changelog
2024-03-22 09:35:23 +00:00
Uku Taht
fd879eeb16
Store referrers from android apps (#3715)
* Store referrers from android apps

* Add test for unknown referrer protocol

* Store android referrer protocol
2024-03-21 17:45:34 +02:00
RobertJoonas
c32779a3e5
Timeseries for conversion rate (#3919)
* add conversion rate to Stats API timeseries

* make sure CR can be queried as the only metric

* add a test asserting zeros are returned

* add tests for filtering by other properties at the same time

* Remove unnecessary validation of params

1. It doesn't make to validate `interval` (and its granularity) in all
   endpoints. It's only relevant for the main graph.

2. The plug (renamed to `date_validation_plug`) already makes sure that
   the dates are validated. No need to call the same function again in
   Top Stats and Funnel endpoints.

* add metric validation to main graph

* Add tests for main graph API

* put conversion rate on the graph

* update changelog

* Add revenue metrics into metrics.ex

* make fn private

* avoid setting graph metric to visitors in goal-filtered view
2024-03-21 13:58:00 +00:00
Adrian Gruntkowski
d6e81670e4
Unify UA and GA4 import flow into one (#3888)
* Unify GA4 and UA import flow into one

* Clean up property and view data retrieval via Google HTTP APIs

* Turn `Map.get` into `Map.fetch!` in API response processing code

* Bump list account summaries page size limit to max of 200

* Show only views in legacy flow and fix legacy redirect after import start

* Move google analytics import actions tests to a separate module

* Extend Google Analytics controller tests

* DRY up `property?` predicate (h/t @RobertJoonas)
2024-03-21 11:37:10 +01:00
ruslandoga
5f9465614b
Include domain and dates in zip archive filename (#3921)
* include domain and dates in zip archive filename

* adapt to comments
2024-03-21 11:35:42 +01:00
Karl-Aksel Puulmann
32ab138301
Fix issue with name clash (#3925)
Unexpectedly, table.name caused a name clash after CR refactor, so using a unique name
for the output column

Sentry issue: https://sentry.plausible.io/organizations/sentry/issues/5612
2024-03-21 10:11:29 +00:00
Karl-Aksel Puulmann
c219652dae
Re-apply Move conversion_rate logic from elixir to clickhouse (#3924)
* Revert "Revert "Move conversion_rate logic from elixir to clickhouse (#3887)"…"

This reverts commit 253fb5d67d.

* Fix issue with missing columns

The issue came from refactoring event:goal UNION ALL logic and trying to move
name select from first to last. If any other tables were joined, the incorrect
item would be used as an array index, causing this issue.

Added a relevant test.
2024-03-21 10:48:41 +02:00
Karl-Aksel Puulmann
253fb5d67d
Revert "Move conversion_rate logic from elixir to clickhouse (#3887)" (#3923)
This reverts commit 1909743b90.
2024-03-21 09:53:31 +02:00
Karl-Aksel Puulmann
1909743b90
Move conversion_rate logic from elixir to clickhouse (#3887)
* Separate out query building from pagination/execution logic.

* Refactor pageview_goals breakdown query, removing index column from results

* Remove zip_columns logic

* Use common pagination util

* Do everything in a single query for breakdowns for goals

* Order in DB

* Make sure column order is identical

* Calculate CR within the goal breakdown query

* Calculate CR for property breakdowns

* WIP: Calculate group CR

* CR with order_by

* Compatibility fix

* Import Ecto.Query and cleanup

* handle total_visitors the same way as add_percentage

* Handle conversion_rate in aggregate.ex

* Solve rebase fail

* Simplify maybe_add_group_conversion_rate

* Add conversion_rate defaults to 0 test

* Add test for conversion_rate should not be calculated with imported data (failing here and on master)

* Dont include imported data when breakdown by prop or goal

* Remove revenue_nils
2024-03-21 09:38:44 +02:00
ruslandoga
279e89c693
CSV imports (no UI) (#3895)
* encode/decode date range in filenames

* Update lib/plausible/imported/csv_importer.ex

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>

* Update lib/plausible/imported/csv_importer.ex

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>

* drop unused functions

* send failure email if there is no data to export

* use PlausibleWeb.Email.mailer_email_from()

* ensure we get dates from minmax date query

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-03-19 12:06:47 +01:00
ruslandoga
4242b52be4
Allow importing extra config (#3906)
* allow importing extra config

* changelog

* fix typo

* add test
2024-03-19 12:02:52 +01:00
Karl-Aksel Puulmann
02d2256483
Set exit_page only on pageviews (#3870)
* Set exit_page only on pageviews

* Update tests

* Update entry_page on first pageview

* Update CHANGELOG.md
2024-03-18 11:11:15 +02:00
ruslandoga
07b714a143
Update Sentry (#3843)
* update Sentry

* Sentry.HTTPClient.child_spec is now optional

* Sentry.EventFilter is deprecated

* update sentry to 10.2.0

* fix dialyzer warnings
2024-03-18 10:10:20 +01:00
ruslandoga
5e74b1cf74
CSV exports (UI) (#3875)
* ui

* fix redirect link

* improve make minio

* use implicit button form for csv export

* add exports_bucket helper

* read S3_EXPORTS_BUCKET

* supply s3_bucket in export_csv job args

* make plausible_minio use unprotected port

* move s3_csv_export queue to base queues

* Update lib/plausible_web/controllers/site_controller.ex

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-03-18 08:52:57 +01:00
hq1
59afa20955
Reapply #3878 + bugfix hit rate tracking (#3891)
* Reapply "Replace caching engine (#3878)" (#3883)

This reverts commit c5881cdc6d.

* Ensure hit rate is tracked on `get_or_store`

* Remove :wx and :observer

* Remove unused deps

* Use `:set` table type
2024-03-14 08:06:12 +01:00
RobertJoonas
e8f3946dde
Fix division by zero in imported queries (#3890)
* prevent division by 0 in merge_imported queries

* Revert "fix bounce_rate change bug (#3886)"

This reverts commit 6eef32a8ff.

After 02aa0b2, we can keep on assuming that bounce rate is always numeric.
2024-03-13 10:37:14 +00:00
Adrian Gruntkowski
4d7d88cfec
Implement basics of GA4 import (#3851)
* Implement LV date input using flatpickr

* Implement basics of GA4 import (very dirty WIP)

* Split Google HTTP API into UA and GA4 specific parts

* Add a quick way to record GA4 API responses

* Add first GA4 import fixtures with GA4 Data API responses

* Extract GA4 and UA specific logic form Google API

* Extract UA and GA4 specific actions to distinct controllers

* Add integration test for GA4 importer

* Update GA4 fixtures

* Test GA4 API

* Add debug logging and fix paginating through API results in in GA4 import

* Revert "Implement LV date input using flatpickr"

This reverts commit c696f8ee39d5702f27015c09a4f079ca124cc7bb.

* Fix note
2024-03-12 18:08:25 +01:00
ruslandoga
f2350b5165
Add /tmp/ to .gitignore and simplify s3 cleanups in tests (#3889) 2024-03-12 17:58:05 +01:00
ruslandoga
5a3072ca21
CSV exports (no UI) (#3836)
* csv exports

* use ex_unit's tmp_dir
2024-03-12 17:27:27 +01:00
RobertJoonas
7641c66a2b
Stats api time on page (#3858)
* add metric validation + support in aggregate

* add a test ensuring comparison works

* disallow time_on_page with a goal filter

* Return time_on_page as `nil` from aggregate API

In case time_on_page cannot be calculated, we'll return it as `nil` from
the Stats API.

This is to make the behaviour consistent between breakdown and aggregate
endpoints. As for the UI, we'll still continue to report time_on_page as
0 - not changing any UI behaviour as discussed with Marko.

* add tests for time_on_page in event:page breakdown

* update changelog

* invalidate time_on_page with event:name filter

* add the ability to only query time_on_page in page breakdown

We'll need the visitors metric to get the list of pages to calculate the
time_on_page for.
2024-03-12 10:00:32 +00:00
hq1
c5881cdc6d
Revert "Replace caching engine (#3878)" (#3883)
This reverts commit 437a3350ff.
2024-03-12 08:30:16 +01:00
hq1
437a3350ff
Replace caching engine (#3878)
* Dependencies: swap Cachex for ConCache

* Implement Cache adapter wrapping ConCache

* Implement cache stats tracker, for metrics

* Use Cache.Adapter in Plausible.Cache

Marking the test as not slow anymore

* Use Cache Adapter when tracking sessions

* Use Cache Adapter for UA parsing

* Rename child identifiers - cachex is obsolete now

* Test stats tracking

* Update grafana metrics

* Put all caches under common child specification

* Try less

* Shorten the function delegation path
2024-03-12 07:58:12 +01:00
Karl-Aksel Puulmann
a9d3c03782
Validate the same metric isnt queried multiple times in external stats API (#3871)
* Validate the same metric isnt queried multiple times in external stats API

Issue: https://3.basecamp.com/5308029/buckets/35611491/card_tables/cards/7161347855

* Changelog entry

* Make credo happy
2024-03-08 10:46:18 +02:00
Karl-Aksel Puulmann
0cdba7d407
Fix broken tests (#3867) 2024-03-06 17:14:04 +02:00
Karl-Aksel Puulmann
c6d98397a8
Move add_percentage logic into clickhouse (#3854)
* Remove `add_percentage`, calculate percentages in clickhouse queries

This simplifies querying logic and avoids doing extra queries and avoids
race conditions.

* Remove special none handling from breakdowns, handling percentages correctly

* Add (failing) test showing expected add_percentage behavior for user making multiple sessions

* Update add_percentage behavior to use separate subqueries
2024-03-06 11:08:25 +02:00
Karl-Aksel Puulmann
c60a2faee4
Write event table session columns (#3865)
* Write event table session columns

* Update testing factory rig
2024-03-06 10:59:24 +02:00
Karl-Aksel Puulmann
8d977e0f76
Tests: session properties without the prefixes (#3863)
* Undo event session attributes renaming

* Rename session_ attributes in tests
2024-03-05 12:44:33 +02:00
Karl-Aksel Puulmann
d5048fd6b4
Stop writing session properties into events table (#3800)
* Refactor: Explicitly add field names to INSERT

This avoids issues when code schema is out of sync with real schema

* Dont write session parameters to events

These would only be stored on first event anyways. Work remains to be done
on tests which have their own helper

* Remove writes to country_code in a test

* Remove old columns from being accessible in elixir code

* Update most tests to use new way of adding session props to events

* Update testing harness

* Update stats controller test

* Update for shield rules

* update breakdown tests

* Fix typing of state for dialyzer

* Drop support for old session attributes code

* Update remaining tests

* cond -> if
2024-03-01 10:53:56 +02:00
hq1
5eb9d724e5
Plugins API: mark data_domain as nullable in capabilities schema (#3840)
* Plugins API: mark data_domain as nullable in capabilities schema

* fixup
2024-02-28 10:26:01 +01:00
Adrian Gruntkowski
39aa81a16f
Implement UI for multiple imports (#3727)
* Create a stub of site settings section for imports and exports

* Use legacy site import indication to determine UA import handling

* Add provisional logos for upcoming import sources

* Stub basics of import page

* Add very rudimentary support for multiple UA imports

* Implement imports list as live view

* Add support for opening LV modal from backend and closing from frontend

* Introduce notion of themes to `button` and `button_link` components

* Add confirmation modal on deleting import

* Swap GA4 logo

* Implement disabled state support for `button_link` component

* Disable export and non-implemented import sources

* Use native starts start date for upper boundary of import time range

* Ensure integrations view uses legacy UA import flow

* Remove unnecessary preload in SiteController

* Remove unnecessary exception for legacy imports

* Move API controller stats tests under PlausibleWeb

* Test listing imports

* Add test for explicit listener setup

* Add tests for legacy flag state in UA importer

* Add test for purging legacy import data

* Add tests for `Sites.native_stats_start_date`

* Test forgetting imports

* Add `Stats.Clickhouse.imported_pageview_counts/1` and fix test flakiness

* Show page view counts on imports list

* Add tests for static imports and exports view

* Adjust button look slightly

* Use `case` instead of `cond`

* Make feature flag customisable per site

* Fix buttons and empty state styling

* Add another import to seeds

* Use JS confirm dialog instead of modal for deletion confirmations

* Revert "Add support for opening LV modal from backend and closing from frontend"

This reverts commit 260e6c753032b451542e24be9edc2118790b5a00.

* Default `legacy` to false when inserting new import jobs

* Drop `method` attribute from `button_link` and `unstyled_link` components
2024-02-28 09:34:04 +01:00
ruslandoga
f3423aefec
Add csv importer (#3795)
* add csv importer

* make table validation explicit

* update some docs

* improve docs

* add minio container to ci

* more tests

* eh

* continue

* add passing test

* add failing test

* add config test

* add minio to Makefile

* testcontainers

* remove extra whitespace

* explain the implementation a bit

* account for async deletes in tests

* bounces is UInt32

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-02-27 14:19:09 +01:00
ruslandoga
31cf3e54f8
Add Bamboo.Mua (#3654) 2024-02-27 14:18:36 +01:00
hq1
f1b6a672d4
Fix added_by not saving on adding country rules (#3835)
* Fix `added_by` not saving on adding country rules

* Format

* Remove dupe attr
2024-02-27 12:36:16 +01:00
hq1
518cdb3307
Shield: Country Rules (#3828)
* Migration: add country rules

* Add CountryRule schema

* Implement CountryRule cache

* Add country rules context interface

* Start country rules cache

* Lookup country rules on ingestion

* Remove :shields feature flag from test helpers

* Add nested sidebar menu for Shields

* Fix typo

* IP Rules: hide description on mobile view

* Prepare SiteController to handle multiple shield types

* Seed some country shield

* Implement LV for country rules

* Remove "YOU" indicator from country rules

* Fix small build

* Format

* Update typespecs

* Make docs link point at /countries

* Fix flash on top of modal for Safari

* Build the rule struct with site_id provided up-front

* Clarify why we're messaging the ComboBox component

* Re-open combobox suggestions after pressing Escape

* Update changelog

* Fix font size in country table cells

* Pass `added_by` via rule add options

* Display site's timezone timestamps in rule tooltips

* Display formatted timestamps in site's timezone

And simplify+test Timezone module; an input timestamp converted
to UTC can never be ambiguous.

* Remove no-op atom

* Display the maximum number of rules when reached

* Improve readability of remove button tests

* Credo

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-02-27 12:08:13 +01:00
RobertJoonas
dd428430f5
Query imported data for views_per_visit (#3830)
* query imported data for views_per_visit

* changelog update
2024-02-26 15:54:49 +00:00
RobertJoonas
52f584efa9
Group os_version by os (#3806)
* improve test

* add os to os_version breakdown

...and add operating_system_versions.csv to the CSV export

* fix conversion rate for os_version breakdown

* update changelog

* fix existing CSV tests

* use case instead of cond
2024-02-22 15:58:50 +00:00
RobertJoonas
d74b1d5e60
Reapply conversion rate into Stats API + bugfixes (#3805)
* Revert "Revert api conversion rate (#3789)"

This reverts commit 8e8790dd30.

* fix browser_version CR breakdown bug

* changelog bugfix

* inspect data structures before sending to sentry
2024-02-21 15:53:05 +00:00
hq1
6035618213
Add GET /capabilities to Plugins API (#3808)
* Add `GET /capabilities` to Plugins API

It aims to:

 - help the client verify the data-domain the token is associated with
 - list all the features available for the site's owner
   (and therefore determine availability of the subset of those for the current
   Plugins API caller)

The endpoint does not require authentication, in the sense that it'll
always respond with 200 OK. However when the token is provided,
a verification lookup is made.

* Remove IO.inspect() call

* Credo

* Aesthetics

* s/send_resp/send_error/

* Call preload just once
2024-02-21 12:41:56 +01:00
Karl-Aksel Puulmann
5cdca6f408
Remove code that updates session browser/geo/OS attributes (#3796)
This will makes it impossible to store session attributes on events.

Looking at production data also revealed no cases where these updates
are effective.
2024-02-20 09:35:12 +02:00
hq1
eceac8afd5
Allow inviting users who are members already (#3797)
* Allow e-mail exclusion in team members quota

* Exclude invitee from quota on invitation create

* Enable invitation submission but report errors on quota violation

* Use a single interface for team members quota

* Check the `Keyword.validate/2` result

* Update test/plausible_web/controllers/site/membership_controller_test.exs

Co-authored-by: Uku Taht <Uku.taht@gmail.com>

---------

Co-authored-by: Uku Taht <Uku.taht@gmail.com>
2024-02-19 12:12:31 +01:00
RobertJoonas
8e8790dd30
Revert api conversion rate (#3789)
* Revert "Unify percentage change for CR and bounce_rate (#3781)"

This reverts commit a6b1a6ebc7.

* Revert "Bring Stats API up to speed: Add `conversion_rate` to Aggregate and Breakdown (#3739)"

This reverts commit 672d682e95.
2024-02-15 17:43:35 +00:00
RobertJoonas
a6b1a6ebc7
Unify percentage change for CR and bounce_rate (#3781)
* Fix conversion rate change calculation

The change in conversion rate should be calculated similar to bounce rate.
For example, an increase of 25% -> 50% should not be a 100% change, but
a 25% change instead.

* Use the same comparison function in Stats API and dashboard API

This commit fixes a bug where the percentage change reported by the Stats
API is different from the one returned by the internal dashboard API.

* changelog update
2024-02-15 12:10:08 +00:00
hq1
926de4dd10
Experimental session count (#3786)
* WIP

* Allow `experimetnal_session_count` request serialization

* Extend `Plausible.Stats.Query` with `experimental_session_count` flag

* Add `FunWithFlags` actor implementation for `Site`

* Change the way sessions are retrieved

* Remove redundant test

* Format

* Update the test

---------

Co-authored-by: Uku Taht <uku.taht@gmail.com>
2024-02-15 12:21:07 +01:00
RobertJoonas
672d682e95
Bring Stats API up to speed: Add conversion_rate to Aggregate and Breakdown (#3739)
* disable event metric with include_imported in every case

* add missing test for metric validation

* refactor metric validation functions

* implement conversion_rate metric validation

* move calculate_cr function into Stats.Util

* Refactor: Move aggregate CR logic into Stats.aggregate

* define atoms to exist

* Ensure that CR does not depend on visitors being queried

If 'visitors' are already queried, we'll use that value. Otherwise we'll
need to make another query to fetch it.

* confirm Stats API aggregate supports CR (tests only)

* small refactor

This is the only 'event_property' left after pattern matching on all
others in the function clauses defined above.

* Make it possible to optionally query conversion_rate

...in breakdown queries (excluding goal and custom prop breakdown)

* A little refactor asking for revenue metrics

1. The `@revenue_metrics` module attribute is an empty list on full build
   anyway
2. We don't need to query for revenue metrics if there are no revenue goals
   returned in the given query (even if revenue goals exist in site.goals)
3. Revenue metrics are already dropped in prop breakdown without a goal
   filter via (get_revenue_tracking_currency/3)

* Make it possible to optionally query conversion_rate (continuation)

... also from a custom prop and goal breakdown

* Frontend adjustments to the Locations report

* Display conversion rate in Regions and Cities (ListReport view)
* Display total conversions, conversions (visitors), and CR in the
  "Details" modals of Countries, Regions, and Cities
* Move the percentage into a separate column in the Countries details table

* confirm Stats API breakdown supports conversion_rate (tests only)

* small refactor: extract maybe_add_time_on_page function

* Make it possible to query cr alone

... (without the visitors metric). Already supported in aggregate, this
commit only implements it for the breakdown API.

* Reuse Stats.Util helper functions from b02db88 for aggregate API

We can follow the same logic as with breakdown for manually adding
`visitors` into the metrics list and taking it out of the response
later on.

That way we don't have to make another query, e.g. in a case where
only pageviews and conversion rate is queried. Also keeps things
consistent.

* changelog update

* fix test after resolving merge conflict

* Use explicit string->atom mapping instead of casting

* alias Util module instead of importing it

* use Enum.empty instead of Enum.any

* improve readability

* rename special_metrics to computed_metrics and explain with a comment

* rename visitors_without_event_filters to total_visitors

* keep a single function for removing unwanted metrics

---------

Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-02-15 09:18:57 +00:00
ruslandoga
423f72a0ad
safe(r) Oban telemetry (#3743)
* safe telemetry

* use try/catch instead

* add some tests

* cleanup tests

---------

Co-authored-by: hq1 <hq@mtod.org>
2024-02-14 10:12:03 +01:00
Adrian Gruntkowski
f8b4d5066a
Add multiple imports per site (#3724)
* Clean up references to no longer active `google_analytics_imports` Oban queue

* Stub CSV importer

* Add SiteImport schema

* Rename `Plausible.Imported` module file to match module name

* Add `import_id` column to `Imported.*` CH schemas

* Implement Importer behavior and manage imports state using new entities

* Implement importer callbacks and maintain site.imported_data for UA

* Keep imports in sync when forgetting all imports

* Scope imported data queries to completed import IDs

* Mark newly imported data with respective import ID

* Clean up Importer implementation a bit

* Test querying legacy and new imported data

* Send Oban notifications on import worker failure too

* Fix checking for forgettable imports and remove redundant function

* Fix UA integration test

* Change site import source to atom enum and add source label

* Add typespecs and reduce repetition in `Plausible.Imported`

* Improve documentation and typespecs

* Add test for purging particular import

* Switch email notification templates depending on import source

* Document running import synchronously

* Fix UA importer args parsing and ensure it's covered by tests

* Clear `site.stats_start_date` on complete import to force recalculation

* Test Oban notifications (h/t @ruslandoga)

* Purge stats on import failure right away to reduce a chance of leaving debris behind

* Fix typos

Co-authored-by: hq1 <hq@mtod.org>

* Fix another typo

* Refactor fetching earliest import and earliest stats start date

* Use `Date.after?` instead of `Timex.after?`

* Cache import data in site virtual fields and limit queried imports to 5

* Ensure always current `stats_start_date` is used

* Work around broken typespec in Timex

* Make `SiteController.forget_imported` action idempotent

* Discard irrecoverably failed import tasks

* Use macros for site import statuses

There's also a fix ensuring only complete imports are considered
where relevant - couldn't isolate it as it was in a common hunk

* Use `import_id` as worker job uniqueness criterion

* Do not load imported stats data in plugins API context

---------

Co-authored-by: hq1 <hq@mtod.org>
2024-02-14 09:32:36 +01:00
Karl-Aksel Puulmann
f3509f2a17
Refactor spike detection top sources query (#3770)
* ORDER BY referrer_source for spikes job

This is more consistent with the rest of the queries

* Refactor top_sources -> top_sources_for_spike

* Remove more dead code

* Remove unused arguments

* Remove unused select arguments

* Add a test to top_sources_for_spike
2024-02-13 08:28:32 +02:00
Karl-Aksel Puulmann
d1fe184cb7
Remove ORDER BY min(session.start) (#3771)
This adds a needless performance overhead. Note that all queries already
sort by the _name_ of the breakdown value besides the session start.
2024-02-13 08:28:23 +02:00
hq1
494ffbd622
Shields behind a FF (#3780) 2024-02-12 15:22:06 +01:00
hq1
99fe03701e
IP Block List (#3761)
* Add Ecto.Network dependency

* Migration: Add ip block list table

* If Cachex errors out, mark the cache as not ready

* Add IPRule schema

* Seed IPRules

* Add Shields context module

* Implement IPRuleCache

* Start IPRuleCache

* Drop blocklisted IPs on ingestion

* Cosmetic rename

* Add settings sidebar item

* Consider IPRuleCache readiness on health checks

* Fix typo

* Implement IP blocklist live view

* Update moduledocs

* Extend contextual module tests

* Convert IPRules LiveView into LiveComponent

* Keep live flashes on the tabs view

* Update changelog

* Format

* Credo

* Remove garbage

* Update drop reason typespecs

* Update typespecs for cache keys

* Keep track of who added a rule and when

* Test if adding via LV prefills the updated_by tooltip

* Update ecto_network dependency

* s/updated_by/added_by

* s/drop_blocklist_ip/drop_shield_rule_ip

* Add docs link

* s/Updated/Added
2024-02-12 14:55:20 +01:00
hq1
6a2d7fc0f5
Merge Plugins.API.Router into main one (#3767)
* Merge `Plugins.API.Router` into main one

In order to get grafana metrics reported
See: https://github.com/akoutmos/prom_ex/issues/224

* Format
2024-02-12 10:44:32 +01:00
hq1
f5129f1b0d
Turn Revenue Goals into Custom Events if the plan doesn't support them (#3768)
* Turn Revenue Goals into Custom Events if the plan runs out

* Tag test with full_build
2024-02-12 10:43:54 +01:00
Karl-Aksel Puulmann
1cb7982cd9
Filtering by multiple custom properties (#3719)
* WIP: PropFilterRow

* Get multi-behavior working

* Render multiple prop filters in one

* Modal reads from query string correctly

* Backend support for multiple custom property filters

* Add backend tests for multiple custom property filters

* Disable already selected options in property keys

We can't allow choosing the same property multiple times without changing the request
params, which we decided against

* Allow choosing any property under Behaviors > Custom props even if custom prop filter applied

This was a limitation (I believe) introduced by using ARRAY JOINs to query custom properties

* CHANGELOG.md

* Solve credo warning about too deep nesting

* Update assets/js/dashboard/stats/modals/prop-filter-modal.js

Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>

* Refactor internal function for clarity

* Add another step -> Add another

* Solve 500 error

* Separate boxes per property filter

* Retain other filters in props table

* removeFilter behavior for props

* matches_member support for custom props

* filter_suggestions for prop keys should account for prop filter

* find over filter

* refactor appliedFilters

* FILTER_TYPES => FILTER_OPERATIONS

* Make add another link not wrap the whole page

* Unique keys

---------

Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>
2024-02-12 09:03:57 +02:00
hq1
6ce4d19931
Fix flaky test: wait till ets is emptied (#3755)
* Fix flaky test: wait till ets is emptied

* fixup
2024-02-07 08:59:00 +01:00
RobertJoonas
51f0e406a0
improve stats api upgrade required error message (#3747) 2024-02-01 17:00:09 +00:00
Adrian Gruntkowski
3738cd9578
Parse referrer filter value when passed separately (#3742)
* Parse referrer filter value when passed separately

* Remove unnecessary test setup
2024-02-01 10:27:11 +01:00
hq1
e28793a45a
Make hello@plausible.io clickable (#3746)
* Make `hello@plausible.io` clickable

* Format

* ...
2024-02-01 09:43:21 +01:00