* Wrap Plausible.Stats.Filters with unit tests
* Parse `member` filter type
* Support for `member` filter in `aggregate_time_on_page`
* Support `not_member` filter type
* Support `matches_member` and `not_matches_member` filters
* Extract util module for React filters
* Implement Combobox from scratch with no libs
* Support multple filter clauses in combobox
* Don't use browser / os in version label
* Show highlighted option in combobox
* WIP
* Fix location filters outside filter modal
* Align open/close behaviour with react-select
* Styling updates for combobox
* Add support for wildcards in Combobox
* Implement keybindings for combobox
* Allow free choice inputs in combobox
* Rename 'Save filter' -> Apply filter
* Remove TODO comment
* Clean up some rebase mistakes
* Rename `allowWildcard` -> `freeChoice`
* Dark mode fixes
* Remove hint from filter modal
* Escape pipe character in filter modal
* Do not allow selecting duplicate options in combobox
* Escape brackets in `page_regex/1`
* Fix disabled style in dark mode
* Add regex fallback for safari
* Show no matches found when visibleOptions is empty
* Disable enter key when no visible options
* Do not submit empty form fields
* Remove unnecessary setOpen(true)
* Remove ClickhouseSetup module
This has been an implicit point of contact to many
tests. From now on the goal is for each test to maintain
its own, isolated setup so that no accidental clashes
and implicit assumptions are relied upon.
* Implement v2 schema check
An environment variable V2_MIGRATION_DONE acts like
a feature flag, switching plausible from using old events/sessions
schemas to v2 schemas introduced by NumericIDs migration.
* Run both test suites sequentially
While the code for v1 and v2 schemas must be kept still,
we will from now on run tests against both code paths.
Secondary test run will set V2_MIGRATION_DONE=1 variable,
thus making all `Plausible.v2?()` checks return `true'.
* Remove unused function
This is a remnant from the short period when
we would check for existing events before allowing
creating a new site.
* Update test setups/factories with v2 migration check
* Make GateKeeper return site id along with :allow
* Make Billing module check for v2 schema
* Make ingestion aware of v2 schema
* Disable site transfers for when v2 is live
In a separate changeset we will implement simplified
site transfer for when v2 migration is complete.
The new transfer will only rename the site domain in postgres
and keep track of the original site prior to the transfer
so we keep an ingestion grace period until the customers
redeploy their scripting.
* Make Stats base queries aware of v2 schema switch
* Update breakdown with v2 conditionals
* Update pageview local start with v2 check
* Update current visitoris with v2 check
* Update stats controller with v2 checks
* Update external controller with v2 checks
* Update remaining tests with proper fixtures
* Rewrite redundant assignment
* Remove unused alias
* Mute credo, this is not the right time
* Add test_helper prompt
* Fetch priv dir so it works with a release
* Fetch distinct partitions only
* Don't limit inspect output for partitions
* Ensure SQL is printed to IO
* Remove redundant domain fixture
* Fix Timex.total_offset blowing up during clock changes
* Format large numbers with underscore in tests
Co-authored-by: Adam <hq@mtod.org>
---------
Co-authored-by: Adam <hq@mtod.org>
* Wrap Plausible.Stats.Filters with unit tests
* Parse `member` filter type
* Support escaped | in member filter
* Support for `member` filter in `aggregate_time_on_page`
* Add support for `member` filter type on goals
* Disable Credo warning
* Support `not_member` filter type
* Disable credo for `query_sessions`
* Support `matches_member` and `not_matches_member` filters
* Disable Credo for `Filters.filter_value/2`
* Support `matches_member` and `not_matches_member` for goal filter
* Support for contains_member and friends
* Updates for new chto driver
* make top_stats_test.exs:203 pass (#2779)
---------
Co-authored-by: ruslandoga <rusl@n-do.ga>
This pull request adds support for multiple comparison modes, changes the comparison checkbox to a combobox, and implements the year over year comparison mode. The feature is still behind a feature flag.
Co-authored-by: Uku Taht <uku.taht@gmail.com>
* globally rename 'pages_per_visit' to 'views_per_visit'
* change the order of top stats
* rename 'Visits' to 'Total visits' in the UI
* add views_per_visit to UI
* put the new metric under a feature flag
* add new metric to CSV export under feature flag
* mix format
* use only one feature flag
* refactor metric validation
* link to the correct docs section
* add the new metric to aggregate API
* explicitly remove __internal_visits
The overall metric selection is well defined by
`Plausible.Stats.Timeseries.empty_row/2`. The only metric that needs
to be removed from the timeseries response is __internal_visits.
This commit also moves the `remove_internal_visits_metric` function to a
new Util module to be used by both breakdown and timeseries.
* add the new metric to timeseries API
* mix format
* add moduledoc to keep credo happy
* convert pages_per_visit to string straight away
* do rounding in db query
* query # of visits with sum(sign) instead
* stop converting pages_per_visit to string
* Use user-agent instead of screen_width to get device type
Co-authored-by: eriknakata <erik.nakata5@gmail.com>
* Fix credo
* Log on unhandled UAInspector device type
* Make 'browser' the default tab in devices report
* Remove device tooltip
* Remove screen_width from ingestion completely
* Remove browserstack harness, run playwright directly
* Select meta key based on OS platform
* Run CI tests in parallel
* Improve device match readability
* Add changelog
---------
Co-authored-by: eriknakata <erik.nakata5@gmail.com>
* Clickhouse migration: add ingest_counters table
* Configure ingest counters per MIX_ENV
* Emit telemetry for ingest events with rich metadata
* Allow building Request.t() with fake now() - for testing purposes
* Use clickhousex branch where session_id is assigned to each connection
* Add helper function for getting site id via cache
* Add Ecto schema for `ingest_counters` table
* Implement metrics buffer
* Implement buffering handler for `Plausible.Ingestion.Event` telemetry
* Implement periodic metrics aggregation
* Update counters docs
* Add toStartOfMinute() to ordering key
* Reset the sync connection state in `after` clause
* Flush counters on app termination
* Use separate Repo with async settings enabled at config level
* Switch to clickhouse_settings repo root config key
* Add AsyncInsertRepo module
* Add visits metric and make it graphable
* include visits metric in csv export (visitors.csv)
* put visits under a feature flag (CSV export)
* feature flag for displaying visits on the dashboard
* fix formatting
* add visits metric to top stats (fix)
* fix imported_test to expect visits metric included
* fix formatting
* Use HeadlessUI for search select box
* Remove downshift from package.json
* More consistent API for Combobox component
* Combine toFilterType and valueWithoutPrefix into a single function
* Rename MyCombobox -> PlausibleCombobox
* Update webpack-cli
* Disable cache for build
* Revert "Disable cache for build"
This reverts commit aa130541f8.
* Disable cache for build
* Update webpack dependencies
* Remove glob from webpack config
* Webpack is required by package.json
* Require autoprefixer in postcss config
* Revert build changes
* Fix styling for dark mode
* Reject unknown imported cities from queries
This commit fixes a bug where the city report returned `N/A` entries.
The functions that build imported data queries were using SQL
`COALESCE`, assuming city data is `NULL` when unknown, when actually its
unknown value is `0`.
This commit addresses the problem using SQL `NULLIF` combined with the
previous `COALESCE` call. With this change both `0` and `NULL` are
treated as unknown.
Since 1cb07efe6d cities can be `NULL`, but
previously we saved `0` as unknown.
Closes#1960
* Add entry to CHANGELOG
* Ignore cyclomatic complexity Credo check
This commit adds city data to imported records from Google Analytics. The
current implementation sets city to 0 because GA does not use the GeoNames
database.
Google Analytics Reporting API uses [Geographical IDs](https://developers.google.com/analytics/devguides/collection/protocol/v1/geoid)
to identify cities and countries. Plausible uses
[GeoNames](https://geonames.org/) and I couldn't find databases corelating the
two.
Fortunately, GA also returns the city name and this commit uses the city name
and the country ISO code to find the Geoname ID. To avoid making expensive ETS
searches, I created another ETS table in the Location library that uses
{country, city} as a key.
Related PR: https://github.com/plausible/location/pull/3
* Configure ingest repo access/pool size
If I'm not mistaken 3 is a sane default, the only
inserts we're doing are:
- session buffer dump
- events buffer dump
- GA import dump
And all are serializable within their scopes?
* Add IngestRepo
* Start IngestRepo
* Use IngestRepo for inserts
* Annotate ClickhouseRepo as read_only
So no insert* functions are expanded
* Update moduledoc
* rename alias
* Fix default env var value so it can be casted
* Use IngestRepo for migrations
* Set default ingest pool size from 3 to 5
in case conns are restarting or else...
* Ensure all Repo prometheus metrics are collected
* Make Device section components aware of (not set)
So that no extra sub-filters are possible when the unset
top item is selected.
* Support '(not set)' in breakdown/filters
* Update expectations for export tests
* Add extra tests for returning/filtering by '(not set)'
* Add changelog entry
* Remove ListReport conditional render
* Prevent redundant sub-filters
* Fix filter text rendering
---------
Co-authored-by: Uku Taht <uku.taht@gmail.com>
* Move Endpoint errors setup to common config
* Implement naive Sentry link resolver
* Implement error report e-mail
* Delete static sentry script
* Implement user feedback form on server errors
* Re-arrange pipe
* Use Sentry.Config.dsn() where applicable
* Fix typo
* Use Map.replace/3
Some changes to be more consistent with the emails we send. Also "valid" subscription rather than "active" subscription fits better for the different cases where this screen is shown
* fix subquery for sessions in base_event_query/2
As the 'sessions' table is using the CollapsingMergeTree engine, we have
to select session_id's distinctively. Otherwise we will get multiple rows
(with sign -1 and 1) as long as the background merge hasn't happened.
* update changelog
* use GROUP BY instead of SELECT DISTINCT
* remove comma
* Fingerprint DBConnection.ConnectionError in Sentry
* Check events before creating a site
* Enable sites limit screen
* Remove debugging remnant
* Fix buggy assertions
This wasn't doing what expected:
iex(1)> Repo.exists?(Plausible.Site, domain: "foo")
[debug] QUERY OK source="sites" db=0.6ms idle=1906.2ms
SELECT TRUE FROM "sites" AS s0 LIMIT 1 []
* Encapsulate check to satisfy credo
* Use less technically involved error message
* Bring back e-mail to the limit error message
This PR replaces geolix with locus to simplify self-hosted setup. locus can auto-update maxmind dbs which are recommended for self-hosters if they want city-level geolocation. locus is also a bit faster.
This PR also uses a test mmdb file from https://github.com/maxmind/MaxMind-DB for e2e geolocation tests without stubs.
* Add Checkly Terraform config
* add deployment workflow
* use pagerduty instead of email for notifications
* use terraform cloud backend
* update variable declaration
* rename checkly check group
* update syntax
* test trigger
* Revert "test trigger"
This reverts commit 333e82beac.
* run a single job at a time
Co-authored-by: Cenk Kücük <c@cenk.me>
* Fix breakdown API pagination when using event metrics
This commit fixes a bug where the subsequent breakdown API pages had
the same items as the first page. The fix sorts the underlying
ClickHouse query by timestamp, keeping the same order between requests,
as we use OFFSET/LIMIT pagination.
* Fix repeated results assertion
* Add different ORDER BY to each breakdown property
* Ignore XX and T1 countries
* Add fallback if country_code=nil
* Lookup city overrides directly in CityOverrides module
* Changelog
* Add empty moduledoc
* Remove redundant comment
* Cascade delete sent_renewal_notifications table when user is deleted
This commit fixes a bug when deleting a user would trigger a constraint
error.
* Update CHANGELOG.md
* Set pg pool size for MIX_ENV=test
* Include slow tests in CI run
* Exclude slow tests by default
* Mark tests slow/async where applicable
* Restructure captcha mocks
* Revert async where env is relied upon
* Add --max-failures=1 to CI run
* Set warnings as errors
* Disable async where various mocks are used
* Revert "Disable async where various mocks are used"
This reverts commit 2446b72a29.
* Disable async for test using vcr
* Return empty list when breaking down by event:page without events
This commit fixes a bug with pagination where breaking down by event:page
would always return results despite pagination.
Closes#2255
* Update CHANGELOG.md
* Remove show_noref behaviour
Removes query param show_noref which was used from React to control
whether to show Direct / None traffic or not. The show_noref behaviour
was untested previously.
Closes#2523
* Add changelog entry
* Fix tests
* Removed files I did not mean to check in :)
### Changes
This PR:
- pushes PromEx to the bottom of supervision stack to avoid Endpoint
instrumentation failure
- ensures the site cache is ready by exposing it through the health
check endpoint
- fixes event timestamps being calculated at compile time, with
regression unit and integration tests
### Tests
- [x] Automated tests have been added
- [ ] This PR does not require tests
### Changelog
- [ ] Entry has been added to changelog
- [x] This PR does not make a user-facing change
### Documentation
- [ ] [Docs](https://github.com/plausible/docs) have been updated
- [x] This change does not need a documentation update
### Dark mode
- [ ] The UI has been tested both in dark and light mode
- [x] This PR does not change the UI
* Update Sites.Cache
So it's now capable of refreshing most recent sites.
Refreshing a single site is no longer wanted.
* Introduce Warmer.RecentlyUpdated
This is Sites Cache warmer that runs only for
most recently updated sites every 30s.
* Validate Request creation early
* Rename RateLimiter to GateKeeper and introduce detailed policies
* Update events API tests - a provisioned site is now required
* Update events ingestion tests
* Make limits visible in CRM Sites index
* Hard-deprecate DOMAIN_BLACKLIST
* Remove unnecessary clause
* Fix typo
* Explicitly delegate Warmer.All
* GateKeeper.allwoance => GateKeeper.check
* Instrument Sites.Cache measurments
* Update send_pageview task to output response headers
* Instrument ingestion pipeline
* Credo
* Make event telemetry test a sync case
* Simplify Request.uri/hostname handling
* Use embedded schema, apply action and rely on get_field
* Parse event URL in Plausible.Ingestion.Request
* Parse event domain in Plausible.Ingestion.Request
* Rework ingestion pipeline processing (#2462)
* Rework ingestion pipeline processing
So that Request can have multiple domains and
based on that each event is processed uniformly.
The build_and_buffer/1 function now returns an
accumulator with all the dropped/buffered events
for further inspection.
* Reduce function complexity
* Don't chain struct fields to check for an empty host
* Separate referrer and utm tags
* Fix up `with` clause, credo was right cc @vinibrsl
Co-authored-by: Adam Rutkowski <hq@mtod.org>
* Seed database with pageviews
This commit adds basic support for database seeding useful for testing,
especially dashboard changes, like intervals.
It creates two years of pageviews with random timestamps. There is lot
of room for improvement, such as adding sources, entry pages,
geolocation, devices, custom events, but this already helps us with
testing.
* Update CONTRIBUTING.md file
* Allow refreshing a single site cache + clear cache on prefill
* Reorganize Site.Cache tests with describe blocks
* Tidy up Cache tests
* Make sure the cache is cleared on (p)re-fill
* Allow process name customization in Cache.Warmer
* s/Cache.prefill/Cache.refresh
* Unify Cache refresh instrumentation
* Apply credo suggestion: change `with` to `case`
* Update typespecs to pass dialyzer
* Implement sites by domain caching interface + warmer
* Add test
* Implement hit rate interface
* Add moduledocs
* Fix up typespec
* s/warmer/warmer_fn
* Extract measure_duration/2
* Fix up typespec
* Log errors and return nil on cache internal errors
* Fix up non-existing cache test
* Retrieve specific db columns when pre-filling the cache
* Reduce the subset of fields retrieved from the DB
See 63f3c6233d (r89871536)
People are likely to enter (copy/paste) goals from external sources
which can lead to whitespace characters appended by accident.
That renders the goal unusable and hard to distinct visually.
Normally to fix up existing goals we would use a data migration,
but this should be good enough to check if the problem
with never appearing goals resurfaces.
This commit fixes a bug where Google Analytics import tokens were not
being refreshed properly because the function was not returning the
expected tuple. Thanks to @aerosol we can nicely test this now.
This commit fixes a bug where users clearing one import and trying to
re-import would get invalid import date ranges. This was caused because
the stats_start_date field was not updated to nil after clearing
imported stats, as this field defines the end date of the import.
* Make TestUtils module available in all tests
* Add macros patching the application env in tests
Unfortunately a lot of existing functionality relies on
certain application env setup. This isn't ideal because
the app config is a shared state that prevents us from
running the tests in parallel.
Those macros encapsulate setting up new env for test purposes
and make sure the changes are reverted when the test finishes.
* Allow passing request opts to HTTPClient.post/4
We need this to swap custom request building in
Google Analytics import.
* Unify errors when listing sites
* React: propagate backend error messages if available
* React: catch API errors in Search Terms component
* Propagate google API errors on referrer drilldown
* Handle verified properties errors in SC settings
* Add missing tests for SC settings controller
* Unify errors for fetching search analytics queries (list stats)
* Unify errors refreshing Google Auth Token
* Test fetch_stats/3 errors and replace Double with Mox
* Fixup makrup
* s/class/className
* Simplify Search Terms display in case of errors
* Fix warnings
This commit removes a flaky test assertion from the GA suite. The
initial purpose of this test was to ensure we are not inserting to
Clickhouse more than 1 time per second during an import.
Although it is nice to have an assertion like this, the test is flaky
and fails sometimes. To get the total inserts during a time range, it
queries Clickhouse internal tables for insert commands, and this number
is not deterministic.
In a real-world scenario with Clickhouse, the app runs in multiple nodes
not aware of their neighbors' CH calls. In that case, Grafana provides
us visibility over CH, and these tests do not suffice.
The Google Analytics report request may take some time, especially with big
imports with thousands of pages. To mitigate the issue this commit increases
the timeout to 60s and lowers the page size to 7,500 records per request.
* remove tracker files from git index
* generate tracker files on npm test
* generate tracker files for elixir tests/dev/CI
* update tracker/package-lock.json
* exclude npm run deploy from mix test + some docs
* Move clear stats functions to Plausible.Purge
* Delete both native and imported stats when deleting a site
This commit moves the delete site function to the Plausible.Purge
module, and fixes a bug where deleted sites could leave dangling
imported stats.
* Clear sites.stats_start_date after clearing stats
This commit fixes a bug where resetting stats left an invalid state of
the stats_start_date field, used for GA imports, for example.