* Dynamically set :domain for the live socket
Ref: https://github.com/phoenixframework/phoenix_live_view/pull/2715
* Make runtime config raise before the user ends up in a reconnect loop
* hall of shame: remove console.info remnant
* Check origin on live websocket
* Get rid of single pipe
* Add revenue fields to ClickHouse events
This commit adds 4 fields to the ClickHouse events_v2 table:
* `revenue_source_amount` and `revenue_source_currency` store revenue in
the original currency sent during ingestion
* `revenue_reporting_amount` and `revenue_reporting_currency` store
revenue in a common currency to perform calculations, and this
currency is defined by the user when setting up the goal
The type of amount fields is `Nullable(Decimal64(3))`. That covers all
fiat currencies and allows us to store huge amounts. Even though
ClickHouse does not suggest using `Nullable`, this is a good use case,
because otherwise additional work would have to be done to
differentiate missing values from real zeroes.
I ran a benchmark with the data pattern we expect in production, where
we have more missing values than real decimals. I created 100 million
records where 90% of decimals are missing. The difference between the
tables in storage is just 0.4Mb.
* Add revenue parameter to Events API
This commit adds support for sending revenue data in ingestion using the
`revenue` parameter - aliased to `$`.
* Add revenue parameter to mix send_pageview
* Add average and total revenue to breakdown queries
* Add revenue goal option to goal creation
This commit adds a currency field to the goals form. Goals that have a
currency set are now revenue goals, and are cached with sites to later
be used during ingestion.
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
* Enable feature flag in tests
---------
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
* default to v2
* allow N defaults in data migration prompt and custom messages
* join domains lookup
* remove duplicate test runs from ci (both are v2)
* Migration (PR: https://github.com/plausible/analytics/pull/2802)
* Implement Site.Domain interface allowing change and expiry
* Fixup seeds so they work with V2_MIGRATION_DONE=1
* Update Sites.Cache so it's capable of multi-keyed lookups
* Implement worker handling domain change expiration
* Implement domain change UI
* Implement transition period for public APIs
* Exclude v2 tests in primary test run
* Update lib/plausible_web/controllers/site_controller.ex
Co-authored-by: Vini Brasil <vini@hey.com>
* Update lib/plausible_web/controllers/site_controller.ex
Co-authored-by: Vini Brasil <vini@hey.com>
* Update moduledoc
* Update changelog
* Remove remnant from previous implementation attempt
* !fixup
* !fixup
* Implement domain change via Sites API
cc @ukutaht
* Update CHANGELOG
* Credo
* !fixup commit missing tests
* Allow continuous domain change within the same site
---------
Co-authored-by: Vini Brasil <vini@hey.com>
* Remove ClickhouseSetup module
This has been an implicit point of contact to many
tests. From now on the goal is for each test to maintain
its own, isolated setup so that no accidental clashes
and implicit assumptions are relied upon.
* Implement v2 schema check
An environment variable V2_MIGRATION_DONE acts like
a feature flag, switching plausible from using old events/sessions
schemas to v2 schemas introduced by NumericIDs migration.
* Run both test suites sequentially
While the code for v1 and v2 schemas must be kept still,
we will from now on run tests against both code paths.
Secondary test run will set V2_MIGRATION_DONE=1 variable,
thus making all `Plausible.v2?()` checks return `true'.
* Remove unused function
This is a remnant from the short period when
we would check for existing events before allowing
creating a new site.
* Update test setups/factories with v2 migration check
* Make GateKeeper return site id along with :allow
* Make Billing module check for v2 schema
* Make ingestion aware of v2 schema
* Disable site transfers for when v2 is live
In a separate changeset we will implement simplified
site transfer for when v2 migration is complete.
The new transfer will only rename the site domain in postgres
and keep track of the original site prior to the transfer
so we keep an ingestion grace period until the customers
redeploy their scripting.
* Make Stats base queries aware of v2 schema switch
* Update breakdown with v2 conditionals
* Update pageview local start with v2 check
* Update current visitoris with v2 check
* Update stats controller with v2 checks
* Update external controller with v2 checks
* Update remaining tests with proper fixtures
* Rewrite redundant assignment
* Remove unused alias
* Mute credo, this is not the right time
* Add test_helper prompt
* Fetch priv dir so it works with a release
* Fetch distinct partitions only
* Don't limit inspect output for partitions
* Ensure SQL is printed to IO
* Remove redundant domain fixture
* Clickhouse migration: add ingest_counters table
* Configure ingest counters per MIX_ENV
* Emit telemetry for ingest events with rich metadata
* Allow building Request.t() with fake now() - for testing purposes
* Use clickhousex branch where session_id is assigned to each connection
* Add helper function for getting site id via cache
* Add Ecto schema for `ingest_counters` table
* Implement metrics buffer
* Implement buffering handler for `Plausible.Ingestion.Event` telemetry
* Implement periodic metrics aggregation
* Update counters docs
* Add toStartOfMinute() to ordering key
* Reset the sync connection state in `after` clause
* Flush counters on app termination
* Use separate Repo with async settings enabled at config level
* Switch to clickhouse_settings repo root config key
* Add AsyncInsertRepo module
* Configure ingest repo access/pool size
If I'm not mistaken 3 is a sane default, the only
inserts we're doing are:
- session buffer dump
- events buffer dump
- GA import dump
And all are serializable within their scopes?
* Add IngestRepo
* Start IngestRepo
* Use IngestRepo for inserts
* Annotate ClickhouseRepo as read_only
So no insert* functions are expanded
* Update moduledoc
* rename alias
* Fix default env var value so it can be casted
* Use IngestRepo for migrations
* Set default ingest pool size from 3 to 5
in case conns are restarting or else...
* Ensure all Repo prometheus metrics are collected
This PR replaces geolix with locus to simplify self-hosted setup. locus can auto-update maxmind dbs which are recommended for self-hosters if they want city-level geolocation. locus is also a bit faster.
This PR also uses a test mmdb file from https://github.com/maxmind/MaxMind-DB for e2e geolocation tests without stubs.
### Changes
This PR adds a fallback to empty build metadata when BUILD_METADATA
contains invalid JSON.
Example `warning` log for `BUILD_METADATA={...}`:
```
20:57:57.872 [warning] failed to parse $BUILD_METADATA, reason: ** (Jason.DecodeError) unexpected byte at position 1: 0x2E (".")
```
Fixes https://github.com/plausible/analytics/issues/2491
### Tests
- [x] This PR does not require tests
### Changelog
- [ ] Entry has been added to changelog
### Documentation
- [x] This change does not need a documentation update
### Dark mode
- [x] This PR does not change the UI
* Update Sites.Cache
So it's now capable of refreshing most recent sites.
Refreshing a single site is no longer wanted.
* Introduce Warmer.RecentlyUpdated
This is Sites Cache warmer that runs only for
most recently updated sites every 30s.
* Validate Request creation early
* Rename RateLimiter to GateKeeper and introduce detailed policies
* Update events API tests - a provisioned site is now required
* Update events ingestion tests
* Make limits visible in CRM Sites index
* Hard-deprecate DOMAIN_BLACKLIST
* Remove unnecessary clause
* Fix typo
* Explicitly delegate Warmer.All
* GateKeeper.allwoance => GateKeeper.check
* Instrument Sites.Cache measurments
* Update send_pageview task to output response headers
* Instrument ingestion pipeline
* Credo
* Make event telemetry test a sync case
* Simplify Request.uri/hostname handling
* Use embedded schema, apply action and rely on get_field
This pull request improves the current OpenTelemetry implementation. Currently only 1% of the spans are sent, due to the high volume of ingestion requests to /api/event. I enabled the 1% sampling to /api/event only, recording 100% of the other traces.
Getting this error when running the release:
ERROR! the application :fun_with_flags has a different value set for key :persistence during runtime compared to compile time. Since this application environment entry was marked as compile time, this difference can lead to different behaviour than expected:
* Compile time value was not set
* Runtime value was set to: [adapter: FunWithFlags.Store.Persistent.Ecto, repo: Plausible.Repo]
* List all Google Analytics views during import
This commit fixes a bug where different Google Analytics views with the
same name and URI were not shown. This was caused because GA views were
stored as a map, that naturally doesn't support duplicate keys.
This change updates the GA views list to display view IDs, making it
clearer to know what is being imported. The dropdown is now grouped by
website URL.
* Put Google Analytics API URLs in app env
* Add controller test to GA view list
* Create separate module for GA HTTP requests
* Fetch GA data entirely instead of monthly
* Add buffering to GA imports
* Change positional args to maps when serializing from GA
* Create Google Analytics VCR tests
* Upgrade geolix
* Remove geolix pool config
* Save unnecessary Task.async_stream roundtrip
Normally the Geolix API accepts `:where` keyword option that designates
the database to look up. In case no parameter is supplied, it'll spawn
a parallel map over all databases available. In this case we have only
one DB anyway, so there is no need for the extra instrumentation.
* Follow up on direct :geolocation lookups
* Introduce Finch for Sentry integration
* Make sure the DummyAgent can be started
* No need to sanitize the dsn, finch takes care of that
* Simplify the dummy child spec
* Annotate redirects clause
* Make use of new `get_int_from_path_or_env`
* Actually use finch in Sentry config
* Configure `excluded_domains` correctly for Sentry
The way sentry is configured currently, when we get an HTTP error it
will be logged twice - once from Sentry.PlugCapture and once from
Sentry.LoggerBackend. The logger backend module does the right thing
by default but for some reason we've been overriding the config
parameter that by default stops double-counting errors. This commit
returns to the default configuration which is better.
* Default to 15s timeout
* Attempt to send twice at most
* Warn in sentry client
* Use warn level in sentry client
Co-authored-by: Adam Rutkowski <hq@mtod.org>
* Include gelocation DB download in the development workflow
* Make sure `tls_certificate_check` is started ASAP
This prevents `:application_either_not_started_or_not_ready` errors
on application startup.
* Mark Makefile targets as PHONY
By default Make assumes the targets are files,
in this case none of them are.
* Adds tri-state disable_registration config
* Formatting
* Changes variable back to atom
* Changelog
* Uses atoms correctly :/
* Swaps to a more fitting value
* Formatting