* Clickhouse migration: add ingest_counters table
* Configure ingest counters per MIX_ENV
* Emit telemetry for ingest events with rich metadata
* Allow building Request.t() with fake now() - for testing purposes
* Use clickhousex branch where session_id is assigned to each connection
* Add helper function for getting site id via cache
* Add Ecto schema for `ingest_counters` table
* Implement metrics buffer
* Implement buffering handler for `Plausible.Ingestion.Event` telemetry
* Implement periodic metrics aggregation
* Update counters docs
* Add toStartOfMinute() to ordering key
* Reset the sync connection state in `after` clause
* Flush counters on app termination
* Use separate Repo with async settings enabled at config level
* Switch to clickhouse_settings repo root config key
* Add AsyncInsertRepo module
* Configure ingest repo access/pool size
If I'm not mistaken 3 is a sane default, the only
inserts we're doing are:
- session buffer dump
- events buffer dump
- GA import dump
And all are serializable within their scopes?
* Add IngestRepo
* Start IngestRepo
* Use IngestRepo for inserts
* Annotate ClickhouseRepo as read_only
So no insert* functions are expanded
* Update moduledoc
* rename alias
* Fix default env var value so it can be casted
* Use IngestRepo for migrations
* Set default ingest pool size from 3 to 5
in case conns are restarting or else...
* Ensure all Repo prometheus metrics are collected
* Move Endpoint errors setup to common config
* Implement naive Sentry link resolver
* Implement error report e-mail
* Delete static sentry script
* Implement user feedback form on server errors
* Re-arrange pipe
* Use Sentry.Config.dsn() where applicable
* Fix typo
* Use Map.replace/3
This PR replaces geolix with locus to simplify self-hosted setup. locus can auto-update maxmind dbs which are recommended for self-hosters if they want city-level geolocation. locus is also a bit faster.
This PR also uses a test mmdb file from https://github.com/maxmind/MaxMind-DB for e2e geolocation tests without stubs.
* Ignore XX and T1 countries
* Add fallback if country_code=nil
* Lookup city overrides directly in CityOverrides module
* Changelog
* Add empty moduledoc
* Remove redundant comment
* Set pg pool size for MIX_ENV=test
* Include slow tests in CI run
* Exclude slow tests by default
* Mark tests slow/async where applicable
* Restructure captcha mocks
* Revert async where env is relied upon
* Add --max-failures=1 to CI run
* Set warnings as errors
* Disable async where various mocks are used
* Revert "Disable async where various mocks are used"
This reverts commit 2446b72a29.
* Disable async for test using vcr
### Changes
This PR adds a fallback to empty build metadata when BUILD_METADATA
contains invalid JSON.
Example `warning` log for `BUILD_METADATA={...}`:
```
20:57:57.872 [warning] failed to parse $BUILD_METADATA, reason: ** (Jason.DecodeError) unexpected byte at position 1: 0x2E (".")
```
Fixes https://github.com/plausible/analytics/issues/2491
### Tests
- [x] This PR does not require tests
### Changelog
- [ ] Entry has been added to changelog
### Documentation
- [x] This change does not need a documentation update
### Dark mode
- [x] This PR does not change the UI
* Update Sites.Cache
So it's now capable of refreshing most recent sites.
Refreshing a single site is no longer wanted.
* Introduce Warmer.RecentlyUpdated
This is Sites Cache warmer that runs only for
most recently updated sites every 30s.
* Validate Request creation early
* Rename RateLimiter to GateKeeper and introduce detailed policies
* Update events API tests - a provisioned site is now required
* Update events ingestion tests
* Make limits visible in CRM Sites index
* Hard-deprecate DOMAIN_BLACKLIST
* Remove unnecessary clause
* Fix typo
* Explicitly delegate Warmer.All
* GateKeeper.allwoance => GateKeeper.check
* Instrument Sites.Cache measurments
* Update send_pageview task to output response headers
* Instrument ingestion pipeline
* Credo
* Make event telemetry test a sync case
* Simplify Request.uri/hostname handling
* Use embedded schema, apply action and rely on get_field
* Implement sites by domain caching interface + warmer
* Add test
* Implement hit rate interface
* Add moduledocs
* Fix up typespec
* s/warmer/warmer_fn
* Extract measure_duration/2
* Fix up typespec
* Log errors and return nil on cache internal errors
* Fix up non-existing cache test
* Retrieve specific db columns when pre-filling the cache
* Reduce the subset of fields retrieved from the DB
See 63f3c6233d (r89871536)
* Implement FF-driven DB lookup for sites during ingestion
We like to see the impact of doing a simple postgres lookup on each
ingestion event. The percentage-based feature flag `:ingestion_pg_lookup`
must be set in order for lookups to be executed.
* Fix resolving Cachex stats metrics
* Enable PromEx on dev env
This pull request improves the current OpenTelemetry implementation. Currently only 1% of the spans are sent, due to the high volume of ingestion requests to /api/event. I enabled the 1% sampling to /api/event only, recording 100% of the other traces.
* remove tracker files from git index
* generate tracker files on npm test
* generate tracker files for elixir tests/dev/CI
* update tracker/package-lock.json
* exclude npm run deploy from mix test + some docs
Getting this error when running the release:
ERROR! the application :fun_with_flags has a different value set for key :persistence during runtime compared to compile time. Since this application environment entry was marked as compile time, this difference can lead to different behaviour than expected:
* Compile time value was not set
* Runtime value was set to: [adapter: FunWithFlags.Store.Persistent.Ecto, repo: Plausible.Repo]
* List all Google Analytics views during import
This commit fixes a bug where different Google Analytics views with the
same name and URI were not shown. This was caused because GA views were
stored as a map, that naturally doesn't support duplicate keys.
This change updates the GA views list to display view IDs, making it
clearer to know what is being imported. The dropdown is now grouped by
website URL.
* Put Google Analytics API URLs in app env
* Add controller test to GA view list
* Remove invalid Jason.decode argument
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
* Add custom message to Google invalid grant error
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
* Test invalid_grant while refreshing Google token
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
* Create separate module for GA HTTP requests
* Fetch GA data entirely instead of monthly
* Add buffering to GA imports
* Change positional args to maps when serializing from GA
* Create Google Analytics VCR tests
* Fix geolocation subdivision pattern matching
This commit fixes a bug where regions were not being saved. This was
caused because Geolix response was returning an additional
`:geolocation` map key. It also adds a test case for this.
Closes#2033
* Add geolocation database to .gitignore
* Upgrade geolix
* Remove geolix pool config
* Save unnecessary Task.async_stream roundtrip
Normally the Geolix API accepts `:where` keyword option that designates
the database to look up. In case no parameter is supplied, it'll spawn
a parallel map over all databases available. In this case we have only
one DB anyway, so there is no need for the extra instrumentation.
* Follow up on direct :geolocation lookups
* Introduce Finch for Sentry integration
* Make sure the DummyAgent can be started
* No need to sanitize the dsn, finch takes care of that
* Simplify the dummy child spec
* Annotate redirects clause
* Make use of new `get_int_from_path_or_env`
* Actually use finch in Sentry config
* Configure `excluded_domains` correctly for Sentry
The way sentry is configured currently, when we get an HTTP error it
will be logged twice - once from Sentry.PlugCapture and once from
Sentry.LoggerBackend. The logger backend module does the right thing
by default but for some reason we've been overriding the config
parameter that by default stops double-counting errors. This commit
returns to the default configuration which is better.
* Default to 15s timeout
* Attempt to send twice at most
* Warn in sentry client
* Use warn level in sentry client
Co-authored-by: Adam Rutkowski <hq@mtod.org>
* Include gelocation DB download in the development workflow
* Make sure `tls_certificate_check` is started ASAP
This prevents `:application_either_not_started_or_not_ready` errors
on application startup.
* Mark Makefile targets as PHONY
By default Make assumes the targets are files,
in this case none of them are.
* Adds tri-state disable_registration config
* Formatting
* Changes variable back to atom
* Changelog
* Uses atoms correctly :/
* Swaps to a more fitting value
* Formatting