* Add revenue fields to ClickHouse events
This commit adds 4 fields to the ClickHouse events_v2 table:
* `revenue_source_amount` and `revenue_source_currency` store revenue in
the original currency sent during ingestion
* `revenue_reporting_amount` and `revenue_reporting_currency` store
revenue in a common currency to perform calculations, and this
currency is defined by the user when setting up the goal
The type of amount fields is `Nullable(Decimal64(3))`. That covers all
fiat currencies and allows us to store huge amounts. Even though
ClickHouse does not suggest using `Nullable`, this is a good use case,
because otherwise additional work would have to be done to
differentiate missing values from real zeroes.
I ran a benchmark with the data pattern we expect in production, where
we have more missing values than real decimals. I created 100 million
records where 90% of decimals are missing. The difference between the
tables in storage is just 0.4Mb.
* Add revenue parameter to Events API
This commit adds support for sending revenue data in ingestion using the
`revenue` parameter - aliased to `$`.
* Add revenue parameter to mix send_pageview
* Add average and total revenue to breakdown queries
* Add revenue goal option to goal creation
This commit adds a currency field to the goals form. Goals that have a
currency set are now revenue goals, and are cached with sites to later
be used during ingestion.
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
* Enable feature flag in tests
---------
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
* upgrade phoenix
Co-authored-by: Vini Brasil <vini@hey.com>
* fix a test (flash message)
The flash message in focus.html.eex was not covered by any test. This
commit fixes also fixes that.
* change function name
* remove unnecessary formatter and format
* update CI cache
* fix dialyzer error
---------
Co-authored-by: Vini Brasil <vini@hey.com>
* Clickhouse migration: add ingest_counters table
* Configure ingest counters per MIX_ENV
* Emit telemetry for ingest events with rich metadata
* Allow building Request.t() with fake now() - for testing purposes
* Use clickhousex branch where session_id is assigned to each connection
* Add helper function for getting site id via cache
* Add Ecto schema for `ingest_counters` table
* Implement metrics buffer
* Implement buffering handler for `Plausible.Ingestion.Event` telemetry
* Implement periodic metrics aggregation
* Update counters docs
* Add toStartOfMinute() to ordering key
* Reset the sync connection state in `after` clause
* Flush counters on app termination
* Use separate Repo with async settings enabled at config level
* Switch to clickhouse_settings repo root config key
* Add AsyncInsertRepo module
This PR replaces geolix with locus to simplify self-hosted setup. locus can auto-update maxmind dbs which are recommended for self-hosters if they want city-level geolocation. locus is also a bit faster.
This PR also uses a test mmdb file from https://github.com/maxmind/MaxMind-DB for e2e geolocation tests without stubs.
This commit updates mix.exs to resolve bamboo_postmark to our fork. The
fork encodes names with quotes when building e-mails, adding support for
special names with commas and quotes. Related to
plausible/bamboo_postmark#1.
Closes#1885
* Seed database with pageviews
This commit adds basic support for database seeding useful for testing,
especially dashboard changes, like intervals.
It creates two years of pageviews with random timestamps. There is lot
of room for improvement, such as adding sources, entry pages,
geolocation, devices, custom events, but this already helps us with
testing.
* Update CONTRIBUTING.md file
* Implement sites by domain caching interface + warmer
* Add test
* Implement hit rate interface
* Add moduledocs
* Fix up typespec
* s/warmer/warmer_fn
* Extract measure_duration/2
* Fix up typespec
* Log errors and return nil on cache internal errors
* Fix up non-existing cache test
* Retrieve specific db columns when pre-filling the cache
* Reduce the subset of fields retrieved from the DB
See 63f3c6233d (r89871536)
This pull request improves the current OpenTelemetry implementation. Currently only 1% of the spans are sent, due to the high volume of ingestion requests to /api/event. I enabled the 1% sampling to /api/event only, recording 100% of the other traces.
* Render 404 when shared link cannot be found
* Add documentation for StatsController and shared link rendering
* Refactor shared_link/2 for more clarity
* Add changelog entry
* Use mermaid graph for sequence diagram
* Use more accurate return value in sequence diagram
* Refactor Ecto query to be more idiomatic
* Remove order dependence in test
* Restore backwards compatibility for older shared links
* Add changelog entry
* Update Timex version from 3.7.7 to 3.7.8
* Generate timezone list from Tzdata
This commit fixes a bug where timezone changes weren't updating the
timezone list displayed when editing or creating a site.
Timezones were being pulled from a static list. This commit changes it
to generate the list from Tzdata, that uses a timezone database with
updated information on time changes. Additionally it adds more timezones
with aliases and links to the list.
Closes#1340
* Use timezone name from browser to recommend timezone
This commit matches the timezone name instead of offset to recommend a
timezone when creating a new site. The JavaScript Intl.DateTimeFormat
API is widely supported according to the link. In any case, if the
timezone fails to match by name, it fallbacks to the offset strategy.
https://caniuse.com/mdn-javascript_builtins_intl_datetimeformat_resolvedoptions_computed_timezoneCloses#904
* Create separate module for GA HTTP requests
* Fetch GA data entirely instead of monthly
* Add buffering to GA imports
* Change positional args to maps when serializing from GA
* Create Google Analytics VCR tests
* Upgrade geolix
* Remove geolix pool config
* Save unnecessary Task.async_stream roundtrip
Normally the Geolix API accepts `:where` keyword option that designates
the database to look up. In case no parameter is supplied, it'll spawn
a parallel map over all databases available. In this case we have only
one DB anyway, so there is no need for the extra instrumentation.
* Follow up on direct :geolocation lookups
* Introduce Finch for Sentry integration
* Make sure the DummyAgent can be started
* No need to sanitize the dsn, finch takes care of that
* Simplify the dummy child spec
* Annotate redirects clause
* Make use of new `get_int_from_path_or_env`
* Actually use finch in Sentry config
* Configure `excluded_domains` correctly for Sentry
The way sentry is configured currently, when we get an HTTP error it
will be logged twice - once from Sentry.PlugCapture and once from
Sentry.LoggerBackend. The logger backend module does the right thing
by default but for some reason we've been overriding the config
parameter that by default stops double-counting errors. This commit
returns to the default configuration which is better.
* Default to 15s timeout
* Attempt to send twice at most
* Warn in sentry client
* Use warn level in sentry client
Co-authored-by: Adam Rutkowski <hq@mtod.org>
* Include gelocation DB download in the development workflow
* Make sure `tls_certificate_check` is started ASAP
This prevents `:application_either_not_started_or_not_ready` errors
on application startup.
* Mark Makefile targets as PHONY
By default Make assumes the targets are files,
in this case none of them are.