analytics

mirror of https://github.com/plausible/analytics.git synced 2024-11-22 18:52:38 +03:00

Author	SHA1	Message	Date
hq1	1d01328287	Allow domain change (#2803 ) * Migration (PR: https://github.com/plausible/analytics/pull/2802) * Implement Site.Domain interface allowing change and expiry * Fixup seeds so they work with V2_MIGRATION_DONE=1 * Update Sites.Cache so it's capable of multi-keyed lookups * Implement worker handling domain change expiration * Implement domain change UI * Implement transition period for public APIs * Exclude v2 tests in primary test run * Update lib/plausible_web/controllers/site_controller.ex Co-authored-by: Vini Brasil <vini@hey.com> * Update lib/plausible_web/controllers/site_controller.ex Co-authored-by: Vini Brasil <vini@hey.com> * Update moduledoc * Update changelog * Remove remnant from previous implementation attempt * !fixup * !fixup * Implement domain change via Sites API cc @ukutaht * Update CHANGELOG * Credo * !fixup commit missing tests * Allow continuous domain change within the same site --------- Co-authored-by: Vini Brasil <vini@hey.com>	2023-04-04 10:55:12 +02:00
hq1	94a86a17eb	Migration: enable domain changes (upcoming feature) (#2802 ) * Migration: enable domain changes (upcoming feature) * Pass domain check within the same site	2023-04-04 10:12:08 +02:00
hq1	d2f2c69387	Conditionally support switching between v1 and v2 clickhouse schemas (#2780 ) * Remove ClickhouseSetup module This has been an implicit point of contact to many tests. From now on the goal is for each test to maintain its own, isolated setup so that no accidental clashes and implicit assumptions are relied upon. * Implement v2 schema check An environment variable V2_MIGRATION_DONE acts like a feature flag, switching plausible from using old events/sessions schemas to v2 schemas introduced by NumericIDs migration. * Run both test suites sequentially While the code for v1 and v2 schemas must be kept still, we will from now on run tests against both code paths. Secondary test run will set V2_MIGRATION_DONE=1 variable, thus making all `Plausible.v2?()` checks return `true'. * Remove unused function This is a remnant from the short period when we would check for existing events before allowing creating a new site. * Update test setups/factories with v2 migration check * Make GateKeeper return site id along with :allow * Make Billing module check for v2 schema * Make ingestion aware of v2 schema * Disable site transfers for when v2 is live In a separate changeset we will implement simplified site transfer for when v2 migration is complete. The new transfer will only rename the site domain in postgres and keep track of the original site prior to the transfer so we keep an ingestion grace period until the customers redeploy their scripting. * Make Stats base queries aware of v2 schema switch * Update breakdown with v2 conditionals * Update pageview local start with v2 check * Update current visitoris with v2 check * Update stats controller with v2 checks * Update external controller with v2 checks * Update remaining tests with proper fixtures * Rewrite redundant assignment * Remove unused alias * Mute credo, this is not the right time * Add test_helper prompt * Fetch priv dir so it works with a release * Fetch distinct partitions only * Don't limit inspect output for partitions * Ensure SQL is printed to IO * Remove redundant domain fixture	2023-03-27 13:52:42 +02:00
Adam	6637751a5e	Implement Numeric IDs migration (#2762 ) * Implement Numeric IDs migration * Fix typo * Mute credo for now * Improve configurability and add stop_t * Adjust to Ch/Chto only * Fix opts key for dictionary password * Add regular ecto migration with numeric ids v2 schemas (#2768) * Add regular ecto migration * Fix typo * Update priv/ingest_repo/migrations/20230320094327_create_v2_schemas.exs Co-authored-by: Vini Brasil <vini@hey.com> * Implement v2 events/sessions schema modules (#2777) * Implement v2 events/sessions schema modules * Clean up session schemas --------- Co-authored-by: Vini Brasil <vini@hey.com> * Update moduledocs --------- Co-authored-by: Vini Brasil <vini@hey.com>	2023-03-23 09:47:41 +01:00
Adam	baa04be191	Remove obsolete migration config (#2769 ) The compilation warning advises to use `Application.compile_env/3` however there is no `:url` environment setting under `:plausible` key anyway, so we might as well get rid of it completely.	2023-03-22 11:32:24 +02:00
Adam	6d79ca5093	Switch to new clickhouse adapter (ch/chto) (#2733 ) * another clickhouse adapter * don't restore stats_removal.ex * fix events main-graph error (#2746) * update ch, chto * update chto again (#2759) * Stop treating page filter as an entry_page filter (#2752) * remove dead code * stop treating page filter as entry page filter in breakdown queries * stop treating page filter as entry page filter in aggregate queries * stop treating page filter as entry page filter in timeseries queries * mix format * update changelog * break code down to smaller functions to keep credo happy * remove unused functions * make CSV export return only conversions with goal filter (#2760) * make CSV export return only conversions with goal filter * update changelog * update elixir version in mix.exs (#2742) * revert admin.ex changes (#2776) --------- Co-authored-by: ruslandoga <67764432+ruslandoga@users.noreply.github.com> Co-authored-by: ruslandoga <rusl@n-do.ga> Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>	2023-03-21 09:55:59 +01:00
Adam	8f86036e57	Keep track of native stats start timestamp when retrieving data (#2715 ) * Stats boundary/PoC? * Delete stats removal * Drop events check on site creation * Update seeds script * Use native_stats_start_at * Don't rely on native stats pointer in imported stats queries * Reset site * Export reset/1 * Remove unnecessary inserted_at settings * Update seeds * Remove unnecessary inserted_at setting	2023-03-01 13:11:31 +01:00
Adam	05e7f93da2	Add a migration for native_stats_start_at (#2716 )	2023-03-01 12:01:27 +01:00
Adam Rutkowski	043e3ed572	Clickhouse migration: add ingest_counters table (#2692 ) * Clickhouse migration: add ingest_counters table * Add toStartOfMinute() to ordering key * Explicitly include column to be summarized	2023-02-23 09:34:44 +01:00
Vini Brasil	ce03b5ebd7	Move CH migration from clickhouse_repo/ to ingest_repo/ (#2683 ) This commit moves a migration that was created in `clickhouse_repo/` to its correct folder, `ingest_repo/`. The migration was created in `1cb07efe6d` prior the read/write separation.	2023-02-16 09:00:26 +01:00
Vini Brasil	1cb07efe6d	Save city name when importing from GA (#2608 ) This commit adds city data to imported records from Google Analytics. The current implementation sets city to 0 because GA does not use the GeoNames database. Google Analytics Reporting API uses [Geographical IDs](https://developers.google.com/analytics/devguides/collection/protocol/v1/geoid) to identify cities and countries. Plausible uses [GeoNames](https://geonames.org/) and I couldn't find databases corelating the two. Fortunately, GA also returns the city name and this commit uses the city name and the country ISO code to find the Geoname ID. To avoid making expensive ETS searches, I created another ETS table in the Location library that uses {country, city} as a key. Related PR: https://github.com/plausible/location/pull/3	2023-02-14 09:32:18 -03:00
Adam Rutkowski	8f85b110aa	Split Clickhouse pools into Read-Only and Read/Write (dedicated to writes) (#2661 ) * Configure ingest repo access/pool size If I'm not mistaken 3 is a sane default, the only inserts we're doing are: - session buffer dump - events buffer dump - GA import dump And all are serializable within their scopes? * Add IngestRepo * Start IngestRepo * Use IngestRepo for inserts * Annotate ClickhouseRepo as read_only So no insert* functions are expanded * Update moduledoc * rename alias * Fix default env var value so it can be casted * Use IngestRepo for migrations * Set default ingest pool size from 3 to 5 in case conns are restarting or else... * Ensure all Repo prometheus metrics are collected	2023-02-12 17:50:57 +01:00
Vini Brasil	a730763838	Add city name to imported_locations table (#2633 ) See also #2608	2023-02-02 14:25:07 -03:00
Vini Brasil	1b9e6d9ae5	Add city geolocation data to seeds (#2626 )	2023-01-31 16:15:01 +01:00
Vini Brasil	b6d30019ef	Cascade delete sent_renewal_notifications table when user is deleted (#2549 ) * Cascade delete sent_renewal_notifications table when user is deleted This commit fixes a bug when deleting a user would trigger a constraint error. * Update CHANGELOG.md	2023-01-02 11:46:18 -03:00
Vini Brasil	b9367941f0	Add more data to pageview seeds (#2471 )	2022-11-25 12:53:33 +02:00
Adam Rutkowski	8b5ae9baaa	Migration: index Sites.updated_at (#2467 )	2022-11-23 14:26:50 +02:00
Vini Brasil	3bedf9281c	Seed database with pageviews (#2449 ) * Seed database with pageviews This commit adds basic support for database seeding useful for testing, especially dashboard changes, like intervals. It creates two years of pageviews with random timestamps. There is lot of room for improvement, such as adding sources, entry pages, geolocation, devices, custom events, but this already helps us with testing. * Update CONTRIBUTING.md file	2022-11-17 21:46:42 -03:00
Adam Rutkowski	aa248dde2e	Add sites metadata rate limit migration (#2428 ) * Generate ecto migration for rate limiting sites * Add ingest rate limit meta-data migration * Make scale limit seconds not nullable	2022-11-09 11:40:05 +01:00
RobertJoonas	c0da024b23	Remove static tracker files (#2116 ) * remove tracker files from git index * generate tracker files on npm test * generate tracker files for elixir tests/dev/CI * update tracker/package-lock.json * exclude npm run deploy from mix test + some docs	2022-10-11 12:19:28 +02:00
RobertJoonas	f75d5106f0	Rework outbound links and file downloads (#2208 ) * moved custom event code to the bottom + fix indentation * add handlebars helper fn + extract getLinkEl fn * extract isOutboundLink function * extract shouldFollowLink function * remove middle and click variables * use only one click handler for outbounds and downloads * extract sendLinkClickEvent function * add error handling when script compilation fails * use callback instead of fixed timeout * do not prevent default if externally prevented + test * add more tests * generate tracker files in priv/tracker/js * update changelog * requested changes after review * regenerate tracker files * use return instead of else if * move middleMouseButton outside the function	2022-09-29 14:12:35 +03:00
Uku Taht	d104abb53d	Add fallback for favicon (#2279 ) * Add fallback for favicon * Add Favicon tests * Changelog * Move placeholder icon to priv folder	2022-09-28 08:55:46 -03:00
RobertJoonas	39ce850f18	exclude pages by hash when using `script.exclusions.hash.js` (#2172 ) * account for hash part of the URL when excluding pages and using hash extension * changelog update	2022-09-07 16:41:08 +03:00
Uku Taht	99fd101135	Add basic test harness for browserstack/playwright (#1961 ) * Add basic test harness for browserstack/playwright * Refactor the tests * added the first test for outbound links * tests for outbound-links and file-downloads * added more browser versions to test on * Lint tracker test files * Update harness.js with BrowserStack example * Fix Playwright request mocks * Add test harness to CI * Remove Safari on Windows from browsers list Co-authored-by: Robert <robertjoonas16@gmail.com> Co-authored-by: Vinicius Brasil <vini@hey.com>	2022-08-04 11:50:09 +03:00
Uku Taht	7e658210ba	Remove db-ip city lite	2022-05-30 10:21:28 +03:00
RobertJoonas	11654ddc07	Script extension additions (#1915 ) * added data-include attribute to plausible.exclusions.js * reorder extensions in filename when serving the plausible script * fix formatting * tweaks after review * changelog update	2022-05-27 10:11:40 +03:00
Uku Taht	18e2711556	Package new db-ip library in the git repo	2022-05-04 11:07:52 +03:00
RobertJoonas	199206babc	Dimensions continued (#1847 ) * added the first version of dimensions extension * finished dimensions script extension + updated tracking to use it * script variants build	2022-04-25 10:56:11 +03:00
Uku Taht	5e415c2420	Add entry_props back in	2022-04-22 10:58:02 +03:00
Uku Taht	8fb4f3f886	Revert entry props	2022-04-21 19:22:38 +03:00
RobertJoonas	40275b64d4	Pageview custom dimensions (#1816 ) * added custom dimension filtering tests for pages * first filter UI in place * pages, entry pages and exit pages can be filtered by pageview props * added tests for expected filtering behaviour * fix dimension filter for sources + tests * added is_not filtering functionality * fixed formatting * fixed admin_test * added (none) as filter value + is_not filter type in UI * added prefilling applied filter values and some UI tweaks * added fetch options * Make prop suggestions work with `props` filter * Fix test * Track login state internally * Add CHANGELOG entry Co-authored-by: Uku Taht <uku.taht@gmail.com>	2022-04-21 11:47:15 +03:00
Uku Taht	7c1d64458e	Add fun with flags library	2022-04-21 10:54:08 +03:00
RobertJoonas	83ab092f6c	added add-file-types attribute + now ignores query params (#1831 ) * added add-file-types attribute + now ignores query params * variable extraction + adjustments to general code style	2022-04-18 12:34:45 +03:00
Uku Taht	83c407c016	Upgrade Oban & configure Stager plugin (#1822 )	2022-04-08 11:05:21 +03:00
Uku Taht	333de87ceb	Add stats_start_date field	2022-04-06 10:10:53 +03:00
RobertJoonas	8616dd46fb	added file-downloads script extension (#1775 ) * added file-downloads script extension * fixed the issues and made it compatible with IE * changelog update	2022-03-31 13:52:09 +03:00
Marc Neudert	1c3085050c	Upgrade ua_inspector to 3.0 (#1762 ) * Upgrade ua_inspector to 3.0 * Update ua_inspector database	2022-03-25 11:41:04 +02:00
RobertJoonas	492f47ba1e	Crm transfer data (#1749 ) * pull from master * added query generation by struct fields * ready, improved tests * fixed a naming mistake	2022-03-24 16:11:04 +02:00
Uku Taht	e27734ed79	[Continued] Google Analytics import (#1753 ) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>	2022-03-10 15:04:59 -06:00
Uku Taht	d4e7d27df6	Merge branch 'new-plans'	2021-12-30 11:09:13 +02:00
Uku Taht	e4b99dbad6	Fix Security error with localstorage access (#1568 ) * Fix Security error with localstorage access * Changelog	2021-12-29 11:08:11 +02:00
Uku Taht	0bf1da09ce	New plans	2021-12-28 12:26:53 +02:00
Uku Taht	48ad7485c8	PR 1393 continued (#1542 ) * Add `utm_content` and `utm_term`. Support `utm_content` and `utm_term` as requested in #515. * Add dropdown for UTM options * Remove utm_content and term from filter modal for now Co-authored-by: Blender Defender <defenderblender@gmail.com>	2021-12-16 11:02:09 +02:00
Uku Taht	1dba113e2f	[Draft] Improve location translations (#1526 ) * WIP * Use location library for search suggestions * Remove unused code * Remove Countries completely * Fix tests	2021-12-13 12:03:27 +02:00
Uku Taht	4d0bc61ffd	Remove Twitter stuff	2021-12-02 11:53:29 +02:00
Uku Taht	05bf43c1be	City level location data (#1449 ) * Merge branch 'plausible_master' * Add City level details * Add City level details * Use ISO codes instead of geoname_id for subdivisions * Add easier way to configure geolocation database * Add workflow for dev branch * Correct clickhouse migration * Translate subdivision names * Translate city names * WIP * Region and country filters * Fix region filter * Remove region_name when removing region filter * Add modals for regions and cities * Remove dead code * WIP * Revert "WIP" This reverts commit `3202bf2fe9`. * Feature flag to hide cities when deployed * Add changelog entry * Remove unused code * Remove unused variables * Fix test Co-authored-by: AymanTerra <aymanterra@yahoo.com>	2021-11-23 11:39:09 +02:00
Émile Perron	fbd6e37767	Add option to specify a URL in the manual extension (#1479 ) * Add option to specify a URL in the manual extension * Fix error when option parameter isn't provided The uglify-js dependency had to be updated to support the optional chaining operator * Remove the possibility to pass a function for the custom URL This added weight unnecessarily, as the users can simply call the function themselves before triggering the event. * Change optional chaining to ternary for improved browser support * Revert package-lock.json to version 3.9.4 of uglify-js It was previously updated to support optional chaining; reverting it as a reminder that this is the current browser support level that is accepted by Plausible.	2021-11-23 10:50:06 +02:00
Uku Taht	e85f0f13cc	Improve Clickhouse DDL	2021-11-22 16:06:21 +02:00
Uku Taht	e9cb8eb4e2	Remove grace period if user upgrades	2021-11-16 10:14:24 +02:00
Uku Taht	29cb7462e6	Add grace period to upgrade	2021-11-16 10:14:23 +02:00

1 2 3 4

157 Commits