mirror of
https://github.com/plausible/analytics.git
synced 2024-12-02 07:38:47 +03:00
e27734ed79
* Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net> |
||
---|---|---|
.. | ||
.formatter.exs | ||
20181201181549_add_pageviews.exs | ||
20181214201821_add_new_visitor_to_pageviews.exs | ||
20181215140923_add_session_id_to_pageviews.exs | ||
20190109173917_create_sites.exs | ||
20190117135714_add_uid_to_pageviews.exs | ||
20190118154210_add_derived_data_to_pageviews.exs | ||
20190126135857_add_name_to_users.exs | ||
20190127213938_add_tz_to_sites.exs | ||
20190205165931_add_last_seen_to_users.exs | ||
20190213224404_add_intro_emails.exs | ||
20190219130809_delete_intro_emails_when_user_is_deleted.exs | ||
20190301122344_add_country_code_to_pageviews.exs | ||
20190324155606_add_password_hash_to_users.exs | ||
20190402145007_remove_device_type_from_pageviews.exs | ||
20190402145357_remove_screen_height_from_pageviews.exs | ||
20190402172423_add_index_to_pageviews.exs | ||
20190410095248_add_feedback_emails.exs | ||
20190424162903_delete_feedback_emails_when_user_is_deleted.exs | ||
20190430140411_use_citext_for_email.exs | ||
20190430152923_create_subscriptions.exs | ||
20190516113517_remove_session_id_from_pageviews.exs | ||
20190520144229_change_user_id_to_uuid.exs | ||
20190523160838_add_raw_referrer.exs | ||
20190523171519_add_indices_to_referrers.exs | ||
20190618165016_add_public_sites.exs | ||
20190718160353_create_google_search_console_integration.exs | ||
20190723141824_associate_google_auth_with_site.exs | ||
20190730014913_add_monthly_stats.exs | ||
20190730142200_add_weekly_stats.exs | ||
20190730144413_add_daily_stats.exs | ||
20190809174105_calc_screen_size.exs | ||
20190810145419_remove_unused_indices.exs | ||
20190820140747_remove_rollup_tables.exs | ||
20190906111810_add_email_reporting.exs | ||
20190907134114_add_unique_index_to_email_settings.exs | ||
20190910120900_add_email_address_to_settings.exs | ||
20190911102027_add_monthly_reports.exs | ||
20191010031425_add_property_to_google_auth.exs | ||
20191015072730_remove_unused_fields.exs | ||
20191015073507_proper_timestamp_for_pageviews.exs | ||
20191024062200_rename_pageviews_to_events.exs | ||
20191025055334_add_name_to_events.exs | ||
20191031051340_add_goals.exs | ||
20191031063001_remove_goal_name.exs | ||
20191118075359_allow_free_subscriptions.exs | ||
20191216064647_add_unique_index_to_email_reports.exs | ||
20191218082207_add_sessions.exs | ||
20191220042658_add_session_start.exs | ||
20200106090739_cascade_google_auth_deletion.exs | ||
20200107095234_add_entry_page_to_sessions.exs | ||
20200113143927_add_exit_page_to_session.exs | ||
20200114131538_add_tweets.exs | ||
20200120091134_change_session_referrer_to_text.exs | ||
20200121091251_add_recipients.exs | ||
20200122150130_add_shared_links.exs | ||
20200130123049_add_site_id_to_events.exs | ||
20200204093801_rename_site_id_to_domain.exs | ||
20200204133522_drop_events_hostname_index.exs | ||
20200210134612_add_fingerprint_to_events.exs | ||
20200211080841_add_raw_fingerprint.exs | ||
20200211090126_remove_raw_fingerprint.exs | ||
20200211133829_add_initial_source_and_referrer_to_events.exs | ||
20200219124314_create_custom_domains.exs | ||
20200227092821_add_fingerprint_sesssions.exs | ||
20200302105632_flexible_fingerprint_referrer.exs | ||
20200317093028_add_trial_expiry_to_users.exs | ||
20200317142459_backfill_fingerprints.exs | ||
20200320100803_add_setup_emails.exs | ||
20200323083536_add_create_site_emails.exs | ||
20200323084954_add_check_stats_emails.exs | ||
20200324132431_make_cookie_fields_non_required.exs | ||
20200406115153_cascade_custom_domain_deletion.exs | ||
20200408122329_cascade_setup_emails_deletion.exs | ||
20200529071028_add_oban_jobs_table.exs | ||
20200605134616_remove_events_and_sessions.exs | ||
20200605142737_remove_fingerprint_sessions_table.exs | ||
20200619071221_create_salts_table.exs | ||
20201130083829_add_email_verification_codes.exs | ||
20201208173543_add_spike_notifications.exs | ||
20201210085345_add_email_verified_to_users.exs | ||
20201214072008_add_theme_pref_to_users.exs | ||
20201230085939_delete_email_records_when_user_is_deleted.exs | ||
20210115092331_cascade_site_deletion_to_spike_notification.exs | ||
20210119093337_add_unique_index_to_spike_notification.exs | ||
20210128083453_cascade_site_deletion.exs | ||
20210128084657_create_api_keys.exs | ||
20210209095257_add_last_payment_details.exs | ||
20210406073254_add_name_to_shared_links.exs | ||
20210409074413_add_unique_index_to_shared_link_name.exs | ||
20210409082603_add_api_key_scopes.exs | ||
20210420075623_add_sent_renewal_notifications.exs | ||
20210426075157_upgrade_oban_jobs_to_v9.exs | ||
20210513091653_add_currency_to_subscription.exs | ||
20210525085655_add_rate_limit_to_api_keys.exs | ||
20210531080158_add_role_to_site_memberships.exs | ||
20210601090924_add_invitations.exs | ||
20210604085943_add_locked_to_sites.exs | ||
20210629124428_cascade_site_deletion_to_invitations.exs | ||
20210726090211_make_invitation_email_case_insensitive.exs | ||
20210906102736_memoize_setup_complete.exs | ||
20210908081119_allow_trial_expiry_to_be_null.exs | ||
20211020093238_add_enterprise_plans.exs | ||
20211022084427_add_site_limit_to_enterprise_plans.exs | ||
20211028122202_grace_period_end.exs | ||
20211110174617_add_site_imported_source.exs | ||
20211202094732_remove_tweets.exs |