analytics/test/plausible_web/controllers/api/stats_controller/pages_test.exs

2145 lines
64 KiB
Elixir
Raw Normal View History

defmodule PlausibleWeb.Api.StatsController.PagesTest do
use PlausibleWeb.ConnCase
@user_id 123
describe "GET /api/stats/:domain/pages" do
setup [:create_user, :log_in, :create_new_site, :create_legacy_site_import]
test "returns top pages by visitors", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview, pathname: "/"),
build(:pageview, pathname: "/"),
build(:pageview, pathname: "/"),
build(:pageview, pathname: "/register"),
build(:pageview, pathname: "/register"),
build(:pageview, pathname: "/contact")
])
conn = get(conn, "/api/stats/#{site.domain}/pages?period=day")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"visitors" => 3, "name" => "/"},
%{"visitors" => 2, "name" => "/register"},
%{"visitors" => 1, "name" => "/contact"}
Formatting only changes - No code change (#75) * first commit with test and compile job Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding 'prepare' stage Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated ci script to include "test" compile phase Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding environment variables for connecting to postgresql Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated ci config for postgres Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * using non-alpine version of elixir Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * re-using the 'compile' artifacts and added explict env variables for testing Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * removing redundant deps fetching from common code Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * formatting using mix.format -- beware no-code changes! Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * added release config Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding consistent env variable for Database Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * more cleaning up of environment variables Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding releases config for enabling releases Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * cleaning up env configs Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Cleaned up config and prepared config for releases Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated CI script with new config for test Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added Dockerfile for creating production docker image Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding "docker" build job yay! Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * using non-slim version of debian and installing webpack Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding overlays for migrations on releases Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * restricting the docker built to master branch only Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * typo fix Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding "Hosting.md" to explain hosting instructions Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * removed the default comments Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added documentation related to env variables * updated documentation and fixed typo Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated documentation * Bumping up elixir version as `overlays` are only supported in latest version read release notes: https://github.com/elixir-lang/elixir/releases/tag/v1.10.0 Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding tarball assembly during release Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated HOSTING.md Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added support for db migration Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * minor corrections Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * initializing admin user Admin user has been added in the "migration" phase. A default user is automatically created in the process. One can provide the related env variables, else a new one will be automatically created for you. Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Initial base domain update - phase#1 These changes are only meant for correct operating it under self-hosting. There are many other cosmetic changes, that require updates to email, site and other places where the original website and author is used. Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Using dedicated config variable `base_domain` instead Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding base_domain to releases config Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * removing the dedicated config "base_domain", relying on endpoint host Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Removed the usage of "Mix" in code! It is bad practice to use "mix" module inside the code as in actual release this module is unavailable. Replacing this with a config environment variable Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added support for SMTP via Bamboo Smtp Adapter Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Capturing SMTP errors via Sentry Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Minor updates Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding junit formatter -- useful for generating test reports Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding documentation for default user * Resolve "Gitlab Adoption: Add supported services in "Security & Compliance"" * bumping up the debian version to fix issues fixing some vulnerabilities identified by the scanning tools * More updates for self-hosting Changes in most of the places to suit self-hosting. Although, there are some which have been left-off. Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * quick-dirty-fix! * bumping up the db connect timeout Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * bumping up the db connect timeout Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * bumping up the db connect timeout Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * bumping up timeout - skipping MRs :-/ * removing restrictions on watching for changes this stuff isn't working * Update HOSTING.md * renamed the module name * reverting formatting-whitespace changes Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * reverting the name to release Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding docker-compose.yml and related instructions Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * using `plausible_url` instead of assuming `https` this is because, it is much to test in local dev machines and in most cases there's already a layer above which is capable for `https` termination and http -> https upgrade Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * WIP: merging changes from upstream Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * wip: more changes * Pushing in changes from upstream Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * changes to ci for testing Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * cleaning up and finishing clickhouse integration Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updating readme with hosting details * removing deleted files from upstream * minor config adjustments Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * formatting changes Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>
2020-06-08 10:35:13 +03:00
]
end
test "returns top pages by visitors by hostname", %{conn: conn1, site: site} do
populate_stats(site, [
build(:pageview, pathname: "/", hostname: "a.example.com"),
build(:pageview, pathname: "/", hostname: "b.example.com"),
build(:pageview, pathname: "/", hostname: "d.example.com"),
build(:pageview, pathname: "/landing", hostname: "x.example.com", user_id: 123),
build(:pageview, pathname: "/register", hostname: "d.example.com", user_id: 123),
build(:pageview, pathname: "/register", hostname: "d.example.com", user_id: 123),
build(:pageview, pathname: "/register", hostname: "d.example.com"),
build(:pageview, pathname: "/contact", hostname: "e.example.com")
])
filters = Jason.encode!(%{"hostname" => "*.example.com"})
conn = get(conn1, "/api/stats/#{site.domain}/pages?period=day&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"visitors" => 3, "name" => "/"},
%{"visitors" => 2, "name" => "/register"},
%{"visitors" => 1, "name" => "/contact"},
%{"visitors" => 1, "name" => "/landing"}
]
filters = Jason.encode!(%{"hostname" => "d.example.com"})
conn = get(conn1, "/api/stats/#{site.domain}/pages?period=day&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"visitors" => 2, "name" => "/register"},
%{"visitors" => 1, "name" => "/"}
]
end
test "returns top pages with :is filter on custom pageview props", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog/john-1",
"meta.key": ["author"],
"meta.value": ["John Doe"]
),
build(:pageview,
pathname: "/blog/other-post",
"meta.key": ["author"],
"meta.value": ["other"]
),
build(:pageview, user_id: 123, pathname: "/")
])
filters = Jason.encode!(%{props: %{"author" => "John Doe"}})
conn = get(conn, "/api/stats/#{site.domain}/pages?period=day&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"visitors" => 1, "name" => "/blog/john-1"}
]
end
test "returns top pages with :is_not filter on custom pageview props", %{
conn: conn,
site: site
} do
populate_stats(site, [
build(:pageview,
pathname: "/blog/john-1",
"meta.key": ["author"],
"meta.value": ["John Doe"]
),
build(:pageview,
pathname: "/blog/other-post",
"meta.key": ["author"],
"meta.value": ["other"]
),
build(:pageview, pathname: "/")
])
filters = Jason.encode!(%{props: %{"author" => "!John Doe"}})
conn = get(conn, "/api/stats/#{site.domain}/pages?period=day&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"visitors" => 1, "name" => "/"},
%{"visitors" => 1, "name" => "/blog/other-post"}
]
end
test "returns top pages with :matches filter on custom pageview props", %{
conn: conn,
site: site
} do
populate_stats(site, [
build(:pageview,
pathname: "/1",
"meta.key": ["prop"],
"meta.value": ["bar"]
),
build(:pageview,
pathname: "/2",
"meta.key": ["prop"],
"meta.value": ["foobar"]
),
build(:pageview,
pathname: "/3",
"meta.key": ["prop"],
"meta.value": ["baar"]
),
build(:pageview,
pathname: "/4",
"meta.key": ["another"],
"meta.value": ["bar"]
),
build(:pageview, pathname: "/5")
])
filters = Jason.encode!(%{props: %{"prop" => "~bar"}})
conn = get(conn, "/api/stats/#{site.domain}/pages?period=day&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"visitors" => 1, "name" => "/1"},
%{"visitors" => 1, "name" => "/2"}
]
end
test "returns top pages with :matches_member filter on custom pageview props", %{
conn: conn,
site: site
} do
populate_stats(site, [
build(:pageview,
pathname: "/1",
"meta.key": ["prop"],
"meta.value": ["bar"]
),
build(:pageview,
pathname: "/2",
"meta.key": ["prop"],
"meta.value": ["foobar"]
),
build(:pageview,
pathname: "/3",
"meta.key": ["prop"],
"meta.value": ["baar"]
),
build(:pageview,
pathname: "/4",
"meta.key": ["another"],
"meta.value": ["bar"]
),
build(:pageview, pathname: "/5"),
build(:pageview,
pathname: "/6",
"meta.key": ["prop"],
"meta.value": ["near"]
)
])
filters = Jason.encode!(%{props: %{"prop" => "~bar|nea"}})
conn = get(conn, "/api/stats/#{site.domain}/pages?period=day&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"visitors" => 1, "name" => "/1"},
%{"visitors" => 1, "name" => "/2"},
%{"visitors" => 1, "name" => "/6"}
]
end
test "returns top pages with multiple filters on custom pageview props", %{
conn: conn,
site: site
} do
populate_stats(site, [
build(:pageview,
pathname: "/1",
"meta.key": ["prop", "number"],
"meta.value": ["bar", "1"]
),
build(:pageview,
pathname: "/2",
"meta.key": ["prop", "number"],
"meta.value": ["bar", "2"]
),
build(:pageview,
pathname: "/3",
"meta.key": ["prop"],
"meta.value": ["bar"]
),
build(:pageview,
pathname: "/4",
"meta.key": ["number"],
"meta.value": ["bar"]
),
build(:pageview, pathname: "/5")
])
filters = Jason.encode!(%{props: %{"prop" => "bar", "number" => "1"}})
conn = get(conn, "/api/stats/#{site.domain}/pages?period=day&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"visitors" => 1, "name" => "/1"}
]
end
test "calculates bounce_rate and time_on_page with :is filter on custom pageview props",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog/john-1",
"meta.key": ["author"],
"meta.value": ["John Doe"],
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/blog/john-2",
"meta.key": ["author"],
"meta.value": ["John Doe"],
user_id: @user_id,
timestamp: ~N[2021-01-01 00:02:00]
),
build(:pageview,
pathname: "/blog/john-2",
"meta.key": ["author"],
"meta.value": ["John Doe"],
user_id: 456,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog",
user_id: 456,
timestamp: ~N[2021-01-01 00:10:00]
)
])
filters = Jason.encode!(%{props: %{"author" => "John Doe"}})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/blog/john-2",
"visitors" => 2,
"pageviews" => 2,
"bounce_rate" => 0,
"time_on_page" => 600
},
%{
"name" => "/blog/john-1",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 0,
"time_on_page" => 60
}
]
end
test "calculates bounce_rate and time_on_page with :is_not filter on custom pageview props",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog/john-1",
user_id: @user_id,
"meta.key": ["author"],
"meta.value": ["John Doe"],
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/blog/other-post",
user_id: @user_id,
"meta.key": ["author"],
"meta.value": ["other"],
timestamp: ~N[2021-01-01 00:02:00]
),
build(:pageview,
pathname: "/blog",
user_id: 456,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog/john-1",
"meta.key": ["author"],
"meta.value": ["John Doe"],
user_id: 456,
timestamp: ~N[2021-01-01 00:03:00]
)
])
filters = Jason.encode!(%{props: %{"author" => "!John Doe"}})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/blog",
"visitors" => 2,
"pageviews" => 2,
"bounce_rate" => 0,
"time_on_page" => 120.0
},
%{
"name" => "/blog/other-post",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 0,
"time_on_page" => nil
}
]
end
test "calculates bounce_rate and time_on_page with :is (none) filter on custom pageview props",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog/john-1",
user_id: @user_id,
"meta.key": ["author"],
"meta.value": ["John Doe"],
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/blog/other-post",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:02:00]
),
build(:pageview,
pathname: "/blog",
timestamp: ~N[2021-01-01 00:00:00]
)
])
filters = Jason.encode!(%{props: %{"author" => "(none)"}})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/blog",
"visitors" => 2,
"pageviews" => 2,
"bounce_rate" => 50,
"time_on_page" => 60
},
%{
"name" => "/blog/other-post",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 0,
"time_on_page" => nil
}
]
end
test "calculates bounce_rate and time_on_page with :is_not (none) filter on custom pageview props",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog/john-1",
user_id: @user_id,
"meta.key": ["author"],
"meta.value": ["John Doe"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/blog/other-post",
"meta.key": ["author"],
"meta.value": ["other"],
user_id: @user_id,
timestamp: ~N[2021-01-01 00:02:00]
),
build(:pageview,
pathname: "/blog/other-post",
"meta.key": ["author"],
"meta.value": [""],
timestamp: ~N[2021-01-01 00:00:00]
)
])
filters = Jason.encode!(%{props: %{"author" => "!(none)"}})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/blog/other-post",
"visitors" => 2,
"pageviews" => 2,
"bounce_rate" => 100,
"time_on_page" => nil
},
%{
"name" => "/blog/john-1",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 0,
"time_on_page" => 60
}
]
end
test "returns top pages with :not_member filter on custom pageview props", %{
conn: conn,
site: site
} do
populate_stats(site, [
build(:pageview,
pathname: "/chrome",
"meta.key": ["browser"],
"meta.value": ["Chrome"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/chrome",
"meta.key": ["browser"],
"meta.value": ["Chrome"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/safari",
"meta.key": ["browser"],
"meta.value": ["Safari"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/firefox",
"meta.key": ["browser"],
"meta.value": ["Firefox"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/firefox",
timestamp: ~N[2021-01-01 00:00:00]
)
])
filters = Jason.encode!(%{props: %{"browser" => "!Chrome|Safari"}})
conn =
get(conn, "/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/firefox",
"visitors" => 2
}
]
end
test "returns top pages with :not_member filter on custom pageview props including (none) value",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/chrome",
"meta.key": ["browser"],
"meta.value": ["Chrome"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/chrome",
"meta.key": ["browser"],
"meta.value": ["Chrome"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/safari",
"meta.key": ["browser"],
"meta.value": ["Safari"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/no-browser-prop",
timestamp: ~N[2021-01-01 00:00:00]
)
])
filters = Jason.encode!(%{props: %{"browser" => "!Chrome|(none)"}})
conn =
get(conn, "/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/safari",
"visitors" => 1
}
]
end
test "calculates bounce_rate and time_on_page for pages filtered by page path",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/about",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:02:00]
),
build(:pageview,
pathname: "/",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/about",
timestamp: ~N[2021-01-01 00:10:00]
)
])
filters = Jason.encode!(%{page: "/"})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/",
"visitors" => 2,
"pageviews" => 3,
"bounce_rate" => 50,
"time_on_page" => 60
}
]
end
test "can filter using the | (OR) filter",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/irrelevant",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:02:00]
),
build(:pageview,
pathname: "/",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/about",
timestamp: ~N[2021-01-01 00:10:00]
)
])
filters = Jason.encode!(%{page: "/about|/"})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/",
"visitors" => 2,
"pageviews" => 3,
"bounce_rate" => 50,
"time_on_page" => 60
},
%{
"name" => "/about",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 100,
"time_on_page" => nil
}
]
end
test "can filter using the not_member filter type",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/irrelevant",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:02:00]
),
build(:pageview,
pathname: "/",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/about",
timestamp: ~N[2021-01-01 00:10:00]
)
])
filters = Jason.encode!(%{page: "!/irrelevant|/about"})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/",
"visitors" => 2,
"pageviews" => 3,
"bounce_rate" => 50,
"time_on_page" => 60
}
]
end
test "can filter using the matches_member filter type",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog/post-1",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog/post-2",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/articles/post-1",
timestamp: ~N[2021-01-01 00:10:00]
),
build(:pageview,
pathname: "/articles/post-1",
timestamp: ~N[2021-01-01 00:10:00]
)
])
filters = Jason.encode!(%{page: "/blog/**|/articles/**"})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/articles/post-1",
"visitors" => 2,
"pageviews" => 2,
"bounce_rate" => 100,
"time_on_page" => nil
},
%{
"name" => "/blog/post-1",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 0,
"time_on_page" => 60
},
%{
"name" => "/blog/post-2",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 0,
"time_on_page" => nil
}
]
end
test "page filter escapes brackets",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog/(/post-1",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog/(/post-2",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/",
timestamp: ~N[2021-01-01 00:00:00]
)
])
filters = Jason.encode!(%{page: "/blog/(/**|/blog/)/**"})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/blog/(/post-1",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 0,
"time_on_page" => 60
},
%{
"name" => "/blog/(/post-2",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 0,
"time_on_page" => nil
}
]
end
test "can filter using the not_matches_member filter type",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog/post-1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/about",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:10:00]
),
build(:pageview,
pathname: "/",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/articles/post-1",
timestamp: ~N[2021-01-01 00:10:00]
)
])
filters = Jason.encode!(%{page: "!/blog/**|/articles/**"})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&filters=#{filters}&detailed=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/",
"visitors" => 2,
"pageviews" => 2,
"bounce_rate" => 50,
"time_on_page" => 600
},
%{
"name" => "/about",
"visitors" => 1,
"pageviews" => 1,
"bounce_rate" => 0,
"time_on_page" => nil
}
]
end
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
test "returns top pages by visitors with imported data", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview, pathname: "/"),
build(:pageview, pathname: "/"),
build(:pageview, pathname: "/"),
build(:imported_pages, page: "/"),
build(:pageview, pathname: "/register"),
build(:pageview, pathname: "/register"),
build(:imported_pages, page: "/register"),
build(:pageview, pathname: "/contact")
])
conn = get(conn, "/api/stats/#{site.domain}/pages?period=day")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
%{"visitors" => 3, "name" => "/"},
%{"visitors" => 2, "name" => "/register"},
%{"visitors" => 1, "name" => "/contact"}
]
conn = get(conn, "/api/stats/#{site.domain}/pages?period=day&with_imported=true")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
%{"visitors" => 4, "name" => "/"},
%{"visitors" => 3, "name" => "/register"},
%{"visitors" => 1, "name" => "/contact"}
]
end
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
test "returns imported pages with a pageview goal filter", %{conn: conn, site: site} do
insert(:goal, site: site, page_path: "/blog**")
populate_stats(site, [
build(:imported_pages, page: "/blog"),
build(:imported_pages, page: "/not-this"),
build(:imported_pages, page: "/blog/post-1", visitors: 2),
build(:imported_visitors, visitors: 4)
])
filters = Jason.encode!(%{goal: "Visit /blog**"})
q = "?period=day&filters=#{filters}&with_imported=true"
conn = get(conn, "/api/stats/#{site.domain}/pages#{q}")
assert json_response(conn, 200)["results"] == [
%{
"visitors" => 2,
"name" => "/blog/post-1",
"conversion_rate" => 100.0,
"total_visitors" => 2
},
%{
"visitors" => 1,
"name" => "/blog",
"conversion_rate" => 100.0,
"total_visitors" => 1
}
]
end
test "calculates bounce rate and time on page for pages", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/some-other-page",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:15:00]
),
build(:pageview,
pathname: "/",
timestamp: ~N[2021-01-01 00:15:00]
)
])
Formatting only changes - No code change (#75) * first commit with test and compile job Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding 'prepare' stage Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated ci script to include "test" compile phase Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding environment variables for connecting to postgresql Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated ci config for postgres Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * using non-alpine version of elixir Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * re-using the 'compile' artifacts and added explict env variables for testing Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * removing redundant deps fetching from common code Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * formatting using mix.format -- beware no-code changes! Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * added release config Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding consistent env variable for Database Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * more cleaning up of environment variables Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding releases config for enabling releases Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * cleaning up env configs Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Cleaned up config and prepared config for releases Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated CI script with new config for test Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added Dockerfile for creating production docker image Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding "docker" build job yay! Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * using non-slim version of debian and installing webpack Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding overlays for migrations on releases Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * restricting the docker built to master branch only Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * typo fix Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding "Hosting.md" to explain hosting instructions Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * removed the default comments Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added documentation related to env variables * updated documentation and fixed typo Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated documentation * Bumping up elixir version as `overlays` are only supported in latest version read release notes: https://github.com/elixir-lang/elixir/releases/tag/v1.10.0 Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding tarball assembly during release Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated HOSTING.md Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added support for db migration Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * minor corrections Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * initializing admin user Admin user has been added in the "migration" phase. A default user is automatically created in the process. One can provide the related env variables, else a new one will be automatically created for you. Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Initial base domain update - phase#1 These changes are only meant for correct operating it under self-hosting. There are many other cosmetic changes, that require updates to email, site and other places where the original website and author is used. Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Using dedicated config variable `base_domain` instead Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding base_domain to releases config Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * removing the dedicated config "base_domain", relying on endpoint host Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Removed the usage of "Mix" in code! It is bad practice to use "mix" module inside the code as in actual release this module is unavailable. Replacing this with a config environment variable Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added support for SMTP via Bamboo Smtp Adapter Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Capturing SMTP errors via Sentry Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Minor updates Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding junit formatter -- useful for generating test reports Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding documentation for default user * Resolve "Gitlab Adoption: Add supported services in "Security & Compliance"" * bumping up the debian version to fix issues fixing some vulnerabilities identified by the scanning tools * More updates for self-hosting Changes in most of the places to suit self-hosting. Although, there are some which have been left-off. Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * quick-dirty-fix! * bumping up the db connect timeout Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * bumping up the db connect timeout Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * bumping up the db connect timeout Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * bumping up timeout - skipping MRs :-/ * removing restrictions on watching for changes this stuff isn't working * Update HOSTING.md * renamed the module name * reverting formatting-whitespace changes Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * reverting the name to release Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding docker-compose.yml and related instructions Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * using `plausible_url` instead of assuming `https` this is because, it is much to test in local dev machines and in most cases there's already a layer above which is capable for `https` termination and http -> https upgrade Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * WIP: merging changes from upstream Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * wip: more changes * Pushing in changes from upstream Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * changes to ci for testing Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * cleaning up and finishing clickhouse integration Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updating readme with hosting details * removing deleted files from upstream * minor config adjustments Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * formatting changes Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>
2020-06-08 10:35:13 +03:00
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&detailed=true"
Formatting only changes - No code change (#75) * first commit with test and compile job Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding 'prepare' stage Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated ci script to include "test" compile phase Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding environment variables for connecting to postgresql Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated ci config for postgres Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * using non-alpine version of elixir Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * re-using the 'compile' artifacts and added explict env variables for testing Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * removing redundant deps fetching from common code Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * formatting using mix.format -- beware no-code changes! Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * added release config Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding consistent env variable for Database Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * more cleaning up of environment variables Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding releases config for enabling releases Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * cleaning up env configs Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Cleaned up config and prepared config for releases Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated CI script with new config for test Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added Dockerfile for creating production docker image Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding "docker" build job yay! Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * using non-slim version of debian and installing webpack Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding overlays for migrations on releases Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * restricting the docker built to master branch only Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * typo fix Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding "Hosting.md" to explain hosting instructions Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * removed the default comments Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added documentation related to env variables * updated documentation and fixed typo Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated documentation * Bumping up elixir version as `overlays` are only supported in latest version read release notes: https://github.com/elixir-lang/elixir/releases/tag/v1.10.0 Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding tarball assembly during release Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updated HOSTING.md Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added support for db migration Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * minor corrections Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * initializing admin user Admin user has been added in the "migration" phase. A default user is automatically created in the process. One can provide the related env variables, else a new one will be automatically created for you. Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Initial base domain update - phase#1 These changes are only meant for correct operating it under self-hosting. There are many other cosmetic changes, that require updates to email, site and other places where the original website and author is used. Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Using dedicated config variable `base_domain` instead Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding base_domain to releases config Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * removing the dedicated config "base_domain", relying on endpoint host Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Removed the usage of "Mix" in code! It is bad practice to use "mix" module inside the code as in actual release this module is unavailable. Replacing this with a config environment variable Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Added support for SMTP via Bamboo Smtp Adapter Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Capturing SMTP errors via Sentry Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Minor updates Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * Adding junit formatter -- useful for generating test reports Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding documentation for default user * Resolve "Gitlab Adoption: Add supported services in "Security & Compliance"" * bumping up the debian version to fix issues fixing some vulnerabilities identified by the scanning tools * More updates for self-hosting Changes in most of the places to suit self-hosting. Although, there are some which have been left-off. Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * quick-dirty-fix! * bumping up the db connect timeout Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * bumping up the db connect timeout Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * bumping up the db connect timeout Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * bumping up timeout - skipping MRs :-/ * removing restrictions on watching for changes this stuff isn't working * Update HOSTING.md * renamed the module name * reverting formatting-whitespace changes Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * reverting the name to release Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * adding docker-compose.yml and related instructions Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * using `plausible_url` instead of assuming `https` this is because, it is much to test in local dev machines and in most cases there's already a layer above which is capable for `https` termination and http -> https upgrade Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * WIP: merging changes from upstream Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * wip: more changes * Pushing in changes from upstream Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * changes to ci for testing Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * cleaning up and finishing clickhouse integration Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * updating readme with hosting details * removing deleted files from upstream * minor config adjustments Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me> * formatting changes Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>
2020-06-08 10:35:13 +03:00
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
2020-11-03 12:20:11 +03:00
%{
"bounce_rate" => 50.0,
"time_on_page" => 900.0,
"visitors" => 2,
2020-11-03 12:20:11 +03:00
"pageviews" => 2,
"name" => "/"
2020-11-03 12:20:11 +03:00
},
%{
"bounce_rate" => 0,
"time_on_page" => nil,
"visitors" => 1,
2020-11-03 12:20:11 +03:00
"pageviews" => 1,
"name" => "/some-other-page"
2020-11-03 12:20:11 +03:00
}
]
end
test "filtering by hostname, excludes a page on different hostname", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
timestamp: ~N[2021-01-01 05:01:00],
pathname: "/about",
hostname: "blog.example.com",
user_id: @user_id
),
build(:pageview,
timestamp: ~N[2021-01-01 05:01:02],
pathname: "/hello",
hostname: "example.com",
user_id: @user_id
),
build(:pageview,
timestamp: ~N[2021-01-01 05:01:02],
pathname: "/about",
hostname: "blog.example.com"
)
])
filters = Jason.encode!(%{"hostname" => "blog.example.com"})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&detailed=true&filters=#{filters}"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"bounce_rate" => 50,
"name" => "/about",
"pageviews" => 2,
"time_on_page" => nil,
"visitors" => 2
}
]
end
test "calculates bounce rate and time on page for pages when filtered by hostname", %{
conn: conn,
site: site
} do
populate_stats(site, [
# session 1
build(:pageview,
pathname: "/about-blog",
hostname: "blog.example.com",
user_id: @user_id + 1,
timestamp: ~N[2021-01-01 00:01:00]
),
# session 2
build(:pageview,
pathname: "/about-blog",
hostname: "blog.example.com",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/about",
hostname: "example.com",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:10:00]
),
build(:pageview,
pathname: "/about-blog",
hostname: "blog.example.com",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:15:00]
),
build(:pageview,
pathname: "/exit-blog",
hostname: "blog.example.com",
timestamp: ~N[2021-01-01 00:20:00],
user_id: @user_id
),
build(:pageview,
pathname: "/about",
hostname: "example.com",
timestamp: ~N[2021-01-01 00:22:00],
user_id: @user_id
),
build(:pageview,
pathname: "/exit",
hostname: "example.com",
timestamp: ~N[2021-01-01 00:25:00],
user_id: @user_id
),
# session 3
build(:pageview,
pathname: "/about",
hostname: "example.com",
user_id: @user_id + 2,
timestamp: ~N[2021-01-01 00:01:00]
)
])
filters = Jason.encode!(%{"hostname" => "blog.example.com"})
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&detailed=true&filters=#{filters}"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"bounce_rate" => 50,
"name" => "/about-blog",
"pageviews" => 3,
"time_on_page" => 1140.0,
"visitors" => 2
},
%{
"bounce_rate" => 0,
"name" => "/exit-blog",
"pageviews" => 1,
"time_on_page" => nil,
"visitors" => 1
}
]
end
test "doesn't calculate time on page with only single page visits", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview, pathname: "/", user_id: @user_id, timestamp: ~N[2021-01-01 00:00:00]),
build(:pageview, pathname: "/", user_id: @user_id, timestamp: ~N[2021-01-01 00:10:00])
])
assert [%{"name" => "/", "time_on_page" => nil}] =
conn
|> get("/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&detailed=true")
|> json_response(200)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
|> Map.get("results")
end
test "ignores page refresh when calculating time on page", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview, user_id: @user_id, timestamp: ~N[2021-01-01 00:00:00], pathname: "/"),
build(:pageview, user_id: @user_id, timestamp: ~N[2021-01-01 00:01:00], pathname: "/"),
build(:pageview, user_id: @user_id, timestamp: ~N[2021-01-01 00:02:00], pathname: "/"),
build(:pageview, user_id: @user_id, timestamp: ~N[2021-01-01 00:03:00], pathname: "/exit")
])
assert [
%{"name" => "/", "time_on_page" => _three_minutes = 180.0},
%{"name" => "/exit", "time_on_page" => nil}
] =
conn
|> get("/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&detailed=true")
|> json_response(200)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
|> Map.get("results")
end
test "calculates time on page per unique transition within session", %{conn: conn, site: site} do
# ┌─p──┬─p2─┬─minus(t2, t)─┬──s─┐
# │ /a │ /b │ 100 │ s1 │
# │ /a │ /d │ 100 │ s2 │ <- these two get treated
# │ /a │ /d │ 0 │ s2 │ <- as single page transition
# └────┴────┴──────────────┴────┘
# so that time_on_page(a)=(100+100)/uniq(transition)=200/2=100
s1 = @user_id
s2 = @user_id + 1
now = ~N[2021-01-01 00:00:00]
later = fn seconds -> NaiveDateTime.add(now, seconds) end
populate_stats(site, [
build(:pageview, user_id: s1, timestamp: now, pathname: "/a"),
build(:pageview, user_id: s1, timestamp: later.(100), pathname: "/b"),
build(:pageview, user_id: s2, timestamp: now, pathname: "/a"),
build(:pageview, user_id: s2, timestamp: later.(100), pathname: "/d"),
build(:pageview, user_id: s2, timestamp: later.(100), pathname: "/a"),
build(:pageview, user_id: s2, timestamp: later.(100), pathname: "/d")
])
assert [
%{"name" => "/a", "time_on_page" => 100.0},
%{"name" => "/b", "time_on_page" => nil},
%{"name" => "/d", "time_on_page" => +0.0}
] =
conn
|> get("/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&detailed=true")
|> json_response(200)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
|> Map.get("results")
end
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
test "calculates bounce rate and time on page for pages with imported data", %{
conn: conn,
site: site
} do
populate_stats(site, [
build(:pageview,
pathname: "/",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/some-other-page",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:15:00]
),
build(:pageview,
pathname: "/",
timestamp: ~N[2021-01-01 00:15:00]
),
build(:imported_pages,
page: "/",
date: ~D[2021-01-01],
time_on_page: 700
),
build(:imported_entry_pages,
entry_page: "/",
date: ~D[2021-01-01],
entrances: 3,
bounces: 1
),
build(:imported_pages,
page: "/some-other-page",
date: ~D[2021-01-01],
time_on_page: 60
)
])
conn =
get(
conn,
"/api/stats/#{site.domain}/pages?period=day&date=2021-01-01&detailed=true&with_imported=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
%{
"bounce_rate" => 40.0,
"time_on_page" => 800.0,
"visitors" => 3,
"pageviews" => 3,
"name" => "/"
},
%{
"bounce_rate" => 0,
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
"time_on_page" => 60,
"visitors" => 2,
"pageviews" => 2,
"name" => "/some-other-page"
}
]
end
test "returns top pages in realtime report", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview, pathname: "/page1"),
build(:pageview, pathname: "/page2"),
build(:pageview, pathname: "/page1")
])
conn = get(conn, "/api/stats/#{site.domain}/pages?period=realtime")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"visitors" => 2, "name" => "/page1"},
%{"visitors" => 1, "name" => "/page2"}
]
end
test "calculates conversion_rate when filtering for goal", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview, user_id: 1, pathname: "/"),
build(:pageview, user_id: 2, pathname: "/"),
build(:pageview, user_id: 3, pathname: "/"),
build(:event, user_id: 3, name: "Signup")
])
insert(:goal, site: site, event_name: "Signup")
filters = Jason.encode!(%{"goal" => "Signup"})
conn = get(conn, "/api/stats/#{site.domain}/pages?period=day&filters=#{filters}")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"total_visitors" => 3, "visitors" => 1, "name" => "/", "conversion_rate" => 33.3}
]
end
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
test "filter by :is page with imported data", %{conn: conn, site: site} do
site_import = insert(:site_import, site: site)
populate_stats(site, site_import.id, [
build(:pageview, user_id: 1, pathname: "/", timestamp: ~N[2021-01-01 12:00:00]),
build(:pageview, user_id: 1, pathname: "/ignored", timestamp: ~N[2021-01-01 12:01:00]),
build(:imported_entry_pages,
entry_page: "/",
visitors: 1,
bounces: 1,
date: ~D[2021-01-01]
),
build(:imported_pages,
page: "/",
visitors: 3,
pageviews: 3,
time_on_page: 300,
date: ~D[2021-01-01]
),
build(:imported_pages, page: "/ignored", visitors: 10, date: ~D[2021-01-01])
])
filters = Jason.encode!(%{"page" => "/"})
q = "?period=day&date=2021-01-01&filters=#{filters}&detailed=true&with_imported=true"
conn = get(conn, "/api/stats/#{site.domain}/pages#{q}")
assert json_response(conn, 200)["results"] == [
%{
"bounce_rate" => 50,
"name" => "/",
"pageviews" => 4,
"time_on_page" => 90.0,
"visitors" => 4
}
]
end
test "filter by :member page with imported data", %{conn: conn, site: site} do
site_import = insert(:site_import, site: site)
populate_stats(site, site_import.id, [
build(:pageview, user_id: 1, pathname: "/", timestamp: ~N[2021-01-01 12:00:00]),
build(:pageview, user_id: 1, pathname: "/ignored", timestamp: ~N[2021-01-01 12:01:00]),
build(:imported_entry_pages,
entry_page: "/",
visitors: 1,
bounces: 1,
date: ~D[2021-01-01]
),
build(:imported_entry_pages,
entry_page: "/a",
visitors: 1,
bounces: 1,
date: ~D[2021-01-01]
),
build(:imported_pages,
page: "/",
visitors: 3,
pageviews: 3,
time_on_page: 300,
date: ~D[2021-01-01]
),
build(:imported_pages,
page: "/a",
visitors: 1,
date: ~D[2021-01-01]
),
build(:imported_pages, page: "/ignored", visitors: 10, date: ~D[2021-01-01])
])
filters = Jason.encode!(%{"page" => "/|/a"})
q = "?period=day&date=2021-01-01&filters=#{filters}&detailed=true&with_imported=true"
conn = get(conn, "/api/stats/#{site.domain}/pages#{q}")
assert json_response(conn, 200)["results"] == [
%{
"bounce_rate" => 50,
"name" => "/",
"pageviews" => 4,
"time_on_page" => 90.0,
"visitors" => 4
},
%{
"bounce_rate" => 100,
"name" => "/a",
"pageviews" => 1,
"time_on_page" => 10.0,
"visitors" => 1
}
]
end
test "filter by :matches page with imported data", %{conn: conn, site: site} do
site_import = insert(:site_import, site: site)
populate_stats(site, site_import.id, [
build(:pageview, user_id: 1, pathname: "/aaa", timestamp: ~N[2021-01-01 12:00:00]),
build(:pageview, user_id: 1, pathname: "/ignored", timestamp: ~N[2021-01-01 12:01:00]),
build(:imported_entry_pages,
entry_page: "/aaa",
visitors: 1,
bounces: 1,
date: ~D[2021-01-01]
),
build(:imported_entry_pages,
entry_page: "/a",
visitors: 1,
bounces: 1,
date: ~D[2021-01-01]
),
build(:imported_pages,
page: "/aaa",
visitors: 3,
pageviews: 3,
time_on_page: 300,
date: ~D[2021-01-01]
),
build(:imported_pages,
page: "/a",
visitors: 1,
date: ~D[2021-01-01]
),
build(:imported_pages, page: "/ignored", visitors: 10, date: ~D[2021-01-01])
])
filters = Jason.encode!(%{"page" => "/a**"})
q = "?period=day&date=2021-01-01&filters=#{filters}&detailed=true&with_imported=true"
conn = get(conn, "/api/stats/#{site.domain}/pages#{q}")
assert json_response(conn, 200)["results"] == [
%{
"bounce_rate" => 50,
"name" => "/aaa",
"pageviews" => 4,
"time_on_page" => 90.0,
"visitors" => 4
},
%{
"bounce_rate" => 100,
"name" => "/a",
"pageviews" => 1,
"time_on_page" => 10.0,
"visitors" => 1
}
]
end
end
describe "GET /api/stats/:domain/entry-pages" do
setup [:create_user, :log_in, :create_new_site, :create_legacy_site_import]
test "returns top entry pages by visitors", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/page1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:15:00]
)
])
populate_stats(site, [
build(:pageview,
pathname: "/page2",
user_id: @user_id,
timestamp: ~N[2021-01-01 23:15:00]
)
])
conn = get(conn, "/api/stats/#{site.domain}/entry-pages?period=day&date=2021-01-01")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
2020-11-03 12:20:11 +03:00
%{
"visitors" => 2,
"visits" => 2,
"name" => "/page1",
"visit_duration" => 0
},
%{
"visitors" => 1,
"visits" => 2,
"name" => "/page2",
"visit_duration" => 450
2020-11-03 12:20:11 +03:00
}
]
end
test "returns top entry pages filtered by custom pageview props", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog",
user_id: 123,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog/john-1",
"meta.key": ["author"],
"meta.value": ["John Doe"],
user_id: 123,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/blog/john-2",
"meta.key": ["author"],
"meta.value": ["John Doe"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog/other-post",
"meta.key": ["author"],
"meta.value": ["other"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/blog",
timestamp: ~N[2021-01-01 00:00:00]
)
])
filters = Jason.encode!(%{props: %{"author" => "John Doe"}})
conn =
get(
conn,
"/api/stats/#{site.domain}/entry-pages?period=day&date=2021-01-01&filters=#{filters}"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"visitors" => 1,
"visits" => 1,
"name" => "/blog",
"visit_duration" => 60
},
%{
"visitors" => 1,
"visits" => 1,
"name" => "/blog/john-2",
"visit_duration" => 0
}
]
end
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
test "returns top entry pages by visitors with imported data", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/page1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:15:00]
),
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
build(:pageview,
pathname: "/page2",
user_id: @user_id,
timestamp: ~N[2021-01-01 23:15:00]
)
])
populate_stats(site, [
build(:imported_entry_pages,
entry_page: "/page2",
date: ~D[2021-01-01],
entrances: 3,
visitors: 2,
visit_duration: 300
)
])
conn = get(conn, "/api/stats/#{site.domain}/entry-pages?period=day&date=2021-01-01")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
%{
"visitors" => 2,
"visits" => 2,
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
"name" => "/page1",
"visit_duration" => 0
},
%{
"visitors" => 1,
"visits" => 2,
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
"name" => "/page2",
"visit_duration" => 450
}
]
conn =
get(
conn,
"/api/stats/#{site.domain}/entry-pages?period=day&date=2021-01-01&with_imported=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
%{
"visitors" => 3,
"visits" => 5,
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
"name" => "/page2",
"visit_duration" => 240.0
},
%{
"visitors" => 2,
"visits" => 2,
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
"name" => "/page1",
"visit_duration" => 0
}
]
end
test "returns top entry pages by visitors filtered by hostname",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/page1",
hostname: "en.example.com",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
hostname: "es.example.com",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
hostname: "en.example.com",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
hostname: "es.example.com",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:15:00]
),
build(:pageview,
pathname: "/exit",
hostname: "es.example.com",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:16:00]
),
build(:pageview,
pathname: "/page2",
hostname: "es.example.com",
timestamp: ~N[2021-01-01 23:15:00]
)
])
filters = Jason.encode!(%{"hostname" => "es.example.com"})
conn =
get(
conn,
"/api/stats/#{site.domain}/entry-pages?period=day&date=2021-01-01&filters=#{filters}"
)
# We're going to only join sessions where the exit hostname matches the filter
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"name" => "/page1", "visit_duration" => 0, "visitors" => 1, "visits" => 1},
%{"name" => "/page2", "visit_duration" => 0, "visitors" => 1, "visits" => 1}
]
end
test "bugfix: pagination on /pages filtered by goal", %{conn: conn, site: site} do
populate_stats(
site,
for i <- 1..30 do
build(:event,
user_id: i,
name: "Signup",
pathname: "/signup/#{String.pad_leading(to_string(i), 2, "0")}",
timestamp: ~N[2021-01-01 00:01:00]
)
end
)
insert(:goal, site: site, event_name: "Signup")
request = fn conn, opts ->
page = Keyword.fetch!(opts, :page)
limit = Keyword.fetch!(opts, :limit)
filters = Jason.encode!(%{"goal" => "Signup"})
conn
|> get(
"/api/stats/#{site.domain}/pages?date=2021-01-01&period=day&filters=#{filters}&limit=#{limit}&page=#{page}"
)
|> json_response(200)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
|> Map.get("results")
|> Enum.map(fn %{"name" => "/signup/" <> seq} ->
seq
end)
end
assert List.first(request.(conn, page: 1, limit: 100)) == "01"
assert List.last(request.(conn, page: 1, limit: 100)) == "30"
assert List.last(request.(conn, page: 1, limit: 29)) == "29"
assert ["01", "02"] = request.(conn, page: 1, limit: 2)
assert ["03", "04"] = request.(conn, page: 2, limit: 2)
assert ["01", "02", "03", "04", "05"] = request.(conn, page: 1, limit: 5)
assert ["06", "07", "08", "09", "10"] = request.(conn, page: 2, limit: 5)
assert ["11", "12", "13", "14", "15"] = request.(conn, page: 3, limit: 5)
assert ["20"] = request.(conn, page: 20, limit: 1)
assert [] = request.(conn, page: 31, limit: 1)
end
test "calculates conversion_rate when filtering for goal", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/page1",
user_id: 1,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
user_id: 2,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:event,
name: "Signup",
user_id: 1,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
user_id: 3,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
user_id: 3,
timestamp: ~N[2021-01-01 00:15:00]
),
build(:event,
name: "Signup",
user_id: 3,
timestamp: ~N[2021-01-01 00:15:00]
)
])
insert(:goal, site: site, event_name: "Signup")
filters = Jason.encode!(%{"goal" => "Signup"})
conn =
get(
conn,
"/api/stats/#{site.domain}/entry-pages?period=day&date=2021-01-01&filters=#{filters}"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"total_visitors" => 2,
"visitors" => 1,
"name" => "/page1",
"conversion_rate" => 50.0
},
%{
"total_visitors" => 1,
"visitors" => 1,
"name" => "/page2",
"conversion_rate" => 100.0
}
]
end
test "ignores entry pages from sessions with only custom events", %{conn: conn, site: site} do
populate_stats(site, [
build(:event,
name: "Signup",
timestamp: ~N[2021-01-01 00:15:00],
pathname: "/"
)
])
conn =
get(
conn,
"/api/stats/#{site.domain}/entry-pages?period=day&date=2021-01-01"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == []
end
test "filter by :matches_member entry_page with imported data", %{conn: conn, site: site} do
site_import = insert(:site_import, site: site)
populate_stats(site, site_import.id, [
build(:pageview, pathname: "/aaa", timestamp: ~N[2021-01-01 12:00:00]),
build(:pageview, pathname: "/a", timestamp: ~N[2021-01-01 12:00:00]),
build(:pageview, pathname: "/ignored", timestamp: ~N[2021-01-01 12:01:00]),
build(:imported_entry_pages,
entry_page: "/a",
visitors: 5,
entrances: 9,
visit_duration: 1000,
date: ~D[2021-01-01]
),
build(:imported_entry_pages,
entry_page: "/bbb",
visitors: 2,
entrances: 2,
visit_duration: 100,
date: ~D[2021-01-01]
)
])
filters = Jason.encode!(%{"entry_page" => "/a**|/b**"})
q = "?period=day&date=2021-01-01&filters=#{filters}&detailed=true&with_imported=true"
conn = get(conn, "/api/stats/#{site.domain}/entry-pages#{q}")
assert json_response(conn, 200)["results"] == [
%{
"visit_duration" => 100.0,
"name" => "/a",
"visits" => 10,
"visitors" => 6
},
%{
"visit_duration" => 50.0,
"name" => "/bbb",
"visits" => 2,
"visitors" => 2
},
%{
"visit_duration" => 0,
"name" => "/aaa",
"visits" => 1,
"visitors" => 1
}
]
end
end
describe "GET /api/stats/:domain/exit-pages" do
setup [:create_user, :log_in, :create_new_site, :create_legacy_site_import]
test "returns top exit pages by visitors", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/page1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:15:00]
)
])
conn = get(conn, "/api/stats/#{site.domain}/exit-pages?period=day&date=2021-01-01")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"name" => "/page1", "visitors" => 2, "visits" => 2, "exit_rate" => 66},
%{"name" => "/page2", "visitors" => 1, "visits" => 1, "exit_rate" => 100}
]
end
2021-08-19 10:32:03 +03:00
test "returns top exit pages by visitors filtered by hostname",
%{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/page1",
hostname: "en.example.com",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
hostname: "es.example.com",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
hostname: "en.example.com",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
hostname: "es.example.com",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:15:00]
),
build(:pageview,
pathname: "/exit",
hostname: "en.example.com",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:16:00]
)
])
filters = Jason.encode!(%{hostname: "es.example.com"})
conn =
get(
conn,
"/api/stats/#{site.domain}/exit-pages?period=day&date=2021-01-01&filters=#{filters}"
)
# We're going to only join sessions where the entry hostname matches the filter
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] ==
[%{"name" => "/page1", "visitors" => 1, "visits" => 1}]
end
test "returns top exit pages filtered by custom pageview props", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/blog/john-1",
"meta.key": ["author"],
"meta.value": ["John Doe"],
user_id: 123,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/",
user_id: 123,
timestamp: ~N[2021-01-01 00:01:00]
),
build(:pageview,
pathname: "/blog/other-post",
"meta.key": ["author"],
"meta.value": ["other"],
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/",
timestamp: ~N[2021-01-01 00:00:00]
)
])
filters = Jason.encode!(%{props: %{"author" => "John Doe"}})
conn =
get(
conn,
"/api/stats/#{site.domain}/exit-pages?period=day&date=2021-01-01&filters=#{filters}"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"name" => "/", "visitors" => 1, "visits" => 1}
]
end
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
test "returns top exit pages by visitors with imported data", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
pathname: "/page1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page1",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
pathname: "/page2",
user_id: @user_id,
timestamp: ~N[2021-01-01 00:15:00]
)
])
populate_stats(site, [
build(:imported_pages,
page: "/page2",
date: ~D[2021-01-01],
pageviews: 4,
visitors: 2
),
build(:imported_exit_pages,
exit_page: "/page2",
date: ~D[2021-01-01],
exits: 3,
visitors: 2
)
])
conn = get(conn, "/api/stats/#{site.domain}/exit-pages?period=day&date=2021-01-01")
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"name" => "/page1", "visitors" => 2, "visits" => 2, "exit_rate" => 66},
%{"name" => "/page2", "visitors" => 1, "visits" => 1, "exit_rate" => 100}
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
]
conn =
get(
conn,
"/api/stats/#{site.domain}/exit-pages?period=day&date=2021-01-01&with_imported=true"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
%{
"name" => "/page2",
"visitors" => 3,
"visits" => 4,
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
"exit_rate" => 80.0
},
%{"name" => "/page1", "visitors" => 2, "visits" => 2, "exit_rate" => 66}
[Continued] Google Analytics import (#1753) * Add has_imported_stats boolean to Site * Add Google Analytics import panel to general settings * Get GA profiles to display in import settings panel * Add import_from_google method as entrypoint to import data * Add imported_visitors table * Remove conflicting code from migration * Import visitors data into clickhouse database * Pass another dataset to main graph for rendering in red This adds another entry to the JSON data returned via the main graph API called `imported_plot`, which is similar to `plot` in form but will be completed with previously imported data. Currently it simply returns the values from `plot` / 2. The data is rendered in the main graph in red without fill, and without an indicator for the present. Rationale: imported data will not continue to grow so there is no projection forward, only backwards. * Hook imported GA data to dashboard timeseries plot * Add settings option to forget imported data * Import sources from google analytics * Merge imported sources when queried * Merge imported source data native data when querying sources * Start converting metrics to atoms so they can be subqueried This changes "visitors" and in some places "sources" to atoms. This does not change the behaviour of the functions - the tests all pass unchanged following this commit. This is necessary as joining subqueries requires that the keys in `select` statements be atoms and not strings. * Convery GA (direct) source to empty string * Import utm campaign and utm medium from GA * format * Import all data types from GA into new tables * Handle large amounts of more data more safely * Fix some mistakes in tables * Make GA requests in chunks of 5 queries * Only display imported timeseries when there is no filter * Correctly show last 30 minutes timeseries when 'realtime' * Add with_imported key to Query struct * Account for injected :is_not filter on sources from dashboard * Also add tentative imported_utm_sources table This needs a bit more work on the google import side, as GA do not report sources and utm sources as distinct things. * Return imported data to dashboard for rest of Sources panel This extends the merge_imported function definition for sources to utm_sources, utm_mediums and utm_campaigns too. This appears to be working on the DB side but something is incomplete on the client side. * Clear imported stats from all tables when requested * Merge entry pages and exit pages from imported data into unfiltered dashboard view This requires converting the `"visits"` and `"visit_duration"` metrics to atoms so that they can be used in ecto subqueries. * Display imported devices, browsers and OSs on dashboard * Display imported country data on dashboard * Add more metrics to entries/exits for modals * make sure data is returned via API with correct keys * Import regions and cities from GA * Capitalize device upon import to match native data * Leave query limits/offsets until after possibly joining with imported data * Also import timeOnPage and pageviews for pages from GA * imported_countries -> imported_locations * Get timeOnPage and pageviews for pages from GA These are needed for the pages modal, and for calculating exit rates for exit pages. * Add indicator to dashboard when imported data is being used * Don't show imported data as separately line on main graph * "bounce_rate" -> :bounce_rate, so it works in subqueries * Drop imported browser and OS versions These are not needed. * Toggle displaying imported data by clicking indicator * Parse referrers with RefInspector - Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual referrer host + path, whereas 'ga:source' includes utm_mediums and other values when relevant. - 'ga:fullReferror' does however include search engine names directly, so they are manually checked for as RefInspector won't pick up on these. * Keep imported data indicator on dashboard and strikethrough when hidden * Add unlink google button to import panel * Rename some GA browsers and OSes to plausible versions * Get main top pages and exit pages panels working correctly with imported data * mix format * Fetch time_on_pages for imported data when needed * entry pages need to fetch bounces from GA * "sample_percent" -> :sample_percent as only atoms can be used in subqueries * Calculate bounce_rate for joined native and imported data for top pages modal * Flip some query bindings around to be less misleading * Fixup entry page modal visit durations * mix format * Fetch bounces and visit_duration for sources from GA * add more source metrics used for data in modals * Make sources modals display correct values * imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration * Merge imported data into aggregate stats * Reformat top graph side icons * Ensure sample_percent is yielded from aggregate data * filter event_props should be strings * Hide imported data from frontend when using filter * Fix existing tests * fix tests * Fix imported indicator appearing when filtering * comma needed, lost when rebasing * Import utm_terms and utm_content from GA * Merge imported utm_term and utm_content * Rename imported Countries data as Locations * Set imported city schema field to int * Remove utm_terms and utm_content when clearing imported * Clean locations import from Google Analytics - Country and region should be set to "" when GA provides "(not set)" - City should be set to 0 for "unknown", as we cannot reliably import city data from GA. * Display imported region and city in dashboard * os -> operating_system in some parts of code The inconsistency of using os in some places and operating_system in others causes trouble with subqueries and joins for the native and imported data, which would require additional logic to account for. The simplest solution is the just use a consistent word for all uses. This doesn't make any user-facing or database changes. * to_atom -> to_existing_atom * format * "events" metric -> :events * ignore imported data when "events" in metrics * update "bounce_rate" * atomise some more metrics from new city and region api * atomise some more metrics for email handlers * "conversion_rate" -> :conversion_rate during csv export * Move imported data stats code to own module * Move imported timeseries function to Stats.Imported * Use Timex.parse to import dates from GA * has_imported_stats -> imported_source * "time_on_page" -> :time_on_page * Convert imported GA data to UTC * Clean up GA request code a bit There was some weird logic here with two separate lists that really ought to be together, so this merges those. * Fail sooner if GA timezone can't be identified * Link imported tables to site by id * imported_utm_content -> imported_utm_contents * Imported GA from all of time * Reorganise GA data fetch logic - Fetch data from the start of time (2005) - Check whether no data was fetched, and if so, inform user and don't consider data to be imported. * Clarify removal of "visits" data when it isn't in metrics * Apply location filters from API This makes it consistent with the sources etc which filter out 'Direct / None' on the API side. These filters are used by both the native and imported data handling code, which would otherwise both duplicate the filters in their `where` clauses. * Do not use changeset for setting site.imported_source * Add all metrics to all dimensions * Run GA import in the background * Send email when GA import completes * Add handler to insert imported data into tests and imported_browsers_factory * Add remaining import data test factories * Add imported location data to test * Test main graph with imported data * Add imported data to operating systems tests * Add imported data to pages tests * Add imported data to entry pages tests * Add imported data to exit pages tests * Add imported data to devices tests * Add imported data to sources tests * Add imported data to UTM tests * Add new test module for the data import step * Test import of sources GA data * Test import of utm_mediums GA data * Test import of utm_campaigns GA data * Add tests for UTM terms * Add tests for UTM contents * Add test for importing pages and entry pages data from GA * Add test for importing exit page data * Fix module file name typo * Add test for importing location data from GA * Add test for importing devices data from GA * Add test for importing browsers data from GA * Add test for importing OS data from GA * Paginate GA requests to download all data * Bump clickhouse_ecto version * Move RefInspector wrapper function into module * Drop timezone transform on import * Order imported by side_id then date * More strings -> atoms Also changes a conditional to be a bit nicer * Remove parallelisation of data import * Split sources and UTM sources from fetched GA data GA has only a "source" dimension and no "UTM source" dimension. Instead it returns these combined. The logic herein to tease these apart is: 1. "(direct)" -> it's a direct source 2. if the source is a domain -> it's a source 3. "google" -> it's from adwords; let's make this a UTM source "adwords" 4. else -> just a UTM source * Keep prop names in queries as strings * fix typo * Fix import * Insert data to clickhouse in batches * Fix link when removing imported data * Merge source tables * Import hostname as well as pathname * Record start and end time of imported data * Track import progress * Fix month interval with imported data * Do not JOIN when imported date range has no overlap * Fix time on page using exits Co-authored-by: mcol <mcol@posteo.net>
2022-03-11 00:04:59 +03:00
]
end
test "calculates correct exit rate and conversion_rate when filtering for goal", %{
conn: conn,
site: site
} do
2021-08-19 10:32:03 +03:00
populate_stats(site, [
build(:event,
name: "Signup",
user_id: 1,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
user_id: 1,
pathname: "/exit1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:event,
name: "Signup",
user_id: 2,
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
user_id: 2,
pathname: "/exit1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
user_id: 2,
pathname: "/exit2",
timestamp: ~N[2021-01-01 00:00:00]
)
])
insert(:goal, site: site, event_name: "Signup")
2021-08-19 10:32:03 +03:00
filters = Jason.encode!(%{"goal" => "Signup"})
conn =
get(
conn,
"/api/stats/#{site.domain}/exit-pages?period=day&date=2021-01-01&filters=#{filters}"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{
"name" => "/exit1",
"visitors" => 1,
"total_visitors" => 1,
"conversion_rate" => 100.0
},
%{
"name" => "/exit2",
"visitors" => 1,
"total_visitors" => 1,
"conversion_rate" => 100.0
}
2021-08-19 10:32:03 +03:00
]
end
test "calculates correct exit rate when filtering for page", %{conn: conn, site: site} do
populate_stats(site, [
build(:pageview,
user_id: 1,
pathname: "/exit1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
user_id: 2,
pathname: "/exit1",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
user_id: 2,
pathname: "/exit2",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
user_id: 3,
pathname: "/exit2",
timestamp: ~N[2021-01-01 00:00:00]
),
build(:pageview,
user_id: 3,
pathname: "/should-not-appear",
timestamp: ~N[2021-01-01 00:00:00]
)
])
filters = Jason.encode!(%{"page" => "/exit1"})
conn =
get(
conn,
"/api/stats/#{site.domain}/exit-pages?period=day&date=2021-01-01&filters=#{filters}"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == [
%{"name" => "/exit1", "visitors" => 1, "visits" => 1},
%{"name" => "/exit2", "visitors" => 1, "visits" => 1}
2021-08-19 10:32:03 +03:00
]
end
test "ignores exit pages from sessions with only custom events", %{conn: conn, site: site} do
populate_stats(site, [
build(:event,
name: "Signup",
timestamp: ~N[2021-01-01 00:15:00],
pathname: "/"
)
])
conn =
get(
conn,
"/api/stats/#{site.domain}/exit-pages?period=day&date=2021-01-01"
)
Implement filtering for imported data (#4118) * move imported.ex to imported subfolder * move constructing base imported query into a separate module * Implement imported table deciding and filtering + tests for pages, entry_pages, exit_pages and common filter types * add top stats test with country filter * add timeseries test * Drop bounce_rate and time_on_page from imported & page-filtered Top Stats * rename field returned by top stats * turn pages into a fn comp * Move dashboard API results under a results key ...and also return the skip_imported_reason to the frontend to be used for displaying warnings. * extend ListReport component with an optional afterFetchData prop * turn Devices into a fn comp * add not_requested as a skip_imported_reason * display warning icons in the dashboard * Implement filtering suggestions and translate filter fields for imported * WIP * Improve and cover filtering suggestions with tests * Rename imported suggestions query helpers * fix screen size breakdown with screen size filter * support filtering by the same suggestion property * support location filters when fetching location suggestions * support filtering by multiple props from the same table * Implement filtering by goals * Make views per visit metric work for import entry and exit pages * Get rid of circular dependencies between Stats.Imported and Stats.Imported.Base * Clean up Query struct manipulation in Breakdown * Rename helper function for clarity * Automatically refresh query struct state after modifications * Shutup credo * display imported warning bubble in prop breakdown section * Render warning bubble for funnels whenever imported data is in the view * Transform any operator on respective goal filters * Fix percentage and conversion_rate calculation in presence of custom props * add tests for for combining page and pageview goal filters * add skip_refresh option to query tweaking functions * add imported CR support for timeseries * still show url breakdown when special goal + url in filter * rename Query.refresh * use flat_map instead of map and concat * fix darkmode color * Handle invalid imported region codes in suggestions gracefully * Add an entry to CHANGELOG.md --------- Co-authored-by: Adrian Gruntkowski <adrian.gruntkowski@gmail.com>
2024-06-03 13:29:08 +03:00
assert json_response(conn, 200)["results"] == []
end
test "filter by :is_not exit_page with imported data", %{conn: conn, site: site} do
site_import = insert(:site_import, site: site)
populate_stats(site, site_import.id, [
build(:pageview, pathname: "/aaa", timestamp: ~N[2021-01-01 12:00:00]),
build(:pageview, pathname: "/a", timestamp: ~N[2021-01-01 12:00:00]),
build(:pageview, pathname: "/ignored", timestamp: ~N[2021-01-01 12:01:00]),
build(:imported_exit_pages,
exit_page: "/a",
visitors: 5,
exits: 9,
visit_duration: 1000,
date: ~D[2021-01-01]
),
build(:imported_exit_pages,
exit_page: "/bbb",
visitors: 2,
exits: 2,
visit_duration: 100,
date: ~D[2021-01-01]
),
build(:imported_pages, page: "/a", pageviews: 19, date: ~D[2021-01-01]),
build(:imported_pages, page: "/bbb", pageviews: 2, date: ~D[2021-01-01])
])
filters = Jason.encode!(%{"exit_page" => "!/ignored"})
q = "?period=day&date=2021-01-01&filters=#{filters}&detailed=true&with_imported=true"
conn = get(conn, "/api/stats/#{site.domain}/exit-pages#{q}")
assert json_response(conn, 200)["results"] == [
%{
"exit_rate" => 50.0,
"name" => "/a",
"visits" => 10,
"visitors" => 6
},
%{
"exit_rate" => 100.0,
"name" => "/bbb",
"visits" => 2,
"visitors" => 2
},
%{
"exit_rate" => 100.0,
"name" => "/aaa",
"visits" => 1,
"visitors" => 1
}
]
end
end
end