Commit Graph

22 Commits

Author SHA1 Message Date
Vini Brasil
4503895d0a
Fix breakdown API pagination when using event metrics (#2562)
* Fix breakdown API pagination when using event metrics

This commit fixes a bug where the subsequent breakdown API pages had
the same items as the first page. The fix sorts the underlying
ClickHouse query by timestamp, keeping the same order between requests,
as we use OFFSET/LIMIT pagination.

* Fix repeated results assertion

* Add different ORDER BY to each breakdown property
2023-01-04 22:14:40 -03:00
Adam Rutkowski
0fa6b688af
Google APIs integration improvements (#2358)
* Make TestUtils module available in all tests

* Add macros patching the application env in tests

Unfortunately a lot of existing functionality relies on
certain application env setup. This isn't ideal because
the app config is a shared state that prevents us from
running the tests in parallel.

Those macros encapsulate setting up new env for test purposes
and make sure the changes are reverted when the test finishes.

* Allow passing request opts to HTTPClient.post/4

We need this to swap custom request building in
Google Analytics import.

* Unify errors when listing sites

* React: propagate backend error messages if available

* React: catch API errors in Search Terms component

* Propagate google API errors on referrer drilldown

* Handle verified properties errors in SC settings

* Add missing tests for SC settings controller

* Unify errors for fetching search analytics queries (list stats)

* Unify errors refreshing Google Auth Token

* Test fetch_stats/3 errors and replace Double with Mox

* Fixup makrup

* s/class/className

* Simplify Search Terms display in case of errors

* Fix warnings
2022-10-24 09:34:02 +02:00
Uku Taht
669091f2ef
Ignore unknown country in imported data (#2247) 2022-09-20 15:02:14 +03:00
Uku Taht
b239f73a6d
Ignore unknown country code (#2223)
* Ignore unknown country code

* Add changelog entry
2022-09-16 11:02:39 +03:00
Vinicius Brasil
b5ea6ae3dc
Keep user filter when listing cities, countries, and regions stats (#2030)
This commit fixes a bug where location filters were filtering stats but
not the locations list. This was caused by a `Map.put/3` call that
overrides the user filter. This commit rollbacks 5b57143273
changes and removes the `Map.put/3` call.

Closes #1982
2022-07-25 12:19:38 +03:00
RobertJoonas
40275b64d4
Pageview custom dimensions (#1816)
* added custom dimension filtering tests for pages

* first filter UI in place

* pages, entry pages and exit pages can be filtered by pageview props

* added tests for expected filtering behaviour

* fix dimension filter for sources + tests

* added is_not filtering functionality

* fixed formatting

* fixed admin_test

* added (none) as filter value + is_not filter type in UI

* added prefilling applied filter values and some UI tweaks

* added fetch options

* Make prop suggestions work with `props` filter

* Fix test

* Track login state internally

* Add CHANGELOG entry

Co-authored-by: Uku Taht <uku.taht@gmail.com>
2022-04-21 11:47:15 +03:00
Uku Taht
e27734ed79
[Continued] Google Analytics import (#1753)
* Add has_imported_stats boolean to Site

* Add Google Analytics import panel to general settings

* Get GA profiles to display in import settings panel

* Add import_from_google method as entrypoint to import data

* Add imported_visitors table

* Remove conflicting code from migration

* Import visitors data into clickhouse database

* Pass another dataset to main graph for rendering in red

This adds another entry to the JSON data returned via the main graph API
called `imported_plot`, which is similar to `plot` in form but will be
completed with previously imported data.  Currently it simply returns
the values from `plot` / 2. The data is rendered in the main graph in
red without fill, and without an indicator for the present. Rationale:
imported data will not continue to grow so there is no projection
forward, only backwards.

* Hook imported GA data to dashboard timeseries plot

* Add settings option to forget imported data

* Import sources from google analytics

* Merge imported sources when queried

* Merge imported source data native data when querying sources

* Start converting metrics to atoms so they can be subqueried

This changes "visitors" and in some places "sources" to atoms. This does
not change the behaviour of the functions - the tests all pass unchanged
following this commit. This is necessary as joining subqueries requires
that the keys in `select` statements be atoms and not strings.

* Convery GA (direct) source to empty string

* Import utm campaign and utm medium from GA

* format

* Import all data types from GA into new tables

* Handle large amounts of more data more safely

* Fix some mistakes in tables

* Make GA requests in chunks of 5 queries

* Only display imported timeseries when there is no filter

* Correctly show last 30 minutes timeseries when 'realtime'

* Add with_imported key to Query struct

* Account for injected :is_not filter on sources from dashboard

* Also add tentative imported_utm_sources table

This needs a bit more work on the google import side, as GA do not
report sources and utm sources as distinct things.

* Return imported data to dashboard for rest of Sources panel

This extends the merge_imported function definition for sources to
utm_sources, utm_mediums and utm_campaigns too. This appears to be
working on the DB side but something is incomplete on the client side.

* Clear imported stats from all tables when requested

* Merge entry pages and exit pages from imported data into unfiltered dashboard view

This requires converting the `"visits"` and `"visit_duration"` metrics
to atoms so that they can be used in ecto subqueries.

* Display imported devices, browsers and OSs on dashboard

* Display imported country data on dashboard

* Add more metrics to entries/exits for modals

* make sure data is returned via API with correct keys

* Import regions and cities from GA

* Capitalize device upon import to match native data

* Leave query limits/offsets until after possibly joining with imported data

* Also import timeOnPage and pageviews for pages from GA

* imported_countries -> imported_locations

* Get timeOnPage and pageviews for pages from GA

These are needed for the pages modal, and for calculating exit rates for
exit pages.

* Add indicator to dashboard when imported data is being used

* Don't show imported data as separately line on main graph

* "bounce_rate" -> :bounce_rate, so it works in subqueries

* Drop imported browser and OS versions

These are not needed.

* Toggle displaying imported data by clicking indicator

* Parse referrers with RefInspector

- Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual
  referrer host + path, whereas 'ga:source' includes utm_mediums and
  other values when relevant.
- 'ga:fullReferror' does however include search engine names directly,
  so they are manually checked for as RefInspector won't pick up on
  these.

* Keep imported data indicator on dashboard and strikethrough when hidden

* Add unlink google button to import panel

* Rename some GA browsers and OSes to plausible versions

* Get main top pages and exit pages panels working correctly with imported data

* mix format

* Fetch time_on_pages for imported data when needed

* entry pages need to fetch bounces from GA

* "sample_percent" -> :sample_percent as only atoms can be used in subqueries

* Calculate bounce_rate for joined native and imported data for top pages modal

* Flip some query bindings around to be less misleading

* Fixup entry page modal visit durations

* mix format

* Fetch bounces and visit_duration for sources from GA

* add more source metrics used for data in modals

* Make sources modals display correct values

* imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration

* Merge imported data into aggregate stats

* Reformat top graph side icons

* Ensure sample_percent is yielded from aggregate data

* filter event_props should be strings

* Hide imported data from frontend when using filter

* Fix existing tests

* fix tests

* Fix imported indicator appearing when filtering

* comma needed, lost when rebasing

* Import utm_terms and utm_content from GA

* Merge imported utm_term and utm_content

* Rename imported Countries data as Locations

* Set imported city schema field to int

* Remove utm_terms and utm_content when clearing imported

* Clean locations import from Google Analytics

- Country and region should be set to "" when GA provides "(not set)"
- City should be set to 0 for "unknown", as we cannot reliably import
  city data from GA.

* Display imported region and city in dashboard

* os -> operating_system in some parts of code

The inconsistency of using os in some places and operating_system in
others causes trouble with subqueries and joins for the native and
imported data, which would require additional logic to account for. The
simplest solution is the just use a consistent word for all uses. This
doesn't make any user-facing or database changes.

* to_atom -> to_existing_atom

* format

* "events" metric -> :events

* ignore imported data when "events" in metrics

* update "bounce_rate"

* atomise some more metrics from new city and region api

* atomise some more metrics for email handlers

* "conversion_rate" -> :conversion_rate during csv export

* Move imported data stats code to own module

* Move imported timeseries function to Stats.Imported

* Use Timex.parse to import dates from GA

* has_imported_stats -> imported_source

* "time_on_page" -> :time_on_page

* Convert imported GA data to UTC

* Clean up GA request code a bit

There was some weird logic here with two separate lists that really
ought to be together, so this merges those.

* Fail sooner if GA timezone can't be identified

* Link imported tables to site by id

* imported_utm_content -> imported_utm_contents

* Imported GA from all of time

* Reorganise GA data fetch logic

- Fetch data from the start of time (2005)
- Check whether no data was fetched, and if so, inform user and don't
  consider data to be imported.

* Clarify removal of "visits" data when it isn't in metrics

* Apply location filters from API

This makes it consistent with the sources etc which filter out 'Direct /
None' on the API side. These filters are used by both the native and
imported data handling code, which would otherwise both duplicate the
filters in their `where` clauses.

* Do not use changeset for setting site.imported_source

* Add all metrics to all dimensions

* Run GA import in the background

* Send email when GA import completes

* Add handler to insert imported data into tests and imported_browsers_factory

* Add remaining import data test factories

* Add imported location data to test

* Test main graph with imported data

* Add imported data to operating systems tests

* Add imported data to pages tests

* Add imported data to entry pages tests

* Add imported data to exit pages tests

* Add imported data to devices tests

* Add imported data to sources tests

* Add imported data to UTM tests

* Add new test module for the data import step

* Test import of sources GA data

* Test import of utm_mediums GA data

* Test import of utm_campaigns GA data

* Add tests for UTM terms

* Add tests for UTM contents

* Add test for importing pages and entry pages data from GA

* Add test for importing exit page data

* Fix module file name typo

* Add test for importing location data from GA

* Add test for importing devices data from GA

* Add test for importing browsers data from GA

* Add test for importing OS data from GA

* Paginate GA requests to download all data

* Bump clickhouse_ecto version

* Move RefInspector wrapper function into module

* Drop timezone transform on import

* Order imported by side_id then date

* More strings -> atoms

Also changes a conditional to be a bit nicer

* Remove parallelisation of data import

* Split sources and UTM sources from fetched GA data

GA has only a "source" dimension and no "UTM source" dimension. Instead
it returns these combined. The logic herein to tease these apart is:

1. "(direct)" -> it's a direct source
2. if the source is a domain -> it's a source
3. "google" -> it's from adwords; let's make this a UTM source "adwords"
4. else -> just a UTM source

* Keep prop names in queries as strings

* fix typo

* Fix import

* Insert data to clickhouse in batches

* Fix link when removing imported data

* Merge source tables

* Import hostname as well as pathname

* Record start and end time of imported data

* Track import progress

* Fix month interval with imported data

* Do not JOIN when imported date range has no overlap

* Fix time on page using exits

Co-authored-by: mcol <mcol@posteo.net>
2022-03-10 15:04:59 -06:00
Uku Taht
1dba113e2f
[Draft] Improve location translations (#1526)
* WIP

* Use location library for search suggestions

* Remove unused code

* Remove Countries completely

* Fix tests
2021-12-13 12:03:27 +02:00
Uku Taht
effe56b3e4 Use new iso_codes package 2021-12-01 15:31:50 +02:00
Uku Taht
2bdfec1cc0
JS refactor: use generic ListReport for country report (#1487)
* Use ListReport for countries

* Fix countries tests

* Replace Browsers with ListReport

* Use Listreport for OS and screen size
2021-11-25 12:00:17 +02:00
Matt Colligan
7faa2f6673
CSV export output conversions and conversion rate when filtering for goal (#1464)
* Only add percentages to dashboard data when not filtering goal

* Correctly name CSV headers when exporting conversion data

* Remove percentages from tests when filtering for goal
2021-11-12 15:18:35 +02:00
Matt Colligan
f576fa2a2c
Improvements to CSV export (#1427)
* Add details=True to export API parameters

This makes the ZIP export add `%{"details" => "True"}` to the query's
`params` when fetching data internally for packaging in the ZIP.

This adds bounce_rate and time_on_page to the data in pages.csv, and
bounce_rate and visit_duration to sources.csv.

* Make API return data with consistent names

Some of the data types returned via the JSON or CSV API use inconsistent
naming, and some have redundant name changes (i.e. count -> visitors ->
count). This makes these all consistent and removes the redundancy.

This addresses #1426, fixes some of the CSV headers, and unifies the
JSON and CSV return data labels.

* Update changelog

* Test should use Timex.shift, not relative time

* Return full country names in CSV export

This also replaces the " character with ' in two country names, as those
are the characters used in the names, yielding a more predictable and
'correct' output.

* Fetch CSV exported data concurrently

* Use spinner to indicate when export has started

* Use 300 as default number of brekadown entries for export

Higher numbers (e.g. 1000) seem to cause clickhouse errors when there
many pages to request. It is unclear what is causing the error, as
clickhouse returns an "unknown" error code and an empty error message.
2021-11-04 14:20:39 +02:00
Uku Taht
bd7de59a9c Show total visitors when filtered for goal 2021-09-29 13:28:29 +02:00
Ro Savage
b3bc796d50
Add conversion_rate to sources api and source table (#1299)
* Add conversion_rate to sources api and source table

* Remove percentageFormatter

* Update source tests to include conversionat rate

* Add CR to detals modal

* Correct formatting with linter

* Add change log

* Add CR to Pages, Device and Countries panels
2021-09-20 16:17:11 +02:00
Uku Taht
7d1b1214e2 Implement dashboard with new stats module 2021-08-02 15:46:30 +03:00
Vignesh Joglekar
ff32218bd0
Adds entry and exit pages to Top Pages module (#712)
* Initial Pass

* Adds support for page visits counting by referrer

* Includes goal selection in entry and exit computation

* Adds goal-based entry and exit page stats, formatting, code cleanup

* Changelog

* Format

* Exit rate, visit duration, updated tests

* I keep forgetting to format :/

* Tests, last time

* Fixes double counting, exit rate >100%, relevant tests

* Fixes exit pages on filter and goal states

* Adds entry and exit filters, fixes various bugs

* Fixes discussed issues

* Format

* Fixes impossible case in tests

Originally, there were only 2 pageviews for `test-site.com`,`/` on `2019-01-01`, but that doesn't make sense when there were 3 sessions that exited on the same site/date.

* Format

* Removes boolean function parameter in favor of separate function

* Adds support for queries that use `page` filter as `entry-page`

* Format

* Makes  loader/title interaction in sources report consistent
2021-02-26 11:02:37 +02:00
Chandra Tungathurthi
b4b7532f07
Formatting only changes - No code change (#75)
* first commit with test and compile job

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* adding 'prepare' stage

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* updated ci script to include "test" compile phase

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* adding environment variables for connecting to postgresql

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* updated ci config for postgres

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* using non-alpine version of elixir

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* re-using the 'compile' artifacts and added explict env variables for testing

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* removing redundant deps fetching from common code

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* formatting using mix.format -- beware no-code changes!

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* added release config

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* adding consistent env variable for Database

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* more cleaning up of environment variables

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Adding releases config for enabling releases

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* cleaning up env configs

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Cleaned up config and prepared config for releases

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* updated CI script with new config for test

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Added Dockerfile for creating production docker image

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Adding "docker" build job yay!

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* using non-slim version of debian and installing webpack

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Adding overlays for migrations on releases

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* restricting the docker built to master branch only

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* typo fix

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* adding "Hosting.md" to explain hosting instructions

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* removed the default comments

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Added documentation related to env variables

* updated documentation and fixed typo

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* updated documentation

* Bumping up elixir version as `overlays` are only supported in latest version

read release notes: https://github.com/elixir-lang/elixir/releases/tag/v1.10.0

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Adding tarball assembly during release

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* updated HOSTING.md

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Added support for db migration

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* minor corrections

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* initializing admin user

Admin user has been added in the "migration" phase. A default user is automatically created in the process. One can provide the related env variables, else a new one will be automatically created for you.

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Initial base domain update - phase#1

These changes are only meant for correct operating it under self-hosting. There are many other cosmetic changes, that require updates to email, site and other places where the original website and author is used.

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Using dedicated config variable `base_domain` instead

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* adding base_domain to releases config

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* removing the dedicated config "base_domain", relying on endpoint host

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Removed the usage of "Mix" in code!

It is bad practice to use "mix" module inside the code as in actual release this module is unavailable. Replacing this with a config environment variable

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Added support for SMTP via Bamboo Smtp Adapter

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Capturing SMTP errors via Sentry

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Minor updates

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* Adding junit formatter -- useful for generating test reports

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* adding documentation for default user

* Resolve "Gitlab Adoption: Add supported services in "Security & Compliance""

* bumping up the debian version to fix issues

fixing some vulnerabilities identified by the scanning tools

* More updates for self-hosting

Changes in most of the places to suit self-hosting. Although, there are some which have been left-off.

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* quick-dirty-fix!

* bumping up the db connect timeout

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* bumping up the db connect timeout

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* bumping up the db connect timeout

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* bumping up timeout - skipping MRs :-/

* removing restrictions on watching for changes

this stuff isn't working

* Update HOSTING.md

* renamed the module name

* reverting formatting-whitespace changes

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* reverting the name to release

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* adding docker-compose.yml and related instructions

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* using `plausible_url` instead of assuming `https`

this is because, it is much to test in local dev machines and in most cases there's already a layer above which is capable for `https` termination and http -> https upgrade

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* WIP: merging changes from upstream

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* wip: more changes

* Pushing in changes from upstream

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* changes to ci for testing

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* cleaning up and finishing clickhouse integration

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* updating readme with hosting details

* removing deleted files from upstream

* minor config adjustments

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>

* formatting changes

Signed-off-by: Chandra Tungathurthi <tckb@tgrthi.me>
2020-06-08 10:35:13 +03:00
Uku Taht
d1b703f007
Clickhouse (#63)
* Get stats from clickhosue

* Pull stats from clickhouse

* Use correct Query namespace

* Use Clickhouse in unit tests

* Use Clickhouse in stats controller tests

* Use fixtures for unit tests

* Add Clickhouse to travis

* Use Clickhouse session store for sessions

* Add garbage collection to session store

* Reload session state from Clickhouse on server restart

* Query from sessions table

* Trap exits in event write buffer

* Run hydration without starting the whole app

* Make session length 30 minutes

* Revert changes to fingerprint schema

* Remove clickhouse from fingerprint sessions

* Flush buffers before shutdown

* Use old stats when merging

* Remove old session schema

* Fix tests with CH sessions

* Add has_pageviews? to Stats
2020-05-18 12:44:52 +03:00
Uku Taht
e14001664c Fix tests 2020-03-31 15:01:27 +03:00
Uku Taht
8a4d0a91d1
WIP: New UI updates (#19)
* WIP: New UI idea

* Remove hover styling if things are not clickable

* Improve mobile view

* Remove dead code

* Fade in tabs

* Update landing page with new style

* Fix countries test

* Fix alignment in conversions report
2020-02-10 15:17:00 +02:00
Uku Taht
7dbbc8ba22
Configurable site id (#30)
* Use site id instead of hostname for events

* Use site id in domain status check

* Revert change to tracking module

* Catch more places where link generation needed updating

* Rename site_id to domain

* Drop hostname index from events
2020-02-04 15:44:13 +02:00
Uku Taht
e8f20e67cc
React (#17)
* Load dashboard with react

* Rename stast2 to dashboard

* Save timeframe on the frontend

* Implement current visitors

* Implement comparisons

* React to route changes

* Add modals

* Show number of visitors on hover

* Show 'Today' for today

* Add 30 days

* Show referrer drilldown

* Arrow keys to go back and forward

* Improve comparisons UI

* Fix dropdown when clicking on it

* Verify API access in a memoized fashion

* Test API access

* Test stats through controller

* Move map formatting from stats controller to stats

* Remove unused code

* Remove dead code from query

* Remove dead code from stats templates

* Add stats view test back in

* Render modal inside the modal component

* Implement google search terms

* Add explanation for screen sizes

* Separate dashboard JS from the app js
2019-11-19 12:30:42 +08:00