View Source Plausible.Ingestion.Source (Plausible v0.0.1)

Resolves the source dimension from a combination of referer header and either utm_source, source, or ref query parameter.

Summary

Functions

Resolves the source of a session based on query params and the Referer header.

Functions

Link to this function

format_referrer(referrer)

View Source

Resolves the source of a session based on query params and the Referer header.

When a query parameter like utm_source is present, it will be prioritized over the Referer header. When the URL does not contain a source tag, we fall back to using Referer to determine the source. This module also takes care of certain transformations to make the data more useful for the user:

  1. The RefInspector library is used to categorize referrers into "known" sources. For example, when the referrer is google.com or google.co.uk, it will always be stored as "Google" which is more useful for marketers.
  2. On top of the standard RefInspector behaviour, we also keep a list of custom_sources.json which extends it with referrers that we have seen in the wild. For example, Wikipedia has many domains that need to be combined into a single known source. These could all in theory be upstreamed.
  3. When a known source is supplied in utm_source (or source, ref) query parameter, we merge it with our known sources in a case-insensitive manner.
  4. Our list of custom_sources.json also contains some commonly used utm_source shorthands for certain sources. URL tagging is a mess, and we can never do it perfectly, but at least we're making an effort for the most commonly used ones. For example, ig -> Instagram and adwords -> Google.

Examples:

iex> alias Plausible.Ingestion.{Source, Request} iex> base_request = %Request{uri: URI.parse("https://plausible.io")} iex> Source.resolve(%{base_request | referrer: "https://google.com"}) # Known referrer from RefInspector "Google" iex> Source.resolve(%{base_request | query_params: %{"utm_source" => "google"}}) # Known source from RefInspector supplied as downcased utm_source by user "Google" iex> Source.resolve(%{base_request | query_params: %{"utm_source" => "GOOGLE"}}) # Known source from RefInspector supplied as uppercased utm_source by user "Google" iex> Source.resolve(%{base_request | referrer: "https://en.m.wikipedia.org"}) # Known referrer from custom_sources.json "Wikipedia" iex> Source.resolve(%{base_request | query_params: %{"utm_source" => "wikipedia"}}) # Known source from custom_sources.json supplied as downcased utm_source by user "Wikipedia" iex> Source.resolve(%{base_request | query_params: %{"utm_source" => "ig"}}) # Known utm_source from custom_sources.json "Instagram" iex> Source.resolve(%{base_request | referrer: "https://www.markosaric.com"}) # Unknown source, it is just stored as the domain name "markosaric.com"