View Source Plausible.Ingestion.Source (Plausible v0.0.1)
Resolves the source
dimension from a combination of referer
header and either utm_source
, source
, or ref
query parameter.
Summary
Functions
Resolves the source of a session based on query params and the Referer
header.
Functions
Resolves the source of a session based on query params and the Referer
header.
When a query parameter like utm_source
is present, it will be prioritized over the Referer
header. When the URL does not contain a source tag, we fall
back to using Referer
to determine the source. This module also takes care of certain transformations to make the data more useful for the user:
- The RefInspector library is used to categorize referrers into "known" sources. For example, when the referrer is google.com or google.co.uk, it will always be stored as "Google" which is more useful for marketers.
- On top of the standard RefInspector behaviour, we also keep a list of
custom_sources.json
which extends it with referrers that we have seen in the wild. For example, Wikipedia has many domains that need to be combined into a single known source. These could all in theory be upstreamed. - When a known source is supplied in utm_source (or source, ref) query parameter, we merge it with our known sources in a case-insensitive manner.
- Our list of
custom_sources.json
also contains some commonly used utm_source shorthands for certain sources. URL tagging is a mess, and we can never do it perfectly, but at least we're making an effort for the most commonly used ones. For example,ig -> Instagram
andadwords -> Google
.
Examples:
iex> alias Plausible.Ingestion.{Source, Request} iex> base_request = %Request{uri: URI.parse("https://plausible.io")} iex> Source.resolve(%{base_request | referrer: "https://google.com"}) # Known referrer from RefInspector "Google" iex> Source.resolve(%{base_request | query_params: %{"utm_source" => "google"}}) # Known source from RefInspector supplied as downcased utm_source by user "Google" iex> Source.resolve(%{base_request | query_params: %{"utm_source" => "GOOGLE"}}) # Known source from RefInspector supplied as uppercased utm_source by user "Google" iex> Source.resolve(%{base_request | referrer: "https://en.m.wikipedia.org"}) # Known referrer from custom_sources.json "Wikipedia" iex> Source.resolve(%{base_request | query_params: %{"utm_source" => "wikipedia"}}) # Known source from custom_sources.json supplied as downcased utm_source by user "Wikipedia" iex> Source.resolve(%{base_request | query_params: %{"utm_source" => "ig"}}) # Known utm_source from custom_sources.json "Instagram" iex> Source.resolve(%{base_request | referrer: "https://www.markosaric.com"}) # Unknown source, it is just stored as the domain name "markosaric.com"