graphql-engine/server/src-lib/Hasura/Tracing.hs
Antoine Leblanc cf531b05cb Rewrite Tracing to allow for only one TraceT in the entire stack.
This PR is on top of #7789.

### Description

This PR entirely rewrites the API of the Tracing library, to make `interpTraceT` a thing of the past. Before this change, we ran traces by sticking a `TraceT` on top of whatever we were doing. This had several major drawbacks:
- we were carrying a bunch of `TraceT` across the codebase, and the entire codebase had to know about it
- we needed to carry a second class constraint around (`HasReporterM`) to be able to run all of those traces
- we kept having to do stack rewriting with `interpTraceT`, which went from inconvenient to horrible
- we had to declare several behavioral instances on `TraceT m`

This PR rewrite all of `Tracing` using a more conventional model: there is ONE `TraceT` at the bottom of the stack, and there is an associated class constraint `MonadTrace`: any part of the code that happens to satisfy `MonadTrace` is able to create new traces. We NEVER have to do stack rewriting, `interpTraceT` is gone, and `TraceT` and `Reporter` become  implementation details that 99% of the code is blissfully unaware of: code that needs to do tracing only needs to declare that the monad in which it operates implements `MonadTrace`.

In doing so, this PR revealed **several bugs in the codebase**: places where we were expecting to trace something, but due to the default instance of `HasReporterM IO` we would actually not do anything. This PR also splits the code of `Tracing` in more byte-sized modules, with the goal of potentially moving to `server/lib` down the line.

### Remaining work

This PR is a draft; what's left to do is:
- [x] make Pro compile; i haven't updated `HasuraPro/Main` yet
- [x] document Tracing by writing a note that explains how to use the library, and the meaning of "reporter", "trace" and "span", as well as the pitfalls
- [x] discuss some of the trade-offs in the implementation, which is why i'm opening this PR already despite it not fully building yet
- [x] it depends on #7789 being merged first

PR-URL: https://github.com/hasura/graphql-engine-mono/pull/7791
GitOrigin-RevId: cadd32d039134c93ddbf364599a2f4dd988adea8
2023-03-13 17:38:39 +00:00

84 lines
3.0 KiB
Haskell

module Hasura.Tracing (module Tracing) where
import Hasura.Tracing.Class as Tracing
import Hasura.Tracing.Context as Tracing
import Hasura.Tracing.Monad as Tracing
import Hasura.Tracing.Reporter as Tracing
import Hasura.Tracing.Sampling as Tracing
import Hasura.Tracing.TraceId as Tracing
import Hasura.Tracing.Utils as Tracing
{- Note [Tracing]
## Usage
The Tracing library allows us to trace arbitrary pieces of our code, providing
that the current monad implements 'MonadTrace'.
newTrace "request" do
userInfo <- newSpan "authentication" retrieveUserInfo
parsedQuery <- newSpan "parsing" $ parseQuery q
result <- newSpan "execution" $ runQuery parsedQuery userInfo
pure result
## Trace and span
Each _trace_ is distinct, and is composed of one or more _spans_. Spans are
organized as a tree: the root span covers the entire trace, and each sub span
keeps track of its parent.
We report each span individually, and to each of them we associate a
'TraceContext', that contains:
- a trace id, common to all the spans of that trace
- a unique span id, generated randomly
- the span id of the parent span, if any
- whether that trace was sampled (see "Sampling").
All of this can be retrieved for the current span with 'currentContext'.
Starting a new trace masks the previous one; in the following example, "span2"
is associated to "trace2" and "span1" is associated to "trace1"; the two trees
are distinct:
newTrace "trace1" $
newSpan "span1" $
newTrace "trace2" $
newSpan "span2"
Lastly, a span that is started outside of a root trace is, for now, silently
ignored, as it has no trace id to attach to. This is a design decision we may
revisit.
## Metadata
Metadata can be attached to the current trace with 'attachMetadata', as a list
of pair of text key and text values.
## Reporters
'TraceT' is the de-facto implementation of 'MonadTrace'; but, in practice, it
only does half the job: once a span finishes, 'TraceT' delegates the job of
actually reporting / exporting all relevant information to a 'Reporter'. Said
reporter must be provided to 'runTraceT', and is a wrapper around a function in
IO that processes the span.
In practice, 'TraceT' is only a reader that keeps track of the reporter, the
default sampling policy, and the current trace.
## Sampling
To run 'TraceT', you must also provide a 'SamplingPolicy': an IO action that,
when evaluated, will decide whether an arbitrary trace should be reporter or
not. This decision is only made once per trace: every span within a trace will
use the same result: they're either all reporter, or none of them are.
When starting a trace, the default sampling policy can be overriden. You can for
instance run 'TraceT' with an action that, by default, only reports one out of
every ten traces, but use 'newTraceWithPolicy sampleAlways' when sending
critical requests to your authentication service.
Note that sampling and reporting are distinct: using 'sampleAlways' simply
guarantees that the 'Reporter' you provided will be called.
-}