server: add total time, db exec time subscription metrics

PR-URL: https://github.com/hasura/graphql-engine-mono/pull/8899
Co-authored-by: paritosh-08 <85472423+paritosh-08@users.noreply.github.com>
Co-authored-by: Rob Dominguez <24390149+robertjdominguez@users.noreply.github.com>
GitOrigin-RevId: c8c4a89576ae95265a8e4f4e6803a12ba7e840d4
This commit is contained in:
Naveen Naidu 2023-05-18 18:25:53 +05:30 committed by hasura-bot
parent db5370bb62
commit 770407110c
13 changed files with 472 additions and 134 deletions

View File

@ -342,7 +342,7 @@ Enable the Hasura Console (served by the server on `/` and `/console`).
### Enable High-cardinality Labels for Metrics ### Enable High-cardinality Labels for Metrics
Enable high-cardinality labels for [Prometheus Metrics](/enterprise/metrics.mdx). Enabling this setting will add more labels to Enable high-cardinality labels for [Prometheus Metrics](/enterprise/metrics.mdx). Enabling this setting will add more labels to
some of the metrics (e.g. `operationName` label for Graphql subscription metrics). some of the metrics (e.g. `operation_name` label for Graphql subscription metrics).
| | | | | |
| ------------------- | ------------------------------------------------- | | ------------------- | ------------------------------------------------- |

View File

@ -44,9 +44,10 @@ The metrics endpoint should be configured with a secret to prevent misuse and sh
Starting in `v2.26.0`, Hasura GraphQL Engine exposes metrics with high-cardinality labels by default. Starting in `v2.26.0`, Hasura GraphQL Engine exposes metrics with high-cardinality labels by default.
You can disable [the cardinality of labels for You can disable
metrics](/deployment/graphql-engine-flags/reference.mdx#enable-high-cardinality-labels-for-metrics) if you are [the cardinality of labels for metrics](/deployment/graphql-engine-flags/reference.mdx#enable-high-cardinality-labels-for-metrics)
experiencing high memory usage, which can be due to a large number of labels in the metrics (typically more than 10000). if you are experiencing high memory usage, which can be due to a large number of labels in the metrics (typically more
than 10000).
::: :::
@ -54,6 +55,16 @@ experiencing high memory usage, which can be due to a large number of labels in
The following metrics are exported by Hasura GraphQL Engine: The following metrics are exported by Hasura GraphQL Engine:
### Hasura active subscriptions
Current number of active subscriptions, representing the subscription load on the server.
| | |
| ------ | ------------------------------------------------------------------------------------------ |
| Name | `hasura_active_subscriptions` |
| Type | Gauge |
| Labels | `subscription_kind`: streaming \| live-query, `operation_name`, `parameterized_query_hash` |
### Hasura active subscription pollers ### Hasura active subscription pollers
Current number of active subscription pollers. A subscription poller Current number of active subscription pollers. A subscription poller
@ -84,16 +95,6 @@ runtime errors.
| Type | Gauge | | Type | Gauge |
| Labels | `subscription_kind`: streaming \| live-query | | Labels | `subscription_kind`: streaming \| live-query |
### Hasura active subscriptions
Current number of active subscriptions, representing the subscription load on the server.
| | |
| ------ | ----------------------------- |
| Name | `hasura_active_subscriptions` |
| Type | Gauge |
| Labels | none |
### Hasura Cache requests count ### Hasura Cache requests count
Tracks cache hit and miss requests, which helps in monitoring and optimizing cache utilization. Tracks cache hit and miss requests, which helps in monitoring and optimizing cache utilization.
@ -283,6 +284,75 @@ Current number of active PostgreSQL connections. Compare this to
| Type | Gauge | | Type | Gauge |
| Labels | `source_name`: name of the database<br />`conn_info`: connection url string (password omitted) or name of the connection url environment variable<br />`role`: primary \| replica | | Labels | `source_name`: name of the database<br />`conn_info`: connection url string (password omitted) or name of the connection url environment variable<br />`role`: primary \| replica |
### Hasura subscription database execution time
The time taken to run the subscription's multiplexed query in the database for a single batch.
A subscription poller
[multiplexes](https://github.com/hasura/graphql-engine/blob/master/architecture/live-queries.md#idea-3-batch-multiple-live-queries-into-one-sql-query)
similar subscriptions together. During every run (every 1 second by default), the poller splits the different variables
for the multiplexed query into batches (by default 100) and execute the batches. This metric observes the time taken for
each batch to execute on the database.
If this metric is high, then it may be an indication that the database is not performing as expected, you should
consider investigating the subscription query and see if indexes can help improve performance.
| | |
| ------ | ------------------------------------------------------------------------------------------ |
| Name | `hasura_subscription_db_execution_time_seconds` |
| Type | Histogram<br /><br />Buckets: 0.000001, 0.0001, 0.01, 0.1, 0.3, 1, 3, 10, 30, 100 |
| Labels | `subscription_kind`: streaming \| live-query, `operation_name`, `parameterized_query_hash` |
### Hasura subscription total time
The time taken to complete running of one subscription poller.
A subscription poller
[multiplexes](https://github.com/hasura/graphql-engine/blob/master/architecture/live-queries.md#idea-3-batch-multiple-live-queries-into-one-sql-query)
similar subscriptions together. This subscription poller runs every 1 second by default and queries the database with
the multiplexed query to fetch the latest data. In a polling instance, the poller not only queries the database but does
other operations like splitting similar queries into batches (by default 100) before fetching their data from the
database, etc. **This metric is the total time taken to complete all the operations in a single poll.**
If the value of this metric is high, then it may be an indication that the multiplexed query is taking longer to execute
in the database, verify with
[`hasura_subscription_db_execution_time_seconds`](/enterprise/metrics.mdx/#hasura-subscription-database-execution-time)
metric.
In a single poll, the subscription poller splits the different variables for the multiplexed query into batches (by
default 100) and executes the batches. We use the `hasura_subscription_db_execution_time_seconds` metric to observe the
time taken for each batch to execute on the database. This means for a single poll there can be multiple values for
`hasura_subscription_db_execution_time_seconds` metric.
Let's look at an example to understand these metrics better:
Say we have 650 subscriptions with the same selection set but different input arguments. These 650 subscriptions will be
grouped to form one multiplexed query. A single poller would be created to run this multiplexed query. This poller will
run every 1 second.
The default batch size in hasura is 100, so the 650 subscriptions will be split into 7 batches for execution during a
single poll.
During a single poll:
- Batch 1: `hasura_subscription_db_execution_time_seconds` = 0.002 seconds
- Batch 2: `hasura_subscription_db_execution_time_seconds` = 0.001 seconds
- Batch 3: `hasura_subscription_db_execution_time_seconds` = 0.003 seconds
- Batch 4: `hasura_subscription_db_execution_time_seconds` = 0.001 seconds
- Batch 5: `hasura_subscription_db_execution_time_seconds` = 0.002 seconds
- Batch 6: `hasura_subscription_db_execution_time_seconds` = 0.001 seconds
- Batch 7: `hasura_subscription_db_execution_time_seconds` = 0.002 seconds
The `hasura_subscription_total_time_seconds` would be sum of all the database execution times shown in the batches, plus
some extra process time for other tasks the poller does during a single poll. In this case, it would be approximately
0.013 seconds.
| | |
| ------ | ------------------------------------------------------------------------------------------ |
| Name | `hasura_subscription_total_time_seconds` |
| Type | Histogram<br /><br />Buckets: 0.000001, 0.0001, 0.01, 0.1, 0.3, 1, 3, 10, 30, 100 |
| Labels | `subscription_kind`: streaming \| live-query, `operation_name`, `parameterized_query_hash` |
### Hasura WebSocket connections ### Hasura WebSocket connections
Current number of active WebSocket connections, representing the WebSocket load on the server. Current number of active WebSocket connections, representing the WebSocket load on the server.

View File

@ -247,7 +247,11 @@ data Poller streamCursor = Poller
-- This var is "write once", moving monotonically from empty to full. -- This var is "write once", moving monotonically from empty to full.
-- TODO this could probably be tightened up to something like -- TODO this could probably be tightened up to something like
-- 'STM PollerIOState' -- 'STM PollerIOState'
_pIOState :: STM.TMVar PollerIOState _pIOState :: STM.TMVar PollerIOState,
_pParameterizedQueryHash :: ParameterizedQueryHash,
-- The operation names of the subscriptions that are part of this poller. This is
-- used while emitting subscription metrics
_pOperationNamesMap :: TMap.TMap (Maybe OperationName) Int
} }
data PollerIOState = PollerIOState data PollerIOState = PollerIOState
@ -291,7 +295,7 @@ dumpPollerMap :: Bool -> PollerMap streamCursor -> IO J.Value
dumpPollerMap extended pollerMap = dumpPollerMap extended pollerMap =
fmap J.toJSON $ do fmap J.toJSON $ do
entries <- STM.atomically $ ListT.toList $ STMMap.listT pollerMap entries <- STM.atomically $ ListT.toList $ STMMap.listT pollerMap
forM entries $ \(pollerKey, Poller cohortsMap _responseState ioState) -> forM entries $ \(pollerKey, Poller cohortsMap _responseState ioState _paramQueryHash _opNames) ->
AB.dispatchAnyBackend @Backend (unBackendPollerKey pollerKey) $ \(PollerKey source role query _connectionKey) -> do AB.dispatchAnyBackend @Backend (unBackendPollerKey pollerKey) $ \(PollerKey source role query _connectionKey) -> do
PollerIOState threadId pollerId <- STM.atomically $ STM.readTMVar ioState PollerIOState threadId pollerId <- STM.atomically $ STM.readTMVar ioState
cohortsJ <- cohortsJ <-
@ -305,7 +309,8 @@ dumpPollerMap extended pollerMap =
"thread_id" J..= show (Immortal.threadId threadId), "thread_id" J..= show (Immortal.threadId threadId),
"poller_id" J..= pollerId, "poller_id" J..= pollerId,
"multiplexed_query" J..= query, "multiplexed_query" J..= query,
"cohorts" J..= cohortsJ "cohorts" J..= cohortsJ,
"parameterized_query_hash" J..= _paramQueryHash
] ]
-- | An ID to track unique 'Poller's, so that we can gather metrics about each -- | An ID to track unique 'Poller's, so that we can gather metrics about each

View File

@ -33,9 +33,11 @@ import Hasura.RQL.Types.BackendType (BackendType (..), PostgresKind (Vanilla))
import Hasura.RQL.Types.Common (SourceName) import Hasura.RQL.Types.Common (SourceName)
import Hasura.RQL.Types.Roles (RoleName) import Hasura.RQL.Types.Roles (RoleName)
import Hasura.RQL.Types.Subscription (SubscriptionType (..)) import Hasura.RQL.Types.Subscription (SubscriptionType (..))
import Hasura.Server.Prometheus (PrometheusMetrics (..), SubscriptionMetrics (..)) import Hasura.Server.Prometheus (PrometheusMetrics (..), SubscriptionMetrics (..), liveQuerySubscriptionLabel, recordSubcriptionMetric)
import Hasura.Server.Types (GranularPrometheusMetricsState (..))
import Refined (unrefine) import Refined (unrefine)
import System.Metrics.Prometheus.Gauge qualified as Prometheus.Gauge import System.Metrics.Prometheus.Gauge qualified as Prometheus.Gauge
import System.Metrics.Prometheus.HistogramVector qualified as HistogramVector
pushResultToCohort :: pushResultToCohort ::
GQResult BS.ByteString -> GQResult BS.ByteString ->
@ -86,9 +88,12 @@ pollLiveQuery ::
CohortMap 'LiveQuery -> CohortMap 'LiveQuery ->
SubscriptionPostPollHook -> SubscriptionPostPollHook ->
PrometheusMetrics -> PrometheusMetrics ->
IO GranularPrometheusMetricsState ->
TMap.TMap (Maybe OperationName) Int ->
ResolvedConnectionTemplate b -> ResolvedConnectionTemplate b ->
IO () IO ()
pollLiveQuery pollerId pollerResponseState lqOpts (sourceName, sourceConfig) roleName parameterizedQueryHash query cohortMap postPollHook prometheusMetrics resolvedConnectionTemplate = do pollLiveQuery pollerId pollerResponseState lqOpts (sourceName, sourceConfig) roleName parameterizedQueryHash query cohortMap postPollHook prometheusMetrics granularPrometheusMetricsState operationNamesMap' resolvedConnectionTemplate = do
operationNamesMap <- STM.atomically $ TMap.getMap operationNamesMap'
(totalTime, (snapshotTime, batchesDetails)) <- withElapsedTime $ do (totalTime, (snapshotTime, batchesDetails)) <- withElapsedTime $ do
-- snapshot the current cohorts and split them into batches -- snapshot the current cohorts and split them into batches
(snapshotTime, cohortBatches) <- withElapsedTime $ do (snapshotTime, cohortBatches) <- withElapsedTime $ do
@ -105,6 +110,15 @@ pollLiveQuery pollerId pollerResponseState lqOpts (sourceName, sourceConfig) rol
batchesDetails <- A.forConcurrently cohortBatches $ \(batchId, cohorts) -> do batchesDetails <- A.forConcurrently cohortBatches $ \(batchId, cohorts) -> do
(queryExecutionTime, mxRes) <- runDBSubscription @b sourceConfig query (over (each . _2) C._csVariables cohorts) resolvedConnectionTemplate (queryExecutionTime, mxRes) <- runDBSubscription @b sourceConfig query (over (each . _2) C._csVariables cohorts) resolvedConnectionTemplate
let dbExecTimeMetric = submDBExecTotalTime $ pmSubscriptionMetrics $ prometheusMetrics
recordSubcriptionMetric
granularPrometheusMetricsState
True
operationNamesMap
parameterizedQueryHash
liveQuerySubscriptionLabel
(flip (HistogramVector.observe dbExecTimeMetric) (realToFrac queryExecutionTime))
previousPollerResponseState <- STM.readTVarIO pollerResponseState previousPollerResponseState <- STM.readTVarIO pollerResponseState
case mxRes of case mxRes of
@ -152,7 +166,6 @@ pollLiveQuery pollerId pollerResponseState lqOpts (sourceName, sourceConfig) rol
batchId batchId
cohortsExecutionDetails cohortsExecutionDetails
batchResponseSize batchResponseSize
pure (snapshotTime, batchesDetails) pure (snapshotTime, batchesDetails)
let pollDetails = let pollDetails =
@ -169,6 +182,14 @@ pollLiveQuery pollerId pollerResponseState lqOpts (sourceName, sourceConfig) rol
_pdParameterizedQueryHash = parameterizedQueryHash _pdParameterizedQueryHash = parameterizedQueryHash
} }
postPollHook pollDetails postPollHook pollDetails
let totalTimeMetric = submTotalTime $ pmSubscriptionMetrics $ prometheusMetrics
recordSubcriptionMetric
granularPrometheusMetricsState
True
operationNamesMap
parameterizedQueryHash
liveQuerySubscriptionLabel
(flip (HistogramVector.observe totalTimeMetric) (realToFrac totalTime))
where where
SubscriptionsOptions batchSize _ = lqOpts SubscriptionsOptions batchSize _ = lqOpts

View File

@ -37,10 +37,12 @@ import Hasura.RQL.Types.Common (SourceName)
import Hasura.RQL.Types.Roles (RoleName) import Hasura.RQL.Types.Roles (RoleName)
import Hasura.RQL.Types.Subscription (SubscriptionType (..)) import Hasura.RQL.Types.Subscription (SubscriptionType (..))
import Hasura.SQL.Value (TxtEncodedVal (..)) import Hasura.SQL.Value (TxtEncodedVal (..))
import Hasura.Server.Prometheus (PrometheusMetrics (..), SubscriptionMetrics (..)) import Hasura.Server.Prometheus (PrometheusMetrics (..), SubscriptionMetrics (..), recordSubcriptionMetric, streamingSubscriptionLabel)
import Hasura.Server.Types (GranularPrometheusMetricsState (..))
import Language.GraphQL.Draft.Syntax qualified as G import Language.GraphQL.Draft.Syntax qualified as G
import Refined (unrefine) import Refined (unrefine)
import System.Metrics.Prometheus.Gauge qualified as Prometheus.Gauge import System.Metrics.Prometheus.Gauge qualified as Prometheus.Gauge
import System.Metrics.Prometheus.HistogramVector qualified as HistogramVector
import Text.Shakespeare.Text (st) import Text.Shakespeare.Text (st)
{- Note [Streaming subscriptions rebuilding cohort map] {- Note [Streaming subscriptions rebuilding cohort map]
@ -249,9 +251,12 @@ pollStreamingQuery ::
SubscriptionPostPollHook -> SubscriptionPostPollHook ->
Maybe (IO ()) -> -- Optional IO action to make this function (pollStreamingQuery) testable Maybe (IO ()) -> -- Optional IO action to make this function (pollStreamingQuery) testable
PrometheusMetrics -> PrometheusMetrics ->
IO GranularPrometheusMetricsState ->
TMap.TMap (Maybe OperationName) Int ->
ResolvedConnectionTemplate b -> ResolvedConnectionTemplate b ->
IO () IO ()
pollStreamingQuery pollerId pollerResponseState streamingQueryOpts (sourceName, sourceConfig) roleName parameterizedQueryHash query cohortMap rootFieldName postPollHook testActionMaybe prometheusMetrics resolvedConnectionTemplate = do pollStreamingQuery pollerId pollerResponseState streamingQueryOpts (sourceName, sourceConfig) roleName parameterizedQueryHash query cohortMap rootFieldName postPollHook testActionMaybe prometheusMetrics granularPrometheusMetricsState operationNames' resolvedConnectionTemplate = do
operationNames <- STM.atomically $ TMap.getMap operationNames'
(totalTime, (snapshotTime, batchesDetailsAndProcessedCohorts)) <- withElapsedTime $ do (totalTime, (snapshotTime, batchesDetailsAndProcessedCohorts)) <- withElapsedTime $ do
-- snapshot the current cohorts and split them into batches -- snapshot the current cohorts and split them into batches
-- This STM transaction is a read only transaction i.e. it doesn't mutate any state -- This STM transaction is a read only transaction i.e. it doesn't mutate any state
@ -275,6 +280,14 @@ pollStreamingQuery pollerId pollerResponseState streamingQueryOpts (sourceName,
query query
(over (each . _2) C._csVariables $ fmap (fmap fst) cohorts) (over (each . _2) C._csVariables $ fmap (fmap fst) cohorts)
resolvedConnectionTemplate resolvedConnectionTemplate
let dbExecTimeMetric = submDBExecTotalTime $ pmSubscriptionMetrics $ prometheusMetrics
recordSubcriptionMetric
granularPrometheusMetricsState
True
operationNames
parameterizedQueryHash
streamingSubscriptionLabel
(flip (HistogramVector.observe dbExecTimeMetric) (realToFrac queryExecutionTime))
previousPollerResponseState <- STM.readTVarIO pollerResponseState previousPollerResponseState <- STM.readTVarIO pollerResponseState
@ -426,6 +439,14 @@ pollStreamingQuery pollerId pollerResponseState streamingQueryOpts (sourceName,
currentCohorts currentCohorts
TMap.replace cohortMap updatedCohortsMap TMap.replace cohortMap updatedCohortsMap
postPollHook pollDetails postPollHook pollDetails
let totalTimeMetric = submTotalTime $ pmSubscriptionMetrics $ prometheusMetrics
recordSubcriptionMetric
granularPrometheusMetricsState
True
operationNames
parameterizedQueryHash
streamingSubscriptionLabel
(flip (HistogramVector.observe totalTimeMetric) (realToFrac totalTime))
where where
SubscriptionsOptions batchSize _ = streamingQueryOpts SubscriptionsOptions batchSize _ = streamingQueryOpts

View File

@ -52,13 +52,22 @@ import Hasura.RQL.Types.Action
import Hasura.RQL.Types.Common (SourceName) import Hasura.RQL.Types.Common (SourceName)
import Hasura.SQL.AnyBackend qualified as AB import Hasura.SQL.AnyBackend qualified as AB
import Hasura.Server.Metrics (ServerMetrics (..)) import Hasura.Server.Metrics (ServerMetrics (..))
import Hasura.Server.Prometheus (PrometheusMetrics (..), SubscriptionMetrics (..)) import Hasura.Server.Prometheus
import Hasura.Server.Types (RequestId) ( DynamicSubscriptionLabel (..),
PrometheusMetrics (..),
SubscriptionLabel (..),
SubscriptionMetrics (..),
liveQuerySubscriptionLabel,
recordMetricWithLabel,
streamingSubscriptionLabel,
)
import Hasura.Server.Types (GranularPrometheusMetricsState (..), RequestId)
import Language.GraphQL.Draft.Syntax qualified as G import Language.GraphQL.Draft.Syntax qualified as G
import Refined (unrefine) import Refined (unrefine)
import StmContainers.Map qualified as STMMap import StmContainers.Map qualified as STMMap
import System.Metrics.Gauge qualified as EKG.Gauge import System.Metrics.Gauge qualified as EKG.Gauge
import System.Metrics.Prometheus.Gauge qualified as Prometheus.Gauge import System.Metrics.Prometheus.Gauge qualified as Prometheus.Gauge
import System.Metrics.Prometheus.GaugeVector qualified as GaugeVector
-- | The top-level datatype that holds the state for all active subscriptions. -- | The top-level datatype that holds the state for all active subscriptions.
-- --
@ -117,8 +126,10 @@ findPollerForSubscriber ::
CohortKey -> CohortKey ->
(Subscriber -> Cohort streamCursorVars -> STM.STM streamCursorVars) -> (Subscriber -> Cohort streamCursorVars -> STM.STM streamCursorVars) ->
(Subscriber -> Poller streamCursorVars -> STM.STM streamCursorVars) -> (Subscriber -> Poller streamCursorVars -> STM.STM streamCursorVars) ->
ParameterizedQueryHash ->
Maybe OperationName ->
STM.STM ((Maybe (Poller streamCursorVars)), streamCursorVars) STM.STM ((Maybe (Poller streamCursorVars)), streamCursorVars)
findPollerForSubscriber subscriber pollerMap pollerKey cohortKey addToCohort addToPoller = findPollerForSubscriber subscriber pollerMap pollerKey cohortKey addToCohort addToPoller parameterizedQueryHash maybeOperationName =
-- a handler is returned only when it is newly created -- a handler is returned only when it is newly created
STMMap.lookup pollerKey pollerMap >>= \case STMMap.lookup pollerKey pollerMap >>= \case
Just poller -> do Just poller -> do
@ -130,11 +141,18 @@ findPollerForSubscriber subscriber pollerMap pollerKey cohortKey addToCohort add
-- cohort not found. Create a cohort with the subscriber and add -- cohort not found. Create a cohort with the subscriber and add
-- the cohort to the poller -- the cohort to the poller
Nothing -> addToPoller subscriber poller Nothing -> addToPoller subscriber poller
-- Add the operation name of the subcription to the poller, if it doesn't exist
-- else increment the count for the operation name
TMap.lookup maybeOperationName (_pOperationNamesMap poller) >>= \case
Nothing -> TMap.insert 1 maybeOperationName (_pOperationNamesMap poller)
Just _ -> TMap.adjust (+ 1) maybeOperationName (_pOperationNamesMap poller)
return (Nothing, cursorVars) return (Nothing, cursorVars)
Nothing -> do Nothing -> do
-- no poller found, so create one with the cohort -- no poller found, so create one with the cohort
-- and the subscriber within it. -- and the subscriber within it.
!poller <- Poller <$> TMap.new <*> STM.newTVar PRSSuccess <*> STM.newEmptyTMVar operationNamesMap <- TMap.new
TMap.insert 1 maybeOperationName operationNamesMap
!poller <- Poller <$> TMap.new <*> STM.newTVar PRSSuccess <*> STM.newEmptyTMVar <*> pure parameterizedQueryHash <*> pure operationNamesMap
cursorVars <- addToPoller subscriber poller cursorVars <- addToPoller subscriber poller
STMMap.insert poller pollerKey pollerMap STMMap.insert poller pollerKey pollerMap
return $ (Just poller, cursorVars) return $ (Just poller, cursorVars)
@ -155,6 +173,7 @@ addLiveQuery ::
Maybe OperationName -> Maybe OperationName ->
RequestId -> RequestId ->
SubscriptionQueryPlan b (MultiplexedQuery b) -> SubscriptionQueryPlan b (MultiplexedQuery b) ->
IO GranularPrometheusMetricsState ->
-- | the action to be executed when result changes -- | the action to be executed when result changes
OnChange -> OnChange ->
IO LiveQuerySubscriberDetails IO LiveQuerySubscriberDetails
@ -170,6 +189,7 @@ addLiveQuery
operationName operationName
requestId requestId
plan plan
granularPrometheusMetricsState
onResultAction = do onResultAction = do
-- CAREFUL!: It's absolutely crucial that we can't throw any exceptions here! -- CAREFUL!: It's absolutely crucial that we can't throw any exceptions here!
@ -187,6 +207,8 @@ addLiveQuery
cohortKey cohortKey
addToCohort addToCohort
addToPoller addToPoller
parameterizedQueryHash
operationName
-- we can then attach a polling thread if it is new the livequery can only be -- we can then attach a polling thread if it is new the livequery can only be
-- cancelled after putTMVar -- cancelled after putTMVar
@ -207,6 +229,8 @@ addLiveQuery
(_pCohorts poller) (_pCohorts poller)
postPollHook postPollHook
prometheusMetrics prometheusMetrics
granularPrometheusMetricsState
(_pOperationNamesMap poller)
resolvedConnectionTemplate resolvedConnectionTemplate
sleep $ unrefine $ unRefetchInterval refetchInterval sleep $ unrefine $ unRefetchInterval refetchInterval
let !pState = PollerIOState threadRef pollerId let !pState = PollerIOState threadRef pollerId
@ -215,7 +239,14 @@ addLiveQuery
liftIO $ Prometheus.Gauge.inc $ submActiveLiveQueryPollers $ pmSubscriptionMetrics $ prometheusMetrics liftIO $ Prometheus.Gauge.inc $ submActiveLiveQueryPollers $ pmSubscriptionMetrics $ prometheusMetrics
liftIO $ EKG.Gauge.inc $ smActiveSubscriptions serverMetrics liftIO $ EKG.Gauge.inc $ smActiveSubscriptions serverMetrics
liftIO $ Prometheus.Gauge.inc $ pmActiveSubscriptions prometheusMetrics let promMetricGranularLabel = SubscriptionLabel liveQuerySubscriptionLabel (Just $ DynamicSubscriptionLabel parameterizedQueryHash operationName)
promMetricLabel = SubscriptionLabel liveQuerySubscriptionLabel Nothing
let numSubscriptionMetric = submActiveSubscriptions $ pmSubscriptionMetrics $ prometheusMetrics
recordMetricWithLabel
granularPrometheusMetricsState
True
(GaugeVector.inc numSubscriptionMetric promMetricGranularLabel)
(GaugeVector.inc numSubscriptionMetric promMetricLabel)
liftIO $ EKG.Gauge.inc $ smActiveLiveQueries serverMetrics liftIO $ EKG.Gauge.inc $ smActiveLiveQueries serverMetrics
pure $ SubscriberDetails handlerId cohortKey subscriberId pure $ SubscriberDetails handlerId cohortKey subscriberId
@ -256,6 +287,7 @@ addStreamSubscriptionQuery ::
-- | root field name -- | root field name
G.Name -> G.Name ->
SubscriptionQueryPlan b (MultiplexedQuery b) -> SubscriptionQueryPlan b (MultiplexedQuery b) ->
IO GranularPrometheusMetricsState ->
-- | the action to be executed when result changes -- | the action to be executed when result changes
OnChange -> OnChange ->
IO StreamingSubscriberDetails IO StreamingSubscriberDetails
@ -272,6 +304,7 @@ addStreamSubscriptionQuery
requestId requestId
rootFieldName rootFieldName
plan plan
granularPrometheusMetricsState
onResultAction = do onResultAction = do
-- CAREFUL!: It's absolutely crucial that we can't throw any exceptions here! -- CAREFUL!: It's absolutely crucial that we can't throw any exceptions here!
@ -289,6 +322,8 @@ addStreamSubscriptionQuery
cohortKey cohortKey
addToCohort addToCohort
addToPoller addToPoller
parameterizedQueryHash
operationName
-- we can then attach a polling thread if it is new the subscription can only be -- we can then attach a polling thread if it is new the subscription can only be
-- cancelled after putTMVar -- cancelled after putTMVar
@ -298,7 +333,22 @@ addStreamSubscriptionQuery
forever $ do forever $ do
(_, streamQOpts) <- getSubscriptionOptions (_, streamQOpts) <- getSubscriptionOptions
let SubscriptionsOptions _ refetchInterval = streamQOpts let SubscriptionsOptions _ refetchInterval = streamQOpts
pollStreamingQuery @b pollerId (_pPollerState handler) streamQOpts (source, sourceConfig) role parameterizedQueryHash query (_pCohorts handler) rootFieldName postPollHook Nothing prometheusMetrics resolvedConnectionTemplate pollStreamingQuery @b
pollerId
(_pPollerState handler)
streamQOpts
(source, sourceConfig)
role
parameterizedQueryHash
query
(_pCohorts handler)
rootFieldName
postPollHook
Nothing
prometheusMetrics
granularPrometheusMetricsState
(_pOperationNamesMap handler)
resolvedConnectionTemplate
sleep $ unrefine $ unRefetchInterval refetchInterval sleep $ unrefine $ unRefetchInterval refetchInterval
let !pState = PollerIOState threadRef pollerId let !pState = PollerIOState threadRef pollerId
$assertNFHere pState -- so we don't write thunks to mutable vars $assertNFHere pState -- so we don't write thunks to mutable vars
@ -307,9 +357,17 @@ addStreamSubscriptionQuery
liftIO $ do liftIO $ do
EKG.Gauge.inc $ smActiveSubscriptions serverMetrics EKG.Gauge.inc $ smActiveSubscriptions serverMetrics
Prometheus.Gauge.inc $ pmActiveSubscriptions prometheusMetrics
EKG.Gauge.inc $ smActiveStreamingSubscriptions serverMetrics EKG.Gauge.inc $ smActiveStreamingSubscriptions serverMetrics
let promMetricGranularLabel = SubscriptionLabel streamingSubscriptionLabel (Just $ DynamicSubscriptionLabel parameterizedQueryHash operationName)
promMetricLabel = SubscriptionLabel streamingSubscriptionLabel Nothing
numSubscriptionMetric = submActiveSubscriptions $ pmSubscriptionMetrics $ prometheusMetrics
recordMetricWithLabel
granularPrometheusMetricsState
True
(GaugeVector.inc numSubscriptionMetric promMetricGranularLabel)
(GaugeVector.inc numSubscriptionMetric promMetricLabel)
pure $ SubscriberDetails handlerId (cohortKey, cohortCursorTVar) subscriberId pure $ SubscriberDetails handlerId (cohortKey, cohortCursorTVar) subscriberId
where where
SubscriptionsState _ streamQueryMap postPollHook _ = subscriptionState SubscriptionsState _ streamQueryMap postPollHook _ = subscriptionState
@ -336,16 +394,27 @@ removeLiveQuery ::
SubscriptionsState -> SubscriptionsState ->
-- the query and the associated operation -- the query and the associated operation
LiveQuerySubscriberDetails -> LiveQuerySubscriberDetails ->
IO GranularPrometheusMetricsState ->
Maybe OperationName ->
IO () IO ()
removeLiveQuery logger serverMetrics prometheusMetrics lqState lqId@(SubscriberDetails handlerId cohortId sinkId) = mask_ $ do removeLiveQuery logger serverMetrics prometheusMetrics lqState lqId@(SubscriberDetails handlerId cohortId sinkId) granularPrometheusMetricsState maybeOperationName = mask_ $ do
mbCleanupIO <- STM.atomically $ do join $
STM.atomically $ do
detM <- getQueryDet lqMap detM <- getQueryDet lqMap
fmap join $ case detM of
forM detM $ \(Poller cohorts _ ioState, cohort) -> Nothing -> return (pure ())
cleanHandlerC cohorts ioState cohort Just (Poller cohorts _ ioState parameterizedQueryHash operationNamesMap, cohort) -> do
sequence_ mbCleanupIO TMap.lookup maybeOperationName operationNamesMap >>= \case
-- If only one operation name is present in the map, delete it
Just 1 -> TMap.delete maybeOperationName operationNamesMap
-- If the count of a operation name is more than 1, then it means there
-- are more subscriptions with the same name and we should keep emitting
-- the metrics until the all the subscription with that operaion name are
-- removed
Just _ -> TMap.adjust (\v -> v - 1) maybeOperationName operationNamesMap
Nothing -> return ()
cleanHandlerC cohorts ioState cohort parameterizedQueryHash
liftIO $ EKG.Gauge.dec $ smActiveSubscriptions serverMetrics liftIO $ EKG.Gauge.dec $ smActiveSubscriptions serverMetrics
liftIO $ Prometheus.Gauge.dec $ pmActiveSubscriptions prometheusMetrics
liftIO $ EKG.Gauge.dec $ smActiveLiveQueries serverMetrics liftIO $ EKG.Gauge.dec $ smActiveLiveQueries serverMetrics
where where
lqMap = _ssLiveQueryMap lqState lqMap = _ssLiveQueryMap lqState
@ -357,7 +426,7 @@ removeLiveQuery logger serverMetrics prometheusMetrics lqState lqId@(SubscriberD
cohortM <- TMap.lookup cohortId (_pCohorts poller) cohortM <- TMap.lookup cohortId (_pCohorts poller)
return $ (poller,) <$> cohortM return $ (poller,) <$> cohortM
cleanHandlerC cohortMap ioState handlerC = do cleanHandlerC cohortMap ioState handlerC parameterizedQueryHash = do
let curOps = _cExistingSubscribers handlerC let curOps = _cExistingSubscribers handlerC
newOps = _cNewSubscribers handlerC newOps = _cNewSubscribers handlerC
TMap.delete sinkId curOps TMap.delete sinkId curOps
@ -368,6 +437,8 @@ removeLiveQuery logger serverMetrics prometheusMetrics lqState lqId@(SubscriberD
<*> TMap.null newOps <*> TMap.null newOps
when cohortIsEmpty $ TMap.delete cohortId cohortMap when cohortIsEmpty $ TMap.delete cohortId cohortMap
handlerIsEmpty <- TMap.null cohortMap handlerIsEmpty <- TMap.null cohortMap
let promMetricGranularLabel = SubscriptionLabel liveQuerySubscriptionLabel (Just $ DynamicSubscriptionLabel parameterizedQueryHash maybeOperationName)
promMetricLabel = SubscriptionLabel liveQuerySubscriptionLabel Nothing
-- when there is no need for handler i.e, this happens to be the last -- when there is no need for handler i.e, this happens to be the last
-- operation, take the ref for the polling thread to cancel it -- operation, take the ref for the polling thread to cancel it
if handlerIsEmpty if handlerIsEmpty
@ -375,12 +446,18 @@ removeLiveQuery logger serverMetrics prometheusMetrics lqState lqId@(SubscriberD
STMMap.delete handlerId lqMap STMMap.delete handlerId lqMap
threadRefM <- fmap _pThread <$> STM.tryReadTMVar ioState threadRefM <- fmap _pThread <$> STM.tryReadTMVar ioState
return $ return $
Just $ -- deferred IO: -- deferred IO:
case threadRefM of case threadRefM of
Just threadRef -> do Just threadRef -> do
Immortal.stop threadRef Immortal.stop threadRef
liftIO $ Prometheus.Gauge.dec $ submActiveLiveQueryPollers $ pmSubscriptionMetrics prometheusMetrics liftIO $ do
Prometheus.Gauge.dec $ submActiveLiveQueryPollers $ pmSubscriptionMetrics prometheusMetrics
let numSubscriptionMetric = submActiveSubscriptions $ pmSubscriptionMetrics $ prometheusMetrics
recordMetricWithLabel
granularPrometheusMetricsState
True
(GaugeVector.dec numSubscriptionMetric promMetricGranularLabel)
(GaugeVector.dec numSubscriptionMetric promMetricLabel)
-- This would seem to imply addLiveQuery broke or a bug -- This would seem to imply addLiveQuery broke or a bug
-- elsewhere. Be paranoid and log: -- elsewhere. Be paranoid and log:
Nothing -> Nothing ->
@ -389,7 +466,14 @@ removeLiveQuery logger serverMetrics prometheusMetrics lqState lqId@(SubscriberD
fromString $ fromString $
"In removeLiveQuery no worker thread installed. Please report this as a bug: " "In removeLiveQuery no worker thread installed. Please report this as a bug: "
<> show lqId <> show lqId
else return Nothing else do
let numSubscriptionMetric = submActiveSubscriptions $ pmSubscriptionMetrics $ prometheusMetrics
return $
recordMetricWithLabel
granularPrometheusMetricsState
True
(GaugeVector.dec numSubscriptionMetric promMetricGranularLabel)
(GaugeVector.dec numSubscriptionMetric promMetricLabel)
removeStreamingQuery :: removeStreamingQuery ::
L.Logger L.Hasura -> L.Logger L.Hasura ->
@ -398,17 +482,28 @@ removeStreamingQuery ::
SubscriptionsState -> SubscriptionsState ->
-- the query and the associated operation -- the query and the associated operation
StreamingSubscriberDetails -> StreamingSubscriberDetails ->
IO GranularPrometheusMetricsState ->
Maybe OperationName ->
IO () IO ()
removeStreamingQuery logger serverMetrics prometheusMetrics subscriptionState (SubscriberDetails handlerId (cohortId, cursorVariableTV) sinkId) = mask_ $ do removeStreamingQuery logger serverMetrics prometheusMetrics subscriptionState (SubscriberDetails handlerId (cohortId, cursorVariableTV) sinkId) granularPrometheusMetricsState maybeOperationName = mask_ $ do
mbCleanupIO <- STM.atomically $ do join $
STM.atomically $ do
detM <- getQueryDet streamQMap detM <- getQueryDet streamQMap
fmap join $ case detM of
forM detM $ \(Poller cohorts _ ioState, currentCohortId, cohort) -> Nothing -> return (pure ())
cleanHandlerC cohorts ioState (cohort, currentCohortId) Just (Poller cohorts _ ioState parameterizedQueryHash operationNamesMap, currentCohortId, cohort) -> do
sequence_ mbCleanupIO TMap.lookup maybeOperationName operationNamesMap >>= \case
-- If only one operation name is present in the map, delete it
Just 1 -> TMap.delete maybeOperationName operationNamesMap
-- If the count of a operation name is more than 1, then it means there
-- are more subscriptions with the same name and we should keep emitting
-- the metrics until the all the subscription with the operaion name are
-- removed
Just _ -> TMap.adjust (\v -> v - 1) maybeOperationName operationNamesMap
Nothing -> return ()
cleanHandlerC cohorts ioState (cohort, currentCohortId) parameterizedQueryHash
liftIO $ do liftIO $ do
EKG.Gauge.dec $ smActiveSubscriptions serverMetrics EKG.Gauge.dec $ smActiveSubscriptions serverMetrics
Prometheus.Gauge.dec $ pmActiveSubscriptions prometheusMetrics
EKG.Gauge.dec $ smActiveStreamingSubscriptions serverMetrics EKG.Gauge.dec $ smActiveStreamingSubscriptions serverMetrics
where where
streamQMap = _ssStreamQueryMap subscriptionState streamQMap = _ssStreamQueryMap subscriptionState
@ -422,7 +517,7 @@ removeStreamingQuery logger serverMetrics prometheusMetrics subscriptionState (S
cohortM <- TMap.lookup updatedCohortId (_pCohorts poller) cohortM <- TMap.lookup updatedCohortId (_pCohorts poller)
return $ (poller,updatedCohortId,) <$> cohortM return $ (poller,updatedCohortId,) <$> cohortM
cleanHandlerC cohortMap ioState (handlerC, currentCohortId) = do cleanHandlerC cohortMap ioState (handlerC, currentCohortId) parameterizedQueryHash = do
let curOps = _cExistingSubscribers handlerC let curOps = _cExistingSubscribers handlerC
newOps = _cNewSubscribers handlerC newOps = _cNewSubscribers handlerC
TMap.delete sinkId curOps TMap.delete sinkId curOps
@ -433,6 +528,8 @@ removeStreamingQuery logger serverMetrics prometheusMetrics subscriptionState (S
<*> TMap.null newOps <*> TMap.null newOps
when cohortIsEmpty $ TMap.delete currentCohortId cohortMap when cohortIsEmpty $ TMap.delete currentCohortId cohortMap
handlerIsEmpty <- TMap.null cohortMap handlerIsEmpty <- TMap.null cohortMap
let promMetricGranularLabel = SubscriptionLabel streamingSubscriptionLabel (Just $ DynamicSubscriptionLabel parameterizedQueryHash maybeOperationName)
promMetricLabel = SubscriptionLabel streamingSubscriptionLabel Nothing
-- when there is no need for handler i.e, -- when there is no need for handler i.e,
-- operation, take the ref for the polling thread to cancel it -- operation, take the ref for the polling thread to cancel it
if handlerIsEmpty if handlerIsEmpty
@ -440,14 +537,18 @@ removeStreamingQuery logger serverMetrics prometheusMetrics subscriptionState (S
STMMap.delete handlerId streamQMap STMMap.delete handlerId streamQMap
threadRefM <- fmap _pThread <$> STM.tryReadTMVar ioState threadRefM <- fmap _pThread <$> STM.tryReadTMVar ioState
return $ return $
Just $ -- deferred IO: -- deferred IO:
case threadRefM of case threadRefM of
Just threadRef -> do Just threadRef -> do
Immortal.stop threadRef Immortal.stop threadRef
liftIO $ liftIO $ do
Prometheus.Gauge.dec $ Prometheus.Gauge.dec $ submActiveStreamingPollers $ pmSubscriptionMetrics prometheusMetrics
submActiveStreamingPollers $ let numSubscriptionMetric = submActiveSubscriptions $ pmSubscriptionMetrics $ prometheusMetrics
pmSubscriptionMetrics prometheusMetrics recordMetricWithLabel
granularPrometheusMetricsState
True
(GaugeVector.dec numSubscriptionMetric promMetricGranularLabel)
(GaugeVector.dec numSubscriptionMetric promMetricLabel)
-- This would seem to imply addStreamSubscriptionQuery broke or a bug -- This would seem to imply addStreamSubscriptionQuery broke or a bug
-- elsewhere. Be paranoid and log: -- elsewhere. Be paranoid and log:
Nothing -> Nothing ->
@ -461,7 +562,14 @@ removeStreamingQuery logger serverMetrics prometheusMetrics subscriptionState (S
<> show cohortId <> show cohortId
<> ", subscriber_id:" <> ", subscriber_id:"
<> show sinkId <> show sinkId
else return Nothing else do
let numSubscriptionMetric = submActiveSubscriptions $ pmSubscriptionMetrics $ prometheusMetrics
return $
recordMetricWithLabel
granularPrometheusMetricsState
True
(GaugeVector.dec numSubscriptionMetric promMetricGranularLabel)
(GaugeVector.dec numSubscriptionMetric promMetricLabel)
-- | An async action query whose relationships are refered to table in a source. -- | An async action query whose relationships are refered to table in a source.
-- We need to generate an SQL statement with the action response and execute it -- We need to generate an SQL statement with the action response and execute it

View File

@ -11,6 +11,7 @@ module Hasura.GraphQL.Execute.Subscription.TMap
union, union,
filterWithKey, filterWithKey,
getMap, getMap,
adjust,
) )
where where
@ -58,3 +59,6 @@ union mapA mapB = do
getMap :: TMap k v -> STM (HashMap.HashMap k v) getMap :: TMap k v -> STM (HashMap.HashMap k v)
getMap = readTVar . unTMap getMap = readTVar . unTMap
adjust :: (Hashable k) => (v -> v) -> k -> TMap k v -> STM ()
adjust f k mapTV = modifyTVar' (unTMap mapTV) $ HashMap.adjust f k

View File

@ -121,7 +121,7 @@ parameterizedQueryHashListToObject =
[("parameterized_query_hash", J.toJSON queryHashes)] [("parameterized_query_hash", J.toJSON queryHashes)]
newtype ParameterizedQueryHash = ParameterizedQueryHash {unParamQueryHash :: B.ByteString} newtype ParameterizedQueryHash = ParameterizedQueryHash {unParamQueryHash :: B.ByteString}
deriving (Show, Eq) deriving (Show, Eq, Ord)
instance J.ToJSON ParameterizedQueryHash where instance J.ToJSON ParameterizedQueryHash where
toJSON = J.String . bsToTxt . unParamQueryHash toJSON = J.String . bsToTxt . unParamQueryHash

View File

@ -40,6 +40,7 @@ import Hasura.Server.Prometheus
decWebsocketConnections, decWebsocketConnections,
incWebsocketConnections, incWebsocketConnections,
) )
import Hasura.Server.Types (MonadGetPolicies (..))
import Hasura.Services.Network import Hasura.Services.Network
import Hasura.Tracing qualified as Tracing import Hasura.Tracing qualified as Tracing
import Network.WebSockets qualified as WS import Network.WebSockets qualified as WS
@ -59,7 +60,8 @@ createWSServerApp ::
MonadQueryTags m, MonadQueryTags m,
HasResourceLimits m, HasResourceLimits m,
ProvidesNetwork m, ProvidesNetwork m,
Tracing.MonadTrace m Tracing.MonadTrace m,
MonadGetPolicies m
) => ) =>
HashSet (L.EngineLogType L.Hasura) -> HashSet (L.EngineLogType L.Hasura) ->
WSServerEnv impl -> WSServerEnv impl ->
@ -96,9 +98,10 @@ createWSServerApp enabledLogTypes serverEnv connInitTimeout licenseKeyCache = \
onMessage enabledLogTypes getAuthMode serverEnv conn bs (wsActions sp) licenseKeyCache onMessage enabledLogTypes getAuthMode serverEnv conn bs (wsActions sp) licenseKeyCache
onCloseHandler conn = mask_ do onCloseHandler conn = mask_ do
granularPrometheusMetricsState <- runGetPrometheusMetricsGranularity
liftIO $ EKG.Gauge.dec $ smWebsocketConnections serverMetrics liftIO $ EKG.Gauge.dec $ smWebsocketConnections serverMetrics
liftIO $ decWebsocketConnections $ pmConnections prometheusMetrics liftIO $ decWebsocketConnections $ pmConnections prometheusMetrics
onClose logger serverMetrics prometheusMetrics (_wseSubscriptionState serverEnv) conn onClose logger serverMetrics prometheusMetrics (_wseSubscriptionState serverEnv) conn granularPrometheusMetricsState
stopWSServerApp :: WSServerEnv impl -> IO () stopWSServerApp :: WSServerEnv impl -> IO ()
stopWSServerApp wsEnv = WS.shutdown (_wseServer wsEnv) stopWSServerApp wsEnv = WS.shutdown (_wseServer wsEnv)

View File

@ -93,7 +93,7 @@ import Hasura.Server.Prometheus
PrometheusMetrics (..), PrometheusMetrics (..),
) )
import Hasura.Server.Telemetry.Counters qualified as Telem import Hasura.Server.Telemetry.Counters qualified as Telem
import Hasura.Server.Types (RequestId, getRequestId) import Hasura.Server.Types (GranularPrometheusMetricsState (..), MonadGetPolicies (..), RequestId, getRequestId)
import Hasura.Services.Network import Hasura.Services.Network
import Hasura.Session import Hasura.Session
import Hasura.Tracing qualified as Tracing import Hasura.Tracing qualified as Tracing
@ -420,7 +420,8 @@ onStart ::
MonadMetadataStorage m, MonadMetadataStorage m,
MonadQueryTags m, MonadQueryTags m,
HasResourceLimits m, HasResourceLimits m,
ProvidesNetwork m ProvidesNetwork m,
MonadGetPolicies m
) => ) =>
HashSet (L.EngineLogType L.Hasura) -> HashSet (L.EngineLogType L.Hasura) ->
Maybe (CredentialCache AgentLicenseKey) -> Maybe (CredentialCache AgentLicenseKey) ->
@ -679,9 +680,9 @@ onStart enabledLogTypes agentLicenseKey serverEnv wsConn shouldCaptureVariables
E.SEOnSourceDB (E.SSLivequery actionIds liveQueryBuilder) -> do E.SEOnSourceDB (E.SSLivequery actionIds liveQueryBuilder) -> do
actionLogMapE <- fmap fst <$> runExceptT (EA.fetchActionLogResponses actionIds) actionLogMapE <- fmap fst <$> runExceptT (EA.fetchActionLogResponses actionIds)
actionLogMap <- onLeft actionLogMapE (withComplete . preExecErr requestId (Just gqlOpType)) actionLogMap <- onLeft actionLogMapE (withComplete . preExecErr requestId (Just gqlOpType))
opMetadataE <- liftIO $ startLiveQuery liveQueryBuilder parameterizedQueryHash requestId actionLogMap granularPrometheusMetricsState <- runGetPrometheusMetricsGranularity
opMetadataE <- liftIO $ startLiveQuery liveQueryBuilder parameterizedQueryHash requestId actionLogMap granularPrometheusMetricsState
lqId <- onLeft opMetadataE (withComplete . preExecErr requestId (Just gqlOpType)) lqId <- onLeft opMetadataE (withComplete . preExecErr requestId (Just gqlOpType))
-- Update async action query subscription state -- Update async action query subscription state
case NE.nonEmpty (toList actionIds) of case NE.nonEmpty (toList actionIds) of
Nothing -> do Nothing -> do
@ -694,11 +695,11 @@ onStart enabledLogTypes agentLicenseKey serverEnv wsConn shouldCaptureVariables
let asyncActionQueryLive = let asyncActionQueryLive =
ES.LAAQOnSourceDB $ ES.LAAQOnSourceDB $
ES.LiveAsyncActionQueryOnSource lqId actionLogMap $ ES.LiveAsyncActionQueryOnSource lqId actionLogMap $
restartLiveQuery parameterizedQueryHash requestId liveQueryBuilder restartLiveQuery parameterizedQueryHash requestId liveQueryBuilder granularPrometheusMetricsState (_grOperationName reqParsed)
onUnexpectedException err = do onUnexpectedException err = do
sendError requestId err sendError requestId err
stopOperation serverEnv wsConn opId (pure ()) -- Don't log in case opId don't exist stopOperation serverEnv wsConn opId granularPrometheusMetricsState (pure ()) -- Don't log in case opId don't exist
ES.addAsyncActionLiveQuery ES.addAsyncActionLiveQuery
(ES._ssAsyncActions subscriptionsState) (ES._ssAsyncActions subscriptionsState)
opId opId
@ -706,7 +707,8 @@ onStart enabledLogTypes agentLicenseKey serverEnv wsConn shouldCaptureVariables
onUnexpectedException onUnexpectedException
asyncActionQueryLive asyncActionQueryLive
E.SEOnSourceDB (E.SSStreaming rootFieldName streamQueryBuilder) -> do E.SEOnSourceDB (E.SSStreaming rootFieldName streamQueryBuilder) -> do
liftIO $ startStreamingQuery rootFieldName streamQueryBuilder parameterizedQueryHash requestId granularPrometheusMetricsState <- runGetPrometheusMetricsGranularity
liftIO $ startStreamingQuery rootFieldName streamQueryBuilder parameterizedQueryHash requestId granularPrometheusMetricsState
liftIO $ Prometheus.Counter.inc (gqlRequestsSubscriptionSuccess gqlMetrics) liftIO $ Prometheus.Counter.inc (gqlRequestsSubscriptionSuccess gqlMetrics)
liftIO $ logOpEv ODStarted (Just requestId) (Just parameterizedQueryHash) liftIO $ logOpEv ODStarted (Just requestId) (Just parameterizedQueryHash)
@ -894,12 +896,13 @@ onStart enabledLogTypes agentLicenseKey serverEnv wsConn shouldCaptureVariables
liftIO $ sendCompleted Nothing Nothing liftIO $ sendCompleted Nothing Nothing
throwError () throwError ()
restartLiveQuery parameterizedQueryHash requestId liveQueryBuilder lqId actionLogMap = do restartLiveQuery parameterizedQueryHash requestId liveQueryBuilder granularPrometheusMetricsState maybeOperationName lqId actionLogMap = do
ES.removeLiveQuery logger (_wseServerMetrics serverEnv) (_wsePrometheusMetrics serverEnv) subscriptionsState lqId ES.removeLiveQuery logger (_wseServerMetrics serverEnv) (_wsePrometheusMetrics serverEnv) subscriptionsState lqId granularPrometheusMetricsState maybeOperationName
either (const Nothing) Just <$> startLiveQuery liveQueryBuilder parameterizedQueryHash requestId actionLogMap either (const Nothing) Just <$> startLiveQuery liveQueryBuilder parameterizedQueryHash requestId actionLogMap granularPrometheusMetricsState
startLiveQuery liveQueryBuilder parameterizedQueryHash requestId actionLogMap = do startLiveQuery liveQueryBuilder parameterizedQueryHash requestId actionLogMap granularPrometheusMetricsState = do
liveQueryE <- runExceptT $ liveQueryBuilder actionLogMap liveQueryE <- runExceptT $ liveQueryBuilder actionLogMap
for liveQueryE $ \(sourceName, E.SubscriptionQueryPlan exists) -> do for liveQueryE $ \(sourceName, E.SubscriptionQueryPlan exists) -> do
let !opName = _grOperationName q let !opName = _grOperationName q
subscriberMetadata = ES.mkSubscriberMetadata (WS.getWSId wsConn) opId opName requestId subscriberMetadata = ES.mkSubscriberMetadata (WS.getWSId wsConn) opId opName requestId
@ -920,14 +923,16 @@ onStart enabledLogTypes agentLicenseKey serverEnv wsConn shouldCaptureVariables
opName opName
requestId requestId
liveQueryPlan liveQueryPlan
granularPrometheusMetricsState
(onChange opName parameterizedQueryHash $ ES._sqpNamespace liveQueryPlan) (onChange opName parameterizedQueryHash $ ES._sqpNamespace liveQueryPlan)
liftIO $ $assertNFHere (lqId, opName) -- so we don't write thunks to mutable vars liftIO $ $assertNFHere (lqId, opName) -- so we don't write thunks to mutable vars
STM.atomically $ STM.atomically $
-- NOTE: see crucial `lookup` check above, ensuring this doesn't clobber: -- NOTE: see crucial `lookup` check above, ensuring this doesn't clobber:
STMMap.insert (LiveQuerySubscriber lqId, opName) opId opMap STMMap.insert (LiveQuerySubscriber lqId, opName) opId opMap
pure lqId pure lqId
startStreamingQuery rootFieldName (sourceName, E.SubscriptionQueryPlan exists) parameterizedQueryHash requestId = do startStreamingQuery rootFieldName (sourceName, E.SubscriptionQueryPlan exists) parameterizedQueryHash requestId granularPrometheusMetricsState = do
let !opName = _grOperationName q let !opName = _grOperationName q
subscriberMetadata = ES.mkSubscriberMetadata (WS.getWSId wsConn) opId opName requestId subscriberMetadata = ES.mkSubscriberMetadata (WS.getWSId wsConn) opId opName requestId
-- NOTE!: we mask async exceptions higher in the call stack, but it's -- NOTE!: we mask async exceptions higher in the call stack, but it's
@ -948,6 +953,7 @@ onStart enabledLogTypes agentLicenseKey serverEnv wsConn shouldCaptureVariables
requestId requestId
(_rfaAlias rootFieldName) (_rfaAlias rootFieldName)
streamQueryPlan streamQueryPlan
granularPrometheusMetricsState
(onChange opName parameterizedQueryHash $ ES._sqpNamespace streamQueryPlan) (onChange opName parameterizedQueryHash $ ES._sqpNamespace streamQueryPlan)
liftIO $ $assertNFHere (streamSubscriberId, opName) -- so we don't write thunks to mutable vars liftIO $ $assertNFHere (streamSubscriberId, opName) -- so we don't write thunks to mutable vars
STM.atomically $ STM.atomically $
@ -1017,7 +1023,8 @@ onMessage ::
MonadQueryTags m, MonadQueryTags m,
HasResourceLimits m, HasResourceLimits m,
ProvidesNetwork m, ProvidesNetwork m,
Tracing.MonadTrace m Tracing.MonadTrace m,
MonadGetPolicies m
) => ) =>
HashSet (L.EngineLogType L.Hasura) -> HashSet (L.EngineLogType L.Hasura) ->
IO AuthMode -> IO AuthMode ->
@ -1052,7 +1059,9 @@ onMessage enabledLogTypes authMode serverEnv wsConn msgRaw onMessageActions agen
then CaptureQueryVariables then CaptureQueryVariables
else DoNotCaptureQueryVariables else DoNotCaptureQueryVariables
onStart enabledLogTypes agentLicenseKey serverEnv wsConn shouldCaptureVariables startMsg onMessageActions onStart enabledLogTypes agentLicenseKey serverEnv wsConn shouldCaptureVariables startMsg onMessageActions
CMStop stopMsg -> onStop serverEnv wsConn stopMsg CMStop stopMsg -> do
granularPrometheusMetricsState <- runGetPrometheusMetricsGranularity
onStop serverEnv wsConn stopMsg granularPrometheusMetricsState
-- specfic to graphql-ws -- specfic to graphql-ws
CMPing mPayload -> onPing wsConn mPayload CMPing mPayload -> onPing wsConn mPayload
CMPong _mPayload -> pure () CMPong _mPayload -> pure ()
@ -1067,15 +1076,15 @@ onPing :: (MonadIO m) => WSConn -> Maybe PingPongPayload -> m ()
onPing wsConn mPayload = onPing wsConn mPayload =
liftIO $ sendMsg wsConn (SMPong mPayload) liftIO $ sendMsg wsConn (SMPong mPayload)
onStop :: (MonadIO m) => WSServerEnv impl -> WSConn -> StopMsg -> m () onStop :: (MonadIO m) => WSServerEnv impl -> WSConn -> StopMsg -> IO GranularPrometheusMetricsState -> m ()
onStop serverEnv wsConn (StopMsg opId) = liftIO $ do onStop serverEnv wsConn (StopMsg opId) granularPrometheusMetricsState = liftIO $ do
-- When a stop message is received for an operation, it may not be present in OpMap -- When a stop message is received for an operation, it may not be present in OpMap
-- in these cases: -- in these cases:
-- 1. If the operation is a query/mutation - as we remove the operation from the -- 1. If the operation is a query/mutation - as we remove the operation from the
-- OpMap as soon as it is executed -- OpMap as soon as it is executed
-- 2. A misbehaving client -- 2. A misbehaving client
-- 3. A bug on our end -- 3. A bug on our end
stopOperation serverEnv wsConn opId $ stopOperation serverEnv wsConn opId granularPrometheusMetricsState $
L.unLogger logger $ L.unLogger logger $
L.UnstructuredLog L.LevelDebug $ L.UnstructuredLog L.LevelDebug $
fromString $ fromString $
@ -1085,17 +1094,17 @@ onStop serverEnv wsConn (StopMsg opId) = liftIO $ do
where where
logger = _wseLogger serverEnv logger = _wseLogger serverEnv
stopOperation :: WSServerEnv impl -> WSConn -> OperationId -> IO () -> IO () stopOperation :: WSServerEnv impl -> WSConn -> OperationId -> IO GranularPrometheusMetricsState -> IO () -> IO ()
stopOperation serverEnv wsConn opId logWhenOpNotExist = do stopOperation serverEnv wsConn opId granularPrometheusMetricsState logWhenOpNotExist = do
opM <- liftIO $ STM.atomically $ STMMap.lookup opId opMap opM <- liftIO $ STM.atomically $ STMMap.lookup opId opMap
case opM of case opM of
Just (subscriberDetails, opNameM) -> do Just (subscriberDetails, operationName) -> do
logWSEvent logger wsConn $ EOperation $ opDet opNameM logWSEvent logger wsConn $ EOperation $ opDet operationName
case subscriberDetails of case subscriberDetails of
LiveQuerySubscriber lqId -> LiveQuerySubscriber lqId ->
ES.removeLiveQuery logger (_wseServerMetrics serverEnv) (_wsePrometheusMetrics serverEnv) subscriptionState lqId ES.removeLiveQuery logger (_wseServerMetrics serverEnv) (_wsePrometheusMetrics serverEnv) subscriptionState lqId granularPrometheusMetricsState operationName
StreamingQuerySubscriber streamSubscriberId -> StreamingQuerySubscriber streamSubscriberId ->
ES.removeStreamingQuery logger (_wseServerMetrics serverEnv) (_wsePrometheusMetrics serverEnv) subscriptionState streamSubscriberId ES.removeStreamingQuery logger (_wseServerMetrics serverEnv) (_wsePrometheusMetrics serverEnv) subscriptionState streamSubscriberId granularPrometheusMetricsState operationName
Nothing -> logWhenOpNotExist Nothing -> logWhenOpNotExist
STM.atomically $ STMMap.delete opId opMap STM.atomically $ STMMap.delete opId opMap
where where
@ -1182,14 +1191,15 @@ onClose ::
PrometheusMetrics -> PrometheusMetrics ->
ES.SubscriptionsState -> ES.SubscriptionsState ->
WSConn -> WSConn ->
IO GranularPrometheusMetricsState ->
m () m ()
onClose logger serverMetrics prometheusMetrics subscriptionsState wsConn = do onClose logger serverMetrics prometheusMetrics subscriptionsState wsConn granularPrometheusMetricsState = do
logWSEvent logger wsConn EClosed logWSEvent logger wsConn EClosed
operations <- liftIO $ STM.atomically $ ListT.toList $ STMMap.listT opMap operations <- liftIO $ STM.atomically $ ListT.toList $ STMMap.listT opMap
liftIO $ liftIO $
for_ operations $ \(_, (subscriber, _)) -> for_ operations $ \(_, (subscriber, operationName)) ->
case subscriber of case subscriber of
LiveQuerySubscriber lqId -> ES.removeLiveQuery logger serverMetrics prometheusMetrics subscriptionsState lqId LiveQuerySubscriber lqId -> ES.removeLiveQuery logger serverMetrics prometheusMetrics subscriptionsState lqId granularPrometheusMetricsState operationName
StreamingQuerySubscriber streamSubscriberId -> ES.removeStreamingQuery logger serverMetrics prometheusMetrics subscriptionsState streamSubscriberId StreamingQuerySubscriber streamSubscriberId -> ES.removeStreamingQuery logger serverMetrics prometheusMetrics subscriptionsState streamSubscriberId granularPrometheusMetricsState operationName
where where
opMap = _wscOpMap $ WS.getData wsConn opMap = _wscOpMap $ WS.getData wsConn

View File

@ -1,3 +1,5 @@
{-# LANGUAGE DeriveAnyClass #-}
-- | Mutable references for Prometheus metrics. -- | Mutable references for Prometheus metrics.
-- --
-- These metrics are independent from the metrics in "Hasura.Server.Metrics". -- These metrics are independent from the metrics in "Hasura.Server.Metrics".
@ -20,20 +22,32 @@ module Hasura.Server.Prometheus
TriggerNameLabel (..), TriggerNameLabel (..),
GranularPrometheusMetricsState (..), GranularPrometheusMetricsState (..),
observeHistogramWithLabel, observeHistogramWithLabel,
SubscriptionKindLabel (..),
SubscriptionLabel (..),
DynamicSubscriptionLabel (..),
streamingSubscriptionLabel,
liveQuerySubscriptionLabel,
recordMetricWithLabel,
recordSubcriptionMetric,
) )
where where
import Data.HashMap.Internal.Strict qualified as Map import Data.HashMap.Internal.Strict qualified as Map
import Data.HashMap.Strict qualified as HashMap
import Data.IORef (IORef, atomicModifyIORef', newIORef, readIORef) import Data.IORef (IORef, atomicModifyIORef', newIORef, readIORef)
import Data.Int (Int64) import Data.Int (Int64)
import Hasura.GraphQL.ParameterizedQueryHash
import Hasura.GraphQL.Transport.HTTP.Protocol (OperationName (..))
import Hasura.Prelude import Hasura.Prelude
import Hasura.RQL.Types.EventTrigger (TriggerName, triggerNameToTxt) import Hasura.RQL.Types.EventTrigger (TriggerName, triggerNameToTxt)
import Hasura.Server.Types (GranularPrometheusMetricsState (..)) import Hasura.Server.Types (GranularPrometheusMetricsState (..))
import Language.GraphQL.Draft.Syntax qualified as G
import System.Metrics.Prometheus (ToLabels (..)) import System.Metrics.Prometheus (ToLabels (..))
import System.Metrics.Prometheus.Counter (Counter) import System.Metrics.Prometheus.Counter (Counter)
import System.Metrics.Prometheus.Counter qualified as Counter import System.Metrics.Prometheus.Counter qualified as Counter
import System.Metrics.Prometheus.Gauge (Gauge) import System.Metrics.Prometheus.Gauge (Gauge)
import System.Metrics.Prometheus.Gauge qualified as Gauge import System.Metrics.Prometheus.Gauge qualified as Gauge
import System.Metrics.Prometheus.GaugeVector qualified as GaugeVector
import System.Metrics.Prometheus.Histogram (Histogram) import System.Metrics.Prometheus.Histogram (Histogram)
import System.Metrics.Prometheus.Histogram qualified as Histogram import System.Metrics.Prometheus.Histogram qualified as Histogram
import System.Metrics.Prometheus.HistogramVector (HistogramVector) import System.Metrics.Prometheus.HistogramVector (HistogramVector)
@ -44,7 +58,6 @@ import System.Metrics.Prometheus.HistogramVector qualified as HistogramVector
-- | Mutable references for Prometheus metrics. -- | Mutable references for Prometheus metrics.
data PrometheusMetrics = PrometheusMetrics data PrometheusMetrics = PrometheusMetrics
{ pmConnections :: ConnectionsGauge, { pmConnections :: ConnectionsGauge,
pmActiveSubscriptions :: Gauge,
pmGraphQLRequestMetrics :: GraphQLRequestMetrics, pmGraphQLRequestMetrics :: GraphQLRequestMetrics,
pmEventTriggerMetrics :: EventTriggerMetrics, pmEventTriggerMetrics :: EventTriggerMetrics,
pmWebSocketBytesReceived :: Counter, pmWebSocketBytesReceived :: Counter,
@ -101,7 +114,10 @@ data SubscriptionMetrics = SubscriptionMetrics
{ submActiveLiveQueryPollers :: Gauge, { submActiveLiveQueryPollers :: Gauge,
submActiveStreamingPollers :: Gauge, submActiveStreamingPollers :: Gauge,
submActiveLiveQueryPollersInError :: Gauge, submActiveLiveQueryPollersInError :: Gauge,
submActiveStreamingPollersInError :: Gauge submActiveStreamingPollersInError :: Gauge,
submTotalTime :: HistogramVector.HistogramVector SubscriptionLabel,
submDBExecTotalTime :: HistogramVector.HistogramVector SubscriptionLabel,
submActiveSubscriptions :: GaugeVector.GaugeVector SubscriptionLabel
} }
data CacheRequestMetrics = CacheRequestMetrics data CacheRequestMetrics = CacheRequestMetrics
@ -114,7 +130,6 @@ data CacheRequestMetrics = CacheRequestMetrics
makeDummyPrometheusMetrics :: IO PrometheusMetrics makeDummyPrometheusMetrics :: IO PrometheusMetrics
makeDummyPrometheusMetrics = do makeDummyPrometheusMetrics = do
pmConnections <- newConnectionsGauge pmConnections <- newConnectionsGauge
pmActiveSubscriptions <- Gauge.new
pmGraphQLRequestMetrics <- makeDummyGraphQLRequestMetrics pmGraphQLRequestMetrics <- makeDummyGraphQLRequestMetrics
pmEventTriggerMetrics <- makeDummyEventTriggerMetrics pmEventTriggerMetrics <- makeDummyEventTriggerMetrics
pmWebSocketBytesReceived <- Counter.new pmWebSocketBytesReceived <- Counter.new
@ -176,6 +191,9 @@ makeDummySubscriptionMetrics = do
submActiveStreamingPollers <- Gauge.new submActiveStreamingPollers <- Gauge.new
submActiveLiveQueryPollersInError <- Gauge.new submActiveLiveQueryPollersInError <- Gauge.new
submActiveStreamingPollersInError <- Gauge.new submActiveStreamingPollersInError <- Gauge.new
submTotalTime <- HistogramVector.new []
submDBExecTotalTime <- HistogramVector.new []
submActiveSubscriptions <- GaugeVector.new
pure SubscriptionMetrics {..} pure SubscriptionMetrics {..}
makeDummyCacheRequestMetrics :: IO CacheRequestMetrics makeDummyCacheRequestMetrics :: IO CacheRequestMetrics
@ -239,6 +257,40 @@ instance ToLabels (Maybe TriggerNameLabel) where
toLabels Nothing = Map.empty toLabels Nothing = Map.empty
toLabels (Just (TriggerNameLabel triggerName)) = Map.singleton "trigger_name" (triggerNameToTxt triggerName) toLabels (Just (TriggerNameLabel triggerName)) = Map.singleton "trigger_name" (triggerNameToTxt triggerName)
data SubscriptionKindLabel = SubscriptionKindLabel
{ subscription_kind :: Text
}
deriving stock (Generic, Ord, Eq)
deriving anyclass (ToLabels)
streamingSubscriptionLabel :: SubscriptionKindLabel
streamingSubscriptionLabel = SubscriptionKindLabel "streaming"
liveQuerySubscriptionLabel :: SubscriptionKindLabel
liveQuerySubscriptionLabel = SubscriptionKindLabel "live-query"
data DynamicSubscriptionLabel = DynamicSubscriptionLabel
{ _dslParamQueryHash :: ParameterizedQueryHash,
_dslOperationName :: Maybe OperationName
}
deriving stock (Generic, Ord, Eq)
instance ToLabels DynamicSubscriptionLabel where
toLabels (DynamicSubscriptionLabel hash opName) =
Map.fromList $
[("parameterized_query_hash", bsToTxt $ unParamQueryHash hash)]
<> maybe [] (\op -> [("operation_name", G.unName $ _unOperationName op)]) opName
data SubscriptionLabel = SubscriptionLabel
{ _slKind :: SubscriptionKindLabel,
_slDynamicLabels :: Maybe DynamicSubscriptionLabel
}
deriving stock (Generic, Ord, Eq)
instance ToLabels SubscriptionLabel where
toLabels (SubscriptionLabel kind Nothing) = Map.fromList $ [("subscription_kind", subscription_kind kind)]
toLabels (SubscriptionLabel kind (Just dl)) = (Map.fromList $ [("subscription_kind", subscription_kind kind)]) <> toLabels dl
-- | Record metrics with dynamic label -- | Record metrics with dynamic label
recordMetricWithLabel :: recordMetricWithLabel ::
(MonadIO m) => (MonadIO m) =>
@ -280,3 +332,39 @@ observeHistogramWithLabel getMetricState alwaysObserve histogramVector label val
alwaysObserve alwaysObserve
(liftIO $ HistogramVector.observe histogramVector (Just label) value) (liftIO $ HistogramVector.observe histogramVector (Just label) value)
(liftIO $ HistogramVector.observe histogramVector Nothing value) (liftIO $ HistogramVector.observe histogramVector Nothing value)
-- | Record a subscription metric for all the operation names present in the subscription.
-- Use this when you want to update the same value of the metric for all the operation names.
recordSubcriptionMetric ::
(MonadIO m) =>
(IO GranularPrometheusMetricsState) ->
-- should the metric be observed without a label when granularMetricsState is OFF
Bool ->
HashMap (Maybe OperationName) Int ->
ParameterizedQueryHash ->
SubscriptionKindLabel ->
-- the mertic action to perform
(SubscriptionLabel -> IO ()) ->
m ()
recordSubcriptionMetric getMetricState alwaysObserve operationNamesMap parameterizedQueryHash subscriptionKind metricAction = do
-- if no operation names are present, then emit metric with only param query hash as dynamic label
if (null operationNamesMap)
then do
let promMetricGranularLabel = SubscriptionLabel subscriptionKind (Just $ DynamicSubscriptionLabel parameterizedQueryHash Nothing)
promMetricLabel = SubscriptionLabel subscriptionKind Nothing
recordMetricWithLabel
getMetricState
alwaysObserve
(metricAction promMetricGranularLabel)
(metricAction promMetricLabel)
else -- if operationNames are present, then emit the same metric for all the operation names
do
let operationNames = HashMap.keys operationNamesMap
for_ operationNames $ \opName -> do
let promMetricGranularLabel = SubscriptionLabel subscriptionKind (Just $ DynamicSubscriptionLabel parameterizedQueryHash opName)
promMetricLabel = SubscriptionLabel subscriptionKind Nothing
recordMetricWithLabel
getMetricState
alwaysObserve
(metricAction promMetricGranularLabel)
(metricAction promMetricLabel)

View File

@ -159,11 +159,11 @@ instance ToJSON ApolloFederationStatus where
-- | Whether or not to enable granular metrics for Prometheus. -- | Whether or not to enable granular metrics for Prometheus.
-- --
-- `GranularMetricsOn` will enable the dynamic labels for the metrics. `GranularMetricsOff` will disable the dynamic -- `GranularMetricsOn` will enable the dynamic labels for the metrics.
-- labels for the metrics. -- `GranularMetricsOff` will disable the dynamic labels for the metrics.
-- --
-- **Warning**: Enabling dynamic labels for Prometheus metrics can cause cardinality issues and can cause memory usage -- **Warning**: Enabling dynamic labels for Prometheus metrics can cause cardinality
-- to increase. -- issues and can cause memory usage to increase.
data GranularPrometheusMetricsState data GranularPrometheusMetricsState
= GranularMetricsOff = GranularMetricsOff
| GranularMetricsOn | GranularMetricsOn
@ -182,6 +182,11 @@ instance ToJSON GranularPrometheusMetricsState where
class Monad m => MonadGetPolicies m where class Monad m => MonadGetPolicies m where
runGetApiTimeLimit :: runGetApiTimeLimit ::
m (Maybe MaxTime) m (Maybe MaxTime)
-- 'GranularPrometheusMetricsState' is used to decide if dynamic labels needs to be
-- added when emitting the prometheus metric. The state of this can be dynamically
-- changed via policies. Hence we need to fetch the value from the policy everytime
-- before emitting the metric. Thus we create an IO action which fetches the value.
runGetPrometheusMetricsGranularity :: runGetPrometheusMetricsGranularity ::
m (IO GranularPrometheusMetricsState) m (IO GranularPrometheusMetricsState)

View File

@ -42,7 +42,7 @@ import Hasura.RQL.Types.Roles (RoleName, mkRoleName)
import Hasura.Server.Init (considerEnv, databaseUrlOption, runWithEnv, _envVar) import Hasura.Server.Init (considerEnv, databaseUrlOption, runWithEnv, _envVar)
import Hasura.Server.Metrics (createServerMetrics) import Hasura.Server.Metrics (createServerMetrics)
import Hasura.Server.Prometheus (makeDummyPrometheusMetrics) import Hasura.Server.Prometheus (makeDummyPrometheusMetrics)
import Hasura.Server.Types (RequestId (..)) import Hasura.Server.Types (GranularPrometheusMetricsState (..), RequestId (..))
import Language.GraphQL.Draft.Syntax.QQ qualified as G import Language.GraphQL.Draft.Syntax.QQ qualified as G
import ListT qualified import ListT qualified
import StmContainers.Map qualified as STMMap import StmContainers.Map qualified as STMMap
@ -95,6 +95,9 @@ getStaticCohortSnapshot (Cohort cohortId _respRef existingSubsTV newSubsTV _) =
streamingSubscriptionPollingSpec :: SourceConfig ('Postgres 'Vanilla) -> Spec streamingSubscriptionPollingSpec :: SourceConfig ('Postgres 'Vanilla) -> Spec
streamingSubscriptionPollingSpec srcConfig = do streamingSubscriptionPollingSpec srcConfig = do
dummyServerStore <- runIO newStore
dummyServerMetrics <- runIO $ createServerMetrics dummyServerStore
dummyPromMetrics <- runIO makeDummyPrometheusMetrics
let setupDDLTx = let setupDDLTx =
PG.unitQE PG.unitQE
defaultTxErrorHandler defaultTxErrorHandler
@ -133,6 +136,7 @@ streamingSubscriptionPollingSpec srcConfig = do
pollerId <- runIO $ PollerId <$> UUID.nextRandom pollerId <- runIO $ PollerId <$> UUID.nextRandom
pollerResponseState <- runIO $ STM.newTVarIO PRSSuccess pollerResponseState <- runIO $ STM.newTVarIO PRSSuccess
emptyOperationNamesMap <- runIO $ STM.atomically $ TMap.new
let defaultSubscriptionOptions = mkSubscriptionsOptions Nothing Nothing -- use default values let defaultSubscriptionOptions = mkSubscriptionsOptions Nothing Nothing -- use default values
paramQueryHash = mkUnsafeParameterizedQueryHash "random" paramQueryHash = mkUnsafeParameterizedQueryHash "random"
-- hardcoded multiplexed query which is generated for the following GraphQL query: -- hardcoded multiplexed query which is generated for the following GraphQL query:
@ -151,7 +155,6 @@ streamingSubscriptionPollingSpec srcConfig = do
ORDER BY "root.pg.id" ASC ) AS "_2_root" ) AS "numbers_stream" ) ORDER BY "root.pg.id" ASC ) AS "_2_root" ) AS "numbers_stream" )
AS "_fld_resp" ON ('true') AS "_fld_resp" ON ('true')
|] |]
dummyPrometheusMetrics <- runIO makeDummyPrometheusMetrics
let pollingAction cohortMap testSyncAction = let pollingAction cohortMap testSyncAction =
pollStreamingQuery pollStreamingQuery
@('Postgres 'Vanilla) @('Postgres 'Vanilla)
@ -166,7 +169,10 @@ streamingSubscriptionPollingSpec srcConfig = do
[G.name|randomRootField|] [G.name|randomRootField|]
(const $ pure ()) (const $ pure ())
testSyncAction testSyncAction
dummyPrometheusMetrics dummyPromMetrics
(pure GranularMetricsOff)
emptyOperationNamesMap
Nothing
mkSubscriber sId = mkSubscriber sId =
let wsId = maybe (error "Invalid UUID") WS.mkUnsafeWSId $ UUID.fromString "ec981f92-8d5a-47ab-a306-80af7cfb1113" let wsId = maybe (error "Invalid UUID") WS.mkUnsafeWSId $ UUID.fromString "ec981f92-8d5a-47ab-a306-80af7cfb1113"
@ -218,7 +224,7 @@ streamingSubscriptionPollingSpec srcConfig = do
TMap.reset cohortMap TMap.reset cohortMap
TMap.insert cohort1 cohortKey1 cohortMap TMap.insert cohort1 cohortKey1 cohortMap
runIO $ pollingAction cohortMap Nothing Nothing runIO $ pollingAction cohortMap Nothing
currentCohortMap <- runIO $ STM.atomically $ TMap.getMap cohortMap currentCohortMap <- runIO $ STM.atomically $ TMap.getMap cohortMap
it "the key of the cohort1 should have been moved from the cohortKey1 to cohortKey2, so it should not be found anymore at cohortKey1" $ do it "the key of the cohort1 should have been moved from the cohortKey1 to cohortKey2, so it should not be found anymore at cohortKey1" $ do
@ -241,7 +247,7 @@ streamingSubscriptionPollingSpec srcConfig = do
MVar.readMVar syncMVar MVar.readMVar syncMVar
STM.atomically $ TMap.insert cohort2 cohortKey3 cohortMap STM.atomically $ TMap.insert cohort2 cohortKey3 cohortMap
Async.withAsync Async.withAsync
(pollingAction cohortMap (Just syncAction) Nothing) (pollingAction cohortMap (Just syncAction))
( \pollAsync -> do ( \pollAsync -> do
MVar.putMVar syncMVar () MVar.putMVar syncMVar ()
Async.wait pollAsync Async.wait pollAsync
@ -264,7 +270,7 @@ streamingSubscriptionPollingSpec srcConfig = do
MVar.readMVar syncMVar MVar.readMVar syncMVar
STM.atomically $ TMap.delete cohortKey1 cohortMap STM.atomically $ TMap.delete cohortKey1 cohortMap
Async.withAsync Async.withAsync
(pollingAction cohortMap (Just syncAction) Nothing) (pollingAction cohortMap (Just syncAction))
( \pollAsync -> do ( \pollAsync -> do
MVar.putMVar syncMVar () MVar.putMVar syncMVar ()
Async.wait pollAsync Async.wait pollAsync
@ -283,7 +289,7 @@ streamingSubscriptionPollingSpec srcConfig = do
MVar.readMVar syncMVar MVar.readMVar syncMVar
STM.atomically $ addSubscriberToCohort newTemporarySubscriber cohort1 STM.atomically $ addSubscriberToCohort newTemporarySubscriber cohort1
Async.withAsync Async.withAsync
(pollingAction cohortMap (Just syncAction) Nothing) (pollingAction cohortMap (Just syncAction))
( \pollAsync -> do ( \pollAsync -> do
-- concurrently inserting a new cohort to a key (cohortKey2) to which -- concurrently inserting a new cohort to a key (cohortKey2) to which
-- cohort1 is expected to be associated after the current poll -- cohort1 is expected to be associated after the current poll
@ -315,7 +321,7 @@ streamingSubscriptionPollingSpec srcConfig = do
MVar.readMVar syncMVar MVar.readMVar syncMVar
STM.atomically $ TMap.delete temporarySubscriberId (_cNewSubscribers cohort1) STM.atomically $ TMap.delete temporarySubscriberId (_cNewSubscribers cohort1)
Async.withAsync Async.withAsync
(pollingAction cohortMap (Just syncAction) Nothing) (pollingAction cohortMap (Just syncAction))
( \pollAsync -> do ( \pollAsync -> do
MVar.putMVar syncMVar () MVar.putMVar syncMVar ()
Async.wait pollAsync Async.wait pollAsync
@ -339,10 +345,6 @@ streamingSubscriptionPollingSpec srcConfig = do
TMap.delete temporarySubscriberId (_cNewSubscribers cohort1) TMap.delete temporarySubscriberId (_cNewSubscribers cohort1)
describe "Adding two subscribers concurrently" $ do describe "Adding two subscribers concurrently" $ do
dummyServerStore <- runIO newStore
dummyServerMetrics <- runIO $ createServerMetrics dummyServerStore
dummyPromMetrics <- runIO makeDummyPrometheusMetrics
subscriptionState <- do subscriptionState <- do
runIO $ initSubscriptionsState (const (pure ())) runIO $ initSubscriptionsState (const (pure ()))
@ -389,6 +391,7 @@ streamingSubscriptionPollingSpec srcConfig = do
reqId reqId
[G.name|numbers_stream|] [G.name|numbers_stream|]
subscriptionQueryPlan subscriptionQueryPlan
(pure GranularMetricsOff)
(const (pure ())) (const (pure ()))
it "concurrently adding two subscribers should retain both of them in the poller map" $ do it "concurrently adding two subscribers should retain both of them in the poller map" $ do
@ -403,7 +406,7 @@ streamingSubscriptionPollingSpec srcConfig = do
streamQueryMapEntries <- STM.atomically $ ListT.toList $ STMMap.listT streamQueryMap streamQueryMapEntries <- STM.atomically $ ListT.toList $ STMMap.listT streamQueryMap
length streamQueryMapEntries `shouldBe` 1 length streamQueryMapEntries `shouldBe` 1
let (pollerKey, (Poller currentCohortMap _ ioState)) = head streamQueryMapEntries let (pollerKey, (Poller currentCohortMap _ ioState _ _)) = head streamQueryMapEntries
cohorts <- STM.atomically $ TMap.toList currentCohortMap cohorts <- STM.atomically $ TMap.toList currentCohortMap
length cohorts `shouldBe` 1 length cohorts `shouldBe` 1
let (_cohortKey, Cohort _ _ curSubsTV newSubsTV _) = head cohorts let (_cohortKey, Cohort _ _ curSubsTV newSubsTV _) = head cohorts