graphql-engine/server/src-lib/Hasura/RQL/DDL/Schema/Cache/Common.hs
Anon Ray 06b599b747 server: multitenant metadata storage
The metadata storage implementation for graphql-engine-multitenant.

- It uses a centralized PG database to store metadata of all tenants (instead of per tenant database)
- Similarly, it uses a single schema-sync listener thread per MT worker (instead of listener thread per tenant) (PS: although, the processor thread is spawned per tenant)
- 2 new flags are introduced - `--metadataDatabaseUrl` and (optional) `--metadataDatabaseRetries`

Internally, a "metadata mode" is introduced to indicate an external/managed store vs a store managed by each pro-server.

To run :
- obtain the schema file (located at `pro/server/res/cloud/metadata_db_schema.sql`)
- apply the schema on a PG database
- set the `--metadataDatabaseUrl` flag to point to the above database
- run the MT executable

The schema (and its migrations) for the metadata db is managed outside the MT worker.

### New metadata

The following is the new portion of `Metadata` added :

```yaml
version: 3
metrics_config:
  analyze_query_variables: true
  analyze_response_body: false
api_limits:
  disabled: false
  depth_limit:
    global: 5
    per_role:
      user: 7
      editor: 9
  rate_limit:
    per_role:
      user:
        unique_params:
        - x-hasura-user-id
        - x-hasura-team-id
        max_reqs_per_min: 20
    global:
      unique_params: IP
      max_reqs_per_min: 10
```

- In Pro, the code around fetching/updating/syncing pro-config is removed
- That also means, `hdb_pro_catalog` for keeping the config cache is not required. Hence the `hdb_pro_catalog` is also removed
- The required config comes from metadata / schema cache

### New Metadata APIs

- `set_api_limits`
- `remove_api_limits`
- `set_metrics_config`
- `remove_metrics_config`

#### `set_api_limits`

```yaml
type: set_api_limits
args:
  disabled: false
  depth_limit:
    global: 5
    per_role:
      user: 7
      editor: 9
  rate_limit:
    per_role:
      anonymous:
         max_reqs_per_min: 10
         unique_params: "ip"
      editor:
        max_reqs_per_min: 30
        unique_params:
        - x-hasura-user-id
      user:
        unique_params:
        - x-hasura-user-id
        - x-hasura-team-id
        max_reqs_per_min: 20
    global:
      unique_params: IP
      max_reqs_per_min: 10
```

#### `remove_api_limits`

```yaml
type: remove_api_limits
args: {}
```

#### `set_metrics_config`

```yaml
type: set_metrics_config
args:
  analyze_query_variables: true
  analyze_response_body: false
```

#### `remove_metrics_config`

```yaml
type: remove_metrics_config
args: {}
```

#### TODO
- [x] on-prem pro implementation for `MonadMetadataStorage`
- [x] move the project config from Lux to pro metadata (PR: #379)
- [ ] console changes for pro config/api limits, subscription workers (cc @soorajshankar @beerose)
- [x] address other minor TODOs
  - [x] TxIso for `MonadSourceResolver`
  - [x] enable EKG connection pool metrics
  - [x] add logging of connection info when sources are added?
  - [x] confirm if the `buildReason` for schema cache is correct
- [ ] testing
- [x] 1.3 -> 1.4 cloud migration script (#465; PR: #508)
  - [x] one-time migration of existing metadata from users' db to centralized PG
  - [x] one-time migration of pro project config + api limits + regression tests from metrics API  to metadata
- [ ] integrate with infra team (WIP - cc @hgiasac)
  - [x] benchmark with 1000+ tenants + each tenant making read/update metadata query every second (PR: https://github.com/hasura/graphql-engine-mono/pull/411)
  - [ ] benchmark with few tenants having large metadata (100+ tables etc.)
  - [ ] when user moves regions (https://github.com/hasura/lux/issues/1717)
    - [ ] metadata has to be migrated from one regional PG to another
    - [ ] migrate metrics data as well ?
      - [ ] operation logs
      - [ ] regression test runs

- [ ] find a way to share the schema files with the infra team

Co-authored-by: Naveen Naidu <30195193+Naveenaidu@users.noreply.github.com>
GitOrigin-RevId: 39e8361f2c0e96e0f9e8f8fb45e6cc14857f31f1
2021-02-11 17:55:21 +00:00

245 lines
9.3 KiB
Haskell
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{-# LANGUAGE Arrows #-}
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE UndecidableInstances #-}
-- | Types/functions shared between modules that implement "Hasura.RQL.DDL.Schema.Cache". Other
-- modules should not import this module directly.
module Hasura.RQL.DDL.Schema.Cache.Common where
import Hasura.Prelude
import qualified Data.HashMap.Strict.Extended as M
import qualified Data.HashMap.Strict.InsOrd as OMap
import qualified Data.HashSet as HS
import qualified Data.Sequence as Seq
import qualified Network.HTTP.Client.Extended as HTTP
import Control.Arrow.Extended
import Control.Lens
import Control.Monad.Trans.Control (MonadBaseControl)
import Control.Monad.Unique
import Data.Text.Extended
import qualified Hasura.Incremental as Inc
import Hasura.Backends.Postgres.SQL.Types
import Hasura.RQL.Types
-- | 'InvalidationKeys' used to apply requested 'CacheInvalidations'.
data InvalidationKeys = InvalidationKeys
{ _ikMetadata :: !Inc.InvalidationKey
, _ikRemoteSchemas :: !(HashMap RemoteSchemaName Inc.InvalidationKey)
, _ikSources :: !(HashMap SourceName Inc.InvalidationKey)
} deriving (Show, Eq, Generic)
instance Inc.Cacheable InvalidationKeys
instance Inc.Select InvalidationKeys
$(makeLenses ''InvalidationKeys)
initialInvalidationKeys :: InvalidationKeys
initialInvalidationKeys = InvalidationKeys Inc.initialInvalidationKey mempty mempty
invalidateKeys :: CacheInvalidations -> InvalidationKeys -> InvalidationKeys
invalidateKeys CacheInvalidations{..} InvalidationKeys{..} = InvalidationKeys
{ _ikMetadata = if ciMetadata then Inc.invalidate _ikMetadata else _ikMetadata
, _ikRemoteSchemas = foldl' (flip invalidate) _ikRemoteSchemas ciRemoteSchemas
, _ikSources = foldl' (flip invalidate) _ikSources ciSources
}
where
invalidate
:: (Eq a, Hashable a)
=> a -> HashMap a Inc.InvalidationKey -> HashMap a Inc.InvalidationKey
invalidate = M.alter $ Just . maybe Inc.initialInvalidationKey Inc.invalidate
data TableBuildInput
= TableBuildInput
{ _tbiName :: !QualifiedTable
, _tbiIsEnum :: !Bool
, _tbiConfiguration :: !TableConfig
} deriving (Show, Eq, Generic)
instance NFData TableBuildInput
instance Inc.Cacheable TableBuildInput
data NonColumnTableInputs
= NonColumnTableInputs
{ _nctiTable :: !QualifiedTable
, _nctiObjectRelationships :: ![ObjRelDef]
, _nctiArrayRelationships :: ![ArrRelDef]
, _nctiComputedFields :: ![ComputedFieldMetadata]
, _nctiRemoteRelationships :: ![RemoteRelationshipMetadata]
} deriving (Show, Eq, Generic)
-- instance NFData NonColumnTableInputs
-- instance Inc.Cacheable NonColumnTableInputs
data TablePermissionInputs
= TablePermissionInputs
{ _tpiTable :: !QualifiedTable
, _tpiInsert :: ![InsPermDef 'Postgres]
, _tpiSelect :: ![SelPermDef 'Postgres]
, _tpiUpdate :: ![UpdPermDef 'Postgres]
, _tpiDelete :: ![DelPermDef 'Postgres]
} deriving (Show, Eq, Generic)
instance Inc.Cacheable TablePermissionInputs
mkTableInputs :: TableMetadata -> (TableBuildInput, NonColumnTableInputs, TablePermissionInputs)
mkTableInputs TableMetadata{..} =
(buildInput, nonColumns, permissions)
where
buildInput = TableBuildInput _tmTable _tmIsEnum _tmConfiguration
nonColumns = NonColumnTableInputs _tmTable
(OMap.elems _tmObjectRelationships)
(OMap.elems _tmArrayRelationships)
(OMap.elems _tmComputedFields)
(OMap.elems _tmRemoteRelationships)
permissions = TablePermissionInputs _tmTable
(OMap.elems _tmInsertPermissions)
(OMap.elems _tmSelectPermissions)
(OMap.elems _tmUpdatePermissions)
(OMap.elems _tmDeletePermissions)
-- | The direct output of 'buildSchemaCacheRule'. Contains most of the things necessary to build a
-- schema cache, but dependencies and inconsistent metadata objects are collected via a separate
-- 'MonadWriter' side channel.
data BuildOutputs (b :: BackendType)
= BuildOutputs
{ _boSources :: SourceCache
, _boActions :: !(ActionCache b)
, _boRemoteSchemas :: !(HashMap RemoteSchemaName (RemoteSchemaCtx, MetadataObject))
-- ^ We preserve the 'MetadataObject' from the original catalog metadata in the output so we can
-- reuse it later if we need to mark the remote schema inconsistent during GraphQL schema
-- generation (because of field conflicts).
, _boAllowlist :: !(HS.HashSet GQLQuery)
, _boCustomTypes :: !(AnnotatedCustomTypes b)
, _boCronTriggers :: !(M.HashMap TriggerName CronTriggerInfo)
, _boEndpoints :: !(M.HashMap EndpointName (EndpointMetadata GQLQueryWithText))
, _boApiLimits :: !ApiLimit
, _boMetricsConfig :: !MetricsConfig
}
$(makeLenses ''BuildOutputs)
-- | Parameters required for schema cache build
data CacheBuildParams
= CacheBuildParams
{ _cbpManager :: !HTTP.Manager
, _cbpSourceResolver :: !SourceResolver
, _cbpServerConfigCtx :: !ServerConfigCtx
}
-- | The monad in which @'RebuildableSchemaCache' is being run
newtype CacheBuild a
= CacheBuild {unCacheBuild :: ReaderT CacheBuildParams (ExceptT QErr IO) a}
deriving ( Functor, Applicative, Monad
, MonadError QErr
, MonadReader CacheBuildParams
, MonadIO
, MonadBase IO
, MonadBaseControl IO
, MonadUnique
)
instance HTTP.HasHttpManagerM CacheBuild where
askHttpManager = asks _cbpManager
instance HasServerConfigCtx CacheBuild where
askServerConfigCtx = asks _cbpServerConfigCtx
instance MonadResolveSource CacheBuild where
getSourceResolver = asks _cbpSourceResolver
runCacheBuild
:: ( MonadIO m
, MonadError QErr m
)
=> CacheBuildParams -> CacheBuild a -> m a
runCacheBuild params (CacheBuild m) = do
liftEitherM $ liftIO $ runExceptT (runReaderT m params)
runCacheBuildM
:: ( MonadIO m
, MonadError QErr m
, HTTP.HasHttpManagerM m
, HasServerConfigCtx m
, MonadResolveSource m
)
=> CacheBuild a -> m a
runCacheBuildM m = do
params <- CacheBuildParams
<$> HTTP.askHttpManager
<*> getSourceResolver
<*> askServerConfigCtx
runCacheBuild params m
data RebuildableSchemaCache
= RebuildableSchemaCache
{ lastBuiltSchemaCache :: !SchemaCache
, _rscInvalidationMap :: !InvalidationKeys
, _rscRebuild :: !(Inc.Rule (ReaderT BuildReason CacheBuild) (Metadata, InvalidationKeys) SchemaCache)
}
$(makeLenses ''RebuildableSchemaCache)
bindErrorA
:: (ArrowChoice arr, ArrowKleisli m arr, ArrowError e arr, MonadError e m)
=> arr (m a) a
bindErrorA = liftEitherA <<< arrM \m -> (Right <$> m) `catchError` (pure . Left)
{-# INLINE bindErrorA #-}
withRecordDependencies
:: (ArrowWriter (Seq CollectedInfo) arr)
=> WriterA (Seq SchemaDependency) arr (e, s) a
-> arr (e, (MetadataObject, (SchemaObjId, s))) a
withRecordDependencies f = proc (e, (metadataObject, (schemaObjectId, s))) -> do
(result, dependencies) <- runWriterA f -< (e, s)
recordDependencies -< (metadataObject, schemaObjectId, toList dependencies)
returnA -< result
{-# INLINABLE withRecordDependencies #-}
noDuplicates
:: (ArrowChoice arr, ArrowWriter (Seq CollectedInfo) arr)
=> (a -> MetadataObject)
-> [a] `arr` Maybe a
noDuplicates mkMetadataObject = proc values -> case values of
[] -> returnA -< Nothing
[value] -> returnA -< Just value
value:_ -> do
let objectId = _moId $ mkMetadataObject value
definitions = map (_moDefinition . mkMetadataObject) values
tellA -< Seq.singleton $ CIInconsistency (DuplicateObjects objectId definitions)
returnA -< Nothing
{-# INLINABLE noDuplicates #-}
-- | Processes a list of catalog metadata into a map of processed information, marking any duplicate
-- entries inconsistent.
buildInfoMap
:: ( ArrowChoice arr, Inc.ArrowDistribute arr, ArrowWriter (Seq CollectedInfo) arr
, Eq k, Hashable k )
=> (a -> k)
-> (a -> MetadataObject)
-> (e, a) `arr` Maybe b
-> (e, [a]) `arr` HashMap k b
buildInfoMap extractKey mkMetadataObject buildInfo = proc (e, infos) ->
(M.groupOn extractKey infos >- returnA)
>-> (| Inc.keyed (\_ duplicateInfos ->
(duplicateInfos >- noDuplicates mkMetadataObject)
>-> (| traverseA (\info -> (e, info) >- buildInfo) |)
>-> (\info -> join info >- returnA)) |)
>-> (\infoMap -> M.catMaybes infoMap >- returnA)
{-# INLINABLE buildInfoMap #-}
-- | Like 'buildInfo', but includes each processed infos associated 'MetadataObject' in the result.
-- This is useful if the results will be further processed, and the 'MetadataObject' is still needed
-- to mark the object inconsistent.
buildInfoMapPreservingMetadata
:: ( ArrowChoice arr, Inc.ArrowDistribute arr, ArrowWriter (Seq CollectedInfo) arr
, Eq k, Hashable k )
=> (a -> k)
-> (a -> MetadataObject)
-> (e, a) `arr` Maybe b
-> (e, [a]) `arr` HashMap k (b, MetadataObject)
buildInfoMapPreservingMetadata extractKey mkMetadataObject buildInfo =
buildInfoMap extractKey mkMetadataObject proc (e, info) ->
((e, info) >- buildInfo) >-> \result -> result <&> (, mkMetadataObject info) >- returnA
{-# INLINABLE buildInfoMapPreservingMetadata #-}
addTableContext :: QualifiedTable -> Text -> Text
addTableContext tableName e = "in table " <> tableName <<> ": " <> e