graphql-engine/server/src-lib/Hasura/RQL/DDL/Metadata.hs

296 lines
12 KiB
Haskell
Raw Normal View History

2018-06-27 16:11:32 +03:00
module Hasura.RQL.DDL.Metadata
( runReplaceMetadata
, runReplaceMetadataV2
, runExportMetadata
, runExportMetadataV2
, runClearMetadata
, runReloadMetadata
, runDumpInternalState
, runGetInconsistentMetadata
, runDropInconsistentMetadata
, runGetCatalogState
, runSetCatalogState
server: multitenant metadata storage The metadata storage implementation for graphql-engine-multitenant. - It uses a centralized PG database to store metadata of all tenants (instead of per tenant database) - Similarly, it uses a single schema-sync listener thread per MT worker (instead of listener thread per tenant) (PS: although, the processor thread is spawned per tenant) - 2 new flags are introduced - `--metadataDatabaseUrl` and (optional) `--metadataDatabaseRetries` Internally, a "metadata mode" is introduced to indicate an external/managed store vs a store managed by each pro-server. To run : - obtain the schema file (located at `pro/server/res/cloud/metadata_db_schema.sql`) - apply the schema on a PG database - set the `--metadataDatabaseUrl` flag to point to the above database - run the MT executable The schema (and its migrations) for the metadata db is managed outside the MT worker. ### New metadata The following is the new portion of `Metadata` added : ```yaml version: 3 metrics_config: analyze_query_variables: true analyze_response_body: false api_limits: disabled: false depth_limit: global: 5 per_role: user: 7 editor: 9 rate_limit: per_role: user: unique_params: - x-hasura-user-id - x-hasura-team-id max_reqs_per_min: 20 global: unique_params: IP max_reqs_per_min: 10 ``` - In Pro, the code around fetching/updating/syncing pro-config is removed - That also means, `hdb_pro_catalog` for keeping the config cache is not required. Hence the `hdb_pro_catalog` is also removed - The required config comes from metadata / schema cache ### New Metadata APIs - `set_api_limits` - `remove_api_limits` - `set_metrics_config` - `remove_metrics_config` #### `set_api_limits` ```yaml type: set_api_limits args: disabled: false depth_limit: global: 5 per_role: user: 7 editor: 9 rate_limit: per_role: anonymous: max_reqs_per_min: 10 unique_params: "ip" editor: max_reqs_per_min: 30 unique_params: - x-hasura-user-id user: unique_params: - x-hasura-user-id - x-hasura-team-id max_reqs_per_min: 20 global: unique_params: IP max_reqs_per_min: 10 ``` #### `remove_api_limits` ```yaml type: remove_api_limits args: {} ``` #### `set_metrics_config` ```yaml type: set_metrics_config args: analyze_query_variables: true analyze_response_body: false ``` #### `remove_metrics_config` ```yaml type: remove_metrics_config args: {} ``` #### TODO - [x] on-prem pro implementation for `MonadMetadataStorage` - [x] move the project config from Lux to pro metadata (PR: #379) - [ ] console changes for pro config/api limits, subscription workers (cc @soorajshankar @beerose) - [x] address other minor TODOs - [x] TxIso for `MonadSourceResolver` - [x] enable EKG connection pool metrics - [x] add logging of connection info when sources are added? - [x] confirm if the `buildReason` for schema cache is correct - [ ] testing - [x] 1.3 -> 1.4 cloud migration script (#465; PR: #508) - [x] one-time migration of existing metadata from users' db to centralized PG - [x] one-time migration of pro project config + api limits + regression tests from metrics API to metadata - [ ] integrate with infra team (WIP - cc @hgiasac) - [x] benchmark with 1000+ tenants + each tenant making read/update metadata query every second (PR: https://github.com/hasura/graphql-engine-mono/pull/411) - [ ] benchmark with few tenants having large metadata (100+ tables etc.) - [ ] when user moves regions (https://github.com/hasura/lux/issues/1717) - [ ] metadata has to be migrated from one regional PG to another - [ ] migrate metrics data as well ? - [ ] operation logs - [ ] regression test runs - [ ] find a way to share the schema files with the infra team Co-authored-by: Naveen Naidu <30195193+Naveenaidu@users.noreply.github.com> GitOrigin-RevId: 39e8361f2c0e96e0f9e8f8fb45e6cc14857f31f1
2021-02-11 20:54:25 +03:00
, runSetMetricsConfig
, runRemoveMetricsConfig
, module Hasura.RQL.DDL.Metadata.Types
2018-06-27 16:11:32 +03:00
) where
import Hasura.Prelude
2018-06-27 16:11:32 +03:00
import qualified Data.Aeson.Ordered as AO
import qualified Data.HashMap.Strict as HM
import qualified Data.HashMap.Strict.InsOrd as OMap
import qualified Data.HashSet as HS
import qualified Data.List as L
import Control.Lens ((.~), (^?))
import Data.Aeson
import Data.Typeable (cast)
2018-06-27 16:11:32 +03:00
import Hasura.Backends.Postgres.DDL.Table (delTriggerQ)
import Hasura.Metadata.Class
import Hasura.RQL.DDL.Action
import Hasura.RQL.DDL.ComputedField
import Hasura.RQL.DDL.CustomTypes
import Hasura.RQL.DDL.Endpoint
import Hasura.RQL.DDL.EventTrigger
import Hasura.RQL.DDL.Permission
import Hasura.RQL.DDL.Relationship
import Hasura.RQL.DDL.RemoteRelationship
import Hasura.RQL.DDL.RemoteSchema
import Hasura.RQL.DDL.ScheduledTrigger
import Hasura.RQL.DDL.Schema
import Hasura.EncJSON
import Hasura.RQL.DDL.Metadata.Types
2018-06-27 16:11:32 +03:00
import Hasura.RQL.Types
runClearMetadata
:: (MonadIO m, CacheRWM m, MetadataM m, MonadError QErr m)
=> ClearMetadata -> m EncJSON
runClearMetadata _ = do
metadata <- getMetadata
-- We can infer whether the server is started with `--database-url` option
-- (or corresponding env variable) by checking the existence of @'defaultSource'
-- in current metadata.
let maybeDefaultSourceMetadata = metadata ^? metaSources.ix defaultSource
emptyMetadata' = case maybeDefaultSourceMetadata of
Nothing -> emptyMetadata
Just (BackendSourceMetadata defaultSourceMetadata) ->
-- If default postgres source is defined, we need to set metadata
-- which contains only default source without any tables and functions.
let emptyDefaultSource = BackendSourceMetadata $
SourceMetadata defaultSource mempty mempty $ _smConfiguration defaultSourceMetadata
in emptyMetadata
& metaSources %~ OMap.insert defaultSource emptyDefaultSource
runReplaceMetadataV1 $ RMWithSources emptyMetadata'
{- Note [Clear postgres schema for dropped triggers]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There was an issue (https://github.com/hasura/graphql-engine/issues/5461)
fixed (via https://github.com/hasura/graphql-engine/pull/6137) related to
event triggers while replacing metadata in the catalog prior to metadata
separation. The metadata separation solves the issue naturally, since the
'hdb_catalog.event_triggers' table is no more in use and new/updated event
triggers are processed in building schema cache. But we need to drop the
pg trigger and archive events for dropped event triggers. This is handled
explicitly in @'runReplaceMetadata' function.
-}
2018-06-27 16:11:32 +03:00
runReplaceMetadata
:: ( QErrM m
, CacheRWM m
, MetadataM m
, MonadIO m
)
=> ReplaceMetadata -> m EncJSON
runReplaceMetadata = \case
RMReplaceMetadataV1 v1args -> runReplaceMetadataV1 v1args
RMReplaceMetadataV2 v2args -> runReplaceMetadataV2 v2args
runReplaceMetadataV1
:: ( QErrM m
, CacheRWM m
, MetadataM m
, MonadIO m
)
=> ReplaceMetadataV1 -> m EncJSON
runReplaceMetadataV1 =
(successMsg <$) . runReplaceMetadataV2 . ReplaceMetadataV2 NoAllowInconsistentMetadata
runReplaceMetadataV2
:: forall m
. ( QErrM m
, CacheRWM m
, MetadataM m
, MonadIO m
)
=> ReplaceMetadataV2 -> m EncJSON
runReplaceMetadataV2 ReplaceMetadataV2{..} = do
oldMetadata <- getMetadata
metadata <- case _rmv2Metadata of
RMWithSources m -> pure m
RMWithoutSources MetadataNoSources{..} -> do
let maybeDefaultSourceMetadata = oldMetadata ^? metaSources.ix defaultSource.toSourceMetadata
defaultSourceMetadata <- onNothing maybeDefaultSourceMetadata $
throw400 NotSupported $ "cannot import metadata without sources since no default source is defined"
let newDefaultSourceMetadata = BackendSourceMetadata defaultSourceMetadata
{ _smTables = _mnsTables
, _smFunctions = _mnsFunctions
}
pure $ Metadata (OMap.singleton defaultSource newDefaultSourceMetadata)
_mnsRemoteSchemas _mnsQueryCollections _mnsAllowlist
_mnsCustomTypes _mnsActions _mnsCronTriggers (_metaRestEndpoints oldMetadata)
server: multitenant metadata storage The metadata storage implementation for graphql-engine-multitenant. - It uses a centralized PG database to store metadata of all tenants (instead of per tenant database) - Similarly, it uses a single schema-sync listener thread per MT worker (instead of listener thread per tenant) (PS: although, the processor thread is spawned per tenant) - 2 new flags are introduced - `--metadataDatabaseUrl` and (optional) `--metadataDatabaseRetries` Internally, a "metadata mode" is introduced to indicate an external/managed store vs a store managed by each pro-server. To run : - obtain the schema file (located at `pro/server/res/cloud/metadata_db_schema.sql`) - apply the schema on a PG database - set the `--metadataDatabaseUrl` flag to point to the above database - run the MT executable The schema (and its migrations) for the metadata db is managed outside the MT worker. ### New metadata The following is the new portion of `Metadata` added : ```yaml version: 3 metrics_config: analyze_query_variables: true analyze_response_body: false api_limits: disabled: false depth_limit: global: 5 per_role: user: 7 editor: 9 rate_limit: per_role: user: unique_params: - x-hasura-user-id - x-hasura-team-id max_reqs_per_min: 20 global: unique_params: IP max_reqs_per_min: 10 ``` - In Pro, the code around fetching/updating/syncing pro-config is removed - That also means, `hdb_pro_catalog` for keeping the config cache is not required. Hence the `hdb_pro_catalog` is also removed - The required config comes from metadata / schema cache ### New Metadata APIs - `set_api_limits` - `remove_api_limits` - `set_metrics_config` - `remove_metrics_config` #### `set_api_limits` ```yaml type: set_api_limits args: disabled: false depth_limit: global: 5 per_role: user: 7 editor: 9 rate_limit: per_role: anonymous: max_reqs_per_min: 10 unique_params: "ip" editor: max_reqs_per_min: 30 unique_params: - x-hasura-user-id user: unique_params: - x-hasura-user-id - x-hasura-team-id max_reqs_per_min: 20 global: unique_params: IP max_reqs_per_min: 10 ``` #### `remove_api_limits` ```yaml type: remove_api_limits args: {} ``` #### `set_metrics_config` ```yaml type: set_metrics_config args: analyze_query_variables: true analyze_response_body: false ``` #### `remove_metrics_config` ```yaml type: remove_metrics_config args: {} ``` #### TODO - [x] on-prem pro implementation for `MonadMetadataStorage` - [x] move the project config from Lux to pro metadata (PR: #379) - [ ] console changes for pro config/api limits, subscription workers (cc @soorajshankar @beerose) - [x] address other minor TODOs - [x] TxIso for `MonadSourceResolver` - [x] enable EKG connection pool metrics - [x] add logging of connection info when sources are added? - [x] confirm if the `buildReason` for schema cache is correct - [ ] testing - [x] 1.3 -> 1.4 cloud migration script (#465; PR: #508) - [x] one-time migration of existing metadata from users' db to centralized PG - [x] one-time migration of pro project config + api limits + regression tests from metrics API to metadata - [ ] integrate with infra team (WIP - cc @hgiasac) - [x] benchmark with 1000+ tenants + each tenant making read/update metadata query every second (PR: https://github.com/hasura/graphql-engine-mono/pull/411) - [ ] benchmark with few tenants having large metadata (100+ tables etc.) - [ ] when user moves regions (https://github.com/hasura/lux/issues/1717) - [ ] metadata has to be migrated from one regional PG to another - [ ] migrate metrics data as well ? - [ ] operation logs - [ ] regression test runs - [ ] find a way to share the schema files with the infra team Co-authored-by: Naveen Naidu <30195193+Naveenaidu@users.noreply.github.com> GitOrigin-RevId: 39e8361f2c0e96e0f9e8f8fb45e6cc14857f31f1
2021-02-11 20:54:25 +03:00
emptyApiLimit emptyMetricsConfig
putMetadata metadata
case _rmv2AllowInconsistentMetadata of
AllowInconsistentMetadata ->
buildSchemaCache noMetadataModify
NoAllowInconsistentMetadata ->
buildSchemaCacheStrict
-- See Note [Clear postgres schema for dropped triggers]
dropPostgresTriggers (getOnlyPGSources oldMetadata) (getOnlyPGSources metadata)
sc <- askSchemaCache
pure $ encJFromJValue $ formatInconsistentObjs $ scInconsistentObjs sc
where
getOnlyPGSources :: Metadata -> InsOrdHashMap SourceName (SourceMetadata 'Postgres)
getOnlyPGSources = OMap.mapMaybe (\(BackendSourceMetadata sm) -> cast sm) . _metaSources
dropPostgresTriggers
:: InsOrdHashMap SourceName (SourceMetadata 'Postgres) -- ^ old pg sources
-> InsOrdHashMap SourceName (SourceMetadata 'Postgres) -- ^ new pg sources
-> m ()
dropPostgresTriggers oldSources newSources =
for_ (OMap.toList newSources) $ \(source, newSourceCache) ->
onJust (OMap.lookup source oldSources) $ \oldSourceCache -> do
let oldTriggersMap = getPGTriggersMap oldSourceCache
newTriggersMap = getPGTriggersMap newSourceCache
droppedTriggers = OMap.keys $ oldTriggersMap `OMap.difference` newTriggersMap
sourceConfig <- askSourceConfig source
for_ droppedTriggers $
\name -> liftIO $ runPgSourceWriteTx sourceConfig $ delTriggerQ name >> archiveEvents name
where
getPGTriggersMap = OMap.unions . map _tmEventTriggers . OMap.elems . _smTables
runExportMetadata
:: forall m . ( QErrM m, MetadataM m)
=> ExportMetadata -> m EncJSON
runExportMetadata ExportMetadata{} =
AO.toEncJSON . metadataToOrdJSON <$> getMetadata
runExportMetadataV2
:: forall m . ( QErrM m, MetadataM m)
=> MetadataResourceVersion -> ExportMetadata -> m EncJSON
runExportMetadataV2 currentResourceVersion ExportMetadata{} = do
exportMetadata <- getMetadata
pure $ AO.toEncJSON $ AO.object
[ ("resource_version", AO.toOrdered currentResourceVersion)
, ("metadata", metadataToOrdJSON exportMetadata)
]
runReloadMetadata :: (QErrM m, CacheRWM m, MetadataM m) => ReloadMetadata -> m EncJSON
runReloadMetadata (ReloadMetadata reloadRemoteSchemas reloadSources) = do
sc <- askSchemaCache
let remoteSchemaInvalidations = case reloadRemoteSchemas of
RSReloadAll -> HS.fromList $ getAllRemoteSchemas sc
RSReloadList l -> l
pgSourcesInvalidations = case reloadSources of
RSReloadAll -> HS.fromList $ HM.keys $ scSources sc
RSReloadList l -> l
cacheInvalidations = CacheInvalidations
{ ciMetadata = True
, ciRemoteSchemas = remoteSchemaInvalidations
, ciSources = pgSourcesInvalidations
}
metadata <- getMetadata
buildSchemaCacheWithOptions CatalogUpdate cacheInvalidations metadata
pure successMsg
runDumpInternalState
:: (QErrM m, CacheRM m)
=> DumpInternalState -> m EncJSON
runDumpInternalState _ =
encJFromJValue <$> askSchemaCache
runGetInconsistentMetadata
:: (QErrM m, CacheRM m)
=> GetInconsistentMetadata -> m EncJSON
runGetInconsistentMetadata _ = do
inconsObjs <- scInconsistentObjs <$> askSchemaCache
return $ encJFromJValue $ formatInconsistentObjs inconsObjs
formatInconsistentObjs :: [InconsistentMetadata] -> Value
formatInconsistentObjs inconsObjs = object
[ "is_consistent" .= null inconsObjs
, "inconsistent_objects" .= inconsObjs
]
runDropInconsistentMetadata
:: (QErrM m, CacheRWM m, MetadataM m)
=> DropInconsistentMetadata -> m EncJSON
runDropInconsistentMetadata _ = do
sc <- askSchemaCache
let inconsSchObjs = L.nub . concatMap imObjectIds $ scInconsistentObjs sc
-- Note: when building the schema cache, we try to put dependents after their dependencies in the
-- list of inconsistent objects, so reverse the list to start with dependents first. This is not
-- perfect — a completely accurate solution would require performing a topological sort — but it
-- seems to work well enough for now.
metadataModifier <- execWriterT $ mapM_ (tell . purgeMetadataObj) (reverse inconsSchObjs)
metadata <- getMetadata
putMetadata $ unMetadataModifier metadataModifier $ metadata
buildSchemaCacheStrict
return successMsg
purgeMetadataObj :: MetadataObjId -> MetadataModifier
purgeMetadataObj = \case
MOSource source -> MetadataModifier $ metaSources %~ OMap.delete source
MOSourceObjId source (sourceObjId :: SourceMetadataObjId b) ->
case backendTag @b of
PostgresTag -> case sourceObjId of
SMOTable qt -> dropTableInMetadata source qt
SMOTableObj qt tableObj -> MetadataModifier $
tableMetadataSetter source qt %~ case tableObj of
MTORel rn _ -> dropRelationshipInMetadata rn
MTOPerm rn pt -> dropPermissionInMetadata rn pt
MTOTrigger trn -> dropEventTriggerInMetadata trn
MTOComputedField ccn -> dropComputedFieldInMetadata ccn
MTORemoteRelationship rn -> dropRemoteRelationshipInMetadata rn
SMOFunction qf -> dropFunctionInMetadata source qf
SMOFunctionPermission qf rn -> dropFunctionPermissionInMetadata source qf rn
MSSQLTag -> case sourceObjId of
SMOTable qt -> dropTableInMetadata source qt
SMOTableObj qt tableObj -> MetadataModifier $
tableMetadataSetter source qt %~ case tableObj of
MTORel rn _ -> dropRelationshipInMetadata rn
MTOPerm rn pt -> dropPermissionInMetadata rn pt
MTOTrigger trn -> dropEventTriggerInMetadata trn
MTOComputedField ccn -> dropComputedFieldInMetadata ccn
MTORemoteRelationship rn -> dropRemoteRelationshipInMetadata rn
SMOFunction qf -> dropFunctionInMetadata source qf
SMOFunctionPermission qf rn -> dropFunctionPermissionInMetadata source qf rn
MORemoteSchema rsn -> dropRemoteSchemaInMetadata rsn
MORemoteSchemaPermissions rsName role -> dropRemoteSchemaPermissionInMetadata rsName role
MOCustomTypes -> clearCustomTypesInMetadata
MOAction action -> dropActionInMetadata action -- Nothing
MOActionPermission action role -> dropActionPermissionInMetadata action role
MOCronTrigger ctName -> dropCronTriggerInMetadata ctName
MOEndpoint epName -> dropEndpointInMetadata epName
runGetCatalogState
:: (MonadMetadataStorageQueryAPI m) => GetCatalogState -> m EncJSON
runGetCatalogState _ =
encJFromJValue <$> fetchCatalogState
runSetCatalogState
:: (MonadMetadataStorageQueryAPI m) => SetCatalogState -> m EncJSON
runSetCatalogState SetCatalogState{..} = do
updateCatalogState _scsType _scsState
pure successMsg
server: multitenant metadata storage The metadata storage implementation for graphql-engine-multitenant. - It uses a centralized PG database to store metadata of all tenants (instead of per tenant database) - Similarly, it uses a single schema-sync listener thread per MT worker (instead of listener thread per tenant) (PS: although, the processor thread is spawned per tenant) - 2 new flags are introduced - `--metadataDatabaseUrl` and (optional) `--metadataDatabaseRetries` Internally, a "metadata mode" is introduced to indicate an external/managed store vs a store managed by each pro-server. To run : - obtain the schema file (located at `pro/server/res/cloud/metadata_db_schema.sql`) - apply the schema on a PG database - set the `--metadataDatabaseUrl` flag to point to the above database - run the MT executable The schema (and its migrations) for the metadata db is managed outside the MT worker. ### New metadata The following is the new portion of `Metadata` added : ```yaml version: 3 metrics_config: analyze_query_variables: true analyze_response_body: false api_limits: disabled: false depth_limit: global: 5 per_role: user: 7 editor: 9 rate_limit: per_role: user: unique_params: - x-hasura-user-id - x-hasura-team-id max_reqs_per_min: 20 global: unique_params: IP max_reqs_per_min: 10 ``` - In Pro, the code around fetching/updating/syncing pro-config is removed - That also means, `hdb_pro_catalog` for keeping the config cache is not required. Hence the `hdb_pro_catalog` is also removed - The required config comes from metadata / schema cache ### New Metadata APIs - `set_api_limits` - `remove_api_limits` - `set_metrics_config` - `remove_metrics_config` #### `set_api_limits` ```yaml type: set_api_limits args: disabled: false depth_limit: global: 5 per_role: user: 7 editor: 9 rate_limit: per_role: anonymous: max_reqs_per_min: 10 unique_params: "ip" editor: max_reqs_per_min: 30 unique_params: - x-hasura-user-id user: unique_params: - x-hasura-user-id - x-hasura-team-id max_reqs_per_min: 20 global: unique_params: IP max_reqs_per_min: 10 ``` #### `remove_api_limits` ```yaml type: remove_api_limits args: {} ``` #### `set_metrics_config` ```yaml type: set_metrics_config args: analyze_query_variables: true analyze_response_body: false ``` #### `remove_metrics_config` ```yaml type: remove_metrics_config args: {} ``` #### TODO - [x] on-prem pro implementation for `MonadMetadataStorage` - [x] move the project config from Lux to pro metadata (PR: #379) - [ ] console changes for pro config/api limits, subscription workers (cc @soorajshankar @beerose) - [x] address other minor TODOs - [x] TxIso for `MonadSourceResolver` - [x] enable EKG connection pool metrics - [x] add logging of connection info when sources are added? - [x] confirm if the `buildReason` for schema cache is correct - [ ] testing - [x] 1.3 -> 1.4 cloud migration script (#465; PR: #508) - [x] one-time migration of existing metadata from users' db to centralized PG - [x] one-time migration of pro project config + api limits + regression tests from metrics API to metadata - [ ] integrate with infra team (WIP - cc @hgiasac) - [x] benchmark with 1000+ tenants + each tenant making read/update metadata query every second (PR: https://github.com/hasura/graphql-engine-mono/pull/411) - [ ] benchmark with few tenants having large metadata (100+ tables etc.) - [ ] when user moves regions (https://github.com/hasura/lux/issues/1717) - [ ] metadata has to be migrated from one regional PG to another - [ ] migrate metrics data as well ? - [ ] operation logs - [ ] regression test runs - [ ] find a way to share the schema files with the infra team Co-authored-by: Naveen Naidu <30195193+Naveenaidu@users.noreply.github.com> GitOrigin-RevId: 39e8361f2c0e96e0f9e8f8fb45e6cc14857f31f1
2021-02-11 20:54:25 +03:00
runSetMetricsConfig
:: (MonadIO m, CacheRWM m, MetadataM m, MonadError QErr m)
=> MetricsConfig -> m EncJSON
runSetMetricsConfig mc = do
withNewInconsistentObjsCheck
$ buildSchemaCache
$ MetadataModifier
$ metaMetricsConfig .~ mc
pure successMsg
runRemoveMetricsConfig
:: (MonadIO m, CacheRWM m, MetadataM m, MonadError QErr m)
=> m EncJSON
runRemoveMetricsConfig = do
withNewInconsistentObjsCheck
$ buildSchemaCache
$ MetadataModifier
$ metaMetricsConfig .~ emptyMetricsConfig
pure successMsg