* Add ekg-core to build-executable .cabal
* Move creation of EKG Store to Main.hs
This helps to share metrics between pro and OSS and
helps surface the metrics from OSS in Datadog via
Pro.
Co-authored-by: Phil Freeman <phil@hasura.io>
* pass request body to logging context in all cases
* add message size logging on the websocket API
this is required by graphql-engine-pro/#416
* message size logging on websocket API
As we need to log all messages recieved/sent by the websocket server,
it makes sense to log them as part of the websocket server event logs.
Previously message recieved were logged inside the onMessage handler,
and messages sent were logged only for "data" messages (as a server event log)
* fix review comments
Co-authored-by: Phil Freeman <phil@hasura.io>
* server: add logging for action handlers
* add changelog entry
* change action-handler log type from internal to non-internal
* fix action-handler-log name
* Pass environment variables around as a data structure, via @sordina
* Resolving build error
* Adding Environment passing note to changelog
* Removing references to ILTPollerLog as this seems to have been reintroduced from a bad merge
* removing commented-out imports
* Language pragmas already set by project
* Linking async thread
* Apply suggestions from code review
Use `runQueryTx` instead of `runLazyTx` for queries.
* remove the non-user facing entry in the changelog
Co-authored-by: Phil Freeman <paf31@cantab.net>
Co-authored-by: Phil Freeman <phil@hasura.io>
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>
The current idle GC settings seem never to cause idle GC to trigger.
The changes here at least help memory usage to look more reasonable when
running certain benchmarks, and speculatively could partially fix some
memory leaks users have reported.
See ourIdleGC for details.
Referencing canonical memory issue #3388
* new typeclass to abstract the logic of QueryLog-ing
* abstract the logic of logging websocket-server logs
introduce a MonadWSLog typeclass
* move catalog initialization to init step
expose a helper function to migrate catalog
create schema cache in initialiseCtx
* expose various modules and functions for pro
* generalize PGExecCtx to support specialized functions for various operations
* fix tests compilation
* allow customising PGExecCtx when starting the web server
Store the admin secret only as a hash to prevent leaking the secret
inadvertently, and to prevent timing attacks on the secret.
NOTE: best practice for stored user passwords is a function with a
tunable cost like bcrypt, but our threat model is quite different (even
if we thought we could reasonably protect the secret from an attacker
who could read arbitrary regions of memory), and bcrypt is far too slow
(by design) to perform on each request. We'd have to rely on our
(technically savvy) users to choose high entropy passwords in any case.
Referencing #4736
* move user info related code to Hasura.User module
* the RFC #4120 implementation; insert permissions with admin secret
* revert back to old RoleName based schema maps
An attempt made to avoid duplication of schema contexts in types
if any role doesn't possess any admin secret specific schema
* fix compile errors in haskell test
* keep 'user_vars' for session variables in http-logs
* no-op refacto
* tests for admin only inserts
* update docs for admin only inserts
* updated CHANGELOG.md
* default behaviour when admin secret is not set
* fix x-hasura-role to X-Hasura-Role in pytests
* introduce effective timeout in actions async tests
* update docs for admin-secret not configured case
* Update docs/graphql/manual/api-reference/schema-metadata-api/permission.rst
Co-Authored-By: Marion Schleifer <marion@hasura.io>
* Apply suggestions from code review
Co-Authored-By: Marion Schleifer <marion@hasura.io>
* a complete iteration
backend insert permissions accessable via 'x-hasura-backend-privilege'
session variable
* console changes for backend-only permissions
* provide tooltip id; update labels and tooltips;
* requested changes
* requested changes
- remove className from Toggle component
- use appropriate function name (capitalizeFirstChar -> capitalize)
* use toggle props from definitelyTyped
* fix accidental commit
* Revert "introduce effective timeout in actions async tests"
This reverts commit b7a59c19d6.
* generate complete schema for both 'default' and 'backend' sessions
* Apply suggestions from code review
Co-Authored-By: Marion Schleifer <marion@hasura.io>
* remove unnecessary import, export Toggle as is
* update session variable in tooltip
* 'x-hasura-use-backend-only-permissions' variable to switch
* update help texts
* update docs
* update docs
* update console help text
* regenerate package-lock
* serve no backend schema when backend_only: false and header set to true
- Few type name refactor as suggested by @0x777
* update CHANGELOG.md
* Update CHANGELOG.md
* Update CHANGELOG.md
* fix a merge bug where a certain entity didn't get removed
Co-authored-by: Marion Schleifer <marion@hasura.io>
Co-authored-by: Rishichandra Wawhal <rishi@hasura.io>
Co-authored-by: rikinsk <rikin.kachhia@gmail.com>
Co-authored-by: Tirumarai Selvan <tiru@hasura.io>
* config options for internal errors for non-admin role, close#4031
More detailed action debug info is added in response 'internal' field
* add docs
* update CHANGELOG.md
* set admin graphql errors option in ci tests, minor changes to docs
* fix tests
Don't use any auth for sync actions error tests. The request body
changes based on auth type in session_variables (x-hasura-auth-mode)
* Apply suggestions from code review
Co-Authored-By: Marion Schleifer <marion@hasura.io>
* use a new sum type to represent the inclusion of internal errors
As suggested in review by @0x777
-> Move around few modules in to specific API folder
-> Saperate types from Init.hs
* fix tests
Don't use any auth for sync actions error tests. The request body
changes based on auth type in session_variables (x-hasura-auth-mode)
* move 'HttpResponse' to 'Hasura.HTTP' module
* update change log with breaking change warning
* Update CHANGELOG.md
Co-authored-by: Marion Schleifer <marion@hasura.io>
Co-authored-by: Tirumarai Selvan <tiru@hasura.io>
* unlock the locked-events during graceful shutdown
* Some events can still be delivered multiple times due to
ungraceful shutdown
* modify the preparevents documentation
* modify the prepareEvents doc
* update changelog
Co-authored-by: Tirumarai Selvan <tiru@hasura.io>
* add new column "pg_version" while sending telemetry data
* make a new type for PGVersion and use serverVersion func
* define runTxIO action to run transaction(which exits on error)
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>
Writing to a mutable var is a particularly potent source of leaks since
it mostly defeats GHC's analysis. Here we add assertions to all mutable
writes, and fix a couple spots where we wrote some thunks to a mutable
var (compiled with -O2).
Some of these thunks were probably benign, but others looked liked they
might be retaining big args. Didn't do much analysis, just fixed.
Actually pretty happy with how easy this was to use and as a diagnostic,
once I sorted out some issues. We should consider using it elsewhere,
and maybe extending so that we can use it with tests, enable when
`-fenable-assertsions` etc.
Relates #3388
Also simplified codepaths that use `AcceptWith`, which has unnecessary
`Maybe` fields.
* Test working through a backlog of change events
* Use a slightly more performant threaded http server in eventing pytests
This helped locally but not on CI it seems...
* Rework event processing for backpressure. Closes#3839
With loo low `HASURA_GRAPHQL_EVENTS_FETCH_INTERVAL` and/or slow webhooks
and/or too small `HASURA_GRAPHQL_EVENTS_HTTP_POOL_SIZE` we might
previously check out events from the DB faster than we can service them,
leading to space leaks, weirdness, etc.
Other changes:
- avoid fetch interval sleep latency when we previously did a non-empty
fetch
- prefetch event batch while http pool is working
- warn when it appears we can't keep up with events being generated
- make some effort to process events in creation order so we don't
starve older ones.
ALSO NOTE: HASURA_GRAPHQL_EVENTS_FETCH_INTERVAL changes semantics
slightly, since it only comes into play after an empty fetch. The old
semantics weren't documented in detail, so I think this is fine.
This is the result of a general audit of how we fork threads, with a
detour into how we're using mutable state especially in websocket
codepaths, making more robust to async exceptions and exceptions
resulting from bugs.
Some highlights:
- use a wrapper around 'immortal' so threads that die due to bugs are
restarted, and log the error
- use 'withAsync' some places
- use bracket a few places where we might break invariants
- log some codepaths that represent bugs
- export UnstructuredLog for ad hoc logging (the alternative is we
continue not logging useful stuff)
I had to timebox this. There are a few TODOs I didn't want to address.
And we'll wait until this is merged to attempt #3705 for
Control.Concurrent.Extended
* basic doc for actions
* custom_types, sync and async actions
* switch to graphql-parser-hs on github
* update docs
* metadata import/export
* webhook calls are now supported
* relationships in sync actions
* initialise.sql is now in sync with the migration file
* fix metadata tests
* allow specifying arguments of actions
* fix blacklist check on check_build_worthiness job
* track custom_types and actions related tables
* handlers are now triggered on async actions
* default to pgjson unless a field is involved in relationships, for generating definition list
* use 'true' for action filter for non admin role
* fix create_action_permission sql query
* drop permissions when dropping an action
* add a hdb_role view (and relationships) to fetch all roles in the system
* rename 'webhook' key in action definition to 'handler'
* allow templating actions wehook URLs with env vars
* add 'update_action' /v1/query type
* allow forwarding client headers by setting `forward_client_headers` in action definition
* add 'headers' configuration in action definition
* handle webhook error response based on status codes
* support array relationships for custom types
* implement single row mutation, see https://github.com/hasura/graphql-engine/issues/3731
* single row mutation: rename 'pk_columns' -> 'columns' and no-op refactor
* use top level primary key inputs for delete_by_pk & account select permissions for single row mutations
* use only REST semantics to resolve the webhook response
* use 'pk_columns' instead of 'columns' for update_by_pk input
* add python basic tests for single row mutations
* add action context (name) in webhook payload
* Async action response is accessible for non admin roles only if
the request session vars equals to action's
* clean nulls, empty arrays for actions, custom types in export metadata
* async action mutation returns only the UUID of the action
* unit tests for URL template parser
* Basic sync actions python tests
* fix output in async query & add async tests
* add admin secret header in async actions python test
* document async action architecture in Resolve/Action.hs file
* support actions returning array of objects
* tests for list type response actions
* update docs with actions and custom types metadata API reference
* update actions python tests as per #f8e1330
Co-authored-by: Tirumarai Selvan <tirumarai.selvan@gmail.com>
Co-authored-by: Aravind Shankar <face11301@gmail.com>
Co-authored-by: Rakesh Emmadi <12475069+rakeshkky@users.noreply.github.com>
* Add downgrade command
* Add docs per @lexi-lambda's suggestions
* make tests pass
* Update hdb_version once, from Haskell
* more work based on feedback
* Improve the usage message
* Small docs changes
* Test downgrades exist for each tag
* Update downgrading.rst
* Use git-log to find tags which are ancestors of the current commit
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>
We upload a set of accumulating timers and counters to track service
time for different types of operations, across several dimensions (e.g.
did we hit the plan cache, was a remote involved, etc.)
Also...
Standardize on DiffTime as a standard duration type, and try to use it
consistently.
See discussion here:
https://github.com/hasura/graphql-engine/pull/3584#pullrequestreview-340679369
It should be possible to overwrite that module so the new threadDelay
sticks per the pattern in #3705 blocked on #3558
Rename the Control.Concurrent.Extended.threadDelay to `sleep` since a
naive use with a literal argument would be very bad!
We catch a bug in 'computeTimeDiff'.
Add convenient 'Read' instances to the time unit utility types. Make
'Second' a newtype to support this.
The connection handler in websocket transport was not using the
'UserAuthentication' interface to resolve user info. Fix resolving
user info in websocket transport to use the common
'UserAuthentication' interface