* add new column "pg_version" while sending telemetry data
* make a new type for PGVersion and use serverVersion func
* define runTxIO action to run transaction(which exits on error)
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>
Writing to a mutable var is a particularly potent source of leaks since
it mostly defeats GHC's analysis. Here we add assertions to all mutable
writes, and fix a couple spots where we wrote some thunks to a mutable
var (compiled with -O2).
Some of these thunks were probably benign, but others looked liked they
might be retaining big args. Didn't do much analysis, just fixed.
Actually pretty happy with how easy this was to use and as a diagnostic,
once I sorted out some issues. We should consider using it elsewhere,
and maybe extending so that we can use it with tests, enable when
`-fenable-assertsions` etc.
Relates #3388
Also simplified codepaths that use `AcceptWith`, which has unnecessary
`Maybe` fields.
* Test working through a backlog of change events
* Use a slightly more performant threaded http server in eventing pytests
This helped locally but not on CI it seems...
* Rework event processing for backpressure. Closes#3839
With loo low `HASURA_GRAPHQL_EVENTS_FETCH_INTERVAL` and/or slow webhooks
and/or too small `HASURA_GRAPHQL_EVENTS_HTTP_POOL_SIZE` we might
previously check out events from the DB faster than we can service them,
leading to space leaks, weirdness, etc.
Other changes:
- avoid fetch interval sleep latency when we previously did a non-empty
fetch
- prefetch event batch while http pool is working
- warn when it appears we can't keep up with events being generated
- make some effort to process events in creation order so we don't
starve older ones.
ALSO NOTE: HASURA_GRAPHQL_EVENTS_FETCH_INTERVAL changes semantics
slightly, since it only comes into play after an empty fetch. The old
semantics weren't documented in detail, so I think this is fine.
This is the result of a general audit of how we fork threads, with a
detour into how we're using mutable state especially in websocket
codepaths, making more robust to async exceptions and exceptions
resulting from bugs.
Some highlights:
- use a wrapper around 'immortal' so threads that die due to bugs are
restarted, and log the error
- use 'withAsync' some places
- use bracket a few places where we might break invariants
- log some codepaths that represent bugs
- export UnstructuredLog for ad hoc logging (the alternative is we
continue not logging useful stuff)
I had to timebox this. There are a few TODOs I didn't want to address.
And we'll wait until this is merged to attempt #3705 for
Control.Concurrent.Extended
* basic doc for actions
* custom_types, sync and async actions
* switch to graphql-parser-hs on github
* update docs
* metadata import/export
* webhook calls are now supported
* relationships in sync actions
* initialise.sql is now in sync with the migration file
* fix metadata tests
* allow specifying arguments of actions
* fix blacklist check on check_build_worthiness job
* track custom_types and actions related tables
* handlers are now triggered on async actions
* default to pgjson unless a field is involved in relationships, for generating definition list
* use 'true' for action filter for non admin role
* fix create_action_permission sql query
* drop permissions when dropping an action
* add a hdb_role view (and relationships) to fetch all roles in the system
* rename 'webhook' key in action definition to 'handler'
* allow templating actions wehook URLs with env vars
* add 'update_action' /v1/query type
* allow forwarding client headers by setting `forward_client_headers` in action definition
* add 'headers' configuration in action definition
* handle webhook error response based on status codes
* support array relationships for custom types
* implement single row mutation, see https://github.com/hasura/graphql-engine/issues/3731
* single row mutation: rename 'pk_columns' -> 'columns' and no-op refactor
* use top level primary key inputs for delete_by_pk & account select permissions for single row mutations
* use only REST semantics to resolve the webhook response
* use 'pk_columns' instead of 'columns' for update_by_pk input
* add python basic tests for single row mutations
* add action context (name) in webhook payload
* Async action response is accessible for non admin roles only if
the request session vars equals to action's
* clean nulls, empty arrays for actions, custom types in export metadata
* async action mutation returns only the UUID of the action
* unit tests for URL template parser
* Basic sync actions python tests
* fix output in async query & add async tests
* add admin secret header in async actions python test
* document async action architecture in Resolve/Action.hs file
* support actions returning array of objects
* tests for list type response actions
* update docs with actions and custom types metadata API reference
* update actions python tests as per #f8e1330
Co-authored-by: Tirumarai Selvan <tirumarai.selvan@gmail.com>
Co-authored-by: Aravind Shankar <face11301@gmail.com>
Co-authored-by: Rakesh Emmadi <12475069+rakeshkky@users.noreply.github.com>
* Add downgrade command
* Add docs per @lexi-lambda's suggestions
* make tests pass
* Update hdb_version once, from Haskell
* more work based on feedback
* Improve the usage message
* Small docs changes
* Test downgrades exist for each tag
* Update downgrading.rst
* Use git-log to find tags which are ancestors of the current commit
Co-authored-by: Vamshi Surabhi <0x777@users.noreply.github.com>
We upload a set of accumulating timers and counters to track service
time for different types of operations, across several dimensions (e.g.
did we hit the plan cache, was a remote involved, etc.)
Also...
Standardize on DiffTime as a standard duration type, and try to use it
consistently.
See discussion here:
https://github.com/hasura/graphql-engine/pull/3584#pullrequestreview-340679369
It should be possible to overwrite that module so the new threadDelay
sticks per the pattern in #3705 blocked on #3558
Rename the Control.Concurrent.Extended.threadDelay to `sleep` since a
naive use with a literal argument would be very bad!
We catch a bug in 'computeTimeDiff'.
Add convenient 'Read' instances to the time unit utility types. Make
'Second' a newtype to support this.
The connection handler in websocket transport was not using the
'UserAuthentication' interface to resolve user info. Fix resolving
user info in websocket transport to use the common
'UserAuthentication' interface