enso/engine/language-server
Hubert Plociniczak e529f7bb3b
Improvements that significantly reduce the chances of request timeouts (#7042)
Request Timeouts started plaguing IDE due to numerous `executionContext/***Visualization` requests. While caused by a bug they revealed a bigger problem in the Language Server when serving large amounts of requests:
1) Long and short lived jobs are fighting for various locks. Lock contention leads to some jobs waiting for a longer than desired leading to unexpected request timeouts. Increasing timeout value is just delaying the problem.
2) Requests coming from IDE are served almost instantly and handled by various commands. Commands can issue further jobs that serve request. We apparently have and always had a single-thread thread pool for serving such jobs, leading to immediate thread starvation.

Both reasons increase the chances of Request Timeouts when dealing with a large number of requests. For 2) I noticed that while we used to set the `enso-runtime-server.jobParallelism` option descriptor key to some machine-dependent value (most likely > 1), the value set would **only** be available for instrumentation. `JobExecutionEngine` where it is actually used would always get the default, i.e. a single-threaded ThreadPool. This means that this option descriptor was simply misused since its introduction. Moved that option to runtime options so that it can be set and retrieved during normal operation.

Adding parallelism intensified problem 1), because now we could execute multiple jobs and they would compete for resources. It also revealed a scenario for a yet another deadlock scenario, due to invalid order of lock acquisition. See `ExecuteJob` vs `UpsertVisualisationJob` order for details.

Still, a number of requests would continue to randomly timeout due to lock contention. It became apparent that
`Attach/Modify/Detach-VisualisationCmd` should not wait until a triggered `UpsertVisualisationJob` sends a response to the client; long and short lived jobs will always compete for resources and we cannot guarantee that they will not timeout that way. That is why the response is sent immediately from the command handler and not from the job executed after it.

This brings another problematic scenario:
1. `AttachVisualisationCmd` is executed, response sent to the client, `UpsertVisualisationJob` scheduled.
2. In the meantime `ModifyVisualisationCmd` comes and fails; command cannot find the visualization that will only be added by `UpsertVisualisationJob`, which might have not yet been scheduled to run.

Remedied that by checking visualisation-related jobs that are still in progress. It also allowed for cancelling jobs which results wouldn't be used anyway (`ModifyVisualisationCmd` sends its own `UpsertVisualisationJob`). This is not a theoretical scenario, it happened frequently on IDE startup.

This change does not fully solve the rather problematic setup of numerous locks, which are requested by short and long lived jobs. A better design should still be investigated. But it significantly reduces the chances of Request Timeouts which IDE had to deal with.

With this change I haven't been able to experience Request Timeouts for relatively modest projects anymore.

I added the possibility of logging wait times for locks to better investigate further problems.

Closes #7005
2023-06-16 17:57:16 +00:00
..
src Improvements that significantly reduce the chances of request timeouts (#7042) 2023-06-16 17:57:16 +00:00
README.md Add a markdown style guide (#1022) 2020-07-21 13:59:40 +01:00

Enso Language Server

The Enso Language Server is responsibile for providing a remote-communication protocol for the runtime, exposing many of its features to the users. In addition it provides the backing service for much of the IDE functionality associated with the language. It encompasses the following functionality:

  • Introspection Services: Giving clients the ability to observe information about their running code including values, types, profiling information, and debugging.
  • Code Execution: The ability for clients to execute arbitrary Enso code in arbitrary scopes. This can be used in conjunction with the above to provide a REPL with an integrated debugger.
  • Code Completion: Sophisticated completion functionality that refines suggestions intelligently based on context.
  • Node Management: Tracking and providing the language server's internal node representation of the Enso program.
  • Code Analysis: Analysis functionality for Enso code (e.g. find usages, jump-to-definition, and so on).
  • Refactoring: Refactoring functionality for Enso code (e.g. rename, move, extract, and so on).
  • Type Interactions: Features for type-driven-development that allow users to interact with the types of their programs.