Encountered a random NPE when playing with bookclubs. Test case demonstrating the problem is attached.
Threw in a bunch of minor tweaks to logs to make life of the person debugging code more pleasant.
close#8329
Changelog:
- add: `cmd`+`shift`+`,` and `cmd`+`shift`+`.` shortcuts to start and stop the backend profiling. Profiling data is stored on disk.
Changelog:
- update: always create an event log next to the profiling file when the engine is started with the `--profiling-path` flag
- remove: `--profiling-events-log-path` flag
close#8249
Changelog:
- add: `profiling/snapshot` request that takes a heap dump of the language server and puts it in the `ENSO_DATA_DIRECTORY/profiling` direcotry
Attaching or modifying a visualizations returns early on, to avoid a situation when a background job is stalled (by other jobs) and eventually the request timeouts.
This has an unfortunate consequence that any error reported in the `UpsertVisualizationJob` cannot be reported as a directly reply to a request because the sender has already been removed from the list.
Added more logs to discover why we get errors in the first place.
Modified the API a bit so that we carry `VisualizationContext` instead of three parameters all over the place.
Bonus:
Modified `JsonRpcServerTestKit` to implicitly require a position so that we get better error reporting on failures.
Upgrade to GraalVM JDK 21.
```
> java -version
openjdk version "21" 2023-09-19
OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15)
OpenJDK 64-Bit Server VM GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15, mixed mode, sharing)
```
With SDKMan, download with `sdk install java 21-graalce`.
# Important Notes
- After this PR, one can theoretically run enso with any JRE with version at least 21.
- Removed `sbt bootstrap` hack and all the other build time related hacks related to the handling of GraalVM distribution.
- `project-manager` remains backward compatible - it can open older engines with runtimes. New engines now do no longer require a separate runtime to be downloaded.
- sbt does not support compilation of `module-info.java` files in mixed projects - https://github.com/sbt/sbt/issues/3368
- Which means that we can have `module-info.java` files only for Java-only projects.
- Anyway, we need just a single `module-info.class` in the resulting `runtime.jar` fat jar.
- `runtime.jar` is assembled in `runtime-with-instruments` with a custom merge strategy (`sbt-assembly` plugin). Caching is disabled for custom merge strategies, which means that re-assembly of `runtime.jar` will be more frequent.
- Engine distribution contains multiple JAR archives (modules) in `component` directory, along with `runner/runner.jar` that is hidden inside a nested directory.
- The new entry point to the engine runner is [EngineRunnerBootLoader](https://github.com/enso-org/enso/pull/7991/files#diff-9ab172d0566c18456472aeb95c4345f47e2db3965e77e29c11694d3a9333a2aa) that contains a custom ClassLoader - to make sure that everything that does not have to be loaded from a module is loaded from `runner.jar`, which is not a module.
- The new command line for launching the engine runner is in [distribution/bin/enso](https://github.com/enso-org/enso/pull/7991/files#diff-0b66983403b2c329febc7381cd23d45871d4d555ce98dd040d4d1e879c8f3725)
- [Newest version of Frgaal](https://repo1.maven.org/maven2/org/frgaal/compiler/20.0.1/) (20.0.1) does not recognize `--source 21` option, only `--source 20`.
close#8248
Changelog:
- add: `profiling/start` request starts the sampler and starts collecting runtime events to the log file
- add: `profiling/stop` request stop the sampler and write the profiling data to the `$ENSO_DATA_DIR/profiling` directory
- refactor: rewrite the profiling logic into Java
This change fixes a regression introduced in #7918, which prevented the execution from setting the right log level either via env var or parameter.
Now passing either of the options returns logs of the expected level in the log file:
- `ENSO_LOG_TO_FILE_LOG_LEVEL = trace`
- ... `-vv` ...
Fixes#8274
A long running initialization of the component blocks the execution significantly. Removed the `BlockingInitialization` and replaced it with a more fine grained locking.
# Important Notes
Added a simple workaround for potential slow initialization of backend - more retries. We should have a better UX in that case anyway, but due to absence of work on that in old GUI, this will have to do.
The main should be problem should be addressed already by other backend changes. Changes in `app` should only be treated as _just in case something bad happens_.
* OpenFileCmd sends a reply when finished
Lack of reply and therefore a non-determinism on when OpenFile handler
can finish, led to some sporadic instability. Once caches format got
changed, things were taking such a long time that I wasn't able to start
even a basic project (requests would start to timeout).
The change also removes PushContextCmd from synchronous cmds (as
introduced in #798 to remove initialization problems in tests as well as
in real scenarions); the change did the job but was also a bit
controversial.
The change can also help with randomly failing applies (#8174) as IDE kept
closing and opening the project that might have exploited the
race-condition.
* Adapt tests
* Make tests more resilient to out of order messages
* Drop retries that lead to confusing errors
* less random failures
* s/OpenFileNotification/OpenFileRequest
Debugging the issue reported by @PabloBuchu when the language server initialization hangs in the cloud. I'm still not sure what is happening in the cloud because I was not able to reproduce it when trying to connect two clients simultaneously.
Another potential source of the issue may be the Scala Future -> Java CompletableFuture conversion, but I didn't find anything suspicious there.
The change upgrades `directory-watcher` library, hoping that it will fix the problem reported in #7695 (there has been a number of bug fixes in MacOS listener since then).
Once upgraded, tests in `WatcherAdapterSpec` because the logic that attempted to ensure the proper initialization order in the test using semaphore was wrong. Now starting the watcher using `watchAsync` which only returns the future when the watcher successfully registers for paths. Ideally authors of the library would make the registration bit public
(3218d68a84/core/src/main/java/io/methvin/watcher/DirectoryWatcher.java (L229C7-L229C20)) but it is the best we can do so far.
Had to adapt to the new API in PathWatcher as well, ensuring the right order of initialization.
Should fix#7695.
The change eliminates a race-condition that can appear between the `PathWatcher` deregistering the last client (and shutting down) and `ReceivesTreeUpdatesHandler` receiving the termination message of that event. In between there could come a message towards the mentioned `PathWatcher` resulting in a timeout.
The scenario has become rather common in CI tests resulting in spurious failures. The problem could also be simulated by adding artificial `Thread.sleep` between sending the reply in `PathWatcher` and stopping the actor.
Fixes#8151.
# Important Notes
Example failure https://github.com/enso-org/enso/actions/runs/6642894609/job/18048711561?pr=8145
To simulate the scenario
```diff
diff --git a/engine/language-server/src/main/scala/org/enso/languageserver/filemanager/PathWatcher.scala b/engine/language-server/src/main/scala/org/enso/languageserver/filemanager/PathWatcher.scala
index 713cfe1182..88afac95cb 100644
--- a/engine/language-server/src/main/scala/org/enso/languageserver/filemanager/PathWatcher.scala
+++ b/engine/language-server/src/main/scala/org/enso/languageserver/filemanager/PathWatcher.scala
@@ -110,6 +110,7 @@ final class PathWatcher(
case UnwatchPath(client) =>
sender() ! CapabilityReleased
+ Thread.sleep(2000)
unregisterClient(root, base, clients - client)
```
Towards reduced reliance on Scala semantics.
Translated IR.scala to IR.java and extracted implicits that now need to be imported explicitly.
# Important Notes
1:1 translation. For now `@Identifier` and `@ExternalID` represent the old type aliases but are not verified at compile time.
This is because in a mixed Scala/Java world this seems impossible to employ such frameworks as Checker.
`capability/acquire` with an invalid method `executionContext/canModify` would previously timeout because the capability router simply didn't support that kind of capability request.
Now rather than timing out, indicating a failure on LS, we report an error.
Closes#8038.
close#8033
Changelog:
- update: run language server initialization once
- fix: issues with async `getSuggestionDatabase` message handling in new IDE
- update: implement unique background jobs
- refactor: initialization logic to Java
- refactor: `UniqueJob` to a marker interface
Having a modest-size files in a project would lead to a timeout when the project was first initialized. This became apparent when testing delivered `.enso-project` files with some data files. After some digging there was a bug in JGit
(https://bugs.eclipse.org/bugs/show_bug.cgi?id=494323) which meant that adding such files was really slow. The implemented fix is not on by default but even with `--renormalization` turned off I did not see improvement.
In the end it didn't make sense to add `data` directory to our version control, or any other files than those in `src` or some meta files in `.enso`. Not including such files eliminates first-use initialization problems.
# Important Notes
To test, pick an existing Enso project with some data files in it (> 100MB) and remove `.enso/.vcs` directory. Previously it would timeout on first try (and work in successive runs). Now it works even on the first try.
The crash:
```
[org.enso.languageserver.requesthandler.vcs.InitVcsHandler] Initialize project request [Number(2)] for [f9a7cd0d-529c-4e1d-a4fa-9dfe2ed79008] failed with: null.
java.util.concurrent.TimeoutException: null
at org.enso.languageserver.effect.ZioExec$.<clinit>(Exec.scala:134)
at org.enso.languageserver.effect.ZioExec.$anonfun$exec$3(Exec.scala:60)
at org.enso.languageserver.effect.ZioExec.$anonfun$exec$3$adapted(Exec.scala:60)
at zio.ZIO.$anonfun$foldCause$4(ZIO.scala:683)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:904)
at zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
at zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
at zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
at zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:49)
at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1395)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
```
* Reduce extra output in compilation and tests
I couldn't stand the amount of extra output that we got when compiling
a clean project and when executing regular tests. We should strive to
keep output clean and not print anything additional to stdout/stderr.
* Getting rid of explicit setup by service loading
In order for SL4J to use service loading correctly had to upgrade to
latest slf4j. Unfortunately `TestLogProvider` which essentially
delegates to `logback` provider will lead to spurious ambiguous warnings
on multiple providers. In order to dictate which one to use and
therefore eliminate the warnings we can use the `slf4j.provider` env
var, which is only available in slf4j 2.x.
Now, there is no need to explicitly call `LoggerSetup.get().setup()` as
that is being called during service setup.
* legal review
* linter
* Ensure ConsoleHandler uses the default level
ConsoleHandler's constructor uses `Level.INFO` which is unnecessary for
tests.
* report warnings
It seems that Runtime Connector wasn't respecting the protocol it defined itself. The connector should be waiting on the `Api.InitializedNotification` message and only then start forwarding messages. So far it seems this hasn't been a problem, or at least wasn't reported as such, because initialization was fast enough.
Modified `Handler` so that we are certain that its fields hold initialized values when being accessed by different threads.
Should fix problems mentioned in #7898.
* Add support for https and wss
Preliminary support for https and wss. During language server startup we
will read the application config and search for the `https` config with
necessary env vars set.
The configuration supports two modes of creating ssl-context - via
PKCS12 format and certificat+private key.
Fixes#7839.
* Added tests, improved documentation
Generic improvements along with actual tests.
* lint
* more docs + wss support
* changelog
* Apply suggestions from code review
Co-authored-by: Dmitry Bushev <bushevdv@gmail.com>
* PR comment
* typo
* lint
* make windows line endings happy
---------
Co-authored-by: Dmitry Bushev <bushevdv@gmail.com>
It looks like visualization commands had required context lock unnecessairly. Context manager methods are synchronized and therefore should not need the lock before submitting a correspodning background job.
Additionally, the presence of a context lock leads a deadlock:
1. Consider a long running execution of a program that does not finish within the 5 seconds
2. In the meantime there comes either an `AttachVisualization` or `DetachVisualization` request from the client
The latter will get stuck and eventually timeout out because it cannot acquire the lock withing the required time limits. It is still possible that a single long-running `ExecuteJob` might block other ones (including visualization ones) but that's a separate work.
Fixes some issues present in #7941.
# Important Notes
We need to still investigate `ExecuteJob`s need for context lock which might delay the execution of other background jobs that require it as well (mostly concerned about visualization).
* Q: Is it normal for `--inspect` mode to print two debug urls?
* A: No, it should print just one.
* Q: Putting there a Java breakpoint to find out why it the chromeinspector gets initialized twice might reveal the culprit.
* A: The additional listener is happening [here](https://github.com/enso-org/enso/blob/develop/engine/runner/src/main/scala/org/enso/runner/ContextFactory.scala#L117).
# Important Notes
There is no easy check for a language being present without creating an `Engine`. It was thought creating an `Engine` is cheap operation, but it seems to have some downsides. Let's use `ENSO_JAVA` environment variable to decide whether _experimental Espresso_ support shall be enabled.
* Enable log-to-file configuration
PR #7825 enabled parallel logging to a file with a much more
fine-grained log level by default.
However, logging at `TRACE` level on Windows appears to be still
problematic.
This PR reduced the default log level to file from `DEBUG` to `TRACE`
and allows to control it via an environment variable if one wishes to
change the verbosity without making code changes.
* PR comments
close#7750close#7834
Changelog:
- update: project manager uses the packaged language server to open projects
- fix: remove stack traces from connection errors on initial ping handler request (when the language server is booting)
- update: add engine and edition versions to the `initProtocolConnection` response for easier debug
- update: do not resolve project ensoVersion in the `project/list` to eliminate unnecessary network calls
* Always log verbose to a file
The change adds an option by default to always log to a file with
verbose log level.
The implementation is a bit tricky because in the most common use-case
we have to always log in verbose mode to a socket and only later apply
the desired log levels. Previously socket appender would respect the
desired log level already before forwarding the log.
If by default we log to a file, verbose mode is simply ignored and does
not override user settings.
To test run `project-manager` with `ENSO_LOGSERVER_APPENDER=console` env
variable. That will output to the console with the default `INFO` level
and `TRACE` log level for the file.
* add docs
* changelog
* Address some PR requests
1. Log INFO level to CONSOLE by default
2. Change runner's default log level from ERROR to WARN
Took a while to figure out why the correct log level wasn't being passed
to the language server, therefore ignoring the (desired) verbose logs
from the log file.
* linter
* 3rd party uses log4j for logging
Getting rid of the warning by adding a log4j over slf4j bridge:
```
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
```
* legal review update
* Make sure tests use test resources
Having `application.conf` in `src/main/resources` and `test/resources`
does not guarantee that in Tests we will pick up the latter. Instead, by
default it seems to do some kind of merge of different configurations,
which is far from desired.
* Ensure native launcher test log to console only
Logging to console and (temporary) files is problematic for Windows.
The CI also revealed a problem with the native configuration because it
was not possible to modify the launcher via env variables as everything
was initialized during build time.
* Adapt to method changes
* Potentially deal with Windows failures
* Improve shutdown logic of language server
This PR addresses problems mentioned in #7470 and #7729:
- shutting a language server explicitly will not lead to a soft shutdown
- `project/status` endpoint returns the state of the language server
`LanguageServerController` now also signed up for `ClientConnect`
messages. For it to be unambiguous, we need to carry around the port
number of the language server as a way of identifying the right one.
One can now use `project/status` to additionally determine the state of
the language server.
Also relies on a proper fix for #7765.
* changelog
* PR comments
close#7320
Changelog:
- update: enable conversion suggestions
- fix: conversion suggestion building
- fix: conversion suggestion types
- fix: conversion JSON-RPC representation
# Important Notes
For example, the [`Day_Of_Week_From`](5150c14afd/distribution/lib/Standard/Base/0.0.0-dev/src/Data/Time/Day_Of_Week_From.enso) conversion is sent as
```json
{
"type":"Add",
"id":32,
"suggestion":{
"type":"method",
"module":"Standard.Base.Data.Time.Day_Of_Week_From",
"name":"from",
"arguments":[
{
"name":"that",
"reprType":"Standard.Base.Data.Numbers.Integer",
"isSuspended":false,
"hasDefault":false,
"defaultValue":null,
"tagValues":null
},
{
"name":"first_day",
"reprType":"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week",
"isSuspended":false,
"hasDefault":true,
"defaultValue":"Day_Of_Week.Sunday",
"tagValues":[
"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week.Sunday",
"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week.Monday",
"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week.Tuesday",
"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week.Wednesday",
"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week.Thursday",
"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week.Friday",
"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week.Saturday"
]
},
{
"name":"start_at_zero",
"reprType":"Standard.Base.Data.Boolean.Boolean",
"isSuspended":false,
"hasDefault":true,
"defaultValue":"False",
"tagValues":[
"Standard.Base.Data.Boolean.Boolean.True",
"Standard.Base.Data.Boolean.Boolean.False"
]
}
],
"selfType":"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week",
"returnType":"Standard.Base.Data.Time.Day_Of_Week.Day_Of_Week",
"isStatic":false,
"documentation":" Convert from an integer to a Day_Of_Week\n\nArguments:\n- `that`: The first day of the week.\n- `first_day`: The first day of the week.\n- `start_at_zero`: If True, first day of the week is 0 otherwise is 1.",
"annotations":[
]
}
}
```
close#7765
Changelog:
- update: instead of relying on the connection closed events, the `sendSuggestionsDatabase` request initiates the suggestions database re-indexing
Fixes the _RefactoringTests - rename project_ part of the #7775
Changelog:
- update: send the ok response before the notification to fix the order of events in tests
This change replaces Enso's custom logger with an existing, mostly off the shelf logging implementation. The change attempts to provide a 1:1 replacement for the existing solution while requiring only a minimal logic for the initialization.
Loggers are configured completely via `logging-server` section in `application.conf` HOCON file, all initial logback configuration has been removed. This opens up a lot of interesting opportunities because we can benefit from all the well maintained slf4j implementations without being to them in terms of functionality.
Most important differences have been outlined in `docs/infrastructure/logging.md`.
# Important Notes
Addresses:
- #7253
- #6739
- Closes#7633
- Moves `Round_Spec.enso` from published `Standard.Test` into our `test/Tests` project; the `Table_Tests` that depend on it, simply `import enso_dev.Tests`.
- Changes the layout of the local libraries directory:
- It used to be `root/<namespace>/<name>`.
- Now it is `root/<dir>` - the namespace and name are now read from `package.yaml` instead.
- Adds the parent directory of the current project to the default `ENSO_LIBRARY_PATH`.
- It is treated as a secondary path, so the default `ENSO_HOME/lib` still takes precedence.
- This allows projects to reference and load 'sibling' projects easily - the only requirement is for the project to enable `prefer-local-libraries: true` or add the other local project to its edition. The edition resolution logic is **not changed**.
close#7604
After moving the rename action to the dashboard, IDE is unaware of the new project name. PR implements a new `refactoring/projectRenamed` notification that is sent from the server to clients and informs them about the changed project name.
# Important Notes
https://github.com/enso-org/enso/assets/357683/7c62726d-217e-4e69-8e48-568e0b7b8c34
We can't really control the timing of file watcher events, which sometimes leads to failures of VCS tests. The PR disables the file watcher in the `VcsManagerTest` suite to make tests more stable.
Changelog:
- add: a `Watcher` and `WatcherFactory` interfaces
- add: a `NoopWatcher` test watcher
- update: disable the file watcher in the `VcsManagerTest` suite
part of #7178
Changelog:
- add: `text/fileModifiedOnDisk` notification
- update: during the auto-save, check if the file is modified on disk and send the notification. I.e. auto-save does not overwrite the file if it was changed on disk (but the save command does)
- update: IDE handles the file-modified-on-disk notification and reloads the module from disk
# Important Notes
Currently, the auto-save (and the check that the file is modified on disk) is triggered only after the file was edited. The proper check (using the file-watcher service) will be added in the next PR
https://github.com/enso-org/enso/assets/357683/ff91f3e6-2f7a-4c01-a745-98cb140e1964
As discovered in #7224, Json RPC protocol was added to the asynchronous resource initialization stage, as part of #6306, but was not in fact initialized at that point.
Instead it was initialized when the server was started to be able to serve correctly the initialization messages. A classic Catch-22. It was really hard to discover this just by looking at the code, but the profiling clearly showed where the time was spent.
This change splits Language Server's protocol into two:
- the first one accepts `heartbeat/init` and `session/initProtocolConnection`
- the second one enriches it with the full set of supported messages
This shifts the initialization from blocking for 0.5 sec to only ~30ms, and performing the second stage asynchronously.
Closes#7224.
# Important Notes
Before the change (blocking server startup):
![Screenshot from 2023-07-05 18-53-24](https://github.com/enso-org/enso/assets/292128/bcfa9043-d00a-4b36-a44c-782a388a16b9)
![Screenshot from 2023-07-05 18-53-10](https://github.com/enso-org/enso/assets/292128/54927787-4c95-46db-bd68-f3a3b82367d5)
After the change (1st stage):
![Screenshot from 2023-07-06 14-02-34](https://github.com/enso-org/enso/assets/292128/d7a7bc34-39dc-46f1-9e64-6d350697c30b)
After the change (2nd, asynchronous initialization, stage):
![Screenshot from 2023-07-06 14-21-17](https://github.com/enso-org/enso/assets/292128/def8c0a1-f211-4fc0-9df0-7c1634312166)
Package's config information, once loaded, never changed. While there is typically no need for it, this was problematic when the config became out-of-sync with the filesystem, like in the case of project rename action.
In rename, the config's properties would be updated in the FS, but that would never be reflected in module's package. Therefore further compilations would continue to ask for the old namespace.
Most of the changes are cosmetic (s/`.config`/`.getConfig()`) except for the new `reloadConfig` method on `Package` that is being called in `RenameProjectCmd` handler.
Closes#7062.
# Important Notes
The reported `ExecutionFailed` error should have been mostly fixed already via #7143. This change makes sure that all the related warnings are gone as well and the compiler uses the updated namespace.