* Eliminating circe-yaml
This change adds our very-own YAML parser on top of SnakeYAML. Compared
to Circe parser on top of SnakeYAML. The advantage? In some not-so-distant
future we might actually get rid of circe and the related performance
issues.
The logic is similar to what circe does i.e. analyzing SnakeYAML to
build our own structure.
This change is not complete, as there are still some tests failing, but
most common Configs are already parseable.
We _could_ auto-generate some of the code but still some of the logic
would have to be tweaked by hand; the current logic has a number of
special cases, as I found out the hard way.
* wip: more tests passing
* Fix remaining tests in ConfigSpec
* Fixing YAML decoder for editions
Dropping circe as a decoder for Editions revealed some problems. Turns
out the current implementation had even more special cases to deal with.
* nit
* Allow for empty exports
* Mostly complete encodin part
Replaced almost all `toYAML` locations with SnakeYAML equivalent.
The encoding has to use Java collections for which there exists a
built-in support. If we were to use Scala collections we would have to
deal with tagging, at the very least.
* Remove the last remaining Circe's YAML parser
* Bug fix + further loop optimization
* removal of some dependencies
* Remove circe-yaml
Added a custom SnakeYAML Node updater to mimick the JSON -> YAML -> JSON
conversion needed for updating fields. The algorithm recursively follows
the key-path and inserts the desired Node. This is not a performance
oriented code on purpose.
* Fix compilation issues
`circe-core` was marked as `provided` but no one eventually included it
in the final jar, hence `NoClassFoundException`.
* fix licensing
* Removing obsolete circe definitions
* fmt
* nits
* s/SnakeYamlDecoder/YamlDecoder
* fmt
* Partial revert, PM needs JSON decoders/encoders
* style
* incremental compilation gone wrong
The current implementation contains logic that should enable us to make some backward compatibility config changes.
At the same time, the logic is tightly integrated with circe's JSON library, which we want to eventually to get rid off.
Rather than trying to keep it somehow around and maintain via some hacks this PR proposes to ditch that logic completely as we currently have no use-case for such scenarios.
As a result, classes modelling YAML configs now don't have the extra fields and there is 1:1 correspondence.
Performance has also improved although that wasn't the main objective, yet. Follow up PR will attempt to replace `circe-yaml` with `snakeyaml` directly.
In preparation for #9113. Note that the dependency upgrade is necessary because it brings latest available `snakeyaml` (as part of `circe-yaml`).
Reducing the number of dependencies. Explicit `cats` are almost gone (present in `cli`). `enumeration` is completely gone. `cats` is also still included implicitly via `io.circe` but that's a different kind of beast.
Also, really removed `jackson` from dependencies by fixing the dependency on `http-test-helper`.
# Important Notes
In a number of places importing all cats implicits could be simply replaced with a single or two method calls. Not to mention that this will reduce compilation times due to reduced implicit search space.
One example of how the changes affect performance (not only startup):
Before:
![Screenshot from 2024-06-11 12-05-24](https://github.com/enso-org/enso/assets/292128/a1a772a9-635d-4a16-a543-e2fd2124a22c)
Now:
![Screenshot from 2024-06-11 14-27-47](https://github.com/enso-org/enso/assets/292128/b17c7fcc-9a6d-48b9-8200-60708354ee03)
(frequently executed)
![Screenshot from 2024-06-12 12-46-34](https://github.com/enso-org/enso/assets/292128/31bc4dfd-4edc-45c9-9c5d-13e3472089b9)
Also appears to be gone.
This PR is by no means finished. The purge will continue in follow up PRs.
- related #7954
Changelog:
- update: Ydoc starts with the language server on the `localhost:1234` by default. The hostname and ports can be configured by setting environment variables `LANGUAGE_SERVER_YDOC_HOSTNAME` and `LANGUAGE_SERVER_YDOC_PORT`
- update: by default `npm dev run` uses the node Ydoc server. You can control it with `POLYGLOT_YDOC_SERVER` env variable. For example,
```
env POLYGLOT_YDOC_SERVER='true' npm --workspace=enso-gui2 run dev
```
To connect to the Ydoc server running on the 1234 port (the one started with the language server)
⠀
```
env POLYGLOT_YDOC_SERVER='ws://127.0.0.1:1235' npm --workspace=enso-gui2 run dev
```
To connect to the provided URL. Can be useful for debugging when you start a separate Ydoc process.
- update: run `npm install` before the engine build. It is required to create the Ydoc JS bundle.
This PR bumps the FlatBuffers version used by the backend to `24.3.25` (the latest version as of now).
Since the newer FlatBuffers releases come with prebuilt binaries for all platforms we target, we can simplify the build process by simply downloading the required `flatc` binary from the official FlatBuffers GitHub release page. This allows us to remove the dependency on `conda`, which was the only reliable way to get the outdated `flatc`.
The `conda` setup has been removed from the CI steps and the relevant code has been removed from the build script.
The FlatBuffers version is no longer hard-coded in the Rust build script, it is inferred from the `build.sbt` definition (similar to GraalVM).
# Important Notes
This does not affect the GUI binary protocol implementation.
While I initially wanted to update it, it turned out farly non-trivial.
As there are multiple issues with the generated TS code, it was significantly refactored by hand and it is impossible to automatically update it. Work to address this problem is left as [a future task](https://github.com/enso-org/enso/issues/9658).
As the Flatbuffers binary protocol is guaranteed to be compatible between versions (unlike the generated sources), there should be no adverse effects from bumping `flatc` only on the backend side.
`Bump` library uses parser combinators behind the scenes which are known to be good at expressing grammars but are not performance-oriented.
This change ditches the dependency in favour of an existing Java implementation. `jsemver` implements the full specification, which is probably an overkill in our case, but proved to be an almost drop-in replacement for the previous library.
Closes#8692
# Important Notes
Peformance improvements:
- roughly 50ms compared to the previous approach (from 80ms to 20-40ms)
I don't see any time spent in the new implementation during startup so it could be potentially aggressively inlined.
Further more, we could use a facade and offer our own strip down version of semver.
There are two projects transitively required by `runtime`, that have akka dependencies:
- `downloader`
- `connected-lock-manager`
This PR replaces the `akka-http` dependency in `downloader` by HttpClient from JDK, and splits `connected-lock-manager` into two projects such that there are no akka classes in `runtime.jar`.
# Important Notes
- Simplify the `downloader` project - remove akka.
- Add HTTP tests to the `downloader` project that uses our `http-test-helper` that is normally used for stdlib tests.
- It required few tweaks so that we can embed that server in a unit test.
- Split `connected-lock-manager` project into two projects - remove akka from `runtime`.
- **Native image build fixes and quality of life improvements:**
- Output of `native-image` is captured 743e167aa4
- The output will no longer be intertwined with the output from other commands on the CI.
- Arguments to the `native-image` are passed via an argument file, not via command line - ba0a69de6e
- This resolves an issue on Windows with "Command line too long", for example in https://github.com/enso-org/enso/actions/runs/7934447148/job/21665456738?pr=8953#step:8:2269
- ✅Linting fixes and groups.
- ✅Add `File.from that:Text` and use `File` conversions instead of taking both `File` and `Text` and calling `File.new`.
- ✅Align Unix Epoc with the UTC timezone and add converting from long value to `Date_Time` using it.
- ❌Add simple first logging API allowing writing to log messages from Enso.
- ✅Fix minor style issue where a test type had a empty constructor.
- ❌Added a `long` based array builder.
- Added `File_By_Line` to read a file line by line.
- Added "fast" JSON parser based off Jackson.
- ✅Altered range `to_vector` to be a proxy Vector.
- ✅Added `at` and `get` to `Database.Column`.
- ✅Added `get` to `Table.Column`.
- ✅Added ability to expand `Vector`, `Array` `Range`, `Date_Range` to columns.
- ✅Altered so `expand_to_column` default column name will be the same as the input column (i.e. no `Value` suffix).
- ✅Added ability to expand `Map`, `JS_Object` and `Jackson_Object` to rows with two columns coming out (and extra key column).
- ✅ Fixed bug where couldn't use integer index to expand to rows.
Upgrade to GraalVM JDK 21.
```
> java -version
openjdk version "21" 2023-09-19
OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15)
OpenJDK 64-Bit Server VM GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15, mixed mode, sharing)
```
With SDKMan, download with `sdk install java 21-graalce`.
# Important Notes
- After this PR, one can theoretically run enso with any JRE with version at least 21.
- Removed `sbt bootstrap` hack and all the other build time related hacks related to the handling of GraalVM distribution.
- `project-manager` remains backward compatible - it can open older engines with runtimes. New engines now do no longer require a separate runtime to be downloaded.
- sbt does not support compilation of `module-info.java` files in mixed projects - https://github.com/sbt/sbt/issues/3368
- Which means that we can have `module-info.java` files only for Java-only projects.
- Anyway, we need just a single `module-info.class` in the resulting `runtime.jar` fat jar.
- `runtime.jar` is assembled in `runtime-with-instruments` with a custom merge strategy (`sbt-assembly` plugin). Caching is disabled for custom merge strategies, which means that re-assembly of `runtime.jar` will be more frequent.
- Engine distribution contains multiple JAR archives (modules) in `component` directory, along with `runner/runner.jar` that is hidden inside a nested directory.
- The new entry point to the engine runner is [EngineRunnerBootLoader](https://github.com/enso-org/enso/pull/7991/files#diff-9ab172d0566c18456472aeb95c4345f47e2db3965e77e29c11694d3a9333a2aa) that contains a custom ClassLoader - to make sure that everything that does not have to be loaded from a module is loaded from `runner.jar`, which is not a module.
- The new command line for launching the engine runner is in [distribution/bin/enso](https://github.com/enso-org/enso/pull/7991/files#diff-0b66983403b2c329febc7381cd23d45871d4d555ce98dd040d4d1e879c8f3725)
- [Newest version of Frgaal](https://repo1.maven.org/maven2/org/frgaal/compiler/20.0.1/) (20.0.1) does not recognize `--source 21` option, only `--source 20`.
* Reduce extra output in compilation and tests
I couldn't stand the amount of extra output that we got when compiling
a clean project and when executing regular tests. We should strive to
keep output clean and not print anything additional to stdout/stderr.
* Getting rid of explicit setup by service loading
In order for SL4J to use service loading correctly had to upgrade to
latest slf4j. Unfortunately `TestLogProvider` which essentially
delegates to `logback` provider will lead to spurious ambiguous warnings
on multiple providers. In order to dictate which one to use and
therefore eliminate the warnings we can use the `slf4j.provider` env
var, which is only available in slf4j 2.x.
Now, there is no need to explicitly call `LoggerSetup.get().setup()` as
that is being called during service setup.
* legal review
* linter
* Ensure ConsoleHandler uses the default level
ConsoleHandler's constructor uses `Level.INFO` which is unnecessary for
tests.
* report warnings
Implement new Enso documentation parser; remove old Scala Enso parser.
Performance: Total time parsing documentation is now ~2ms.
# Important Notes
- Doc parsing is now done only in the frontend.
- Some engine tests had never been switched to the new parser. We should investigate tests that don't pass after the switch: #5894.
- The option to run the old searcher has been removed, as it is obsolete and was already broken before this (see #5909).
- Some interfaces used only by the old searcher have been removed.
New plan to [fix the `sbt` build](https://www.pivotaltracker.com/n/projects/2539304/stories/182209126) and its annoying:
```
log.error(
"Truffle Instrumentation is not up to date, " +
"which will lead to runtime errors\n" +
"Fixes have been applied to ensure consistent Instrumentation state, " +
"but compilation has to be triggered again.\n" +
"Please re-run the previous command.\n" +
"(If this for some reason fails, " +
s"please do a clean build of the $projectName project)"
)
```
When it is hard to fix `sbt` incremental compilation, let's restructure our project sources so that each `@TruffleInstrument` and `@TruffleLanguage` registration is in individual compilation unit. Each such unit is either going to be compiled or not going to be compiled as a batch - that will eliminate the `sbt` incremental compilation issues without addressing them in `sbt` itself.
fa2cf6a33ec4a5b2e3370e1b22c2b5f712286a75 is the first step - it introduces `IdExecutionService` and moves all the `IdExecutionInstrument` API up to that interface. The rest of the `runtime` project then depends only on `IdExecutionService`. Such refactoring allows us to move the `IdExecutionInstrument` out of `runtime` project into independent compilation unit.
* Initial integration with Frgaal in sbt
Half-working since it chokes on generated classes from annotation
processor.
* Replace AutoService with ServiceProvider
For reasons unknown AutoService would fail to initialize and fail to
generate required builtin method classes.
Hidden error message is not particularly revealing on the reason for
that:
```
[error] error: Bad service configuration file, or exception thrown while constructing Processor object: javax.annotation.processing.Processor: Provider com.google.auto.service.processor.AutoServiceProcessor could not be instantiated
```
The sample records is only to demonstrate that we can now use newer Java
features.
* Cleanup + fix benchmark compilation
Bench requires jmh classes which are not available because we obviously
had to limit `java.base` modules to get Frgaal to work nicely.
For now, we default to good ol' javac for Benchmarks.
Limiting Frgaal to runtime for now, if it plays nicely, we can expand it
to other projects.
* Update CHANGELOG
* Remove dummy record class
* Update licenses
* New line
* PR review
* Update legal review
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>