* Eliminating circe-yaml
This change adds our very-own YAML parser on top of SnakeYAML. Compared
to Circe parser on top of SnakeYAML. The advantage? In some not-so-distant
future we might actually get rid of circe and the related performance
issues.
The logic is similar to what circe does i.e. analyzing SnakeYAML to
build our own structure.
This change is not complete, as there are still some tests failing, but
most common Configs are already parseable.
We _could_ auto-generate some of the code but still some of the logic
would have to be tweaked by hand; the current logic has a number of
special cases, as I found out the hard way.
* wip: more tests passing
* Fix remaining tests in ConfigSpec
* Fixing YAML decoder for editions
Dropping circe as a decoder for Editions revealed some problems. Turns
out the current implementation had even more special cases to deal with.
* nit
* Allow for empty exports
* Mostly complete encodin part
Replaced almost all `toYAML` locations with SnakeYAML equivalent.
The encoding has to use Java collections for which there exists a
built-in support. If we were to use Scala collections we would have to
deal with tagging, at the very least.
* Remove the last remaining Circe's YAML parser
* Bug fix + further loop optimization
* removal of some dependencies
* Remove circe-yaml
Added a custom SnakeYAML Node updater to mimick the JSON -> YAML -> JSON
conversion needed for updating fields. The algorithm recursively follows
the key-path and inserts the desired Node. This is not a performance
oriented code on purpose.
* Fix compilation issues
`circe-core` was marked as `provided` but no one eventually included it
in the final jar, hence `NoClassFoundException`.
* fix licensing
* Removing obsolete circe definitions
* fmt
* nits
* s/SnakeYamlDecoder/YamlDecoder
* fmt
* Partial revert, PM needs JSON decoders/encoders
* style
* incremental compilation gone wrong
The current implementation contains logic that should enable us to make some backward compatibility config changes.
At the same time, the logic is tightly integrated with circe's JSON library, which we want to eventually to get rid off.
Rather than trying to keep it somehow around and maintain via some hacks this PR proposes to ditch that logic completely as we currently have no use-case for such scenarios.
As a result, classes modelling YAML configs now don't have the extra fields and there is 1:1 correspondence.
Performance has also improved although that wasn't the main objective, yet. Follow up PR will attempt to replace `circe-yaml` with `snakeyaml` directly.
In preparation for #9113. Note that the dependency upgrade is necessary because it brings latest available `snakeyaml` (as part of `circe-yaml`).
Reducing the number of dependencies. Explicit `cats` are almost gone (present in `cli`). `enumeration` is completely gone. `cats` is also still included implicitly via `io.circe` but that's a different kind of beast.
Also, really removed `jackson` from dependencies by fixing the dependency on `http-test-helper`.
# Important Notes
In a number of places importing all cats implicits could be simply replaced with a single or two method calls. Not to mention that this will reduce compilation times due to reduced implicit search space.
One example of how the changes affect performance (not only startup):
Before:
![Screenshot from 2024-06-11 12-05-24](https://github.com/enso-org/enso/assets/292128/a1a772a9-635d-4a16-a543-e2fd2124a22c)
Now:
![Screenshot from 2024-06-11 14-27-47](https://github.com/enso-org/enso/assets/292128/b17c7fcc-9a6d-48b9-8200-60708354ee03)
(frequently executed)
![Screenshot from 2024-06-12 12-46-34](https://github.com/enso-org/enso/assets/292128/31bc4dfd-4edc-45c9-9c5d-13e3472089b9)
Also appears to be gone.
This PR is by no means finished. The purge will continue in follow up PRs.
- related #7954
Changelog:
- update: Ydoc starts with the language server on the `localhost:1234` by default. The hostname and ports can be configured by setting environment variables `LANGUAGE_SERVER_YDOC_HOSTNAME` and `LANGUAGE_SERVER_YDOC_PORT`
- update: by default `npm dev run` uses the node Ydoc server. You can control it with `POLYGLOT_YDOC_SERVER` env variable. For example,
```
env POLYGLOT_YDOC_SERVER='true' npm --workspace=enso-gui2 run dev
```
To connect to the Ydoc server running on the 1234 port (the one started with the language server)
⠀
```
env POLYGLOT_YDOC_SERVER='ws://127.0.0.1:1235' npm --workspace=enso-gui2 run dev
```
To connect to the provided URL. Can be useful for debugging when you start a separate Ydoc process.
- update: run `npm install` before the engine build. It is required to create the Ydoc JS bundle.
As reported by our users, when using the AWS SSO, our code was failing with:
```
Execution finished with an error: To use Sso related properties in the 'xyz' profile, the 'sso' service module must be on the class path.
```
This PR adds the missing JARs to fix that.
Additionally it improves the license review tool UX a bit (parts of #9122):
- sorting the report by amount of problems, so that dependencies with unresolved problems appear at the top,
- semi-automatic helper button to rename package configurations after a version bump,
- button to remove stale entries from config (files or copyrights that disappeared after update),
- button to add custom copyright notice text straight from the report UI,
- button to set a file as the license for the project (creating the `custom-license` file automatically)
- ability to filter processed projects - e.g. `openLegalReviewReport AWS` will only run on the AWS subproject - saving time processing unchanged dependencies,
- updated the license search heuristic, fixing a problem with duplicates:
- if we had dependencies `netty-http` and `netty-http2`, because of a prefix-check logic, the notices for `netty-http` would also appear again for `netty-http2`, which is not valid. I have improved the heuristic to avoid these false positives and removed them from the current report.
- WIP: button to mark a license type as reviewed (not finished in this PR).
This change replaces an sqllite-backed suggestions' repo with a simple, in-memory, one.
As `completion` functionality has been implemented completely in GUI, there is no need to support it in backend, which simplifies a lot of functionality.
Closes#9650 and #9471.
# Important Notes
Loading suggestions and sending them to GUI on startup is almost instantaneous. Previously it would take ~10s just for `Standard.Base`.
This PR bumps the FlatBuffers version used by the backend to `24.3.25` (the latest version as of now).
Since the newer FlatBuffers releases come with prebuilt binaries for all platforms we target, we can simplify the build process by simply downloading the required `flatc` binary from the official FlatBuffers GitHub release page. This allows us to remove the dependency on `conda`, which was the only reliable way to get the outdated `flatc`.
The `conda` setup has been removed from the CI steps and the relevant code has been removed from the build script.
The FlatBuffers version is no longer hard-coded in the Rust build script, it is inferred from the `build.sbt` definition (similar to GraalVM).
# Important Notes
This does not affect the GUI binary protocol implementation.
While I initially wanted to update it, it turned out farly non-trivial.
As there are multiple issues with the generated TS code, it was significantly refactored by hand and it is impossible to automatically update it. Work to address this problem is left as [a future task](https://github.com/enso-org/enso/issues/9658).
As the Flatbuffers binary protocol is guaranteed to be compatible between versions (unlike the generated sources), there should be no adverse effects from bumping `flatc` only on the backend side.
* Initial connection to Snowflake via an account, username and password.
* Fix databases and schemas in Snowflake.
Add warehouses.
* Add warehouse.
Update schema dropdowns.
* Add ability to set warehouse and pass at connect.
* Fix for NPE in license review
* scalafmt
* Separate Snowflake from Database.
* Scala fmt.
* Legal Review
* Avoid using ARROW for snowflake.
* Tidy up Entity_Naming_Properties.
* Fix for separating Entity_Namimg_Properties.
* Allow some tweaking of Postgres dialect to allow snowflake to use as well.
* Working on reading Date, Time and Date Times.
* Changelog.
* Java format.
* Make Snowflake Time and TimeStamp stuff work.
Move some responsibilities to Type_Mapping.
* Make Snowflake Time and TimeStamp stuff work.
Move some responsibilities to Type_Mapping.
* fix
* Update distribution/lib/Standard/Database/0.0.0-dev/src/Connection/Connection.enso
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* PR comments.
* Last refactor for PR.
* Fix.
---------
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
- Closes#9120
- Reorders CI steps to do the license check last (to avoid it preventing tests from running which are more important than the license check)
- Tries to reword the warnings to be clearer
- Adds some CSS to the report to more clearly indicate which elements can be clicked.
`Bump` library uses parser combinators behind the scenes which are known to be good at expressing grammars but are not performance-oriented.
This change ditches the dependency in favour of an existing Java implementation. `jsemver` implements the full specification, which is probably an overkill in our case, but proved to be an almost drop-in replacement for the previous library.
Closes#8692
# Important Notes
Peformance improvements:
- roughly 50ms compared to the previous approach (from 80ms to 20-40ms)
I don't see any time spent in the new implementation during startup so it could be potentially aggressively inlined.
Further more, we could use a facade and offer our own strip down version of semver.
There are two projects transitively required by `runtime`, that have akka dependencies:
- `downloader`
- `connected-lock-manager`
This PR replaces the `akka-http` dependency in `downloader` by HttpClient from JDK, and splits `connected-lock-manager` into two projects such that there are no akka classes in `runtime.jar`.
# Important Notes
- Simplify the `downloader` project - remove akka.
- Add HTTP tests to the `downloader` project that uses our `http-test-helper` that is normally used for stdlib tests.
- It required few tweaks so that we can embed that server in a unit test.
- Split `connected-lock-manager` project into two projects - remove akka from `runtime`.
- **Native image build fixes and quality of life improvements:**
- Output of `native-image` is captured 743e167aa4
- The output will no longer be intertwined with the output from other commands on the CI.
- Arguments to the `native-image` are passed via an argument file, not via command line - ba0a69de6e
- This resolves an issue on Windows with "Command line too long", for example in https://github.com/enso-org/enso/actions/runs/7934447148/job/21665456738?pr=8953#step:8:2269
Updates Google_Api version for authentication and adds Google Analytics reporting api and run_google_report method.
This is an initial method for proof of concept, with further design changes to follow.
# Important Notes
Updates google-api-client to v 2.2.0 from 1.35.2
Adds google-analytics-data v 0.44.0
- ✅Linting fixes and groups.
- ✅Add `File.from that:Text` and use `File` conversions instead of taking both `File` and `Text` and calling `File.new`.
- ✅Align Unix Epoc with the UTC timezone and add converting from long value to `Date_Time` using it.
- ❌Add simple first logging API allowing writing to log messages from Enso.
- ✅Fix minor style issue where a test type had a empty constructor.
- ❌Added a `long` based array builder.
- Added `File_By_Line` to read a file line by line.
- Added "fast" JSON parser based off Jackson.
- ✅Altered range `to_vector` to be a proxy Vector.
- ✅Added `at` and `get` to `Database.Column`.
- ✅Added `get` to `Table.Column`.
- ✅Added ability to expand `Vector`, `Array` `Range`, `Date_Range` to columns.
- ✅Altered so `expand_to_column` default column name will be the same as the input column (i.e. no `Value` suffix).
- ✅Added ability to expand `Map`, `JS_Object` and `Jackson_Object` to rows with two columns coming out (and extra key column).
- ✅ Fixed bug where couldn't use integer index to expand to rows.
- After [suggestion](https://github.com/enso-org/enso/pull/8497#discussion_r1429543815) from @JaroslavTulach I have tried reimplementing the URL encoding using just `URLEncode` builtin util. I will see if this does not complicate other followup improvements, but most likely all should work so we should be able to get rid of the unnecessary bloat.
- Closes#8352
- ~~Proposed fix for #8493~~
- The temporary fix is deemed not viable. I will try to figure out a workaround and leave fixing #8493 to the engine team.
Adds these JAR modules to the `component` directory inside Engine distribution:
- `graal-language-23.1.0`
- `org.bouncycastle.*` - these need to be added for graalpy language
# Important Notes
- Remove `org.bouncycastle.*` packages from `runtime.jar` fat jar.
- Make sure that the `./run` script preinstalls GraalPy standalone distribution before starting engine tests
- Note that using `python -m venv` is only possible from standalone distribution, we cannot distribute `graalpython-launcher`.
- Make sure that installation of `numpy` and its polyglot execution example works.
- Convert `Text` to `TruffleString` before passing to GraalPy - 8ee9a2816f
Upgrade to GraalVM JDK 21.
```
> java -version
openjdk version "21" 2023-09-19
OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15)
OpenJDK 64-Bit Server VM GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15, mixed mode, sharing)
```
With SDKMan, download with `sdk install java 21-graalce`.
# Important Notes
- After this PR, one can theoretically run enso with any JRE with version at least 21.
- Removed `sbt bootstrap` hack and all the other build time related hacks related to the handling of GraalVM distribution.
- `project-manager` remains backward compatible - it can open older engines with runtimes. New engines now do no longer require a separate runtime to be downloaded.
- sbt does not support compilation of `module-info.java` files in mixed projects - https://github.com/sbt/sbt/issues/3368
- Which means that we can have `module-info.java` files only for Java-only projects.
- Anyway, we need just a single `module-info.class` in the resulting `runtime.jar` fat jar.
- `runtime.jar` is assembled in `runtime-with-instruments` with a custom merge strategy (`sbt-assembly` plugin). Caching is disabled for custom merge strategies, which means that re-assembly of `runtime.jar` will be more frequent.
- Engine distribution contains multiple JAR archives (modules) in `component` directory, along with `runner/runner.jar` that is hidden inside a nested directory.
- The new entry point to the engine runner is [EngineRunnerBootLoader](https://github.com/enso-org/enso/pull/7991/files#diff-9ab172d0566c18456472aeb95c4345f47e2db3965e77e29c11694d3a9333a2aa) that contains a custom ClassLoader - to make sure that everything that does not have to be loaded from a module is loaded from `runner.jar`, which is not a module.
- The new command line for launching the engine runner is in [distribution/bin/enso](https://github.com/enso-org/enso/pull/7991/files#diff-0b66983403b2c329febc7381cd23d45871d4d555ce98dd040d4d1e879c8f3725)
- [Newest version of Frgaal](https://repo1.maven.org/maven2/org/frgaal/compiler/20.0.1/) (20.0.1) does not recognize `--source 21` option, only `--source 20`.
The change upgrades `directory-watcher` library, hoping that it will fix the problem reported in #7695 (there has been a number of bug fixes in MacOS listener since then).
Once upgraded, tests in `WatcherAdapterSpec` because the logic that attempted to ensure the proper initialization order in the test using semaphore was wrong. Now starting the watcher using `watchAsync` which only returns the future when the watcher successfully registers for paths. Ideally authors of the library would make the registration bit public
(3218d68a84/core/src/main/java/io/methvin/watcher/DirectoryWatcher.java (L229C7-L229C20)) but it is the best we can do so far.
Had to adapt to the new API in PathWatcher as well, ensuring the right order of initialization.
Should fix#7695.
Having a modest-size files in a project would lead to a timeout when the project was first initialized. This became apparent when testing delivered `.enso-project` files with some data files. After some digging there was a bug in JGit
(https://bugs.eclipse.org/bugs/show_bug.cgi?id=494323) which meant that adding such files was really slow. The implemented fix is not on by default but even with `--renormalization` turned off I did not see improvement.
In the end it didn't make sense to add `data` directory to our version control, or any other files than those in `src` or some meta files in `.enso`. Not including such files eliminates first-use initialization problems.
# Important Notes
To test, pick an existing Enso project with some data files in it (> 100MB) and remove `.enso/.vcs` directory. Previously it would timeout on first try (and work in successive runs). Now it works even on the first try.
The crash:
```
[org.enso.languageserver.requesthandler.vcs.InitVcsHandler] Initialize project request [Number(2)] for [f9a7cd0d-529c-4e1d-a4fa-9dfe2ed79008] failed with: null.
java.util.concurrent.TimeoutException: null
at org.enso.languageserver.effect.ZioExec$.<clinit>(Exec.scala:134)
at org.enso.languageserver.effect.ZioExec.$anonfun$exec$3(Exec.scala:60)
at org.enso.languageserver.effect.ZioExec.$anonfun$exec$3$adapted(Exec.scala:60)
at zio.ZIO.$anonfun$foldCause$4(ZIO.scala:683)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:904)
at zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
at zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
at zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
at zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:49)
at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1395)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
```
* Reduce extra output in compilation and tests
I couldn't stand the amount of extra output that we got when compiling
a clean project and when executing regular tests. We should strive to
keep output clean and not print anything additional to stdout/stderr.
* Getting rid of explicit setup by service loading
In order for SL4J to use service loading correctly had to upgrade to
latest slf4j. Unfortunately `TestLogProvider` which essentially
delegates to `logback` provider will lead to spurious ambiguous warnings
on multiple providers. In order to dictate which one to use and
therefore eliminate the warnings we can use the `slf4j.provider` env
var, which is only available in slf4j 2.x.
Now, there is no need to explicitly call `LoggerSetup.get().setup()` as
that is being called during service setup.
* legal review
* linter
* Ensure ConsoleHandler uses the default level
ConsoleHandler's constructor uses `Level.INFO` which is unnecessary for
tests.
* report warnings
* Always log verbose to a file
The change adds an option by default to always log to a file with
verbose log level.
The implementation is a bit tricky because in the most common use-case
we have to always log in verbose mode to a socket and only later apply
the desired log levels. Previously socket appender would respect the
desired log level already before forwarding the log.
If by default we log to a file, verbose mode is simply ignored and does
not override user settings.
To test run `project-manager` with `ENSO_LOGSERVER_APPENDER=console` env
variable. That will output to the console with the default `INFO` level
and `TRACE` log level for the file.
* add docs
* changelog
* Address some PR requests
1. Log INFO level to CONSOLE by default
2. Change runner's default log level from ERROR to WARN
Took a while to figure out why the correct log level wasn't being passed
to the language server, therefore ignoring the (desired) verbose logs
from the log file.
* linter
* 3rd party uses log4j for logging
Getting rid of the warning by adding a log4j over slf4j bridge:
```
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
```
* legal review update
* Make sure tests use test resources
Having `application.conf` in `src/main/resources` and `test/resources`
does not guarantee that in Tests we will pick up the latter. Instead, by
default it seems to do some kind of merge of different configurations,
which is far from desired.
* Ensure native launcher test log to console only
Logging to console and (temporary) files is problematic for Windows.
The CI also revealed a problem with the native configuration because it
was not possible to modify the launcher via env variables as everything
was initialized during build time.
* Adapt to method changes
* Potentially deal with Windows failures
- Add type detection for `Mixed` columns when calling column functions.
- Excel uses column name for missing headers.
- Add aliases for parse functions on text.
- Adjust `Date`, `Time_Of_Day` and `Date_Time` parse functions to not take `Nothing` anymore and provide dropdowns.
- Removed built-in parses.
- All support Locale.
- Add support for missing day or year for parsing a Date.
- All will trim values automatically.
- Added ability to list AWS profiles.
- Added ability to list S3 buckets.
- Workaround for Table.aggregate so default item added works.
close#6080
Changelog
- add: implement `SuggestionsRepo.insertAll` as a batch SQL insert
- update: `search/getSuggestionsDatabase` returns empty suggestions. Currently, the method is only used at startup and returns the empty response anyway because the libs are not loaded at that point.
- update: serialize only global (defined in the module scope) suggestions during the distribution building. There's no sense in storing the local library suggestions.
- update: sqlite dependency
- remove: unused methods from `SuggestionsRepo`
- remove: Arguments table
# Important Notes
Speeds up libraries loading by ~1 second.
![2023-04-03-173423_2086x324_scrot](https://user-images.githubusercontent.com/357683/229597470-19dcc010-2a34-43e1-87be-60af99afd275.png)
![2023-04-03-173514_2083x321_scrot](https://user-images.githubusercontent.com/357683/229597476-bf5b3c33-6321-4ac9-a0ca-2fb57d257857.png)
Implement new Enso documentation parser; remove old Scala Enso parser.
Performance: Total time parsing documentation is now ~2ms.
# Important Notes
- Doc parsing is now done only in the frontend.
- Some engine tests had never been switched to the new parser. We should investigate tests that don't pass after the switch: #5894.
- The option to run the old searcher has been removed, as it is obsolete and was already broken before this (see #5909).
- Some interfaces used only by the old searcher have been removed.
This change adds support for Version Controlled projects in language server.
Version Control supports operations:
- `init` - initialize VCS for a project
- `save` - commit all changes to the project in VCS
- `restore` - ability to restore project to some past `save`
- `status` - show the status of the project from VCS' perspective
- `list` - show a list of requested saves
# Important Notes
Behind the scenes, Enso's VCS uses git (or rather [jGit](https://www.eclipse.org/jgit/)) but nothing stops us from using a different implementation as long as it conforms to the establish API.
- Added expression ANTLR4 grammar and sbt based build.
- Added expression support to `set` and `filter` on the Database and InMemory `Table`.
- Added expression support to `aggregate` on the Database and InMemory `Table`.
- Removed old aggregate functions (`sum`, `max`, `min` and `mean`) from `Column` types.
- Adjusted database `Column` `+` operator to do concatenation (`||`) when text types.
- Added power operator `^` to both `Column` types.
- Adjust `iif` to allow for columns to be passed for `when_true` and `when_false` parameters.
- Added `is_present` to database `Column` type.
- Added `coalesce`, `min` and `max` functions to both `Column` types performing row based operation.
- Added support for `Date`, `Time_Of_Day` and `Date_Time` constants in database.
- Added `read` method to InMemory `Column` returning `self` (or a slice).
# Important Notes
- Moved approximate type computation to `SQL_Type`.
- Fixed issue in `LongNumericOp` where it was always casting to a double.
- Removed `head` from InMemory Table (still has `first` method).
Implements https://www.pivotaltracker.com/story/show/182307143
# Important Notes
- Modified standard library Java helpers dependencies so that `std-table` module depends on `std-base`, as a provided dependency. This is allowed, because `std-table` is used by the `Standard.Table` Enso module which depends on `Standard.Base` which ensures that the `std-base` is loaded onto the classpath, thus whenever `std-table` is loaded by `Standard.Table`, so is `std-base`. Thus we can rely on classes from `std-base` and its dependencies being _provided_ on the classpath. Thanks to that we can use utilities like `Text_Utils` also in `std-table`, avoiding code duplication. Additional advantage of that is that we don't need to specify ICU4J as a separate dependency for `std-table`, since it is 'taken' from `std-base` already - so we avoid including it in our build packages twice.
This change adds support for matching on constants by:
1) extending parser to allow literals in patterns
2) generate branch node for literals
Related to https://www.pivotaltracker.com/story/show/182743559