close#6080
Changelog
- add: implement `SuggestionsRepo.insertAll` as a batch SQL insert
- update: `search/getSuggestionsDatabase` returns empty suggestions. Currently, the method is only used at startup and returns the empty response anyway because the libs are not loaded at that point.
- update: serialize only global (defined in the module scope) suggestions during the distribution building. There's no sense in storing the local library suggestions.
- update: sqlite dependency
- remove: unused methods from `SuggestionsRepo`
- remove: Arguments table
# Important Notes
Speeds up libraries loading by ~1 second.
![2023-04-03-173423_2086x324_scrot](https://user-images.githubusercontent.com/357683/229597470-19dcc010-2a34-43e1-87be-60af99afd275.png)
![2023-04-03-173514_2083x321_scrot](https://user-images.githubusercontent.com/357683/229597476-bf5b3c33-6321-4ac9-a0ca-2fb57d257857.png)
This is the first part of the #5158 umbrella task. It closes#5158, follow-up tasks are listed as a comment in the issue.
- Updates all prototype methods dealing with `Value_Type` with a proper implementation.
- Adds a more precise mapping from in-memory storage to `Value_Type`.
- Adds a dialect-dependent mapping between `SQL_Type` and `Value_Type`.
- Removes obsolete methods and constants on `SQL_Type` that were not portable.
- Ensures that in the Database backend, operation results are computed based on what the Database is meaning to return (by asking the Database about expected types of each operation).
- But also ensures that the result types are sane.
- While SQLite does not officially support a BOOLEAN affinity, we add a set of type overrides to our operations to ensure that Boolean operations will return Boolean values and will not be changed to integers as SQLite would suggest.
- Some methods in SQLite fallback to a NUMERIC affinity unnecessarily, so stuff like `max(text, text)` will keep the `text` type instead of falling back to numeric as SQLite would suggest.
- Adds ability to use custom fetch / builder logic for various types, so that we can support vendor specific types (for example, Postgres dates).
# Important Notes
- There are some TODOs left in the code. I'm still aligning follow-up tasks - once done I will try to add references to relevant tasks in them.
`--compile` command would run the compilation pipeline but silently omit any encountered errors, thus skipping the serialization. This maybe was a good idea in the past but it was problematic now that we generate indexes on build time.
This resulted in rather obscure errors (#6092) for modules that were missing their caches.
The change should significantly improve developers' experience when working on stdlib.
# Important Notes
Making compilation more resilient to sudden cache misses is a separate item to be worked on.
The change adds support for generating suggestions and bindings when using the convenient task for building individual stdlib components. By default commands do not generate index since it adds build time. But `buildStdLibAllWithIndex` will.
Closes#5999.
Implement new Enso documentation parser; remove old Scala Enso parser.
Performance: Total time parsing documentation is now ~2ms.
# Important Notes
- Doc parsing is now done only in the frontend.
- Some engine tests had never been switched to the new parser. We should investigate tests that don't pass after the switch: #5894.
- The option to run the old searcher has been removed, as it is obsolete and was already broken before this (see #5909).
- Some interfaces used only by the old searcher have been removed.
Adds a common project that allows sharing code between the `runtime` and `std-bits`.
Due to classpath separation and the way it is compiled, the classes will be duplicated - we will have one copy for the `runtime` classpath and another copy as a small JAR for `Standard.Base` library.
This is still much better than having the code duplicated - now at least we have a single source of truth for the shared implementations.
Due to the copying we should not expand this project too much, but I encourage to put here any methods that would otherwise require us to copy the code itself.
This may be a good place to put parts of the hashing logic to then allow sharing the logic between the `runtime` and the `MultiValueKey` in the `Table` library (cc: @Akirathan).
This change adds serialization and deserialization of library bindings.
In order to be functional, one needs to first generate IR and
serialize bindings using `--compiled <path-to-library>` command. The bindings
will be stored under the library with `.bindings` suffix.
Bindings are being generated during `buildEngineDistribution` task, thus not
requiring any extra steps.
When resolving import/exports the compiler will first try to load
module's bindings from cache. If successful, it will not schedule its
imports/exports for immediate compilation, as we always did, but use the
bindings info to infer the dependent modules.
The current change does not make any optimizations when it comes to
compiling the modules, yet. It only delays the actual
compilation/loading IR from cache so that it can be done in bulk.
Further optimizations will come from this opportunity such as parallel
loading of caches or lazily inferring only the necessary modules.
Part of https://github.com/enso-org/enso/issues/5568 work.
Creating two `findExceptionMessage` methods in `HostEnsoUtils` and in `VisualizationResult`. Why two? Because one of them is using `org.graalvm.polyglot` SDK as it runs in _"normal Java"_ mode. The other one is using Truffle API as it is running inside of partially evaluated instrument.
There is a `FindExceptionMessageTest` to guarantee consistency between the two methods. It simulates some exceptions in Enso code and checks that both methods extract the same _"message"_ from the exception. The tests verifies hosted and well as Enso exceptions - however testing other polyglot languages is only possible in other modules - as such I created `PolyglotFindExceptionMessageTest` - but that one doesn't have access to Truffle API - e.g. it doesn't really check the consistency - just that a reasonable message is extracted from a JavaScript exception.
# Important Notes
This is not full fix of #5260 - something needs to be done on the IDE side, as the IDE seems to ignore the delivered JSON message - even if it contains properly extracted exception message.
Automating the assembly of the engine and its execution into a single task. If you are modifying standard libraries, engine sources or Enso tests, you can launch `sbt` and then just:
```
sbt:enso> runEngineDistribution --run test/Tests/src/Data/Maybe_Spec.enso
[info] Engine package created at built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev
[info] Executing built-distribution/enso-engine-...-dev/bin/enso --run test/Tests/src/Data/Maybe_Spec.enso
Maybe: [5/5, 30ms]
- should have a None variant [14ms]
- should have a Some variant [5ms]
- should provide the `maybe` function [4ms]
- should provide `is_some` [2ms]
- should provide `is_none` [3ms]
5 tests succeeded.
0 tests failed.unEngineDistribution 4s
0 tests skipped.
```
the [runEngineDistribution](3a581f29ee/docs/CONTRIBUTING.md (running-enso)) `sbt` input task makes sure all your sources are properly compiled and only then executes your enso source. Everything ready at a single press of Enter.
# Important Notes
To debug in chrome dev tools, just add `--inspect`:
```
sbt:enso> runEngineDistribution --inspect --run test/Tests/src/Data/Maybe_Spec.enso
E.g. in Chrome open: devtools://devtools/bundled/js_app.html?ws=127.0.0.1:9229/7JsgjXlntK8
```
everything gets build and one can just attach the Enso debugger.
- Updated `Widget.Vector_Editor` ready for use by IDE team.
- Added `get` to `Row` to make API more aligned.
- Added `first_column`, `second_column` and `last_column` to `Table` APIs.
- Adjusted `Column_Selector` and associated methods to have simpler API.
- Removed `Column` from `Aggregate_Column` constructors.
- Added new `Excel_Workbook` type and added to `Excel_Section`.
- Added new `SQLiteFormatSPI` and `SQLite_Format`.
- Added new `IamgeFormatSPI` and `Image_Format`.
Before, any failures of the Rust-side of the parser build would be swallowed by sbt, for example if I add gibberish to the Rust code I will get:
<img width="474" alt="image" src="https://user-images.githubusercontent.com/1436948/217374050-fd9ddaca-136c-459e-932e-c4b9e630d610.png">
This is problematic, because when users are compiling in SBT they may get confusing errors about Java files not being found whereas the true cause is hard to track down because it is somewhere deep in the logs. We've run into this silent failure when setting up SBT builds together with @GregoryTravis today.
I suggest to change it so that once cargo fails, the build is failed with a helpful message - this way it will be easier to track down the issues.
With these changes we get:
<img width="802" alt="image" src="https://user-images.githubusercontent.com/1436948/217374531-707ae348-4c55-4d62-9a86-93850ad8086b.png">
In order to investigate `engine/language-server` project, I need to be able to open its sources in IGV and NetBeans.
# Important Notes
By adding same Java source (this time `package-info.java`) and compiling with our Frgaal compiler the necessary `.enso-sources*` files are generated for `engine/language-server` and then the `enso4igv` plugin can open them and properly understand their compile settings.
![Logical View of language-server project](https://user-images.githubusercontent.com/26887752/215472696-ec9801f3-4692-4bdb-be92-c4d2ab552e60.png)
In addition to that this PR enhances the _"logical view"_ presentation of the project by including all source roots found under `src/*/*`.
Enso unit tests were running without `-ea` check enabled and as such various invariant checks in Truffle code were not executed. Let's turn the `-ea` flag on and fix all the code misbehaves.
- add: `GeneralAnnotation` IR node for `@name expression` annotations
- update: compilation pipeline to process the annotation expressions
- update: rewrite `OverloadsResolution` compiler pass so that it keeps the order of module definitions
- add: `Meta.get_annotation` builtin function that returns the result of annotation expression
- misc: improvements (private methods, lazy arguments, build.sbt cleanup)
Introduces unboxed (and arity-specialized) storage schemes for Atoms. It results in improvements both in memory consumption and runtime.
Memory wise: instead of using an array, we now use object fields. We also enable unboxing. This cuts a good few pointers in an unboxed object. E.g. a quadruple of integers is now 64 bytes (4x8 bytes for long fields + 16 bytes for layout and constructor pointers + 16 bytes for a class header). It used to be 168 bytes (4x24 bytes for boxed Longs + 16 bytes for array header + 32 bytes for array contents + 8 bytes for constructor ptr + 16 bytes for class header), so we're saving 104 bytes a piece. In the least impressive scenarios (all-boxed fields) we're saving 8 bytes per object (saving 16 bytes for array header, using 8 bytes for the new layout field). In the most-benchmarked case (list of longs), we save 32 bytes per cons-cell.
Time wise:
All list-summing benchmarks observe a ~2x speedup. List generation benchmarks get ~25x speedups, probably both due to less GC activity and better allocation characteristics (only allocating one object per Cons, rather than Cons + Object[] for fields). The "map-reverse" family gets a neat 10x speedup (part of the work is reading, which is 2x faster, the other is allocating, which is now 25x faster, we end up with 10x when combined).
Use `InteropLibrary.isString` and `asString` to convert any string value to `byte[]`
# Important Notes
Also contains a support for `Metadata.assertInCode` to help locating the right place in the code snippets.
`runtime-with-instruments` project sets `-Dgraalvm.locatorDisabled=true` that disables the discovery of available polyglot languages (installed with `gu`). On the other hand, enabling locator makes polyglot languages available, but also makes the program classes and the test classes loaded with different classloaders. This way we're unable to use `EnsoContext` in tests to observe internal context state (there is an exception when you try to cast to `EnsoContext`).
The solution is to move tests with enabled polyglot support, but disabled `EnsoContext` introspection to a separate project.
Disabling `musl` as it isn't capable to load dynamic library.
# Important Notes
With this change it is possible to:
```
$ sbt bootstrap
$ sbt engine-runner/buildNativeImage
$ ./runner --run ./engine/runner/src/test/resources/Factorial.enso 3
6
$ ./runner --run ./engine/runner/src/test/resources/Factorial.enso 4
24
$ ./runner --run ./engine/runner/src/test/resources/Factorial.enso 100
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
```
Is it OK, @radeusgd to disable `musl`? If not, we would have to find a way to link the parser in statically, not dynamically.
Upgrading to GraalVM 22.3.0.
# Important Notes
- Removed all deprecated `FrameSlot`, and replaced them with frame indexes - integers.
- Add more information to `AliasAnalysis` so that it also gathers these indexes.
- Add quick build mode option to `native-image` as default for non-release builds
- `graaljs` and `native-image` should now be downloaded via `gu` automatically, as dependencies.
- Remove `engine-runner-native` project - native image is now build straight from `engine-runner`.
- We used to have `engine-runner-native` without `sqldf` in classpath as a workaround for an internal native image bug.
- Fixed chrome inspector integration, such that it shows values of local variables both for current stack frame and caller stack frames.
- There are still many issues with the debugging in general, for example, when there is a polyglot value among local variables, a `NullPointerException` is thrown and no values are displayed.
- Removed some deprecated `native-image` options
- Remove some deprecated Truffle API method calls.
Previously, when exporting the same module multiple times only the first statement would count and the rest would be discarded by the compiler.
This change allows for multiple exports of the same module e.g.,
```
export project.F1
from project.F1 export foo
```
Multiple exports may however lead to conflicts when combined with hiding names. Added logic in `ImportResolver` to detect such scenarios.
This fixes https://www.pivotaltracker.com/n/projects/2539304/stories/183092447
# Important Notes
Added a bunch of scenarios to simulate pos and neg results.
1-to-1 translation of the HTTPBin expected by our testsuite using Java's HttpServer.
Can be started from SBT via
```
sbt:enso> simple-httpbin/run <hostname> <port>
```
# Important Notes
@mwu-tow this will mean we can ditch Go dependency completely and replace it with the above call.
This change adds support for Version Controlled projects in language server.
Version Control supports operations:
- `init` - initialize VCS for a project
- `save` - commit all changes to the project in VCS
- `restore` - ability to restore project to some past `save`
- `status` - show the status of the project from VCS' perspective
- `list` - show a list of requested saves
# Important Notes
Behind the scenes, Enso's VCS uses git (or rather [jGit](https://www.eclipse.org/jgit/)) but nothing stops us from using a different implementation as long as it conforms to the establish API.
- Added expression ANTLR4 grammar and sbt based build.
- Added expression support to `set` and `filter` on the Database and InMemory `Table`.
- Added expression support to `aggregate` on the Database and InMemory `Table`.
- Removed old aggregate functions (`sum`, `max`, `min` and `mean`) from `Column` types.
- Adjusted database `Column` `+` operator to do concatenation (`||`) when text types.
- Added power operator `^` to both `Column` types.
- Adjust `iif` to allow for columns to be passed for `when_true` and `when_false` parameters.
- Added `is_present` to database `Column` type.
- Added `coalesce`, `min` and `max` functions to both `Column` types performing row based operation.
- Added support for `Date`, `Time_Of_Day` and `Date_Time` constants in database.
- Added `read` method to InMemory `Column` returning `self` (or a slice).
# Important Notes
- Moved approximate type computation to `SQL_Type`.
- Fixed issue in `LongNumericOp` where it was always casting to a double.
- Removed `head` from InMemory Table (still has `first` method).
Make sure `libenso_parser.so`, `.dll` or `.dylib` are packaged and included when `sbt buildEngineDistribution`.
# Important Notes
There was [a discussion](https://discord.com/channels/401396655599124480/1036562819644141598) about proper location of the library. It was concluded that _"there's no functional difference between a dylib and a jar."_ and as such the library is placed in `component` folder.
Currently the old parser is still used for parsing. This PR just integrates the build system changes and makes us ready for smooth flipping of the parser in the future as part of #3611.
We've had an old attempt at integrating a Rust parser with our Scala/Java projects. It seems to have been abandoned and is not used anywhere - it is also superseded by the new integration of the Rust parser. I think it was used as an experiment to see how to approach such an integration.
Since it is not used anymore - it make sense to remove it, because it only adds some (slight, but non-zero) maintenance effort. We can always bring it back from git history if necessary.
- Reimplement the `Duration` type to a built-in type.
- `Duration` is an interop type.
- Allow Enso method dispatch on `Duration` interop coming from different languages.
# Important Notes
- The older `Duration` type should now be split into new `Duration` builtin type and a `Period` type.
- This PR does not implement `Period` type, so all the `Period`-related functionality is currently not working, e.g., `Date - Period`.
- This PR removes `Integer.milliseconds`, `Integer.seconds`, ..., `Integer.years` extension methods.
This PR adds a possibility to generate native-image for engine-runner.
Note that due to on-demand loading of stdlib, programs that make use of it are currently not yet supported
(that will be resolved at a later point).
The purpose of this PR is only to make sure that we can generate a bare minimum runner because due to lack TruffleBoundaries or misconfiguration in reflection config, this can get broken very easily.
To generate a native image simply execute:
```
sbt> engine-runner-native/buildNativeImage
... (wait a few minutes)
```
The executable is called `runner` and can be tested via a simple test that is in the resources. To illustrate the benefits
see the timings difference between the non-native and native one:
```
>time built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/bin/enso --no-ir-caches --in-project test/Tests/ --run engine/runner-native/src/test/resources/Factorial.enso 6
720
real 0m4.503s
user 0m9.248s
sys 0m1.494s
> time ./runner --run engine/runner-native/src/test/resources/Factorial.enso 6
720
real 0m0.176s
user 0m0.042s
sys 0m0.038s
```
# Important Notes
Notice that due to a [bug in GraalVM](https://github.com/oracle/graal/issues/4200), which is already fixed in 22.x, and us still being on 21.x for the time being, I had to add a workaround to our sbt build to build a different fat jar for native image. To workaround it I had to exclude sqlite jar. Hence native image task is on `engine-runner-native` and not on `engine-runner`.
Will need to add the above command to CI.
Implements https://www.pivotaltracker.com/story/show/182307143
# Important Notes
- Modified standard library Java helpers dependencies so that `std-table` module depends on `std-base`, as a provided dependency. This is allowed, because `std-table` is used by the `Standard.Table` Enso module which depends on `Standard.Base` which ensures that the `std-base` is loaded onto the classpath, thus whenever `std-table` is loaded by `Standard.Table`, so is `std-base`. Thus we can rely on classes from `std-base` and its dependencies being _provided_ on the classpath. Thanks to that we can use utilities like `Text_Utils` also in `std-table`, avoiding code duplication. Additional advantage of that is that we don't need to specify ICU4J as a separate dependency for `std-table`, since it is 'taken' from `std-base` already - so we avoid including it in our build packages twice.
This change adds support for matching on constants by:
1) extending parser to allow literals in patterns
2) generate branch node for literals
Related to https://www.pivotaltracker.com/story/show/182743559
Execution of `sbt runtime/bench` doesn't seem to be part of the gate. As such it can happen a change into the Enso language syntax, standard libraries, runtime & co. can break the benchmarks suite without being noticed. Integrating such PR causes unnecessary disruptions to others using the benchmarks.
Let's make sure verification of the benchmarks (e.g. that they compile and can execute without error) is part of the CI.
# Important Notes
Currently the gate shall fail. The fix is being prepared in parallel PR - #3639. When the two PRs are combined, the gate shall succeed again.
The option asks to print a final test report for each projects at the
end `sbt> run`.
That way, when running the task in aggregate mode, we have a summary at
the end, rather than somewhere in the large output of the individual
subproject.