Follow-up of #8890
Refactor the rest of the tests to the builder API (`Test_New`):
- `Image_Tests`
- `Geo_Tests`
- `Google_Api_Test`
- `Examples_Test`
- `AWS_Tests`
- `Meta_Test_Suite_Tests`
- `Visualization_Tests`
# Important Notes
- Unrelated: Fix NPE in `File.new "/" . name`
- ✅Linting fixes and groups.
- ✅Add `File.from that:Text` and use `File` conversions instead of taking both `File` and `Text` and calling `File.new`.
- ✅Align Unix Epoc with the UTC timezone and add converting from long value to `Date_Time` using it.
- ❌Add simple first logging API allowing writing to log messages from Enso.
- ✅Fix minor style issue where a test type had a empty constructor.
- ❌Added a `long` based array builder.
- Added `File_By_Line` to read a file line by line.
- Added "fast" JSON parser based off Jackson.
- ✅Altered range `to_vector` to be a proxy Vector.
- ✅Added `at` and `get` to `Database.Column`.
- ✅Added `get` to `Table.Column`.
- ✅Added ability to expand `Vector`, `Array` `Range`, `Date_Range` to columns.
- ✅Altered so `expand_to_column` default column name will be the same as the input column (i.e. no `Value` suffix).
- ✅Added ability to expand `Map`, `JS_Object` and `Jackson_Object` to rows with two columns coming out (and extra key column).
- ✅ Fixed bug where couldn't use integer index to expand to rows.
Refactor `Base_Tests` to `Test_New` testing framework. Mostly automatic text replacements.
# Important Notes
List of changes that were not done automatically (not via automatic text replacement):
- Fix indexes in Instrumentor_Spec - f590c4a398
- If group or spec is pending, its block is not evaluated - 8d797f1a4a
- Spec_Result is not private - 5767535af2
Tests marked as *pending*:
- #8913
- #8910
- Closes#8808 - adds tests for various scenarios.
- Implements `size` using HEAD.
- Updates existing functions to changes in Cloud API.
- Adds stubs for `*_time` methods, `parent`, `path`.
- [x] TODO: resolve the `Enso_File.current_working_directory` from an environment variable.
- ~~TODO: recursive directory deletion?~~ left for later
# Important Notes
- Currently, the Cloud API does not offer an easy way to extract metadata for a file, in particular to get the parent folder from the file `id`.
- We should be able to get the parent, and stuff like creation/modified time.
- We need a way to resolve paths to asset ids, for `path` to work as well as `current_working_directory`.
- What is the environment variable that will be used to feed the `current_working_directory` property?
Refactor `test/Table_Test` to the builder API. The builder API is in a new library called `Test_New` that is alongside the old `Test` library. There will be follow-up PRs that will migrate the rest of the tests. Meanwhile, let's keep these two libraries, and merge them after the last PR.
# Important Notes
- For a brief introduction into the new API, see **Prototype 1** section in https://github.com/enso-org/enso/pull/8622#issuecomment-1889706168
- When executing all the tests, the behavior should be the same as with the old library. With the only exception that if `ENSO_TEST_ANSI_COLORS` env var is set, the output is more colorful than it used to be.
Goal of this PR is to refactor the design of OrderMask and avoid copying arrays or lists wherever possible.
We have removed a few legacy functions which were not being used.
On a poor mans benchmark seems to be quicker (13s vs 16s) and memory usage should be lower.
- Closes#8723
- Adds some missing features that were needed to make this work:
- `Enso_File.create_directory` and `Enso_File.delete`, and basic tests for it
- Changes how `Enso_Secret.list` is obtained - using a different Cloud endpoint allows us to implement the desired logic, the default endpoint was giving us _all_ secrets which was not what we wanted here.
- Implements `Enso_Secret.update` and tests for it
# Important Notes
Notes describing any problems with the current Cloud API:
https://docs.google.com/document/d/1x8RUt3KkwyhlxGux7XUGfOdtFSAZV3fI9lSSqQ3XsXk/edit
Apparently, everything that was needed to make this feature work has already been implemented, although a few features needed workarounds on Enso side to work properly.
- `up_to` should have the `step` as an optional argument.
- `Date_Range` conversion can do a clever auto rename, so if a period use the name of it.
- Add `to Table` for `Range`, `Date_Range`.
Random.Seed doesn't work in the GUI and the TEXT_ONLY tag doesn't do anything so it was incorrectly showing up in the component browser.
This MR makes Random.Seed private to hide it from the GUI and completely removes the TEXT_ONLY tag which is unused and unimplemented.
Implements `Warnings.get_all wrap_errors=True` which wraps warnings attached to values inside vectors with `Map_Error`, which includes the position of the value within the vector. See [the documentation](https://github.com/enso-org/enso/blob/develop/docs/semantics/wrapped-errors.md) for more details.
`get_all wrap_errors=True` does not change the warnings that are attached to values -- it wraps them before returning them to the caller, but does not change the original warnings attached to the values.
Wrapped warnings only appear attached to the vector itself. The values inside the vector do not have their warnings wrapped.
Warning propagation is not changed at all; `Warnings.get_all` (with default `wrap_errors=False`) behaves as before. `get_all wrap_errors=True` is meant to be used primarily by the IDE, although it can be used anywhere this wrapping is desired.
This is the follow up PR addressing the last couple of points from https://github.com/enso-org/enso/pull/8643 around what the return type from fill_nothing.
# Important Notes
The biggest change is changing what we size we need for an empty string. This change says a variable length string of length 1 and does it at a low enough level that it will effect the whole language. But I think that is correct.
Give file read its own helper widget for delimiters. Remove newline add none. The file read delimiter is similar but different to the split one and so should have its own set of options.
- Closes#8555
- Refactors the file format detection logic, compacting lots of repetitive logic for HTTP handling into helper functions.
- Some updates to CODEOWNERS.
- After [suggestion](https://github.com/enso-org/enso/pull/8497#discussion_r1429543815) from @JaroslavTulach I have tried reimplementing the URL encoding using just `URLEncode` builtin util. I will see if this does not complicate other followup improvements, but most likely all should work so we should be able to get rid of the unnecessary bloat.
- Closes#8354
- Extends `simple-httpbin` with a simple mock of the Cloud API (currently it checks the token and serves the `/users` endpoint).
- Renames `simple-httpbin` to `http-test-helper`.
- Closes#8352
- ~~Proposed fix for #8493~~
- The temporary fix is deemed not viable. I will try to figure out a workaround and leave fixing #8493 to the engine team.
- Closes#8111 by making sure that all Excel workbooks are read using a backing file (which should be more memory efficient).
- If the workbook is being opened from an input stream, that stream is materialized to a `Temporary_File`.
- Adds tests fetching Table formats from HTTP.
- Extends `simple-httpbin` with ability to serve files for our tests.
- Ensures that the `Infer` option on `Excel` format also works with streams, if content-type metadata is available (e.g. from HTTP headers).
- Implements a `Temporary_File` facility that can be used to create a temporary file that is deleted once all references to the `Temporary_File` instance are GCed.
* tests
* wip
* wip
* additional warnings
* wip
* wip
* cleanup
* nested wrapping
* multiple nestings
* wraps_error uses looks_for, test for should_fail_with
* wip
* stack trace line fix
* use catch_primitive internally
* fix warning mapping, dtf spec
* just one wrapper checker, vector spec
* missing ctor, back to non-primitive catch
* back to c_p
* put old map back
* wip
* unnest tests
* Array.map on_problems
* wip
* Revert "wip"
This reverts commit c30d171457.
* better test names
* warning logging
* wip
* wip
* move logic into ALH
* doc
* constant
* My_Error.Error
* nested
* doc
* map_primtiive in warning mapper
* composition
* ref spec
* Remove warnings prior to matching on the value
If an expression has warnings and is matched we:
1) extract the warnings
2) execute the branch of a pattern that matches the value
3) attach extracted warnings to the result
This caused warnings to reappear when doing the custom warnings
manipulation.
This is also consistent with how `CaseNode`'s `doWarning` specialization
is defined.
* fix 1
* do not auto unwrap in test error checkers
* nested error matcher
* in problems too
* dtf
* v
* statistics
* wip
* Table_Spec, map_with_index_primitive
* Column_Operations_Spec
* disable warning wrapping and Report_Warning
* unimpl test
* Warnings_Spec
* DCS
* ACG JP
* zip_primitive
* join_helpers
* Lookup_Helpers
* Table
* Data_Formatter
* Value_Type_Helpers
* revert check types changes
* table_helpers
* table tests
* remove st
* do not remove warnings from value
* vec docs, tests for zip, mwi, flat_map
* docs, fixes
* remove nested_error_matcher
* cleanup
* benchmark
* one error
* alter
* add bench to main
* review
* review
* review
* tail call
* changelog
* tail call was not a tail call
* ws
* bad import
* Added missing import
* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Array.enso
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* review, ref example
* lazy benchmark data
* extra paren
* check outside of catch
* review
* vector too
* actually lazy
* disambiguate Map_Error
* finish rename
* move to extensions
* combine Additional_Warnings error
* rename to map_no_wrap
* do not catch and rethrow
* review
* wip
* remove _primitives entirely
* remove unused should_fail_with function options
* remove expected_warning as function in Problems
---------
Co-authored-by: Hubert Plociniczak <hubert.plociniczak@gmail.com>
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* Test illustrating problems with FQNs
Inline execution fails with `Compile error: The name `Standard` could
not be found.`.
* Ensure InlineContext carries Package Repos info
Previously, there was no requirement that inline execution should allow
for FQNs. This meant that the omission of Package Repository info went
unnoticed.
In order to be able to refer to `Standard.Visualization.Preprocessor` it
has to be exported as well.
- Add a `File_For_Read` type. Used for `File_Format` to read files.
- Added `Enso_User` representing the current user in `Enso_Cloud`.
- *Will be later able to list known users.*
- Added `Enso_Secret` representing a value defined in `Enso_Cloud`.
- Value not used within Enso only accessed within polyglot Java.
- Integrated into `Username_And_Password` and can be used within JDBC connections.
- Integrated into HTTP Headers so a secret can be used as a value.
- New `URI_With_Query` with the same API as `URI`. Supporting secrets in the value.
- *Will be integrated with AWS credentials.*
- Added `Enso_File` representing a file or a folder in the cloud.
- Support the same API as `File` (like the `S3_File`).
- *Will support `enso://` URI style access.*
Upgrade to GraalVM JDK 21.
```
> java -version
openjdk version "21" 2023-09-19
OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15)
OpenJDK 64-Bit Server VM GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15, mixed mode, sharing)
```
With SDKMan, download with `sdk install java 21-graalce`.
# Important Notes
- After this PR, one can theoretically run enso with any JRE with version at least 21.
- Removed `sbt bootstrap` hack and all the other build time related hacks related to the handling of GraalVM distribution.
- `project-manager` remains backward compatible - it can open older engines with runtimes. New engines now do no longer require a separate runtime to be downloaded.
- sbt does not support compilation of `module-info.java` files in mixed projects - https://github.com/sbt/sbt/issues/3368
- Which means that we can have `module-info.java` files only for Java-only projects.
- Anyway, we need just a single `module-info.class` in the resulting `runtime.jar` fat jar.
- `runtime.jar` is assembled in `runtime-with-instruments` with a custom merge strategy (`sbt-assembly` plugin). Caching is disabled for custom merge strategies, which means that re-assembly of `runtime.jar` will be more frequent.
- Engine distribution contains multiple JAR archives (modules) in `component` directory, along with `runner/runner.jar` that is hidden inside a nested directory.
- The new entry point to the engine runner is [EngineRunnerBootLoader](https://github.com/enso-org/enso/pull/7991/files#diff-9ab172d0566c18456472aeb95c4345f47e2db3965e77e29c11694d3a9333a2aa) that contains a custom ClassLoader - to make sure that everything that does not have to be loaded from a module is loaded from `runner.jar`, which is not a module.
- The new command line for launching the engine runner is in [distribution/bin/enso](https://github.com/enso-org/enso/pull/7991/files#diff-0b66983403b2c329febc7381cd23d45871d4d555ce98dd040d4d1e879c8f3725)
- [Newest version of Frgaal](https://repo1.maven.org/maven2/org/frgaal/compiler/20.0.1/) (20.0.1) does not recognize `--source 21` option, only `--source 20`.
- Closes#5303
- Refactors `JoinStrategy` allowing us to 'stack' join strategies on top of each other (to some extent) - currently a `HashJoin` can be followed by another join strategy (currently `SortJoin`)
- Adds benchmarks for join
- Due to limitations of the sorting approach this will still not be as fast as possible for cases where there is more than 1 `Between` condition in a single query - trying to demonstrate that in benchmarks.
- We can replace sorting by d-dimensional [RangeTrees](https://en.wikipedia.org/wiki/Range_tree) to get `O((n + m) log^d n + k)` performance (where `n` and `m` are sizes of joined tables, `d` is the amount of `Between` conditions used in the query and `k` is the result set size).
- Follow up ticket for consideration later:
#8216
- Closes#8215
- After all, it turned out that `TreeSet` was problematic (because of not enough flexibility with duplicate key handling), so the simplest solution was to immediately implement this sub-task.
- Closes#8204
- Unrelated, but I ran into this here: adds type checks to other arguments of `set`.
- Before, putting in a Column as `new_name` (i.e. mistakenly messing up the order of arguments), lead to a hard to understand `Method `if_then_else` of type Column could not be found.`, instead now it would file with type error 'expected Text got Column`.
- Sets the default limit for `Table.read` in Database to be max 1000 rows.
- The limit for in-memory compatible API still defaults to `Nothing`.
- Adds a warning if there are more rows than limit.
- Enables a few unrelated asserts.
* doc
* one test
* date tests
* empty and nothing
* ints floats
* bools
* all columns
* regex and index
* locales
* bad formats
* all with one format
* docs
* examples, not impl db
* docs, more errors
* cleanup
* changelog
* check list
* reorder
* clue
* review
* review
* review
* review
* review
* review
* specify time zone
Added Blank_Selector constructor and applied to remove_blank_columns, select_blank_columns, filter_blank_rows for #7931 . Changed when_any to when for readability.
Previously, constant columns were given generated names with UUIDs in them, which are long and provide no information. Instead, we now use the constant value itself to form the name.
Since these new generated names are less unique, we must explicitly make them unique, in cases where the caller did not explicilty set a name.
- Closes#7981
- Adds a `RUNTIME_ERROR` operation into the DB dialect, that may be used to 'crash' a query if a condition is met - used to validate if `lookup_and_replace` invariants are still satisfied when the query is materialized.
- Removes old `Table_Helpers.is_table` and `same_backend` checks, in favour of the new way of checking this that relies on `Table.from` conversions, and is much simpler to use and also more robust.
- Fixes#8049
- Adds tests for handling of Date_Time upload/download in Postgres.
- Adds tests for edge cases of handling of Decimal and Binary types in Postgres.
- Validate spans during existing lexer and parser unit tests, and in `enso_parser_debug`.
- Fix lost span info causing failures of updated tests.
# Important Notes
- [x] Output of `parse_all_enso_files.sh` is unchanged since before #7881 (modulo libs changes since then).
- When the parser encounters an input with the first line indented, it now creates a sub-block for lines at than indent level, and emits a syntax error (every indented block must have a parent).
- When the parser encounters a number with a base but no digits (e.g. `0x`), it now emits a `Number` with `None` in the digits field rather than a 0-length digits token.
- Fixes issue with `to_display_text` on `Date_Time`. Now a reversible format.
- Adjusted the auto width code on table so it works.
- Manually sized columns should remain fixed size.
- Enable Excel range style selection in table.
- Removed page support as not implemented yet.
# Important Notes
Will need to apply to Vue based viz at some point as well.
- Follow-up of #8055
- Adds a benchmark comparing performance of Enso Map and Java HashMap in two scenarios - _only incremental_ updates (like `Vector.distinct`) and _replacing_ updates (like keeping a counter for each key). These benchmarks can be used as a metric for #8090
- Adds the ability to use numbers, date/time and Boolean values as constants in `set`.
- `Table.set` can take a `Column_Operation`, allowing for deriving of a new column based on other columns.
- Added `Column_Ref` type to refer to a column in `filter`.
- Added some ALIASes.
- Added `sheet` to `Excel_Workbook` to give familiar API to read sheet.
- Added conversion from range to vector allowing easy use with Zip.
- Add `Map.from_keys_and_values.
- Fixes a bug where creating a temporary table could accidentally issue a `DROP` statement of the table name that the user provided, risking destruction of user data.
- Fortunately, the bad scenario was almost impossible, because the `DROP` statement was only issued _if_ we previously checked that the mentioned table _does not exist_ - dropping a nonexistent table does not do any harm.
- It could have been dangerous in a very unlikely scenario that a table was created just between the _existence check_ and the _drop_.
- After the fix the existence check and any modifications are done within a transaction to avoid interference from concurrent modifications, and the DROP is correctly applied to a temporary Enso table instead of the original one.
- Replaced a temporary log with proper simple logging of SQL statements into a file, if an Environment variable is set.
- Used that feature to test that no unexpected statements occur.
- Closes#7749 implementing the in-memory logic.
- Additional complications have surfaced regarding the Database logic, so it has been split off into a separate ticket: #7981
We currently scan the entire child list for each call to children, num_children, and get_child_element.
Instead, we should cache the filtered child list on first access.
This PR includes
* Reading XML from a file, stream, or string
* Reading XML via Data.fetch
* Accessing the root element, element children, and attributes
* Accessing tag text contents
* Get tags by name
* Inner / Outer XML string
- Fixes#7352 by remembering original value types in type inference mode to be able to reconstruct them for Mixed.
- Added more benchmarks for comparing performance of constructing columns.
- Fixes missing implementations that caused `Table.union` crashing on some type pairs.
- Ensures that `Loss_Of_Integer_Precision` warning is not swallowed when numeric columns are unioned to create a `Float` column.
- Adds test for all of the above cases.
- Allow to output benchmark results to a CSV by setting an environment variable - useful for quickly comparing benchmarks, e.g. in Enso.
- Shuffle a few `from`s into correct places:
- `Day_Of_Week.from` removing `Day_Of_Week_From` module.
- Adding short cut for `http` and `https` in `Data.read` so it calls onto `Data.fetch` giving a single entry point.
- Moved `URI` extensions from `Standard.Base.Data` module into `Standard.Base.Network.Extensions`.
- Added `post` extension for `URI`.
- Added `contains_key` to `JS_Object`.
- Restored `into` in `JS_Object`:
- Follows old logic populating a constructor.
- Will use conversion from `JS_Object` if present.
- Added automatic deserialization of `Date`, `Time_Of_Day` and `Date_Time` from JSON.
- Uses conversion from `JS_Object`.
- Added conversion from `Text` to a `HTTP_Method` and type checking where `HTTP_Method` used in public APIs.
- Added support for `Date`, `Time_Of_Day` and `Date_Time` in `Table.from_objects`.
- Added `expand_column` to `Table` to expand `JS_Object` to values.
- Add type checking for `Table` in `right` arguments (allowing `Column`s to be used).
- Use type checking in `Table.set` to allow for conversion to a `Column`.
- Remove some unused imports.
- Fix for bug in S3 edge case.
* Add Text.substring function and get_position helper function
For #7876 adds a Text.substring function which supports negative indexes and returns a part of a string from 0-based index 'start' and continuing for 'length'
* added substring and simplified get function
For #7876 adds a Text.substring function which supports negative indexes and returns a part of a string from 0-based index 'start' and continuing for 'length'.
Also simplified get function as it looped unnecessarily.
* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso
punctuation corrections
Co-authored-by: GregoryTravis <greg.m.travis@gmail.com>
* Update Text_Spec.enso
Added test for start index larger than string
* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso
updated Arguments: section to use consistent style
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso
updated Index_Out_Of_Bounds error to reference cached length
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* Removed Slice label and added changelog entry
* Re-added slice tag to substring
Per conversation with James, added slice back to substring
* Update CHANGELOG.md add link
---------
Co-authored-by: GregoryTravis <greg.m.travis@gmail.com>
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* Always log verbose to a file
The change adds an option by default to always log to a file with
verbose log level.
The implementation is a bit tricky because in the most common use-case
we have to always log in verbose mode to a socket and only later apply
the desired log levels. Previously socket appender would respect the
desired log level already before forwarding the log.
If by default we log to a file, verbose mode is simply ignored and does
not override user settings.
To test run `project-manager` with `ENSO_LOGSERVER_APPENDER=console` env
variable. That will output to the console with the default `INFO` level
and `TRACE` log level for the file.
* add docs
* changelog
* Address some PR requests
1. Log INFO level to CONSOLE by default
2. Change runner's default log level from ERROR to WARN
Took a while to figure out why the correct log level wasn't being passed
to the language server, therefore ignoring the (desired) verbose logs
from the log file.
* linter
* 3rd party uses log4j for logging
Getting rid of the warning by adding a log4j over slf4j bridge:
```
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
```
* legal review update
* Make sure tests use test resources
Having `application.conf` in `src/main/resources` and `test/resources`
does not guarantee that in Tests we will pick up the latter. Instead, by
default it seems to do some kind of merge of different configurations,
which is far from desired.
* Ensure native launcher test log to console only
Logging to console and (temporary) files is problematic for Windows.
The CI also revealed a problem with the native configuration because it
was not possible to modify the launcher via env variables as everything
was initialized during build time.
* Adapt to method changes
* Potentially deal with Windows failures
- Closes#7461 by introducing a `Date_Time_Formatter` type and making parsing date time formats more robust and safer.
- The default ('simple') set of patterns is slightly simplified and made case insensitive (except for `M/m` and `H/h`) to avoid the `YYYY` vs `yyyy` issues and make it less error prone.
- The `YYYY` now has the same meaning as `yyyy` in simple mode. The old meaning (week-based year) is moved to a _separate mode_, triggered by `Date_Time_Formatter.from_iso_week_date_pattern`.
- Full Java syntax, as well as custom-built Java `DateTimeFormatter` can also be used by `Date_Time_Formatter.from_java`.
- Text-based constants (e.g. `ISO_ZONED_DATE_TIME`) have now become methods on `Date_Time_Formatter`, e.g. `Date_Time_Formatter.iso_zoned_date_time`).
- Added a `FileSystemSPI` allowing protocol resolution to a target type.
- Separated `Input_Stream` and `Output_Stream` from `File` to allow use in other spaces.
- `File_Format` types `read_web` changed to be `read_stream` working with `InputStream`.
- Added directory listing to `Auto_Detect` allowing for `Data.read` to list a folder.
- Adjusted HTTP to return an `InputStream` not a `byte[]`:
- `Response_Body` adjusted to wrap an `InputStream`.
- Added ability to materialize to either and in-memory vector (<4KB) or a temporary file.
- `Data.fetch` will materialize if not a recognized mime-type.
- Added `HTTP_Error` to handle IO exceptions from the stream.
- `Excel_Format` now supports mime-type and reading a stream.
- `Excel_Workbook` can now get a `Excel_Section` using `read_section`.
- Added S3 APIs:
- `parse_uri`: splits an S3 URI into bucket and key.
- `list_objects`: list the items in a S3 bucket with specified prefix.
- `read_bucket`: list prefixes and keys with a delimiter in a S3 bucket with specified prefix.
- `head`: either head_bucket (tests existance) or head_object API (reads object meta data).
- `get_object`: gets an object from S3 returning as a `Response_Body`.
- Added `S3_File` type acting like a `File`:
- No support for writing in this PR.
- **ToDo:** recursive listing, glob filtering, exists, size.
- Fixed a few invalid type signature line.
- Moved `create` methods for `Postgres_Connection` and `SQLite_Connection` into type instead of module.
- Renamed `Column_Fetcher.Builder` to `Column_Fetcher_Builder`.
- Fixed bug with `select_into` in Dry Run mode creating permanent tables.
**ToDo:** Unit tests.
- Fixes#7354
- And also closes#7712
- Refactors how we handle numeric ops - ensuring that the 'kernels' are placed all in one place and selected based on storage types.
- Closes#7238
- Aligns `update_database_table` to a more consistent and clearer API - `update_rows`.
- Adds a `truncate_table` helper function, to pair up with `drop_table`. Both are `PRIVATE` for now.
- Adds tests for NULLs in keys in `update_rows` and `delete_rows`.
- The behaviour is sometimes unexpected, so instead these fail with `Null_Values_In_Key_Columns`.
- Adds a workaround for https://github.com/oracle/graal/issues/7359
- Adds a workaround for a related bug where a stack frame has no name (its `rootNode.getName() == null`).
- I could not track down this bug to provide a neat repro.
- Closes#7633
- Moves `Round_Spec.enso` from published `Standard.Test` into our `test/Tests` project; the `Table_Tests` that depend on it, simply `import enso_dev.Tests`.
- Changes the layout of the local libraries directory:
- It used to be `root/<namespace>/<name>`.
- Now it is `root/<dir>` - the namespace and name are now read from `package.yaml` instead.
- Adds the parent directory of the current project to the default `ENSO_LIBRARY_PATH`.
- It is treated as a secondary path, so the default `ENSO_HOME/lib` still takes precedence.
- This allows projects to reference and load 'sibling' projects easily - the only requirement is for the project to enable `prefer-local-libraries: true` or add the other local project to its edition. The edition resolution logic is **not changed**.
- Closes#6111
- Aligns semantics of handling Mixed columns.
- Now, if an operation like `iif` or `fill_nothing` is given a `Mixed` column, the result will also be `Mixed` regardless of the `inferred_precise_value_type`.
- Enables a few old tests that were pending but could be enabled since the types work is advanced enough.
- Update list of groups to agreed list.
- Lower case `ALIAS` names to be consistent with function names.
- Add `GROUP` to methods.
- All constructors and functions have doc comments.
- Correct a few typos (e.g. `PRVIATE`).
- Mark some more things as `PRIVATE`.
- Use `ToDo:` and `Note:` consistently.
- Order tags in doc comment.
# Important Notes
We don't have all the doc comments on types and will want to add them in future,
- Closes#5159
- Now data downloaded from the database can keep the type much closer to the original type (like string length limits or smaller integer types).
- Cast also exposes these types.
- The integers are still all stored as 64-bit Java `long`s, we just check their bounds. Changing underlying storage for memory efficiency may come in the future: #6109
- Fixes#7565
- Fixes#7529 by checking for arithmetic overflow in in-memory integer arithmetic operations that could overflow. Adds a documentation note saying that the behaviour for Database backends is unspecified and depends on particular database.
Closes#7353
I introduce a new type `WithAggregatedProblems`, because `WithProblems` was too simple - it only allowed to hold a `List<Problem>` but `AggregatedProblems` is more than that. Ideally we shouldn't multiply entities like this too much. We should probably unify all to use `WithAggregatedProblems` - but after starting this, I realised it will likely just take too much effort to do for this little PR. So instead, I created a follow-up task for this: #7514
- Fixes#7412
- Also adds tests and fixes some more edge cases:
- Ensures correct handling of existing Database tables whose column names may be invalid from Enso perspective, or clashing from Enso perspective (e.g. for most DBs `ś` and `s\u0301` are different names, but for Enso they are basically the same so this would cause issues - thus Enso now renames such columns when accessed (still using the correct column reference in the generated SQL under the hood).
- Closes#5951
- Ensures any SQL warnings reported by the database through the JDBC driver are processed and forwarded to the user.
- These warnings show issues like the implicit name truncation that this PR is also solving. It's good to make sure they are visible as they can help avoid and understand unexpected problems. They should not show up in most standard workflows.
- Adds simple history to our REPL.
- Fixes#7231
- Cleans up vectorized operations to distinguish unary and binary operations.
- Introduces MixedStorage which may pretend to be a more specialized storage on demand.
- Ensures that operations request a more specialized storage on right-hand side to ensure compatibility with reported inferred storage type.
- Ensures that a dataflow error returned by an Enso callback in Java is propagated as a polyglot exception and can be caught back in Enso
- Tests for comparison of Mixed storages with each other and other types
- Started using `Set` for `Filter_Condition.Is_In` for better performance.
- ~~Migrated `Column.map` and `Column.zip` to use the Java-to-Enso callbacks.~~
- This does not forward warnings. IMO we should not be losing them. We can switch and add a ticket to fix the warnings, but that would be a regression (current implementation handles them correctly). Instead, we should first gain some ability to work with warnings in polyglot. I created a ticket to get this figured out #7371
- ~~Trying to avoid conversions when calling Enso functions from Java.~~
- Needs extra care as dataflow errors may not be handled right then. So only works for simple functions that should not error.
- Not sure how much it really helps. [Benchmarks](https://github.com/enso-org/enso/pull/7270#issuecomment-1635618393) suggested it could improve the performance quite significantly, but the practical solution is not exactly the same as the one measured, so we may have to measure and tune it to get the best results.
- Created #7378 to track this.
Fixes#7336 in a quick way.
Next to the old way of defining groups, the library can just add `GROUP` tag to some entities, and it will be added to the group specified in tag's description.
The group name may be qualified (with project name, like `Standard.Base.Input/Output`) or just name - in the latter case, IDE will assume a group defined in the same library as the entity.
Also moved some entities from "export" list in package.yaml to GROUP tag to give an example. I didn't move all of those, as I assume the library team will reorganize those groups anyway.
### Important Notes
@jdunkerley @radeusgd @GregoryTravis When you will start specifying groups in tags, remember that:
* The groups still belongs to a concrete project; if some entity outside a project wants to be added to its group, the "qualified" name should be specified. See `Table.new` example in this PR.
* If the group name does not reflect any group in package.yaml **the tag is ignored**.
* A single entity may be only in a single group. If it's specified in both package.yaml and in tag, the tag takes precedence.
---------
Co-authored-by: Ilya Bogdanov <fumlead@gmail.com>
The added benchmark is a basis for a performance investigation.
We compare the performance of the same operation run in Java vs Enso to see what is the overhead and try to get the Enso operations closer to the pure-Java performance.
Designing new `Bench` API to _collect benchmarks_ first and only execute them then. This is a minimal change to allow implementation of #7323 - e.g. ability to invoke a _single benchmark_ via JMH harness.
# Important Notes
This is just the basic API skeleton. It can be enhanced, if the basic properties (allowing integration with JMH) are kept. It is not intent of this PR to make the API 100% perfect and usable. Neither it is goal of this PR to update existing benchmarks to use it (74ac8d7 changes only one of them to demonstrate _it all works_ somehow). It is however expected that once this PR is integrated, the newly written benchmarks (like the ones from #7270) are going to use (or even enhance) the new API.
This PR does three related things:
- Fails more gracefully when a non-string is passed to compile_regex
- Don't pass a non-string to compile_regex
- Allow a Regex param to parse_to_table
The Regex change introduced some issues.
Added a test for missed case in `rename_columns` where using vector of pairs.
Reverted parameter order change for `select_columns`.
This PR modifies the builtin method processor such that it forbids arrays of non-primitive and non-guest objects in builtin methods. And provides a proper implementation for the builtin methods in `EnsoFile`.
- Remove last `to_array` calls from `File.enso`
- Add dropdowns for `replace` functions.
- Retire `Column_Selector` type.
- Add `select_blank_columns` and `remove_blank_columns` functions to table types.
- Allow Regex to be used to pick columns.
In #7148 I improved the error message when a `Filter_Condition` constructor without arguments is provided to `Vector.filter` and its friends. This PR applies the same check to the `Table.filter`.
This is useful, because when we select a Filter_Condition from a widget, initially it does not have all its arguments applied. This used to lead to confusing errors being reported to the user, now, a much clearer error is shown:
![image](https://github.com/enso-org/enso/assets/1436948/19140a7b-d6fc-4292-81d3-dc6d61135cb9)
- Adds `Column.date_diff` for computing date/time difference as integer multiply of some unit.
- Adds `Column.date_add` for shifting date/time by a unit.
- Adds `Column.date_part` for extracting various parts of the date/time value as integer.
- Adds widgets for the 3 methods above whose content depends on the column value type.
- Adds shorthands: `Column.hour`, `Column.minute` and `Column.second` to extract these date parts.
- Extends `Time_Period` with support for milli-, micro- and nano- seconds; and adapts functions taking `Time_Period` to support these wherever possible.
- Removed Array methods: `new`, `copy` and `new_[1234]`.
- New builtins for `Vector.insert`, `Vector.remove` and `Vector.flatten`.
- Replaced `Vector_Builder` use of `Array.copy` to a `Vector.Builder` approach.
Mostly stuff to tidy up the static methods in the CB.
- Remove default pattern from `parse_to_table` (caused IDE to freeze).
- Rename any `_` arguments to what they are.
- Merge `Date.now` into `Date.today`
- Merge the Interval constructors into a single constructor.
- Hide various methods.
Fixes#6955 by:
- using `visualisationModule` to specify the module where the visualization is to be used
- referring to method in `Meta.get_annotation` with `.method_name` - e.g. unresolved symbol notation
- evaluating arguments to `Meta.get_annotation` in the context of the user module (which can access the extension functions)
- Removed `module` argument from `enso_project` (new `Project_Description.new` API).
- Removed the custom option from date and time parse/format dropdowns.
- The `format` dropdown uses the value to create the dropdown. (Screenshot below)
- Removed `StorageType` coalescing rules and replaced them with simpler logic in `ObjectStorage`.
- Update signature for `add_row_number` and add aliases.
- Add type detection for `Mixed` columns when calling column functions.
- Excel uses column name for missing headers.
- Add aliases for parse functions on text.
- Adjust `Date`, `Time_Of_Day` and `Date_Time` parse functions to not take `Nothing` anymore and provide dropdowns.
- Removed built-in parses.
- All support Locale.
- Add support for missing day or year for parsing a Date.
- All will trim values automatically.
- Added ability to list AWS profiles.
- Added ability to list S3 buckets.
- Workaround for Table.aggregate so default item added works.
This adds the spec for all update actions, but implements the common input validation framework and `Insert`. Tests for remaining actions are marked as pending - these will be implemented in a subsequent PR.
Related to #6912
It essentially solves it by removing any builtins that would take an EnsoDate/EnsoTimeOfDay/EnsoTimeZone and replacing them with Java utils that do the same operation.
This is not a proper solution - the builtin conversion is still invalid for the date/time types - but at this moment we may just no longer use the invalid conversion so it is much less of an issue. We still need to be aware of this if we want to introduce builtins taking date/time in the future.
First part for #6498 - refactoring of the upload infrastructure, in preparation for `update_database_table`.
Implemented a `Set` data structure which was long needed.
The APIs are added and an initial implementation is created, but it is not complete - but it has grown significantly already so the remaining implementation will be done as a separate PR.
Adds some basic ability for a function to ensure that it is only executed from within a transaction.
- Add the missing dropdowns for `Locale` and `Encoding`.
- Correct a few mismatched type signatures.
- Adjust `order_by` calls with a single `Sort_Column` to call in a Vector.
- Adjust parameter names for `transpose`.
- Fix for the table viz: escape HTML and `suppressFieldDotNotation`.
- Use `Filter_Condition.Equal True` for the default filter.
- Adjust `Data.fetch` to return the response on success when parse fails. Rename `parse` to `try_auto_parse`.
- Add various aliases for methods.
- Add tests for `Table.set` when using a `Vector`, `Range` or `Date_Range`.
- Add check for mismatched length on `Table.set`.
![image](https://github.com/enso-org/enso/assets/4699705/23ea0ba3-2b05-4af8-afd9-f35b55446c24)
![image](https://github.com/enso-org/enso/assets/4699705/8b0253e6-e9e8-490a-9607-0da51ab5a215)
Previously, a `RuntimeException` would be thrown when an attempt would be made to curry a conversion function. That is problematic for IDE where `executionFailed` means we can't enter functions due to lack of method pointers info.
Closes#6897.
![Screenshot from 2023-06-02 20-31-03](https://github.com/enso-org/enso/assets/292128/a6c77544-2c47-425c-8ce0-982d837dda5b)
# Important Notes
A more generic solution that allows to recover from execution failures will need a follow up.
Closes#5227
# Important Notes
- This lays first steps towards #6292 - we get pure Enso variants of MultiValueKey.
- Another part refactors `LongStorage` into `AbstractLongStorage` allowing it to provide alternative implementations of the underlying storage, in our case `LongRangeStorage` generating the values ad-hoc and `LongConstantStorage` - currently unused but in the future it can be adapted to support constant columns (once we implement similar facilities for other types).
- Adds execution control to `Table.write`.
- Refactored the `Text.write` to make part reusable.
- Tidied up some legacy mess in tests.
- Add easier flow to go from `Text` to an `URI` to fetching data.
- Add decode functions to `Response` and `Response_Body`.
- Fix issue with 0 length regex matches (using same as Python and .Net approach).
- Add various ALIAS entries to make function discovery easier.
- Sort a lot of drop down and vector editors out (including switch to fully qualified names).
Throwing `TailCallException` meant that exceptions that were extracted from the expression before the call was made could not be appended. This change catches the `TailCallException`, adds warnings to it and propagates it further, thus ensuring that we don't loose the information.
Closes#6765.
# Important Notes
Removed workarounds introduced in stdlib.
- Fix couple of bugs in Table viz: rounding of bottom div, missing character, not including row count as an option.
- Add better JSON format for `Row`, add support for visualization in the Table viz both for `Vector Row` or `Row`.
- Fix some type signature errors.
- Move `Column_Format` to `Standard.Table.Internal`.
- Move `format_widget` to `File_Format.default_widget` and sort the signature of `Widget` methods.
- Added utility to make `Single_Choice` widgets.
- Added dropdown for delimiter on split methods.
- Removed `default_widget` from `Problem_Behavior` and `Filter_Condition`.
- Altered signature and widgets for table functions.
- Added `to_column` extension to allow easy conversion of Range and Vector to Column.
- Added `compute`, `compute_bulk`, `running` to Column to allow statistic computation.
- Added drop down for `Table.write` format parameter.
- Added drop down for `Table.rename_columns`.
- Added support for Vector of pairs for renaming columns.
- Added check when making a map from Vector if not 2 items.
![image](https://github.com/enso-org/enso/assets/4699705/beed257c-efe3-44a3-9e3a-041354701735)
Implements the simplest `parse` scenarios for the Database backend.
Before #6711 these could have been done by `cast`, but in #6711 the APIs were unified to only allow casting to the same set of types in both in-memory and Database. Converting Text to other types is supposed to be done by `parse` and not `cast`, so the ability to use `cast` for rudimentary parsing is removed in the Database backend to make it consistent with in-memory. But now it is lacking any, even simplest, Text->Int/Text->Date support. To alleviate that, the simple scenarios for `parse` are implemented (no support for format customization yet, will boil down to a cast under the hood).
close#6611
Changelog:
- update: run compiler passes on the `ascribedType` field of the constructor arguments
- update: suggestion builder uses the type information attached to `ascribedType`
- feat: resolve qualified names in type signatures
- Moved the row count out of the grid.
- Shown for all cases.
- Added some of the JS wiring to do pages but currently hidden.
- Dropdown allowing the user to control the number of rows rendered.
- `Nothing` rendered as a italic light Nothing not just empty now.
- Function rendered as `[Function]`.
- Rounded corners on top to make it align more with look of panel.
- Dropdown for `Text.split` delimiter feed.
Related to #6410
# Important Notes
- Updated some `Meta` methods (needed for error handling):
- `Meta.Type` now has `name` and `qualified_name`.
- `Meta.Constructor` has `declaring_type` allowing to get the type that this constructor is associated with.
Fixes#6609 by
- e380e647af - running whole `Vector_Spec` on `java.util.ArrayList`
- 9b1229fe20 - introducing a node to handle interop values
# Important Notes
Contains additional DSL processor fix:
- 415623dcb9 - to not crash the compiler, but to properly report compiler error
Artifically limiting the number of reported warnings to 100. Also added benchmarks with random Ints to investigate perf issues when dealing with warnings (future task).
Ideally we would have a custom set-like collection that allows us internally to specify a maximal number of elements. But `EnsoHashMap` (and potentially `EnsoSet`) are still WIP when it comes to being PE-friendly.
The change also allows for checking if the limit for the number of reported warnings has been reached. It will visualize by adding an additional "Warnings limit reached." to the visualization.
The limit is configurable via `--warnings-limit` parameter to `run`.
Closes#6283.
Closes#6437
Related to #6410
- Add example duplicate row to `Non_Unique_Primary_Key`.
- Ensure `File.read` fails if the file does not exist, always.
- Ensure SQLite fails if file is empty or nonexistent or malformed.
- Split file format detection into read and write modes, so that the read mode can depend on actual file _contents_.
When generating new column names in tokenize/split_to_columns, if there's only one new column, use the deleted input column name and don't add disambiguating integers after it.
Add format to the in-memory Column
# Important Notes
Also updates .format in date types.
Some rearrangement of date formatting builtins / Java libraries.
- Add dropdown to tokenize and split `column`.
- Remove the custom `Join_Kind` dropdown.
- Adjust split and tokenize names to start numbering from 1, not 0.
- Add JS_Object serialization for Period.
- Add `days_until` and `until` to `Date`.
- Add `Date_Period.Day` and create `next` and `previous` on `Date`.
- Use simple names with `File_Format` dropdown.
- Avoid using `Main.enso` based imports in `Standard.Base.Data.Map` and `Standard.Base.Data.Text.Helpers`.
- Remove an incorrect import from `Standard.Database.Data.Table`.
From #6587:
A few small changes, lots of lines because this affected lots of tests:
- `Table.join` now defaults to `Join_Kind.Left_Outer`, to avoid losing rows in the left table unexpectedly. If the user really wants to have an Inner join, they can switch to it.
- `Table.join` now defaults to joining columns by name not by index - it looks in the right table for a column with the same name as the first column in left table.
- Missing Input Column errors now specify which table they refer to in the join.
- The unique name suffix in column renaming / default column names when loading from file is now a space instead of underscore.
- Add `with_disabled` shortcut for `Context`.
- Protect `Image.write` behind `Context.Output`.
- Correct text on the `Forbidden_Operation` error message.
- Remove context overrides from tests.
- Add `File` operations tests with `Context.Output` disabled.
- Add tests for `Text.write` operations with `Context.Output` disabled.
- Use a better method to make `File_Format` dropdown widget.
- Fix bug in `Invalid_Format.to_display_text`.
Remove the magical code generation of `enso_project` method from codegen phase and reimplement it as a proper builtin method.
The old behavior of `enso_project` was special, and violated the language semantics (regarding the `self` argument):
- It was implicitly declared in every module, so it could be called without a self argument.
- It can be called with explicit module as self argument, e.g. `Base.enso_project`, or `Visualizations.enso_project`.
Let's avoid implicit methods on modules and let's be explicit. Let's reimplement the `enso_project` as a builtin method. To comply with the language semantics, we will have to change the signature a bit:
- `enso_project` is a static method in the `Standard.Base.Meta.Enso_Project` module.
- It takes an optional `project` argument (instead of taking it as an explicit self argument).
Having the `enso_project` defined as a (shadowed) builtin method, we will automatically have suggestions created for it.
# Important Notes
- Truffle nodes are no longer generated in codegen phase for the `enso_project` method. It is a standard builtin now.
- The minimal import to use `enso_project` is now `from Standard.Base.Meta.Enso_Project import enso_project`.
- Tested implicitly by `org.enso.compiler.ExecCompilerTest#testInvalidEnsoProjectRef`.
- Adjusted `Context.is_enabled` to support default argument (moved built in so can have defaults).
- Made `environment` case-insensitive.
- Bug fix for play button.
- Short hand to execute within an enabled context.
- Forbid file writing if the Output context is disabled with a `Forbidden_Operation` error.
- Add temporary file support via `File.create_temporary_file` which is deleted on exit of JVM.
- Execution Context first pass in `Text.write`.
- Added dry run warning.
- Writes to a temporary file if disabled.
- Created a `DryRunFileManager` which will create and manage the temporary files.
- Added `format` dropdown to `File.read` and `Data.read`.
- Renamed `JSON_File` to `JSON_Format` to be consistent.
(still to unit test).
This change modifies method dispatch for methods that override Any's definitions. When an overrided method is invoked statically we call Any's method to stay consistent.
This change primarily addresses the plethora of problems related to `to_text` invocations. It does not attempt to completely modify method dispatch logic.
Closes#6300.
* Rename makeUnique overloads to avoid issue when Nothing is passed.
Suspend warnings when building the output table to avoid mass warning duplication.
* Add test for mixed invalid names.
Adjust so a single warning attached.
* PR comments.
related #6323
Fixes the following scenario:
- preprocessor function catches `ThreadInterruptedException` and returns it as a JSON
```json
{"error":"org.enso.interpreter.runtime.control.ThreadInterruptedException"}
```
- engine thinks that the visualization was computed successfully and just returns it to the user (without doing any fallback logic like retrying the computation)
- IDE displays blank visualization
- Missing tests from number parsing.
- Fix type signature on some warning methods.
- Fix warnings on `Standard.Database.Data.Table.parse_values`.
- Added test for `Nothing` and empty string on `use_first_row_as_names`.
- New API for `Number.format` taking a simple format string and `Locale`.
- Add ellipsis to truncated `Text.to_display_text`.
- Adjusted built-in `to_display_text` for numbers to not include type (but also to display BigInteger as value).
- Remove `Noise.Generator` interface type.
- Json: Added `to_display_text` to `JS_Object`.
- Time: Added `to_display_text` for `Date`, `Time_Of_Day`, `Date_Time`, `Duration` and `Period`.
- Text: Added `to_display_text` to `Locale`, `Case_Sensitivity`, `Encoding`, `Text_Sub_Range`, `Span`, `Utf_16_Span`.
- System: Added `to_display_text` to `File`, `File_Permissions`, `Process_Result` and `Exit_Code`.
- Network: Added `to_display_text` to `URI`, `HTTP_Status_Code` and `Header`.
- Added `to_display_text` to `Maybe`, `Regression`, `Pair`, `Range`, `Filter_Condition`.
- Added support for `to_js_object` and `to_display_text` to `Random_Number_Generator`.
- Verified all error types have `to_display_text`.
- Removed `BigInt`, `Date`, `Date_Time` and `Time_Of_Day` JS based rendering as using `to_display_text` now.
- Added support for rendering nested structures in the table viz.
Follow up of #6298 as it grew too much. Adds the needed typechecks to aggregate operations. Ensures that the DB operations report `Floating_Point_Equality` warning consistently with in-memory.
- Add `replace` with same syntax as on `Text` to an in-memory `Column`.
- Add `trim` with same syntax as on `Text` to an in-memory `Column`.
- Add `trim` to in-database `Column`.
- Added `is_supported` to dialects and exposed the dialect consistently on the `Connection`.
- Add `write_table` support to `JSON_File` allowing `Table.write` to write JSON.
- Updated the parsing for integers and decimals:
- Support for currency symbols.
- Support for brackets for negative numbers.
- Automatic detection of decimal points and thousand separators.
- Tighter rules for scientific and thousand separated numbers.
- Remove `replace_text` from `Table`.
- Remove `write_json` from `Table`.
`Number.nan` can be used as a key in `Map`. This PR basically implements the support for [JavaScript's Same Value Zero Equality](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Equality_comparisons_and_sameness#same-value-zero_equality) so that `Number.nan` can be used as a key in `Map`.
# Important Notes
- For NaN, it holds that `Meta.is_same_object Number.nan Number.nan`, and `Number.nan != Number.nan` - inspired by JS spec.
- `Meta.is_same_object x y` implies `Any.== x y`, except for `Number.nan`.
https://github.com/orgs/enso-org/discussions/6344 requested to change the order of arguments when controlling context permissions.
# Important Notes
The change brings it closer to the design doc but IMHO also a bit cumbersome to use (see changed tests) - applications involving default arguments don't play well when the last argument is not the default 🤷 .`
* Update type ascriptions in some operators in Any
* Add @GenerateUncached to AnyToTextNode.
Will be used in another node with @GenerateUncached.
* Add tests for "sort handles incomparable types"
* Vector.sort handles incomparable types
* Implement sort handling for different comparators
* Comparison operators in Any do not throw Type_Error
* Fix some issues in Ordering_Spec
* Remove the remaining comparison operator overrides for numbers.
* Consolidate all sorting functionality into a single builtin node.
* Fix warnings attachment in sort
* PrimitiveValuesComparator handles other types than primitives
* Fix byFunc calling
* on function can be called from the builtin
* Fix build of native image
* Update changelog
* Add VectorSortTest
* Builtin method should not throw DataflowError.
If yes, the message is discarded (a bug?)
* TypeOfNode may not return only Type
* UnresolvedSymbol is not supported as `on` argument to Vector.sort_builtin
* Fix docs
* Fix bigint spec in LessThanNode
* Small fixes
* Small fixes
* Nothings and Nans are sorted at the end of default comparator group.
But not at the whole end of the resulting vector.
* Fix checking of `by` parameter - now accepts functions with default arguments.
* Fix changelog formatting
* Fix imports in DebuggingEnsoTest
* Remove Array.sort_builtin
* Add comparison operators to micro-distribution
* Remove Array.sort_builtin
* Replace Incomparable_Values by Type_Error in some tests
* Add on_incomparable argument to Vector.sort_builtin
* Fix after merge - Array.sort delegates to Vector.sort
* Add more tests for problem_behavior on Vector.sort
* SortVectorNode throws only Incomparable_Values.
* Delete Collections helper class
* Add test for expected failure for custom incomparable values
* Cosmetics.
* Fix test expecting different comparators warning
* isNothing is checked via interop
* Remove TruffleLogger from SortVectorNode
* Small review refactorings
* Revert "Remove the remaining comparison operator overrides for numbers."
This reverts commit 0df66b1080.
* Improve bench_download.py tool's `--compare` functionality.
- Output table is sorted by benchmark labels.
- Do not fail when there are different benchmark labels in both runs.
* Wrap potential interop values with `HostValueToEnsoNode`
* Use alter function in Vector_Spec
* Update docs
* Invalid comparison throws Incomparable_Values rather than Type_Error
* Number comparison builtin methods return Nothing in case of incomparables
The primary motivation for this change was
https://github.com/enso-org/enso/issues/6248, which requested the possibility of defining `to_display_text` methods of common errors via regular method definitions. Until now one could only define them via builtins.
To be able to support that, polyglot invocation had to report `to_display_text` in the list of (invokable) members, which it didn't. Until now, it only considered fields of constructors and builtin methods. That is now fixed as indicated by the change in `Atom`.
Closes#6248.
# Important Notes
Once most of builtins have been translated to regular Enso code, it became apparent how the usage of `.` at the end of the message is not consistent and inflexible. The pure message should never follow with a dot or it makes it impossible to pretty print consistently for the purpose of error reporting. Otherwise we regularly end up with errors ending with `..` or worse. So I went medieval on the reasons for failures and removed all the dots.
The overall result is mostly the same except now we are much more consistent.
Finally, there was a bit of a good reason for using builtins as it simplified our testing.
Take for example `No_Such_Method.Error`. If we do not import `Errors.Common` module we only rely on builtin error types. The type obviously has the constructor but it **does not have** `to_display_text` in scope; the latter is no longer a builtin method but a regular method. This is not really a problem for users who will always import stdlib but our tests often don't. Hence the number of changes and sometimes lack of human-readable errors there.
Closes#5254
In #6189 the SQLite version was bumped to a newer release which has builtin support for Full and Right joins, so no workaround is no longer needed.
As per design, IOContexts controlled via type signatures are going away. They are replaced by explicit `Context.if_enabled` runtime checks that will be added to particular method implementations.
`production`/`development` `IOPermissions` are replaced with `live` and `design` execution enviornment. Currently, the `live` env has a hardcoded list of allowed contexts i.e. `Input` and `Output`.
# Important Notes
As per design PR-55. Closes#6129. Closes#6131.
close#6080
Changelog
- add: implement `SuggestionsRepo.insertAll` as a batch SQL insert
- update: `search/getSuggestionsDatabase` returns empty suggestions. Currently, the method is only used at startup and returns the empty response anyway because the libs are not loaded at that point.
- update: serialize only global (defined in the module scope) suggestions during the distribution building. There's no sense in storing the local library suggestions.
- update: sqlite dependency
- remove: unused methods from `SuggestionsRepo`
- remove: Arguments table
# Important Notes
Speeds up libraries loading by ~1 second.
![2023-04-03-173423_2086x324_scrot](https://user-images.githubusercontent.com/357683/229597470-19dcc010-2a34-43e1-87be-60af99afd275.png)
![2023-04-03-173514_2083x321_scrot](https://user-images.githubusercontent.com/357683/229597476-bf5b3c33-6321-4ac9-a0ca-2fb57d257857.png)
Implements #6134.
# Important Notes
One can define lazy atom fields as:
```haskell
type Lazy
Value ~x ~y
```
the evaluation of the `x` and `y` fields is then delayed until they are needed. The evaluation happens once. Then the computed value is kept in the atom for further use.
Fixes#5898 by removing `Catch.panic` and speeding the `sieve.enso` benchmark from 1058 ms to 514 ms. Should there be no dedicated conversion, let's use one defined on `Any` type - e.g. defining a conversion `from(Any)` makes such a conversion is always available.
- `Process.run` now returns a `Process_Result` allowing the easy capture of stdout and stderr.
- Joining a column with a column name does not warn if adding just the prefix.
- Stop the table viz from changing case and adding spaces to the headers.
This is the first part of the #5158 umbrella task. It closes#5158, follow-up tasks are listed as a comment in the issue.
- Updates all prototype methods dealing with `Value_Type` with a proper implementation.
- Adds a more precise mapping from in-memory storage to `Value_Type`.
- Adds a dialect-dependent mapping between `SQL_Type` and `Value_Type`.
- Removes obsolete methods and constants on `SQL_Type` that were not portable.
- Ensures that in the Database backend, operation results are computed based on what the Database is meaning to return (by asking the Database about expected types of each operation).
- But also ensures that the result types are sane.
- While SQLite does not officially support a BOOLEAN affinity, we add a set of type overrides to our operations to ensure that Boolean operations will return Boolean values and will not be changed to integers as SQLite would suggest.
- Some methods in SQLite fallback to a NUMERIC affinity unnecessarily, so stuff like `max(text, text)` will keep the `text` type instead of falling back to numeric as SQLite would suggest.
- Adds ability to use custom fetch / builder logic for various types, so that we can support vendor specific types (for example, Postgres dates).
# Important Notes
- There are some TODOs left in the code. I'm still aligning follow-up tasks - once done I will try to add references to relevant tasks in them.
Enables `distinct`, `aggregate` and `cross_tab` to use the Enso hashing and equality operations.
Also, I rewired the way the ObjectComparators are obtained in polyglot code to be more consistent.
Add Comparator for `Day_Of_Week`, `Header`, `SQL_Type`, `Image` and `Matrix`.
Also, removed the custom `==` from these types as needed. (Closes#5626)
- Fixes InvokeCallableNode to support warnings.
- Strips warnings from annotations in `get_widget_json`.
- Remove `get_full_annotations_json`.
- Fix warnings on Dialect.
Exporting types named the same as the module where they are defined in `Main` modules of library components may lead to accidental name conflicts. This became apparent when trying to access `Problem_Behavior` module via a fully qualified name and the compiler rejected it. This is due to the fact that `Main` module exported `Error` type defined in `Standard.Base.Error` module, thus making it impossible to access any other submodules of `Standard.Base.Error` via a fully qualified name.
This change adds a warning to FullyQualifiedNames pass that detects any such future problems.
While only `Error` module was affected, it was widely used in the stdlib, hence the number of changes.
Closes#5902.
# Important Notes
I left out the potential conflict in micro-distribution, thus ensuring we actually detect and report the warning.
Fixing #5768 and #5765 and co. Introducing `Meta.Type` and giving it the desired methods.
# Important Notes
`Type` is no longer a `Meta.Atom`, but it has a dedicated `Meta.Type` representation.
Implement new Enso documentation parser; remove old Scala Enso parser.
Performance: Total time parsing documentation is now ~2ms.
# Important Notes
- Doc parsing is now done only in the frontend.
- Some engine tests had never been switched to the new parser. We should investigate tests that don't pass after the switch: #5894.
- The option to run the old searcher has been removed, as it is obsolete and was already broken before this (see #5909).
- Some interfaces used only by the old searcher have been removed.
Closes#5151 and adds some additional tests for `cross_tab` that verify duplicated and invalid names.
I decided that for empty or `Nothing` names, instead of replacing them with `Column` and implicitly losing connection with the value that was in the column, we should just error on such values.
To make handling of these easier, `fill_empty` was added allowing to easily replace the empty values with something else.
Also, `{is,fill}_missing` was renamed to `{is,fill}_nothing` to align with `Filter_Condition.Is_Nothing`.
Merge _ordered_ and _unordered_ comparators into a single one.
# Important Notes
Comparator is now required to have only `compare` method:
```
type Comparator
comapre : T -> T -> (Ordering|Nothing)
hash : T -> Integer
```
Follow-up to #5113 - I add some more edge case tests as we discussed with @jdunkerley
When debugging some quoting issues, I also realised the current `Mismatched_Quote` error provided not enough information. So I amended it to at least include some context indicating which was the 'offending' cell.
Removing special handling of `AtomConstructor` in `Meta.is_a` check.
# Important Notes
A lot of tests are about to fail. Many of them indirectly call `Meta.is_a` with a constructor rather than type.
- Fix issue with Geo Map viz.
- Handle invalid format strings better in `Data_Formatter`.
- New constants for the ISO format strings (and a special ENSO_ZONED_DATE_TIME)
- Consistent Date Time format for parsing in all places.
- Avoid throwing exception in datetime parsing.
- Support for milliseconds (well nanoseconds) in Date_Time and Time_Of_Day.
- `Column.map` stays within Enso.
- Allow `Aggregate_Column.Group_By` in `cross_tab` group_by parameter.
Closes#5113
Fixes a bug where read-only files would be overwritten if File.write was used in backup mode, and added tests to avoid such regression. To implement it, introduced a `is_writable` property on `File`.
- Adjust Excel Workbook write behaviour.
- Support Nothing / Null constants.
- Deduce the type of arithmetic operations and `iif`.
- Allow Date_Time constants, treating as local timezone.
- Removed the `to_column_name` and `ensure_sane_name` code.
- Handle `WithWarnings` in `IndirectInvokeCallableNode`.
- Handle no RootNode in `ErrorResolver`.
- Allow table vizualisation to cope if no `data` passed.
- Add `Warning.has_warnings` to check if warnings present.
- Adjust `set_value` for `JS_Object` so creates a new object each time.
- Updates the `rename_columns` API.
- Add `first_row`, `second_row` and `last_row` to the Table types.
- New option for reading only last row of ResultSet.
Rename is_match + match to match + find (respectively), and remove all non-regexp functionality.
Regexp flags and Match_Mode are also no longer supported by these methods.
Critical performance improvements after #4067
# Important Notes
- Replace if-then-else expressions in `Any.==` with case expressions.
- Fix caching in `EqualsNode`.
- This includes fixing specializations, along with fallback guard.
Remove regex support from .locate and .locate_all; regex functionality is moved to .match and .match_all where appropriate. This is in preparation for simplifying regex support across the board.
Also change Matching_Mode types to a single type with two variants.
Note: the matcher parameter to .locate and .locate_all has been replaced by a case_sensitivity parameter, of type Case_Sensitivity, which differs in that it also has a Default option. Default is treated as Sensitive.
- Updated `Widget.Vector_Editor` ready for use by IDE team.
- Added `get` to `Row` to make API more aligned.
- Added `first_column`, `second_column` and `last_column` to `Table` APIs.
- Adjusted `Column_Selector` and associated methods to have simpler API.
- Removed `Column` from `Aggregate_Column` constructors.
- Added new `Excel_Workbook` type and added to `Excel_Section`.
- Added new `SQLiteFormatSPI` and `SQLite_Format`.
- Added new `IamgeFormatSPI` and `Image_Format`.
Closes#5109
# Important Notes
- Currently the tests pass for the in-memory parts of Common_Table_Operations, but still some stuff not working on DB backends - in progress.
Add `Comparator` type class emulation for all types. Migrate all the types in stdlib to this new `Comparator` API. The main documentation is in `Ordering.enso`.
Fixes these pivotals:
- https://www.pivotaltracker.com/story/show/183945328
- https://www.pivotaltracker.com/story/show/183958734
- https://www.pivotaltracker.com/story/show/184380208
# Important Notes
- The new Comparator API forces users to specify both `equals` and `hash` methods on their custom comparators.
- All the `compare_to` overrides were replaced by definition of a custom _ordered_ comparator.
- All the call sites of `x.compare_to y` method were replaced with `Ordering.compare x y`.
- `Ordering.compare` is essentially a shortcut for `Comparable.from x . compare x y`.
- The default comparator for `Any` is `Default_Unordered_Comparator`, which just forwards to the builtin `EqualsNode` and `HashCodeNode` nodes.
- For `x`, one can get its hash with `Comparable.from x . hash x`.
- This makes `hash` as _hidden_ as possible. There are no other public methods to get a hash code of an object.
- Comparing `x` and `y` can be done either by `Ordering.compare x y` or `Comparable.from x . compare x y` instead of `x.compare_to y`.
- Fixes the display of Date, Time_Of_Day and Date_Time so doesn't wrap.
- Adjust serialization of large integer values for JS and display within table.
- Workaround for issue with using `.lines` in the Table (new bug filed).
- Disabled warning on no specified `separator` on `Concatenate`.
Does not include fix for aggregation on integer values outside of `long` range.
Closes#5038
- Use the proper widget structure.
- Provide new method `get_widget_json` with whole structure, but keep `get_full_annotations_json` in old form.
- Start to get to some reusable functions.
- Added widget to JS_Object field selections.
- New `set` function design - takes a `Column` and works with that more easily and supports control of `Set_Mode`.
- New simple `parse` API on `Column`.
- Separated expression support for `filter` to new `filter_by_expression` on `Table`.
- New `compute` function allowing creation of a column from an expression.
- Added case sensitivity argument to `Column` based on `starts_with`, `ends_with` and `contains`.
- Added case sensitivity argument to `Filter_Condition` for `Starts_With`, `Ends_With`, `Contains` and `Not_Contains`.
- Fixed the issue in JS Table visualisation where JavaScript date was incorrectly set.
- Some dynamic dropdown expressions - experimenting with ways to use them.
- Fixed issue with `.pretty` that wasn't escaping `\`.
- Changed default Postgres DB to `postgres`.
- Fixed SQLite support for starts_with, ends_with and contains to be consistent (using GLOB not LIKE).
- Updated `Text.starts_with`, `Text.ends_with` and `Text.contains` to new simpler API.
- Added a `Case_Sensitivity.Default` and adjusted `Table.distinct` to use it by default.
- Fixed a bug with `Data.fetch` on an HTTP error.
- Improved SQLite Case Sensitivity control in distinct to use collations.
When the function is visualized with the `default_preprocessor` method, it returns a dataflow error that shows up in the logs as
```
Cannot encode class org.enso.interpreter.runtime.error.DataflowError to byte array.
```
PR handles dataflow errors in the preprocessor and fallbacks the `to_display_text` value. As a bonus, we get a function visualization.
Implements [#183453466](https://www.pivotaltracker.com/story/show/183453466).
https://user-images.githubusercontent.com/1428930/203870063-dd9c3941-ce79-4ce9-a772-a2014e900b20.mp4
# Important Notes
* the best laziness is used for `Text` type, which makes use of its internal representation to send data
* any type will first compute its default string representation and then send the content of that lazy to the IDE
* special handling of files and their content will be implemented in the future
* size of the displayed text can be updated dynamically based on best effort information: if the backend does not yet know the full width/height of the text, it can update the IDE at any time and this will be handled gracefully by updating the scrollbar position and sizes.
- add: `GeneralAnnotation` IR node for `@name expression` annotations
- update: compilation pipeline to process the annotation expressions
- update: rewrite `OverloadsResolution` compiler pass so that it keeps the order of module definitions
- add: `Meta.get_annotation` builtin function that returns the result of annotation expression
- misc: improvements (private methods, lazy arguments, build.sbt cleanup)
* Hash codes prototype
* Remove Any.hash_code
* Improve caching of hashcode in atoms
* [WIP] Add Hash_Map type
* Implement Any.hash_code builtin for primitives and vectors
* Add some values to ValuesGenerator
* Fix example docs on Time_Zone.new
* [WIP] QuickFix for HashCodeTest before PR #3956 is merged
* Fix hash code contract in HashCodeTest
* Add times and dates values to HashCodeTest
* Fix docs
* Remove hashCodeForMetaInterop specialization
* Introduce snapshoting of HashMapBuilder
* Add unit tests for EnsoHashMap
* Remove duplicate test in Map_Spec.enso
* Hash_Map.to_vector caches result
* Hash_Map_Spec is a copy of Map_Spec
* Implement some methods in Hash_Map
* Add equalsHashMaps specialization to EqualsAnyNode
* get and insert operations are able to work with polyglot values
* Implement rest of Hash_Map API
* Add test that inserts elements with keys with same hash code
* EnsoHashMap.toDisplayString use builder storage directly
* Add separate specialization for host objects in EqualsAnyNode
* Fix specialization for host objects in EqualsAnyNode
* Add polyglot hash map tests
* EconomicMap keeps reference to EqualsNode and HashCodeNode.
Rather than passing these nodes to `get` and `insert` methods.
* HashMapTest run in polyglot context
* Fix containsKey index handling in snapshots
* Remove snapshots field from EnsoHashMapBuilder
* Prepare polyglot hash map handling.
- Hash_Map builtin methods are separate nodes
* Some bug fixes
* Remove ForeignMapWrapper.
We would have to wrap foreign maps in assignments for this to be efficient.
* Improve performance of Hash_Map.get_builtin
Also, if_nothing parameter is suspended
* Remove to_flat_vector.
Interop API requires nested vector (our previous to_vector implementation). Seems that I have misunderstood the docs the first time I read it.
- to_vector does not sort the vector by keys by default
* Fix polyglot hash maps method dispatch
* Add tests that effectively test hash code implementation.
Via hash map that behaves like a hash set.
* Remove Hashcode_Spec
* Add some polyglot tests
* Add Text.== tests for NFD normalization
* Fix NFD normalization bug in Text.java
* Improve performance of EqualsAnyNode.equalsTexts specialization
* Properly compute hash code for Atom and cache it
* Fix Text specialization in HashCodeAnyNode
* Add Hash_Map_Spec as part of all tests
* Remove HashMapTest.java
Providing all the infrastructure for all the needed Truffle nodes is no longer manageable.
* Remove rest of identityHashCode message implementations
* Replace old Map with Hash_Map
* Add some docs
* Add TruffleBoundaries
* Formatting
* Fix some tests to accept unsorted vector from Map.to_vector
* Delete Map.first and Map.last methods
* Add specialization for big integer hash
* Introduce proper HashCodeTest and EqualsTest.
- Use jUnit theories.
- Call nodes directly
* Fix some specializations for primitives in HashCodeAnyNode
* Fix host object specialization
* Remove Any.hash_code
* Fix import in Map.enso
* Update changelog
* Reformat
* Add truffle boundary to BigInteger.hashCode
* Fix performance of HashCodeTest - initialize DataPoints just once
* Fix MetaIsATest
* Fix ValuesGenerator.textual - Java's char is not Text
* Fix indent in Map_Spec.enso
* Add maps to datapoints in HashCodeTest
* Add specialization for maps in HashCodeAnyNode
* Add multiLevelAtoms to ValuesGenerator
* Provide a workaround for non-linear key inserts
* Fix specializations for double and BigInteger
* Cosmetics
* Add truffle boundaries
* Add allowInlining=true to some truffle boundaries.
Increases performance a lot.
* Increase the size of vectors, and warmup time for Vector.Distinct benchmark
* Various small performance fixes.
* Fix Geo_Spec tests to accept unsorted Map.to_vector
* Implement Map.remove
* FIx Visualization tests to accept unsorted Map.to_vector
* Treat java.util.Properties as Map
* Add truffle boundaries
* Invoke polyglot methods on java.util.Properties
* Ignore python tests if python lang is missing
- Add `get` to Table.
- Correct `Count Nothing` examples.
- Add `join` to File.
- Add `File_Format.all` listing all installed formats.
- Add some more ALIAS entries.
Fixes https://www.pivotaltracker.com/story/show/184073099
# Important Notes
- Since now the only operator on columns for division, `/`, returns floats, it may be worth creating an additional `div` operator exposing integer division. But that will be done as a separate task aligning column operator APIs.
Use `InteropLibrary.isString` and `asString` to convert any string value to `byte[]`
# Important Notes
Also contains a support for `Metadata.assertInCode` to help locating the right place in the code snippets.
**Vector**
- Adjusted `Vector.sort` to be `Vector.sort order on by`.
- Adjusted other sort to use `order` for direction argument.
- Added `insert`, `remove`, `index_of` and `last_index_of` to `Vector`.
- Added `start` and `if_missing` arguments to `find` on `Vector`, and adjusted default is `Not_Found` error.
- Added type checking to `+` on `Vector`.
- Altered `first`, `second` and `last` to error with `Index_Out_Of_Bounds` on `Vector`.
- Removed `sum`, `exists`, `head`, `init`, `tail`, `rest`, `append`, `prepend` from `Vector`.
**Pair**
- Added `last`, `any`, `all`, `contains`, `find`, `index_of`, `last_index_of`, `reverse`, `each`, `fold` and `reduce` to `Pair`.
- Added `get` to `Pair`.
**Range**
- Added `first`, `second`, `index_of`, `last_index_of`, `reverse` and `reduce` to `Range`.
- Added `at` and `get` to `Range`.
- Added `start` and `if_missing` arguments to `find` on `Range`.
- Simplified `last` and `length` of `Range`.
- Removed `exists` from `Range`.
**List**
- Added `second`, `find`, `index_of`, `last_index_of`, `reverse` and `reduce` to `Range`.
- Added `at` and `get` to `List`.
- Removed `exists` from `List`.
- Made `all` short-circuit if any fail on `List`.
- Altered `is_empty` to not compute the length of `List`.
- Altered `first`, `tail`, `head`, `init` and `last` to error with `Index_Out_Of_Bounds` on `List`.
**Others**
- Added `first`, `second`, `last`, `get` to `Text`.
- Added wrapper methods to the Random_Number_Generator so you can get random values more easily.
- Adjusted `Aggregate_Column` to operate on the first column by default.
- Added `contains_key` to `Map`.
- Added ALIAS to `row_count` and `order_by`.
Don't propagate errors from `toDisplayString` - construct an error message with `Atom.toString`.
# Important Notes
> currently a failure in to_text is swallowed by `toString` and we cannot detect that something went wrong during the serialization
Not sure how satisfying the solution is, but the error swallowing happens in Truffle and there is little to do with it. We can just catch the error ourselves and produce some meaningful string.
Most of the problems with accessing `ArrayOverBuffer` have been resolved by using `CoerceArrayNode` (https://github.com/enso-org/enso/pull/3817). In `Array.sort` we still however specialized on Array which wasn't compatible with `ArrayOverBuffer`. Similarly sorting JS or Python arrays wouldn't work.
Added a specialization to `Array.sort` to deal with that case. A generic specialization (with `hasArrayElements`) not only handles `ArrayOverBuffer` but also polyglot arrays coming from JS or Python. We could have an additional specialization for `ArrayOverBuffer` only (removed in the last commit) that returns `ArrayOverBuffer` rather than `Array` although that adds additional complexity which so far is unnecessary.
Also fixed an example in `Array.enso` by providing a default argument.
Compiler performed name resolution of literals in type signatures but would silently fail to report any problems.
This meant that wrong names or forgotten imports would sneak in to stdlib.
This change introduces 2 main changes:
1) failed name resolutions are appended in `TypeNames` pass
2) `GatherDiagnostics` pass also collects and reports failures from type
signatures IR
Updated stdlib so that it passes given the correct gatekeepers in place.
Implements https://www.pivotaltracker.com/story/show/184032869
# Important Notes
- Currently we get failures in Full joins on Postgres which show a more serious problem - amending equality to ensure that `[NULL = NULL] == True` breaks hash/merge based indexing - so such joins will be extremely inefficient. All our joins currently rely on this notion of equality which will mean all of our DB joins will be extremely inefficient.
- We need to find a solution that will support nulls and still work OK with indices (but after exploring a few approaches: `COALESCE(a = b, a IS NULL AND b is NULL)`, `a IS NOT DISTINCT FROM b`, `(a = b) OR (a IS NULL AND b is NULL)`; all of which did not work (they all result in `ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions`) I'm less certain that it is possible. Alternatively, we may need to change the NULL semantics to align it with SQL - this seems like likely the simpler solution, allowing us to generate simple, reliable SQL - the NULL=NULL solution will be cornering us into nasty workarounds very dependent on the particular backend.
PR adds a flag to `Text` implementation tracking whether it is in a FCD normal form. Then this information can be used in the `Normalizer.compare` method.
| Benchmark name | Old (ms) | With flag (ms)
| --- | --- | ---
| Unicode very short | 40.29 | 40.04
| Unicode medium | 9.07 | 1.99
| Unicode big - random | 115.39 | 0.35
| Unicode big - early difference | 107.02 | 0.54
| Unicode big - late difference | 749.81 | 94.73
| ASCII very short | 28.13 | 31.13
| ASCII medium | 4.58 | 2.26
| ASCII big - random | 42.68 | 0.26
| ASCII big - early difference | 30.91 | 0.32
| ASCII big - late difference | 66.29 | 42.72
Full benchmark output.
[bench_old.txt](https://github.com/enso-org/enso/files/10325202/bench_old.txt)
[bench_new.txt](https://github.com/enso-org/enso/files/10325201/bench_new.txt)
`Any.==` is a builtin method. The semantics is the same as it used to be, except that we no longer assume `x == y` iff `Meta.is_same_object x y`, which used to be the case and caused failures in table tests.
# Important Notes
Measurements from `EqualsBenchmarks` shows that the performance of `Any.==` for recursive atoms increased by roughly 20%, and the performance for primitive types stays roughly the same.
First part of fixing `Text.to_text`.
- add: `pretty` method for pretty printing.
- update: make `Text.to_text` conversion identity for Text
In the next iterations `to_text` will be gradually replaced with `to Text` conversion once the related issues with conversions are fixed.
Implements `getMetaObject` and related messages from Truffle interop for Enso values and types. Turns `Meta.is_a` into builtin and re-uses the same functionality.
# Important Notes
Adds `ValueGenerator` testing infrastructure to provide unified access to special Enso values and builtin types that can be reused by other tests, not just `MetaIsATest` and `MetaObjectTest`.
Use JavaScript to parse and serialise to JSON. Parses to native Enso object.
- `.to_json` now returns a `Text` of the JSON.
- Json methods now `parse`, `stringify` and `from_pairs`.
- New `JSON_Object` representing a JavaScript Object.
- `.to_js_object` allows for types to custom serialize. Returning a `JS_Object`.
- Default JSON format for Atom now has a `type` and `constructor` property (or method to call for as needed to deserialise).
- Removed `.into` support for now.
- Added JSON File Format and SPI to allow `Data.read` to work.
- Added `Data.fetch` API for easy Web download.
- Default visualization for JS Object trunctes, and made Vector default truncate children too.
Fixes defect where types with no constructor crashed on `to_json` (e.g. `Matching_Mode.Last.to_json`.
Adjusted default visualisation for Vector, so it doesn't serialise an array of arrays forever.
Likewise, JS_Object default visualisation is truncated to a small subset.
New convention:
- `.get` returns `Nothing` if a key or index is not present. Takes an `other` argument allowing control of default.
- `.at` error if key or index is not present.
- `Nothing` gains a `get` method allowing for easy propagation.
This removes the special handling of polyglot exceptions and allows matching on Java exceptions in the same way as for any other types.
`Polyglot_Error`, `Panic.catch_java` and `Panic.catch_primitive` are gone
The change mostly deals with the backslash of removing `Polyglot_Error` and two `Panic` methods.
`Panic.catch` was implemented as a builtin instead of delegating to `Panic.catch_primitive` builtin that is now gone.
This fixes https://www.pivotaltracker.com/story/show/182844611
When integrated with CI, will guard against any compilation failures. Can be enabled via env var `ENSO_BENCHMARK_TEST_DRY_RUN="True"`:
```
ENSO_BENCHMARK_TEST_DRY_RUN="True" built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/bin/enso --run test/Benchmarks
```
# Important Notes
/cc @mwu-tow this could be run only on linux PRs in CI
Benchmark to compare _curried and lambda_ based function invocations and a fix to make _curried_ invocation (at least) as fast as the _lambda_ one. Allows us to use _curried invocations_ in standard library again without loosing any speed.
# Important Notes
Execute as:
```
sbt:runtime> benchOnly CurriedFunctionBenchmarks
```
Prior to subsequent bugfixes in this PR the benchmark results were:
- `averageCurried` runs in 0.290 ms
- `averageLambda` runs in 0.122 ms
e.g. _curried invocations_ is more than twice slow. That confirms our findings from the `Array_Proxy` vector benchmarks. The problem is that _function object is not compilation final_. After fixing it we have following results:
- `averageCurried` runs in 0.102 ms
- `averageLambda` runs in 0.111 ms
e.g. both operations are of similar complexity.
- Implemented https://www.pivotaltracker.com/story/show/183913276
- Refactored MultiValueIndex and MultiValueKeys to be more type-safe and more direct about using ordered or unordered maps.
- Added performance tests ensuring we use an efficient algorithm for the joins (the tests will fail for a full O(N*M) scan).
- Removed some duplicate code in the Table library.
- Added optional coloring of test results in terminal to make failures easier to spot.
- Aligned `compare_to` so returns `Type_Error` if `that` is wrong type for `Text`, `Ordering` and `Duration`.
- Add `empty_object`, `empty_array`. `get_or_else`, `at`, `field_names` and `length` to `Json`.
- Fix `Json` serialisation of NaN and Infinity (to "null").
- Added `length`, `at` and `to_vector` to Pair (allowing it to be treated as a Vector).
- Added `running_fold` to the `Vector` and `Range`.
- Added `first` and `last` to the `Vector.Builder`.
- Allow `order_by` to take a single `Sort_Column` or have a mix of `Text` and `Sort_Column.Name` in a `Vector`.
- Allow `select_columns_helper` to take a `Text` value. Allows for a single field in group_by in cross_tab.
- Added `Patch` and `Custom` to HTTP_Method.
- Added running `Statistic` calculation and moved more of the logic from Java to Enso. Performance seems similar to pure Java version now.
Add `Test.with_clue` function similar to ScalaTest's [with_clue](https://www.scalatest.org/user_guide/using_assertions).
This is useful for tests where the assertion depends on context which is not clear from it's operands.
### Important Notes
Using with_clue changes `State Clue` which is extracted by the `fail` function.
The State is introduced by `Test.specify` when the test is not pending.
There should be no changes to the public API apart form the addition of `Test.with_clue`.
Using lambda instead of higher order function.
# Important Notes
Running:
```
sbt:runtime> benchOnly VectorBenchmarks.averageOverArrayProxy
```
speeds up from `0.038 ms/op` to `0.016 ms/op` on my computer. Which seems good enough.
* Sequence literal (Vector) should preserve warnings
When Vector was created via a sequence literal, we simply dropped any
associated any warnings associated with it.
This change propagates Warnings during the creation of the Vector.
Ideally, it would be sufficient to propagate warnings from the
individual elements to the underlying storage but doesn't go well with
`Vector.fromArray`.
* update changelog
* Array-like structures preserver warnings
Added a WarningsLibrary that exposes `hasWarnings` and `getWarnings`
messages. That way we can have a single storage that defines how to
extract warnings from an Array and the others just delegate to it.
This simplifies logic added to sequence literals to handle warnings.
* Ensure polyglot method calls are warning-free
Since warnings are no longer automatically extracted from Array-like
structures, we delay the operation until an actual polyglot method call
is performed.
Discovered a bug in `Warning.detach_selected_warnings` which was missing
any usage or tests.
* nits
* Support multi-dimensional Vectors with warnings
* Propagate warnings from case branches
* nit
* Propagate all vector warnings when reading element
Previously, accessing an element of an Array-like structure would only
return warnings of that element or of the structure itself.
Now, accessing an element also returns warnings from all its elements as
well.
- Moved `to_default_visualization_data` to `Standard.Visualization`.
- Remove the use of `is_a` in favour of case statements.
- Stop exporting Standard.Base.Error.Common.
- Separate errors to own files.
- Change constructors to be called `Error`.
- Rename `Caught_Panic.Caught_Panic_Data` -> `Caught_Panic.Panic`.
- Rename `Project_Description.Project_Description_Data` ->`Project_Description.Value`
- Rename `Regex_Matcher.Regex_Matcher_Data` -> `Regex_Matcher.Value` (can't come up with anything better!).
- Rename `Range.Value` -> `Range.Between`.
- Rename `Interval.Value` -> `Interval.Between`.
- Rename `Column.Column_Data` -> `Column.Value`.
- Rename `Table.Table_Data` -> `Table.Value`.
- Align all the Error types in Table.
- Removed GEO Json bits from Table.
- `Json.to_table` doesn't have the GEO bits anymore.
- Added `Json.geo_json_to_table` to add the functions back in.
# Important Notes
No more exports from anywhere but Main!
No more `_Data` constructors!
- Moved `Any`, `Error` and `Panic` to `Standard.Base`.
- Separated `Json` and `Range` extensions into own modules.
- Tidied `Case`, `Case_Sensitivity`, `Encoding`, `Matching`, `Regex_Matcher`, `Span`, `Text_Matcher`, `Text_Ordering` and `Text_Sub_Range` in `Standard.Base.Data.Text`.
- Tidied `Standard.Base.Data.Text.Extensions` and stopped it re-exporting anything.
- Tidied `Regex_Mode`. Renamed `Option` to `Regex_Option` and added type to export.
- Tidied up `Regex` space.
- Tidied up `Meta` space.
- Remove `Matching` from export.
- Moved `Standard.Base.Data.Boolean` to `Standard.Base.Boolean`.
# Important Notes
- Moved `to_json` and `to_default_visualization_data` from base types to extension methods.
Converting `Integer.parse` into a builtin and making sure it can parse big values like `100!`. Adding `locale` parameter to `Locale.parse` and making sure it parses `32,5` as `32.5` double in Czech locale.
# Important Notes
Note that one cannot
```
import Standard.Table as Table_Module
```
because of the 2-component name restriction that gets desugared to `Standard.Table.Main` and we have to write
```
import Standard.Table.Main as Table_Module
```
in a few places. Once we move `Json.to_table` extension this can be improved.
It appears that when were doing
`from XYZ import all`
the module `XYZ` was also being taken into account during name resolution.
This was unfortunate and became problematic when one had a type with the same name defined in it.
During pattern matching one could not simply do
```
from XYZ import all
...
case ... of
_ : XYZ -> ...
```
since the compiler would complain that we try to pattern match on a type but give it a module.
The module is now excluded from the name resolution, when importing everything from the module.
It appears that this "feature" was used in a number of our tests, so they had to be adapted.
This fixes task 4 in https://www.pivotaltracker.com/story/show/183833055
- Adds transpose and cross_tab to the In-Memory table.
- Cross Tab is built on top of aggregate and hence allows for expressions and has same error trapping as in aggregate.
# Important Notes
Only basic tests have been implemented. Error and warning tests will be added as a follow up task.
Implements https://www.pivotaltracker.com/story/show/183854123
It features a naive full scan join and only allows equality conditions. More advanced conditions and better optimized algorithms will be implemented in a subsequent PR.
- Export all for `Problem_Behavior` (allowing for Report_Warning, Report_Error and Ignore to be trivially used).
- Renamed `Range.Range_Data` to `Range.Value` moved to using `up_to` wherever possible.
- Reviewed `Function`, `IO`, `Polyglot`, `Random`, `Runtime`, `System`.
- `File` now published as type. Some static methods moved to `Data` others into type. Removed `read_bytes` static.
- New `Data` module for reading input data in one place (e.g. `Data.read_file`) will add `Data.connect` later.
- Added `Random` module to the exports.
- Move static methods into `Warning` type and exporting the type not the module.
# Important Notes
- Sorted a few imports into order (ordering by direct import in project, then by from import in project then polyglot and finally self imports).
RuntimeStdlibTest now checks that names in type signatures are qualified (aka poor man's typechecker)
Will be merged after the stdlib tidying by James is complete.
Here we go again...
- Tidied up `Pair` and stopped exporting `Pair_Data`. Adjusted so type exported.
- Tidy imports for `Json`, `Json.Internal`, `Locale`.
- Tidy imports Ordering.*. Export `Sort_Direction` and `Case_Sensitivity` as types.
- Move methods of `Statistics` into `Statistic`. Publishing the types not the module.
- Added a `compute` to a `Rank_Method`.
- Tidied the `Regression` module.
- Move methods of `Date`, `Date_Time`, `Duration`, `Time_Of_Day` and `Time_Zone` into type. Publishing types not modules.
- Added exporting `Period`, `Date_Period` and `Time_Period` as types. Static methods moved into types.
# Important Notes
- Move `compare_to_ignore_case`, `equals_ignore_case` and `to_case_insensitive_key` from Extensions into `Text`.
- Hiding polyglot java imports from export all in `Main.enso`.
- Moved static methods into `Locale` type. Publishing type not module.
- Stop publishing `Nil` and `Cons` from `List`.
- Tidied up `Json` and merged static in to type. Sorted out various type signatures which used a `Constructor`. Now exporting type and extensions.
- Tidied up `Noise` and merge `Generator` into file. Export type not module.
- Moved static method of `Map` into type. Publishing type not module.
# Important Notes
- Move `Text.compare_to` into `Text`.
- Move `Text.to_json` into `Json`.
* Tidy Bound and Interval.
* Fix Interval tests.
* Fix Interval tests.
* Restructure Index_Sub_Range to new Type/Statics.
* Adjust for Vector exported as a type and static methods on it.
* Tidy Maybe.
* Fix issue with Line_Ending_Style.
* Revert Filter_Condition change.
Fix benchmark test issue.
Tidy imports on Index_Sub_Range.
* Revert Filter_Condition change.
Fix benchmark test issue.
Tidy imports on Index_Sub_Range.
* Can't export constructors unless exported from type in module.
* Fix failing tests.
- Allow `Map` to store a `Nothing` key (fixes `Vector.distinct` with a `Nothing`).
- Add `column_names` method to `Table` as a shorthand.
- Return data flow error when comparing with Nothing (not a Panic or a Polyglot exception).
- Allow milli and micro second for DateTime and Time Of Day
# Important Notes
- Added a load of tests for the various comparison operators to Numbers_Spec.
It appears that we were always adding builtin methods to the scope of the module and the builtin type that shared the same name.
This resulted in some methods being accidentally available even though they shouldn't.
This change treats differently builtins of types and modules and introduces auto-registration feature for builtins.
By default all builtin methods are registered with a type, unless explicitly defined in the annotation property.
Builtin methods that are auto-registered do not have to be explicitly defined and are registered with the underlying type.
Registration correctly infers the right type, depending whether we deal with static or instance methods.
Builtin methods that are not auto-registered have to be explicitly defined **always**. Modules' builtin methods are the prime example.
# Important Notes
Builtins now carry information whether they are static or not (inferred from the lack of `self` parameter).
They also carry a `autoRegister` property to determine if a builtin method should be automatically registered with the type.
Libraries: Revert changes that were necessitated by a new rule we have decided not to introduce.
Parser:
- Support mixed constructors/bindings in types.
- Disallow zero-length hex sequences in character escapes: `\x`, `\u`, `\u{}`, `\U`, `\U{}` are no longer legal synonyms for `\0` (matches old parser behavior).
Computing length of a text takes time. Let's cache it after first computation.
# Important Notes
Wrote `StringBenchmarks` that sums lengths of (the same) `Text` present many time in a `Vector`. Initially it took `383.673 ms` per operation. Then it took `0.031 ms/op`. Looks like the `length` calls are returning instantly as they get cached.
- Added expression ANTLR4 grammar and sbt based build.
- Added expression support to `set` and `filter` on the Database and InMemory `Table`.
- Added expression support to `aggregate` on the Database and InMemory `Table`.
- Removed old aggregate functions (`sum`, `max`, `min` and `mean`) from `Column` types.
- Adjusted database `Column` `+` operator to do concatenation (`||`) when text types.
- Added power operator `^` to both `Column` types.
- Adjust `iif` to allow for columns to be passed for `when_true` and `when_false` parameters.
- Added `is_present` to database `Column` type.
- Added `coalesce`, `min` and `max` functions to both `Column` types performing row based operation.
- Added support for `Date`, `Time_Of_Day` and `Date_Time` constants in database.
- Added `read` method to InMemory `Column` returning `self` (or a slice).
# Important Notes
- Moved approximate type computation to `SQL_Type`.
- Fixed issue in `LongNumericOp` where it was always casting to a double.
- Removed `head` from InMemory Table (still has `first` method).
Fix bugs in `TreeToIr` (rewrite) and parser. Implement more undocumented features in parser. Emulate some old parser bugs and quirks for compatibility.
Changes in libs:
- Fix some bugs.
- Clean up some odd syntaxes that the old parser translates idiosyncratically.
- Constructors are now required to precede methods.
# Important Notes
Out of 221 files:
- 215 match the old parser
- 6 contain complex types the old parser is known not to handle correctly
So, compared to the old parser, the new parser parses 103% of files correctly.
This PR adds `Period` type, which is a date-only complement to `Duration` builtin type.
# Important Notes
- `Period` replaces `Date_Period`, and `Time_Period`.
- Added shorthand constructors for `Duration` and `Period`. For example: `Period.days 10` instead of `Period.new days=10`.
- `Period` can be compared to other `Period` in some cases, other cases throw an error.
Define start of Enso epoch as 15th of October 1582 - start of the Gregorian calendar.
# Important Notes
- Some (Gregorian) calendar related functionalities within `Date` and `Date_Time` now produces a warning if the receiving Date/Date_Time is before the epoch start, e.g., `week_of_year`, `is_leap_year`, etc.
1. Changes how we do monadic state – rather than a haskelly solution, we now have an implicit env with mutable data inside. It's better for the JVM. It also opens the possibility to have state ratained on exceptions (previously not possible) – both can now be implemented.
2. Introduces permission check system for IO actions.
Another part of #3611 with few more `TreeToIr` improvements.
# Important Notes
Unofficial `LoadParser.sh` check from #3611 of all library files now reports just 54 failures out of 222 files - e.g. 75% success rate.
The main culprit of a Vector slowdown (when compared to Array) was the normalization of the index when accessing the elements. Turns out that the Graal was very persistent on **not** inlining that particular fragment and that was degrading the results in benchmarks.
Being unable to force it to do it (looks like a combination of thunk execution and another layer of indirection) we resorted to just moving the normalization to the builtin method. That makes Array and Vector perform roughly the same.
Moved all handling of invalid index into the builtin as well, simplifying the Enso implementation. This also meant that `Vector.unsafe_at` is now obsolete.
Additionally, added support for negative indices in Array, to behave in the same way as for Vector.
# Important Notes
Note that this workaround only addresses this particular perf issue. I'm pretty sure we will have more of such scenarios.
Before the change `averageOverVector` benchmark averaged around `0.033 ms/op` now it does consistently `0.016 ms/op`, similarly to `averageOverArray`.
Improve `Unsupported_Argument_Types` error so that it includes the message from the original exception. `arguments` field is retained, but not included in `to_display_text` method.
- Removed `Dubious constructor export` from Examples, Geo, Google_Api, Image and Test.
- Updated Google_Api project to meet newer code standards.
- Restructured `Standard.Test`:
- `Main.enso` now exports `Bench`, `Faker`, `Problems`, `Test`, `Test_Suite`
- `Test.Suite` methods moved into a `Test_Suite` type.
- Moved `Bench.measure` into `Bench` type.
- Separated the reporting to a `Test_Reporter` module.
- Moved `Faker` methods into `Faker` type.
- Removed `Verbs` and `.should` method.
- Added `should_start_with` and `should_contain` extensions to `Any`.
- Restructured `Standard.Image`:
- Merged Codecs methods into `Image`.
- Export `Image`, `Read_Flag`, `Write_Flag` and `Matrix` as types from `Main.enso`.
- Merged the internal methods into `Matrix` and `Image`.
- Fixed `Day_Of_Week` to be exported as a type and sort the `from` method.
- Reimplement the `Duration` type to a built-in type.
- `Duration` is an interop type.
- Allow Enso method dispatch on `Duration` interop coming from different languages.
# Important Notes
- The older `Duration` type should now be split into new `Duration` builtin type and a `Period` type.
- This PR does not implement `Period` type, so all the `Period`-related functionality is currently not working, e.g., `Date - Period`.
- This PR removes `Integer.milliseconds`, `Integer.seconds`, ..., `Integer.years` extension methods.