- After [suggestion](https://github.com/enso-org/enso/pull/8497#discussion_r1429543815) from @JaroslavTulach I have tried reimplementing the URL encoding using just `URLEncode` builtin util. I will see if this does not complicate other followup improvements, but most likely all should work so we should be able to get rid of the unnecessary bloat.
- Closes#8354
- Extends `simple-httpbin` with a simple mock of the Cloud API (currently it checks the token and serves the `/users` endpoint).
- Renames `simple-httpbin` to `http-test-helper`.
- Closes#8352
- ~~Proposed fix for #8493~~
- The temporary fix is deemed not viable. I will try to figure out a workaround and leave fixing #8493 to the engine team.
- Closes#8111 by making sure that all Excel workbooks are read using a backing file (which should be more memory efficient).
- If the workbook is being opened from an input stream, that stream is materialized to a `Temporary_File`.
- Adds tests fetching Table formats from HTTP.
- Extends `simple-httpbin` with ability to serve files for our tests.
- Ensures that the `Infer` option on `Excel` format also works with streams, if content-type metadata is available (e.g. from HTTP headers).
- Implements a `Temporary_File` facility that can be used to create a temporary file that is deleted once all references to the `Temporary_File` instance are GCed.
* tests
* wip
* wip
* additional warnings
* wip
* wip
* cleanup
* nested wrapping
* multiple nestings
* wraps_error uses looks_for, test for should_fail_with
* wip
* stack trace line fix
* use catch_primitive internally
* fix warning mapping, dtf spec
* just one wrapper checker, vector spec
* missing ctor, back to non-primitive catch
* back to c_p
* put old map back
* wip
* unnest tests
* Array.map on_problems
* wip
* Revert "wip"
This reverts commit c30d171457.
* better test names
* warning logging
* wip
* wip
* move logic into ALH
* doc
* constant
* My_Error.Error
* nested
* doc
* map_primtiive in warning mapper
* composition
* ref spec
* Remove warnings prior to matching on the value
If an expression has warnings and is matched we:
1) extract the warnings
2) execute the branch of a pattern that matches the value
3) attach extracted warnings to the result
This caused warnings to reappear when doing the custom warnings
manipulation.
This is also consistent with how `CaseNode`'s `doWarning` specialization
is defined.
* fix 1
* do not auto unwrap in test error checkers
* nested error matcher
* in problems too
* dtf
* v
* statistics
* wip
* Table_Spec, map_with_index_primitive
* Column_Operations_Spec
* disable warning wrapping and Report_Warning
* unimpl test
* Warnings_Spec
* DCS
* ACG JP
* zip_primitive
* join_helpers
* Lookup_Helpers
* Table
* Data_Formatter
* Value_Type_Helpers
* revert check types changes
* table_helpers
* table tests
* remove st
* do not remove warnings from value
* vec docs, tests for zip, mwi, flat_map
* docs, fixes
* remove nested_error_matcher
* cleanup
* benchmark
* one error
* alter
* add bench to main
* review
* review
* review
* tail call
* changelog
* tail call was not a tail call
* ws
* bad import
* Added missing import
* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Array.enso
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* review, ref example
* lazy benchmark data
* extra paren
* check outside of catch
* review
* vector too
* actually lazy
* disambiguate Map_Error
* finish rename
* move to extensions
* combine Additional_Warnings error
* rename to map_no_wrap
* do not catch and rethrow
* review
* wip
* remove _primitives entirely
* remove unused should_fail_with function options
* remove expected_warning as function in Problems
---------
Co-authored-by: Hubert Plociniczak <hubert.plociniczak@gmail.com>
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* Test illustrating problems with FQNs
Inline execution fails with `Compile error: The name `Standard` could
not be found.`.
* Ensure InlineContext carries Package Repos info
Previously, there was no requirement that inline execution should allow
for FQNs. This meant that the omission of Package Repository info went
unnoticed.
In order to be able to refer to `Standard.Visualization.Preprocessor` it
has to be exported as well.
- Add a `File_For_Read` type. Used for `File_Format` to read files.
- Added `Enso_User` representing the current user in `Enso_Cloud`.
- *Will be later able to list known users.*
- Added `Enso_Secret` representing a value defined in `Enso_Cloud`.
- Value not used within Enso only accessed within polyglot Java.
- Integrated into `Username_And_Password` and can be used within JDBC connections.
- Integrated into HTTP Headers so a secret can be used as a value.
- New `URI_With_Query` with the same API as `URI`. Supporting secrets in the value.
- *Will be integrated with AWS credentials.*
- Added `Enso_File` representing a file or a folder in the cloud.
- Support the same API as `File` (like the `S3_File`).
- *Will support `enso://` URI style access.*
Upgrade to GraalVM JDK 21.
```
> java -version
openjdk version "21" 2023-09-19
OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15)
OpenJDK 64-Bit Server VM GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15, mixed mode, sharing)
```
With SDKMan, download with `sdk install java 21-graalce`.
# Important Notes
- After this PR, one can theoretically run enso with any JRE with version at least 21.
- Removed `sbt bootstrap` hack and all the other build time related hacks related to the handling of GraalVM distribution.
- `project-manager` remains backward compatible - it can open older engines with runtimes. New engines now do no longer require a separate runtime to be downloaded.
- sbt does not support compilation of `module-info.java` files in mixed projects - https://github.com/sbt/sbt/issues/3368
- Which means that we can have `module-info.java` files only for Java-only projects.
- Anyway, we need just a single `module-info.class` in the resulting `runtime.jar` fat jar.
- `runtime.jar` is assembled in `runtime-with-instruments` with a custom merge strategy (`sbt-assembly` plugin). Caching is disabled for custom merge strategies, which means that re-assembly of `runtime.jar` will be more frequent.
- Engine distribution contains multiple JAR archives (modules) in `component` directory, along with `runner/runner.jar` that is hidden inside a nested directory.
- The new entry point to the engine runner is [EngineRunnerBootLoader](https://github.com/enso-org/enso/pull/7991/files#diff-9ab172d0566c18456472aeb95c4345f47e2db3965e77e29c11694d3a9333a2aa) that contains a custom ClassLoader - to make sure that everything that does not have to be loaded from a module is loaded from `runner.jar`, which is not a module.
- The new command line for launching the engine runner is in [distribution/bin/enso](https://github.com/enso-org/enso/pull/7991/files#diff-0b66983403b2c329febc7381cd23d45871d4d555ce98dd040d4d1e879c8f3725)
- [Newest version of Frgaal](https://repo1.maven.org/maven2/org/frgaal/compiler/20.0.1/) (20.0.1) does not recognize `--source 21` option, only `--source 20`.
- Closes#5303
- Refactors `JoinStrategy` allowing us to 'stack' join strategies on top of each other (to some extent) - currently a `HashJoin` can be followed by another join strategy (currently `SortJoin`)
- Adds benchmarks for join
- Due to limitations of the sorting approach this will still not be as fast as possible for cases where there is more than 1 `Between` condition in a single query - trying to demonstrate that in benchmarks.
- We can replace sorting by d-dimensional [RangeTrees](https://en.wikipedia.org/wiki/Range_tree) to get `O((n + m) log^d n + k)` performance (where `n` and `m` are sizes of joined tables, `d` is the amount of `Between` conditions used in the query and `k` is the result set size).
- Follow up ticket for consideration later:
#8216
- Closes#8215
- After all, it turned out that `TreeSet` was problematic (because of not enough flexibility with duplicate key handling), so the simplest solution was to immediately implement this sub-task.
- Closes#8204
- Unrelated, but I ran into this here: adds type checks to other arguments of `set`.
- Before, putting in a Column as `new_name` (i.e. mistakenly messing up the order of arguments), lead to a hard to understand `Method `if_then_else` of type Column could not be found.`, instead now it would file with type error 'expected Text got Column`.
- Sets the default limit for `Table.read` in Database to be max 1000 rows.
- The limit for in-memory compatible API still defaults to `Nothing`.
- Adds a warning if there are more rows than limit.
- Enables a few unrelated asserts.
* doc
* one test
* date tests
* empty and nothing
* ints floats
* bools
* all columns
* regex and index
* locales
* bad formats
* all with one format
* docs
* examples, not impl db
* docs, more errors
* cleanup
* changelog
* check list
* reorder
* clue
* review
* review
* review
* review
* review
* review
* specify time zone
Added Blank_Selector constructor and applied to remove_blank_columns, select_blank_columns, filter_blank_rows for #7931 . Changed when_any to when for readability.
Previously, constant columns were given generated names with UUIDs in them, which are long and provide no information. Instead, we now use the constant value itself to form the name.
Since these new generated names are less unique, we must explicitly make them unique, in cases where the caller did not explicilty set a name.
- Closes#7981
- Adds a `RUNTIME_ERROR` operation into the DB dialect, that may be used to 'crash' a query if a condition is met - used to validate if `lookup_and_replace` invariants are still satisfied when the query is materialized.
- Removes old `Table_Helpers.is_table` and `same_backend` checks, in favour of the new way of checking this that relies on `Table.from` conversions, and is much simpler to use and also more robust.
- Fixes#8049
- Adds tests for handling of Date_Time upload/download in Postgres.
- Adds tests for edge cases of handling of Decimal and Binary types in Postgres.
- Validate spans during existing lexer and parser unit tests, and in `enso_parser_debug`.
- Fix lost span info causing failures of updated tests.
# Important Notes
- [x] Output of `parse_all_enso_files.sh` is unchanged since before #7881 (modulo libs changes since then).
- When the parser encounters an input with the first line indented, it now creates a sub-block for lines at than indent level, and emits a syntax error (every indented block must have a parent).
- When the parser encounters a number with a base but no digits (e.g. `0x`), it now emits a `Number` with `None` in the digits field rather than a 0-length digits token.
- Fixes issue with `to_display_text` on `Date_Time`. Now a reversible format.
- Adjusted the auto width code on table so it works.
- Manually sized columns should remain fixed size.
- Enable Excel range style selection in table.
- Removed page support as not implemented yet.
# Important Notes
Will need to apply to Vue based viz at some point as well.
- Follow-up of #8055
- Adds a benchmark comparing performance of Enso Map and Java HashMap in two scenarios - _only incremental_ updates (like `Vector.distinct`) and _replacing_ updates (like keeping a counter for each key). These benchmarks can be used as a metric for #8090
- Adds the ability to use numbers, date/time and Boolean values as constants in `set`.
- `Table.set` can take a `Column_Operation`, allowing for deriving of a new column based on other columns.
- Added `Column_Ref` type to refer to a column in `filter`.
- Added some ALIASes.
- Added `sheet` to `Excel_Workbook` to give familiar API to read sheet.
- Added conversion from range to vector allowing easy use with Zip.
- Add `Map.from_keys_and_values.
- Fixes a bug where creating a temporary table could accidentally issue a `DROP` statement of the table name that the user provided, risking destruction of user data.
- Fortunately, the bad scenario was almost impossible, because the `DROP` statement was only issued _if_ we previously checked that the mentioned table _does not exist_ - dropping a nonexistent table does not do any harm.
- It could have been dangerous in a very unlikely scenario that a table was created just between the _existence check_ and the _drop_.
- After the fix the existence check and any modifications are done within a transaction to avoid interference from concurrent modifications, and the DROP is correctly applied to a temporary Enso table instead of the original one.
- Replaced a temporary log with proper simple logging of SQL statements into a file, if an Environment variable is set.
- Used that feature to test that no unexpected statements occur.
- Closes#7749 implementing the in-memory logic.
- Additional complications have surfaced regarding the Database logic, so it has been split off into a separate ticket: #7981
We currently scan the entire child list for each call to children, num_children, and get_child_element.
Instead, we should cache the filtered child list on first access.
This PR includes
* Reading XML from a file, stream, or string
* Reading XML via Data.fetch
* Accessing the root element, element children, and attributes
* Accessing tag text contents
* Get tags by name
* Inner / Outer XML string
- Fixes#7352 by remembering original value types in type inference mode to be able to reconstruct them for Mixed.
- Added more benchmarks for comparing performance of constructing columns.
- Fixes missing implementations that caused `Table.union` crashing on some type pairs.
- Ensures that `Loss_Of_Integer_Precision` warning is not swallowed when numeric columns are unioned to create a `Float` column.
- Adds test for all of the above cases.
- Allow to output benchmark results to a CSV by setting an environment variable - useful for quickly comparing benchmarks, e.g. in Enso.
- Shuffle a few `from`s into correct places:
- `Day_Of_Week.from` removing `Day_Of_Week_From` module.
- Adding short cut for `http` and `https` in `Data.read` so it calls onto `Data.fetch` giving a single entry point.
- Moved `URI` extensions from `Standard.Base.Data` module into `Standard.Base.Network.Extensions`.
- Added `post` extension for `URI`.
- Added `contains_key` to `JS_Object`.
- Restored `into` in `JS_Object`:
- Follows old logic populating a constructor.
- Will use conversion from `JS_Object` if present.
- Added automatic deserialization of `Date`, `Time_Of_Day` and `Date_Time` from JSON.
- Uses conversion from `JS_Object`.
- Added conversion from `Text` to a `HTTP_Method` and type checking where `HTTP_Method` used in public APIs.
- Added support for `Date`, `Time_Of_Day` and `Date_Time` in `Table.from_objects`.
- Added `expand_column` to `Table` to expand `JS_Object` to values.
- Add type checking for `Table` in `right` arguments (allowing `Column`s to be used).
- Use type checking in `Table.set` to allow for conversion to a `Column`.
- Remove some unused imports.
- Fix for bug in S3 edge case.
* Add Text.substring function and get_position helper function
For #7876 adds a Text.substring function which supports negative indexes and returns a part of a string from 0-based index 'start' and continuing for 'length'
* added substring and simplified get function
For #7876 adds a Text.substring function which supports negative indexes and returns a part of a string from 0-based index 'start' and continuing for 'length'.
Also simplified get function as it looped unnecessarily.
* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso
punctuation corrections
Co-authored-by: GregoryTravis <greg.m.travis@gmail.com>
* Update Text_Spec.enso
Added test for start index larger than string
* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso
updated Arguments: section to use consistent style
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso
updated Index_Out_Of_Bounds error to reference cached length
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* Removed Slice label and added changelog entry
* Re-added slice tag to substring
Per conversation with James, added slice back to substring
* Update CHANGELOG.md add link
---------
Co-authored-by: GregoryTravis <greg.m.travis@gmail.com>
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* Always log verbose to a file
The change adds an option by default to always log to a file with
verbose log level.
The implementation is a bit tricky because in the most common use-case
we have to always log in verbose mode to a socket and only later apply
the desired log levels. Previously socket appender would respect the
desired log level already before forwarding the log.
If by default we log to a file, verbose mode is simply ignored and does
not override user settings.
To test run `project-manager` with `ENSO_LOGSERVER_APPENDER=console` env
variable. That will output to the console with the default `INFO` level
and `TRACE` log level for the file.
* add docs
* changelog
* Address some PR requests
1. Log INFO level to CONSOLE by default
2. Change runner's default log level from ERROR to WARN
Took a while to figure out why the correct log level wasn't being passed
to the language server, therefore ignoring the (desired) verbose logs
from the log file.
* linter
* 3rd party uses log4j for logging
Getting rid of the warning by adding a log4j over slf4j bridge:
```
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
```
* legal review update
* Make sure tests use test resources
Having `application.conf` in `src/main/resources` and `test/resources`
does not guarantee that in Tests we will pick up the latter. Instead, by
default it seems to do some kind of merge of different configurations,
which is far from desired.
* Ensure native launcher test log to console only
Logging to console and (temporary) files is problematic for Windows.
The CI also revealed a problem with the native configuration because it
was not possible to modify the launcher via env variables as everything
was initialized during build time.
* Adapt to method changes
* Potentially deal with Windows failures
- Closes#7461 by introducing a `Date_Time_Formatter` type and making parsing date time formats more robust and safer.
- The default ('simple') set of patterns is slightly simplified and made case insensitive (except for `M/m` and `H/h`) to avoid the `YYYY` vs `yyyy` issues and make it less error prone.
- The `YYYY` now has the same meaning as `yyyy` in simple mode. The old meaning (week-based year) is moved to a _separate mode_, triggered by `Date_Time_Formatter.from_iso_week_date_pattern`.
- Full Java syntax, as well as custom-built Java `DateTimeFormatter` can also be used by `Date_Time_Formatter.from_java`.
- Text-based constants (e.g. `ISO_ZONED_DATE_TIME`) have now become methods on `Date_Time_Formatter`, e.g. `Date_Time_Formatter.iso_zoned_date_time`).
- Added a `FileSystemSPI` allowing protocol resolution to a target type.
- Separated `Input_Stream` and `Output_Stream` from `File` to allow use in other spaces.
- `File_Format` types `read_web` changed to be `read_stream` working with `InputStream`.
- Added directory listing to `Auto_Detect` allowing for `Data.read` to list a folder.
- Adjusted HTTP to return an `InputStream` not a `byte[]`:
- `Response_Body` adjusted to wrap an `InputStream`.
- Added ability to materialize to either and in-memory vector (<4KB) or a temporary file.
- `Data.fetch` will materialize if not a recognized mime-type.
- Added `HTTP_Error` to handle IO exceptions from the stream.
- `Excel_Format` now supports mime-type and reading a stream.
- `Excel_Workbook` can now get a `Excel_Section` using `read_section`.
- Added S3 APIs:
- `parse_uri`: splits an S3 URI into bucket and key.
- `list_objects`: list the items in a S3 bucket with specified prefix.
- `read_bucket`: list prefixes and keys with a delimiter in a S3 bucket with specified prefix.
- `head`: either head_bucket (tests existance) or head_object API (reads object meta data).
- `get_object`: gets an object from S3 returning as a `Response_Body`.
- Added `S3_File` type acting like a `File`:
- No support for writing in this PR.
- **ToDo:** recursive listing, glob filtering, exists, size.
- Fixed a few invalid type signature line.
- Moved `create` methods for `Postgres_Connection` and `SQLite_Connection` into type instead of module.
- Renamed `Column_Fetcher.Builder` to `Column_Fetcher_Builder`.
- Fixed bug with `select_into` in Dry Run mode creating permanent tables.
**ToDo:** Unit tests.
- Fixes#7354
- And also closes#7712
- Refactors how we handle numeric ops - ensuring that the 'kernels' are placed all in one place and selected based on storage types.
- Closes#7238
- Aligns `update_database_table` to a more consistent and clearer API - `update_rows`.
- Adds a `truncate_table` helper function, to pair up with `drop_table`. Both are `PRIVATE` for now.
- Adds tests for NULLs in keys in `update_rows` and `delete_rows`.
- The behaviour is sometimes unexpected, so instead these fail with `Null_Values_In_Key_Columns`.
- Adds a workaround for https://github.com/oracle/graal/issues/7359
- Adds a workaround for a related bug where a stack frame has no name (its `rootNode.getName() == null`).
- I could not track down this bug to provide a neat repro.
- Closes#7633
- Moves `Round_Spec.enso` from published `Standard.Test` into our `test/Tests` project; the `Table_Tests` that depend on it, simply `import enso_dev.Tests`.
- Changes the layout of the local libraries directory:
- It used to be `root/<namespace>/<name>`.
- Now it is `root/<dir>` - the namespace and name are now read from `package.yaml` instead.
- Adds the parent directory of the current project to the default `ENSO_LIBRARY_PATH`.
- It is treated as a secondary path, so the default `ENSO_HOME/lib` still takes precedence.
- This allows projects to reference and load 'sibling' projects easily - the only requirement is for the project to enable `prefer-local-libraries: true` or add the other local project to its edition. The edition resolution logic is **not changed**.
- Closes#6111
- Aligns semantics of handling Mixed columns.
- Now, if an operation like `iif` or `fill_nothing` is given a `Mixed` column, the result will also be `Mixed` regardless of the `inferred_precise_value_type`.
- Enables a few old tests that were pending but could be enabled since the types work is advanced enough.
- Update list of groups to agreed list.
- Lower case `ALIAS` names to be consistent with function names.
- Add `GROUP` to methods.
- All constructors and functions have doc comments.
- Correct a few typos (e.g. `PRVIATE`).
- Mark some more things as `PRIVATE`.
- Use `ToDo:` and `Note:` consistently.
- Order tags in doc comment.
# Important Notes
We don't have all the doc comments on types and will want to add them in future,
- Closes#5159
- Now data downloaded from the database can keep the type much closer to the original type (like string length limits or smaller integer types).
- Cast also exposes these types.
- The integers are still all stored as 64-bit Java `long`s, we just check their bounds. Changing underlying storage for memory efficiency may come in the future: #6109
- Fixes#7565
- Fixes#7529 by checking for arithmetic overflow in in-memory integer arithmetic operations that could overflow. Adds a documentation note saying that the behaviour for Database backends is unspecified and depends on particular database.
Closes#7353
I introduce a new type `WithAggregatedProblems`, because `WithProblems` was too simple - it only allowed to hold a `List<Problem>` but `AggregatedProblems` is more than that. Ideally we shouldn't multiply entities like this too much. We should probably unify all to use `WithAggregatedProblems` - but after starting this, I realised it will likely just take too much effort to do for this little PR. So instead, I created a follow-up task for this: #7514
- Fixes#7412
- Also adds tests and fixes some more edge cases:
- Ensures correct handling of existing Database tables whose column names may be invalid from Enso perspective, or clashing from Enso perspective (e.g. for most DBs `ś` and `s\u0301` are different names, but for Enso they are basically the same so this would cause issues - thus Enso now renames such columns when accessed (still using the correct column reference in the generated SQL under the hood).
- Closes#5951
- Ensures any SQL warnings reported by the database through the JDBC driver are processed and forwarded to the user.
- These warnings show issues like the implicit name truncation that this PR is also solving. It's good to make sure they are visible as they can help avoid and understand unexpected problems. They should not show up in most standard workflows.
- Adds simple history to our REPL.