- Closes#8111 by making sure that all Excel workbooks are read using a backing file (which should be more memory efficient).
- If the workbook is being opened from an input stream, that stream is materialized to a `Temporary_File`.
- Adds tests fetching Table formats from HTTP.
- Extends `simple-httpbin` with ability to serve files for our tests.
- Ensures that the `Infer` option on `Excel` format also works with streams, if content-type metadata is available (e.g. from HTTP headers).
- Implements a `Temporary_File` facility that can be used to create a temporary file that is deleted once all references to the `Temporary_File` instance are GCed.
* tests
* wip
* wip
* additional warnings
* wip
* wip
* cleanup
* nested wrapping
* multiple nestings
* wraps_error uses looks_for, test for should_fail_with
* wip
* stack trace line fix
* use catch_primitive internally
* fix warning mapping, dtf spec
* just one wrapper checker, vector spec
* missing ctor, back to non-primitive catch
* back to c_p
* put old map back
* wip
* unnest tests
* Array.map on_problems
* wip
* Revert "wip"
This reverts commit c30d171457.
* better test names
* warning logging
* wip
* wip
* move logic into ALH
* doc
* constant
* My_Error.Error
* nested
* doc
* map_primtiive in warning mapper
* composition
* ref spec
* Remove warnings prior to matching on the value
If an expression has warnings and is matched we:
1) extract the warnings
2) execute the branch of a pattern that matches the value
3) attach extracted warnings to the result
This caused warnings to reappear when doing the custom warnings
manipulation.
This is also consistent with how `CaseNode`'s `doWarning` specialization
is defined.
* fix 1
* do not auto unwrap in test error checkers
* nested error matcher
* in problems too
* dtf
* v
* statistics
* wip
* Table_Spec, map_with_index_primitive
* Column_Operations_Spec
* disable warning wrapping and Report_Warning
* unimpl test
* Warnings_Spec
* DCS
* ACG JP
* zip_primitive
* join_helpers
* Lookup_Helpers
* Table
* Data_Formatter
* Value_Type_Helpers
* revert check types changes
* table_helpers
* table tests
* remove st
* do not remove warnings from value
* vec docs, tests for zip, mwi, flat_map
* docs, fixes
* remove nested_error_matcher
* cleanup
* benchmark
* one error
* alter
* add bench to main
* review
* review
* review
* tail call
* changelog
* tail call was not a tail call
* ws
* bad import
* Added missing import
* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Array.enso
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* review, ref example
* lazy benchmark data
* extra paren
* check outside of catch
* review
* vector too
* actually lazy
* disambiguate Map_Error
* finish rename
* move to extensions
* combine Additional_Warnings error
* rename to map_no_wrap
* do not catch and rethrow
* review
* wip
* remove _primitives entirely
* remove unused should_fail_with function options
* remove expected_warning as function in Problems
---------
Co-authored-by: Hubert Plociniczak <hubert.plociniczak@gmail.com>
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
Adds these JAR modules to the `component` directory inside Engine distribution:
- `graal-language-23.1.0`
- `org.bouncycastle.*` - these need to be added for graalpy language
# Important Notes
- Remove `org.bouncycastle.*` packages from `runtime.jar` fat jar.
- Make sure that the `./run` script preinstalls GraalPy standalone distribution before starting engine tests
- Note that using `python -m venv` is only possible from standalone distribution, we cannot distribute `graalpython-launcher`.
- Make sure that installation of `numpy` and its polyglot execution example works.
- Convert `Text` to `TruffleString` before passing to GraalPy - 8ee9a2816f
Fixes#5233 by removing `EconomicMap` & co. and using plain old good _linear hashing_. Fixes#8090 by introducing `StorageEntry.removed()` rather than copying the builder on each removal.
With herein proposed change one can pass an optional filter to `enso --run test/Benchmarks` to execute only groups and specs that contain given string in its name.
While trying to speed `MetadataStorage` up - #8324 - I felt the need to have an independent (on my computer) measurement of startup time. Here is a benchmark that measures how long execution of two simple hello world programs take.
# Important Notes
There are two benchmarks:
- `empty_startup` measures the time needed to boot without using any `Standard` library - basically _an overhead of the JVM and engine_
- `hello_world_startup` measures the use of `IO.println` - it shall take longer than `empty_startup` and show the overhead we have while processing the standard library
- Add a `File_For_Read` type. Used for `File_Format` to read files.
- Added `Enso_User` representing the current user in `Enso_Cloud`.
- *Will be later able to list known users.*
- Added `Enso_Secret` representing a value defined in `Enso_Cloud`.
- Value not used within Enso only accessed within polyglot Java.
- Integrated into `Username_And_Password` and can be used within JDBC connections.
- Integrated into HTTP Headers so a secret can be used as a value.
- New `URI_With_Query` with the same API as `URI`. Supporting secrets in the value.
- *Will be integrated with AWS credentials.*
- Added `Enso_File` representing a file or a folder in the cloud.
- Support the same API as `File` (like the `S3_File`).
- *Will support `enso://` URI style access.*
Upgrade to GraalVM JDK 21.
```
> java -version
openjdk version "21" 2023-09-19
OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15)
OpenJDK 64-Bit Server VM GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15, mixed mode, sharing)
```
With SDKMan, download with `sdk install java 21-graalce`.
# Important Notes
- After this PR, one can theoretically run enso with any JRE with version at least 21.
- Removed `sbt bootstrap` hack and all the other build time related hacks related to the handling of GraalVM distribution.
- `project-manager` remains backward compatible - it can open older engines with runtimes. New engines now do no longer require a separate runtime to be downloaded.
- sbt does not support compilation of `module-info.java` files in mixed projects - https://github.com/sbt/sbt/issues/3368
- Which means that we can have `module-info.java` files only for Java-only projects.
- Anyway, we need just a single `module-info.class` in the resulting `runtime.jar` fat jar.
- `runtime.jar` is assembled in `runtime-with-instruments` with a custom merge strategy (`sbt-assembly` plugin). Caching is disabled for custom merge strategies, which means that re-assembly of `runtime.jar` will be more frequent.
- Engine distribution contains multiple JAR archives (modules) in `component` directory, along with `runner/runner.jar` that is hidden inside a nested directory.
- The new entry point to the engine runner is [EngineRunnerBootLoader](https://github.com/enso-org/enso/pull/7991/files#diff-9ab172d0566c18456472aeb95c4345f47e2db3965e77e29c11694d3a9333a2aa) that contains a custom ClassLoader - to make sure that everything that does not have to be loaded from a module is loaded from `runner.jar`, which is not a module.
- The new command line for launching the engine runner is in [distribution/bin/enso](https://github.com/enso-org/enso/pull/7991/files#diff-0b66983403b2c329febc7381cd23d45871d4d555ce98dd040d4d1e879c8f3725)
- [Newest version of Frgaal](https://repo1.maven.org/maven2/org/frgaal/compiler/20.0.1/) (20.0.1) does not recognize `--source 21` option, only `--source 20`.
- Closes#5303
- Refactors `JoinStrategy` allowing us to 'stack' join strategies on top of each other (to some extent) - currently a `HashJoin` can be followed by another join strategy (currently `SortJoin`)
- Adds benchmarks for join
- Due to limitations of the sorting approach this will still not be as fast as possible for cases where there is more than 1 `Between` condition in a single query - trying to demonstrate that in benchmarks.
- We can replace sorting by d-dimensional [RangeTrees](https://en.wikipedia.org/wiki/Range_tree) to get `O((n + m) log^d n + k)` performance (where `n` and `m` are sizes of joined tables, `d` is the amount of `Between` conditions used in the query and `k` is the result set size).
- Follow up ticket for consideration later:
#8216
- Closes#8215
- After all, it turned out that `TreeSet` was problematic (because of not enough flexibility with duplicate key handling), so the simplest solution was to immediately implement this sub-task.
- Closes#8204
- Unrelated, but I ran into this here: adds type checks to other arguments of `set`.
- Before, putting in a Column as `new_name` (i.e. mistakenly messing up the order of arguments), lead to a hard to understand `Method `if_then_else` of type Column could not be found.`, instead now it would file with type error 'expected Text got Column`.
- Sets the default limit for `Table.read` in Database to be max 1000 rows.
- The limit for in-memory compatible API still defaults to `Nothing`.
- Adds a warning if there are more rows than limit.
- Enables a few unrelated asserts.
* doc
* one test
* date tests
* empty and nothing
* ints floats
* bools
* all columns
* regex and index
* locales
* bad formats
* all with one format
* docs
* examples, not impl db
* docs, more errors
* cleanup
* changelog
* check list
* reorder
* clue
* review
* review
* review
* review
* review
* review
* specify time zone
Added Blank_Selector constructor and applied to remove_blank_columns, select_blank_columns, filter_blank_rows for #7931 . Changed when_any to when for readability.
Previously, constant columns were given generated names with UUIDs in them, which are long and provide no information. Instead, we now use the constant value itself to form the name.
Since these new generated names are less unique, we must explicitly make them unique, in cases where the caller did not explicilty set a name.
- Closes#7981
- Adds a `RUNTIME_ERROR` operation into the DB dialect, that may be used to 'crash' a query if a condition is met - used to validate if `lookup_and_replace` invariants are still satisfied when the query is materialized.
- Removes old `Table_Helpers.is_table` and `same_backend` checks, in favour of the new way of checking this that relies on `Table.from` conversions, and is much simpler to use and also more robust.
After a discussion, I was really curious that our panics are supposed to be almost free - and while trusting that statement, it was really hard to believe - so I wanted to see for myself - knowing that an experiment is the most robust source of this kind of information - testing that in practice.
So I wrote a benchmark comparing various ways of reporting errors, also testing them both at 'shallow' and 'deep' stack traces (adding 200 additional frames) - to see how stack depth affects them, if at all.
The panics are indeed blazing fast! Kudos to the engine team. However, it seems that our dataflow errors are relatively slow (and we tend to use them _more_ than panics and want to be using them more and more). This uncovers a possible optimization opportunity. Can we make them as fast as panics??
Analysis of the benchmark results in comment below.
- Fixes#8049
- Adds tests for handling of Date_Time upload/download in Postgres.
- Adds tests for edge cases of handling of Decimal and Binary types in Postgres.
- Fixes issue with `to_display_text` on `Date_Time`. Now a reversible format.
- Adjusted the auto width code on table so it works.
- Manually sized columns should remain fixed size.
- Enable Excel range style selection in table.
- Removed page support as not implemented yet.
# Important Notes
Will need to apply to Vue based viz at some point as well.
- Follow-up of #8055
- Adds a benchmark comparing performance of Enso Map and Java HashMap in two scenarios - _only incremental_ updates (like `Vector.distinct`) and _replacing_ updates (like keeping a counter for each key). These benchmarks can be used as a metric for #8090
- Adds the ability to use numbers, date/time and Boolean values as constants in `set`.
- `Table.set` can take a `Column_Operation`, allowing for deriving of a new column based on other columns.
- Added `Column_Ref` type to refer to a column in `filter`.
- Added some ALIASes.
- Added `sheet` to `Excel_Workbook` to give familiar API to read sheet.
- Added conversion from range to vector allowing easy use with Zip.
- Add `Map.from_keys_and_values.