Commit Graph

675 Commits

Author SHA1 Message Date
Pavel Marek
bb8ff8f89e
Refactor Base_Tests/src/Data to Test_New (#8890)
Refactor `Base_Tests` to `Test_New` testing framework. Mostly automatic text replacements.

# Important Notes
List of changes that were not done automatically (not via automatic text replacement):
- Fix indexes in Instrumentor_Spec - f590c4a398
- If group or spec is pending, its block is not evaluated - 8d797f1a4a
- Spec_Result is not private - 5767535af2

Tests marked as *pending*:
- #8913
- #8910
2024-01-31 14:25:16 +00:00
GregoryTravis
644c9af579
Attach a warning when Nothing is used in a Filter_Condition (#8865)
Attach a warning when Nothing is used as a value in a comparison or `is_in` Filter_Condition.
2024-01-27 08:45:45 +00:00
Jaroslav Tulach
9a37357247
Binary operator resolution based on that value (#8779) 2024-01-27 08:38:47 +01:00
Radosław Waśko
ca4f98c78e
Adding tests and missing methods for Enso_File. (#8815)
- Closes #8808 - adds tests for various scenarios.
- Implements `size` using HEAD.
- Updates existing functions to changes in Cloud API.
- Adds stubs for `*_time` methods, `parent`, `path`.
- [x] TODO: resolve the `Enso_File.current_working_directory` from an environment variable.
- ~~TODO: recursive directory deletion?~~ left for later

# Important Notes
- Currently, the Cloud API does not offer an easy way to extract metadata for a file, in particular to get the parent folder from the file `id`.
- We should be able to get the parent, and stuff like creation/modified time.
- We need a way to resolve paths to asset ids, for `path` to work as well as `current_working_directory`.
- What is the environment variable that will be used to feed the `current_working_directory` property?
2024-01-26 19:04:42 +00:00
Pavel Marek
d0fdeca6df
Refactor Table_Tests to the builder API (#8622)
Refactor `test/Table_Test` to the builder API. The builder API is in a new library called `Test_New` that is alongside the old `Test` library. There will be follow-up PRs that will migrate the rest of the tests. Meanwhile, let's keep these two libraries, and merge them after the last PR.

# Important Notes
- For a brief introduction into the new API, see **Prototype 1** section in https://github.com/enso-org/enso/pull/8622#issuecomment-1889706168
- When executing all the tests, the behavior should be the same as with the old library. With the only exception that if `ENSO_TEST_ANSI_COLORS` env var is set, the output is more colorful than it used to be.
2024-01-26 12:08:24 +00:00
James Dunkerley
0b6db5797c
Refactor OrderMask to avoid memory copying (#8863)
Goal of this PR is to refactor the design of OrderMask and avoid copying arrays or lists wherever possible.
We have removed a few legacy functions which were not being used.

On a poor mans benchmark seems to be quicker (13s vs 16s) and memory usage should be lower.
2024-01-26 11:16:16 +00:00
GregoryTravis
5eb3f3bd1d
Implement relational NULL semantics for Nothing for in-memory Column operations (#8816)
Updates in-memory table column operations to treat Nothing as a relational NULL.
This PR does not include changes to Table.join.
2024-01-24 17:02:45 +00:00
AdRiley
23d6fcdd9c
Make max of Random methods inclusive rather than exclusive (#8768)
* Make max inclusive rather than exclusive

* Update names to min max. Remove end_inclusive

* Indicies should remain exclusive

* Make the tests pass
2024-01-24 16:58:33 +00:00
Radosław Waśko
edfcfde11c
Tests and improvements for secrets in cloud subdirectories (#8791)
- Closes #8723
- Adds some missing features that were needed to make this work:
- `Enso_File.create_directory` and `Enso_File.delete`, and basic tests for it
- Changes how `Enso_Secret.list` is obtained - using a different Cloud endpoint allows us to implement the desired logic, the default endpoint was giving us _all_ secrets which was not what we wanted here.
- Implements `Enso_Secret.update` and tests for it

# Important Notes
Notes describing any problems with the current Cloud API:
https://docs.google.com/document/d/1x8RUt3KkwyhlxGux7XUGfOdtFSAZV3fI9lSSqQ3XsXk/edit

Apparently, everything that was needed to make this feature work has already been implemented, although a few features needed workarounds on Enso side to work properly.
2024-01-24 10:17:22 +00:00
Radosław Waśko
368e4867b4
Allow secrets in AWS_Credential (#8774)
- Closes #8722
2024-01-19 19:00:56 +00:00
Radosław Waśko
14be36c401
Allow secrets in Header.authorization_* (#8761)
- Closes #8739
2024-01-18 12:49:47 +00:00
James Dunkerley
d55c9c99b4
First few changes from the Churn workflow (#8782)
- `up_to` should have the `step` as an optional argument.
- `Date_Range` conversion can do a clever auto rename, so if a period use the name of it.
- Add `to Table` for `Range`, `Date_Range`.
2024-01-17 21:57:24 +00:00
AdRiley
ac0d4c9f5f
Make Random.Seed private. Remove unused TEXT_ONLY (#8783)
Random.Seed doesn't work in the GUI and the TEXT_ONLY tag doesn't do anything so it was incorrectly showing up in the component browser.

This MR makes Random.Seed private to hide it from the GUI and completely removes the TEXT_ONLY tag which is unused and unimplemented.
2024-01-17 17:15:51 +00:00
Radosław Waśko
583345d8f2
Allow Test asserts to be run outside of the test suite, still raising sensible Panics (#8778)
- Fixes #5962 by defaulting to no Clue if run outside of tests, ensuring that the assertions still throw a sensible panic.
2024-01-16 17:17:20 +00:00
AdRiley
ef7b11fb67
Add new aliases for cross_join, info and drop (#8773)
Adds new aliases
append -> cross_join
metadata -> info
skip -> drop
2024-01-16 16:08:54 +00:00
GregoryTravis
f2cb1f097e
Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8595)
Implements `Warnings.get_all wrap_errors=True` which wraps warnings attached to values inside vectors with `Map_Error`, which includes the position of the value within the vector. See [the documentation](https://github.com/enso-org/enso/blob/develop/docs/semantics/wrapped-errors.md) for more details.

`get_all wrap_errors=True` does not change the warnings that are attached to values -- it wraps them before returning them to the caller, but does not change the original warnings attached to the values.

Wrapped warnings only appear attached to the vector itself. The values inside the vector do not have their warnings wrapped.

Warning propagation is not changed at all; `Warnings.get_all` (with default `wrap_errors=False`) behaves as before. `get_all wrap_errors=True` is meant to be used primarily by the IDE, although it can be used anywhere this wrapping is desired.
2024-01-16 09:36:22 +00:00
AdRiley
b8e93b3cba
Add new text_left and text_right functions (#8691)
Added text_left and text_right functions for in-memory and databases
2024-01-15 23:43:23 +00:00
Radosław Waśko
5b70ff25f7
Remove set_user_info from URI (#8738)
I have added this in #8591, but I have realised it may not be a good idea to have it, so I am removing that particular change.
2024-01-15 17:35:17 +00:00
Radosław Waśko
f34abeda0c
Add tests for Enso_Secrets, update to new cloud API (#8736)
- Closes #8556
2024-01-15 16:12:08 +00:00
Pavel Marek
6ae35abc46
Fix Runtime.assert (#8742) 2024-01-12 18:47:40 +01:00
AdRiley
1b3c9638ea
Make fill nothing return types tighter (#8734)
This is the follow up PR addressing the last couple of points from https://github.com/enso-org/enso/pull/8643 around what the return type from fill_nothing.

# Important Notes
The biggest change is changing what we size we need for an empty string. This change says a variable length string of length 1 and does it at a low enough level that it will effect the whole language. But I think that is correct.
2024-01-12 11:20:36 +00:00
Pavel Marek
428e83de36
Remove org.bouncycastle dependency (#8664)
Remove `org.bouncycastle` dependency from `org.enso.runtime`.
2024-01-04 17:16:41 +01:00
GregoryTravis
e6ae366917
Ignore User-Agent in HTTP tests (#8630)
`User-Agent` depends on the precise JDK version, so we ignore it when checking HTTP responses.
Clsoes #8629.
2024-01-04 15:53:17 +00:00
AdRiley
bf8dd1888c
Give file read its own helper widget for delimiters. (#8627)
Give file read its own helper widget for delimiters. Remove newline add none. The file read delimiter is similar but different to the split one and so should have its own set of options.
2024-01-04 11:59:42 +00:00
Radosław Waśko
a1207e029d
Unify File_Format_Metadata with File_For_Read (#8628)
- Closes #8555
- Refactors the file format detection logic, compacting lots of repetitive logic for HTTP handling into helper functions.
- Some updates to CODEOWNERS.
2024-01-04 03:57:05 +00:00
AdRiley
689c8f7c3c
Make split to rows of Nothing value equal Nothing. (#8640)
Split to rows of Nothing value should equal Nothing.

Add some additional test cases. And updated existing to help readability
2024-01-03 12:09:35 +00:00
AdRiley
ec51127635
Change null to Nothing (#8637)
Change the generated column name for is_nothing to "[a] is Nothing" from "[a] is null" as Nothing is our customer facing term.
2023-12-28 18:02:23 +00:00
Jaroslav Tulach
07d58f2c02
DataflowError.withoutTrace shall not store a trace (#8608) 2023-12-24 11:07:32 +01:00
Radosław Waśko
b3de42eb23
Handle Nothing values in Filter_Condition.to_predicate (#8600)
- Fixes #8549
- Ensures that a `Type_Error` is thrown instead of a `No_Such_Method` error on type mismatches.
- I think this is more readable.
2023-12-21 19:17:55 +00:00
Radosław Waśko
d41d48e8a0
Merge URI_With_Query into URI, extend API of URI (#8591)
- Closes #8544
- Adds `reset_query_arguments` and `/` operators allowing to transform a URI.
- Adding tests for handling of various edge cases.
2023-12-21 18:39:26 +00:00
AdRiley
cfe0cbe0c1
Add text_length to column for in-memory and database (#8606)
Closes #8521
Adds text_length to Column
2023-12-21 11:31:13 +00:00
Radosław Waśko
dfdb547616
Better context info in Type_Error raised from return type checks (#8566)
- Followup to #8502 that adds better error messages
2023-12-20 18:22:47 +00:00
Radosław Waśko
d56b800c11
Remove the Apache dependency from std-base (#8571)
- After [suggestion](https://github.com/enso-org/enso/pull/8497#discussion_r1429543815) from @JaroslavTulach I have tried reimplementing the URL encoding using just `URLEncode` builtin util. I will see if this does not complicate other followup improvements, but most likely all should work so we should be able to get rid of the unnecessary bloat.
2023-12-20 18:01:08 +00:00
James Dunkerley
2e9bd86854
Small linting fixes. (#8592) 2023-12-20 17:25:43 +00:00
Cassandra-Clark
232077f25e
Renamed lookup_and_replace to merge and renamed Table.replace to text… (#8564) 2023-12-20 16:28:45 +00:00
Pavel Marek
4cb2439890
Submodules can be private (#8581) 2023-12-19 19:13:44 +01:00
Radosław Waśko
724f8d2a56
Add tests for Enso Cloud auth + simple API mock for Enso_User (#8511)
- Closes #8354
- Extends `simple-httpbin` with a simple mock of the Cloud API (currently it checks the token and serves the `/users` endpoint).
- Renames `simple-httpbin` to `http-test-helper`.
2023-12-19 17:41:09 +00:00
James Dunkerley
4f3accb27b
Fix Table viz error (#8563)
![image](https://github.com/enso-org/enso/assets/4699705/d5ba80d1-bc1b-4509-8066-406fc72928da)

Fixes issue with Nothing vector.
2023-12-18 10:46:28 +00:00
Radosław Waśko
d4714af826
Add a few new Filter_Conditions (#8539)
- Closes #8045
2023-12-16 15:12:23 +00:00
Radosław Waśko
9428d12a1e
Fixes Date_Diff widget (#8561)
The widget for `Date_Diff` was using wrong old name and thus did not work properly on one of the arguments.

Before it worked for second argument (`end`) but did not work for `input`:
![image](https://github.com/enso-org/enso/assets/1436948/ef7556db-9518-4854-b7b9-d423f1e6421b)
![image](https://github.com/enso-org/enso/assets/1436948/670a757b-fbed-4fe6-bdf6-13aa46d81aac)

Afterwards it works there too:
![image](https://github.com/enso-org/enso/assets/1436948/5afb88fb-55a0-48bf-9300-8604c672bde3)
2023-12-16 15:04:35 +00:00
Radosław Waśko
940b8f7d51
Improving tests and edge cases for URI and HTTP (#8497)
- Closes #8352
- ~~Proposed fix for #8493~~
- The temporary fix is deemed not viable. I will try to figure out a workaround and leave fixing #8493 to the engine team.
2023-12-15 17:58:45 +00:00
James Dunkerley
9e27b6487b
Minor fixes and tweak for Cloud APIs. (#8557)
- Fix secret to at least be working again
- Tweak to allow a MIMIC flow to work with value types (revisit in 2024).
2023-12-15 17:10:07 +00:00
Radosław Waśko
b5c995a7bf
Reworking Excel support to allow for reading of big files (#8403)
- Closes #8111 by making sure that all Excel workbooks are read using a backing file (which should be more memory efficient).
- If the workbook is being opened from an input stream, that stream is materialized to a `Temporary_File`.
- Adds tests fetching Table formats from HTTP.
- Extends `simple-httpbin` with ability to serve files for our tests.
- Ensures that the `Infer` option on `Excel` format also works with streams, if content-type metadata is available (e.g. from HTTP headers).
- Implements a `Temporary_File` facility that can be used to create a temporary file that is deleted once all references to the `Temporary_File` instance are GCed.
2023-12-15 00:02:15 +00:00
Radosław Waśko
7a05e679c3
Improve details attached to No_Output_Columns reported from various operations (#8528)
- Closes #7635
2023-12-14 10:49:07 +00:00
GregoryTravis
1c815a3d45
Better Error Trapping in map (#8307)
* tests

* wip

* wip

* additional warnings

* wip

* wip

* cleanup

* nested wrapping

* multiple nestings

* wraps_error uses looks_for, test for should_fail_with

* wip

* stack trace line fix

* use catch_primitive internally

* fix warning mapping, dtf spec

* just one wrapper checker, vector spec

* missing ctor, back to non-primitive catch

* back to c_p

* put old map back

* wip

* unnest tests

* Array.map on_problems

* wip

* Revert "wip"

This reverts commit c30d171457.

* better test names

* warning logging

* wip

* wip

* move logic into ALH

* doc

* constant

* My_Error.Error

* nested

* doc

* map_primtiive in warning mapper

* composition

* ref spec

* Remove warnings prior to matching on the value

If an expression has warnings and is matched we:
1) extract the warnings
2) execute the branch of a pattern that matches the value
3) attach extracted warnings to the result

This caused warnings to reappear when doing the custom warnings
manipulation.
This is also consistent with how `CaseNode`'s `doWarning` specialization
is defined.

* fix 1

* do not auto unwrap in test error checkers

* nested error matcher

* in problems too

* dtf

* v

* statistics

* wip

* Table_Spec, map_with_index_primitive

* Column_Operations_Spec

* disable warning wrapping and Report_Warning

* unimpl test

* Warnings_Spec

* DCS

* ACG JP

* zip_primitive

* join_helpers

* Lookup_Helpers

* Table

* Data_Formatter

* Value_Type_Helpers

* revert check types changes

* table_helpers

* table tests

* remove st

* do not remove warnings from value

* vec docs, tests for zip, mwi, flat_map

* docs, fixes

* remove nested_error_matcher

* cleanup

* benchmark

* one error

* alter

* add bench to main

* review

* review

* review

* tail call

* changelog

* tail call was not a tail call

* ws

* bad import

* Added missing import

* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Array.enso

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>

* review, ref example

* lazy benchmark data

* extra paren

* check outside of catch

* review

* vector too

* actually lazy

* disambiguate Map_Error

* finish rename

* move to extensions

* combine Additional_Warnings error

* rename to map_no_wrap

* do not catch and rethrow

* review

* wip

* remove _primitives entirely

* remove unused should_fail_with function options

* remove expected_warning as function in Problems

---------

Co-authored-by: Hubert Plociniczak <hubert.plociniczak@gmail.com>
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
2023-12-13 09:38:09 -05:00
somebody1234
3419c77bc6
Fix typo in Match.enso (#8515)
Fix issue causing visualizations to break:
![image](https://github.com/enso-org/enso/assets/4046547/d63239e2-3fde-424a-ba6c-3e163c6ce983)

# Important Notes
None
2023-12-12 17:59:44 +00:00
Hubert Plociniczak
a11bddcb74
Inline execution should support FQNs (#8454)
* Test illustrating problems with FQNs

Inline execution fails with `Compile error: The name `Standard` could
not be found.`.

* Ensure InlineContext carries Package Repos info

Previously, there was no requirement that inline execution should allow
for FQNs. This meant that the omission of Package Repository info went
unnoticed.

In order to be able to refer to `Standard.Visualization.Preprocessor` it
has to be exported as well.
2023-12-06 10:03:06 +01:00
Radosław Waśko
c685180132
Fix nesting of Test.group (#8439)
- Closes #8430 by ensuring that specs within a nested group are re-attached to the parent group.
2023-12-06 08:00:38 +00:00
Jaroslav Tulach
7f0cb88fa1
Consistent simple and qualified type name (#8448)
Fixes #8255 by unifying `get_qualified_type_name` and `get_simple_type_name` implementations.
2023-12-06 04:30:24 +00:00
Radosław Waśko
c6b6384fe6
Improve performance of anti-join (#8338)
- Closes #8217
2023-11-24 02:44:57 +00:00
James Dunkerley
f60836d9e1
Apply ICONs (#8360)
- Amend a couple of missed groups.
- Add the first pass of some ICONs.

The linter tool has been updated to support rewriting the ICON as well.
2023-11-22 15:24:16 +00:00
James Dunkerley
347b5a7cf5
Linting and Groups update (#8357)
- Fix issues from the linter.
- Rename the constructors for `Blank_Selector`.
- Update various GROUP tags.
2023-11-21 18:12:27 +00:00
James Dunkerley
ecaca12df1
Integrating Enso Cloud with the libraries (part 1...) (#8006)
- Add a `File_For_Read` type. Used for `File_Format` to read files.
- Added `Enso_User` representing the current user in `Enso_Cloud`.
- *Will be later able to list known users.*
- Added `Enso_Secret` representing a value defined in `Enso_Cloud`.
- Value not used within Enso only accessed within polyglot Java.
- Integrated into `Username_And_Password` and can be used within JDBC connections.
- Integrated into HTTP Headers so a secret can be used as a value.
- New `URI_With_Query` with the same API as `URI`. Supporting secrets in the value.
- *Will be integrated with AWS credentials.*
- Added `Enso_File` representing a file or a folder in the cloud.
- Support the same API as `File` (like the `S3_File`).
- *Will support `enso://` URI style access.*
2023-11-20 23:21:14 +00:00
Jaroslav Tulach
1138dfe147
Specify expression to get more advanced results on_return callback (#8331) 2023-11-20 18:47:11 +01:00
Jaroslav Tulach
ba19813511
Speeding up "hello world" example by 16% 2023-11-19 16:38:31 +01:00
Pavel Marek
5a7ad6bfe4
Upgrade enso to GraalVM for jdk 21 (#7991)
Upgrade to GraalVM JDK 21.
```
> java -version
openjdk version "21" 2023-09-19
OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15)
OpenJDK 64-Bit Server VM GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15, mixed mode, sharing)
```

With SDKMan, download with `sdk install java 21-graalce`.

# Important Notes
- After this PR, one can theoretically run enso with any JRE with version at least 21.
- Removed `sbt bootstrap` hack and all the other build time related hacks related to the handling of GraalVM distribution.
- `project-manager` remains backward compatible - it can open older engines with runtimes. New engines now do no longer require a separate runtime to be downloaded.
- sbt does not support compilation of `module-info.java` files in mixed projects - https://github.com/sbt/sbt/issues/3368
- Which means that we can have `module-info.java` files only for Java-only projects.
- Anyway, we need just a single `module-info.class` in the resulting `runtime.jar` fat jar.
- `runtime.jar` is assembled in `runtime-with-instruments` with a custom merge strategy (`sbt-assembly` plugin). Caching is disabled for custom merge strategies, which means that re-assembly of `runtime.jar` will be more frequent.
- Engine distribution contains multiple JAR archives (modules) in `component` directory, along with `runner/runner.jar` that is hidden inside a nested directory.
- The new entry point to the engine runner is [EngineRunnerBootLoader](https://github.com/enso-org/enso/pull/7991/files#diff-9ab172d0566c18456472aeb95c4345f47e2db3965e77e29c11694d3a9333a2aa) that contains a custom ClassLoader - to make sure that everything that does not have to be loaded from a module is loaded from `runner.jar`, which is not a module.
- The new command line for launching the engine runner is in [distribution/bin/enso](https://github.com/enso-org/enso/pull/7991/files#diff-0b66983403b2c329febc7381cd23d45871d4d555ce98dd040d4d1e879c8f3725)
- [Newest version of Frgaal](https://repo1.maven.org/maven2/org/frgaal/compiler/20.0.1/) (20.0.1) does not recognize `--source 21` option, only `--source 20`.
2023-11-17 18:02:36 +00:00
GregoryTravis
ea3d778456
Allow the creation of a constant column on an in-memory table with no rows. (#8218) 2023-11-09 14:40:51 +00:00
GregoryTravis
6be94a854b
Implement truncate Date_Time for database backend (#8235)
Also adds some checks for column names generated for floor, ceil, truncate, round.
2023-11-08 23:23:59 +00:00
Radosław Waśko
1b8b30a68d
Improve performance of Join_Condition.Between by sorting on one dimension (#8212)
- Closes #5303
- Refactors `JoinStrategy` allowing us to 'stack' join strategies on top of each other (to some extent) - currently a `HashJoin` can be followed by another join strategy (currently `SortJoin`)
- Adds benchmarks for join
- Due to limitations of the sorting approach this will still not be as fast as possible for cases where there is more than 1 `Between` condition in a single query - trying to demonstrate that in benchmarks.
- We can replace sorting by d-dimensional [RangeTrees](https://en.wikipedia.org/wiki/Range_tree) to get `O((n + m) log^d n + k)` performance (where `n` and `m` are sizes of joined tables, `d` is the amount of `Between` conditions used in the query and `k` is the result set size).
- Follow up ticket for consideration later:
#8216
- Closes #8215
- After all, it turned out that `TreeSet` was problematic (because of not enough flexibility with duplicate key handling), so the simplest solution was to immediately implement this sub-task.
- Closes #8204
- Unrelated, but I ran into this here: adds type checks to other arguments of `set`.
- Before, putting in a Column as `new_name` (i.e. mistakenly messing up the order of arguments), lead to a hard to understand `Method `if_then_else` of type Column could not be found.`, instead now it would file with type error 'expected Text got Column`.
2023-11-08 12:59:55 +00:00
Radosław Waśko
2ce1567384
Limit max_rows that are downloaded in Table.read by default, and warn if more rows are available (#8159)
- Sets the default limit for `Table.read` in Database to be max 1000 rows.
- The limit for in-memory compatible API still defaults to `Nothing`.
- Adds a warning if there are more rows than limit.
- Enables a few unrelated asserts.
2023-11-06 16:41:47 +00:00
James Dunkerley
a850ecb787
Change widget for Order By. (#8226)
Simplifies the Order By drop and always adds as a `Sort_Column`. If user wants to do descending is a two step process but feels more natural.
2023-11-06 16:27:36 +00:00
Radosław Waśko
237aae33c7
Simplify internal logic of Table.order_by, avoid unnecessary warning (#8221)
- Fixes #8213
2023-11-06 11:00:01 +00:00
GregoryTravis
3c371adbef
Implement Table.format similar to Table.parse allowing to format columns in bulk (#8150)
* doc

* one test

* date tests

* empty and nothing

* ints floats

* bools

* all columns

* regex and index

* locales

* bad formats

* all with one format

* docs

* examples, not impl db

* docs, more errors

* cleanup

* changelog

* check list

* reorder

* clue

* review

* review

* review

* review

* review

* review

* specify time zone
2023-11-02 09:36:36 -04:00
Cassandra-Clark
b5d6628c57
Change filter_blank_rows when_any parameter to have a more user-friendly type (#7935)
Added Blank_Selector constructor and applied to remove_blank_columns, select_blank_columns, filter_blank_rows for #7931 . Changed when_any to when for readability.
2023-11-01 16:51:15 +00:00
GregoryTravis
d467683ed1
Constant columns (in expressions and Column_Operations) should have clearer names (#8188)
Previously, constant columns were given generated names with UUIDs in them, which are long and provide no information. Instead, we now use the constant value itself to form the name.

Since these new generated names are less unique, we must explicitly make them unique, in cases where the caller did not explicilty set a name.
2023-11-01 14:41:03 +00:00
GregoryTravis
1480f50207
Overhaul the random number and item generation code (#8127)
Rewrite most of Random.enso.
2023-10-31 15:25:37 +00:00
Radosław Waśko
79011bd550
Implement Table.lookup_and_replace in Database (#8146)
- Closes #7981
- Adds a `RUNTIME_ERROR` operation into the DB dialect, that may be used to 'crash' a query if a condition is met - used to validate if `lookup_and_replace` invariants are still satisfied when the query is materialized.
- Removes old `Table_Helpers.is_table` and `same_backend` checks, in favour of the new way of checking this that relies on `Table.from` conversions, and is much simpler to use and also more robust.
2023-10-31 15:19:55 +00:00
Radosław Waśko
0c278391fe
Test and improve handling of Date_Time with_timezone=False in Postgres (#8114)
- Fixes #8049
- Adds tests for handling of Date_Time upload/download in Postgres.
- Adds tests for edge cases of handling of Decimal and Binary types in Postgres.
2023-10-21 21:35:13 +00:00
Radosław Waśko
8172896065
Support Previous_Value in fill_nothing and fill_missing (#8105)
- Adds `Previous_Value` to `fill_nothing` and `fill_empty`, as requested by #7192.
2023-10-20 13:18:53 +00:00
Kaz Wesley
2edd2bd7ff
Ensure all spans have document offsets (#8039)
- Validate spans during existing lexer and parser unit tests, and in `enso_parser_debug`.
- Fix lost span info causing failures of updated tests.

# Important Notes
- [x] Output of `parse_all_enso_files.sh` is unchanged since before #7881 (modulo libs changes since then).
- When the parser encounters an input with the first line indented, it now creates a sub-block for lines at than indent level, and emits a syntax error (every indented block must have a parent).
- When the parser encounters a number with a base but no digits (e.g. `0x`), it now emits a `Number` with `None` in the digits field rather than a 0-length digits token.
2023-10-19 12:36:42 +00:00
James Dunkerley
ad95dc9815
Fixes for Table viz (#8102)
- Fixes issue with `to_display_text` on `Date_Time`. Now a reversible format.
- Adjusted the auto width code on table so it works.
- Manually sized columns should remain fixed size.
- Enable Excel range style selection in table.
- Removed page support as not implemented yet.

# Important Notes
Will need to apply to Vue based viz at some point as well.
2023-10-19 08:22:23 +00:00
GregoryTravis
7383db0e04
Restructuring XML into Table form (#8083)
# Important Notes
Adds `.to Table` support, as well as XML support for `expand_column`.
2023-10-19 07:02:48 +00:00
Radosław Waśko
28fc183f92
Review places where we can use Column_Ref (#8101)
Closes #8046
2023-10-18 19:03:50 +00:00
Radosław Waśko
93a31fcc8b
Add benchmarks related to add_row_number performance investigation (#8091)
- Follow-up of #8055
- Adds a benchmark comparing performance of Enso Map and Java HashMap in two scenarios - _only incremental_ updates (like `Vector.distinct`) and _replacing_ updates (like keeping a counter for each key). These benchmarks can be used as a metric for #8090
2023-10-18 17:21:59 +00:00
Radosław Waśko
e9fa12763e
Improve performance of add_row_number (#8076)
Fixes #8055
2023-10-17 00:42:35 +00:00
Radosław Waśko
08b717eb54
Refactor Table problem handling to a more robust and hopefully cleaner approach (#7879)
Closes #7514
2023-10-16 15:09:08 +00:00
GregoryTravis
f18d1323e1
Add Table.expand_to_rows to allow flattening vector and array values in table (#8042)
# Important Notes
Also includes a fix for a reallocation bug in `InferredBuilder`.
2023-10-13 20:54:06 +00:00
James Dunkerley
b7d7910a88
Use US locale for Date/Time parsing by default. (#8053)
Fixes issue with parsing long format month names.
2023-10-13 17:47:14 +00:00
James Dunkerley
fac9e7a420
Expand capabilities of Table.set and better dropdown support, (#8005)
- Adds the ability to use numbers, date/time and Boolean values as constants in `set`.
- `Table.set` can take a `Column_Operation`, allowing for deriving of a new column based on other columns.
- Added `Column_Ref` type to refer to a column in `filter`.
2023-10-13 16:03:28 +00:00
Jaroslav Tulach
31624b5314
Helper method to turn iterator to a Vector (#8035) 2023-10-12 17:50:51 +02:00
James Dunkerley
0dcfc3e9bf
Minor improvements from last couple of Book Clubs (#8034)
- Added some ALIASes.
- Added `sheet` to `Excel_Workbook` to give familiar API to read sheet.
- Added conversion from range to vector allowing easy use with Zip.
- Add `Map.from_keys_and_values.
2023-10-12 14:29:59 +00:00
Radosław Waśko
cd84ac16ce
Restructure Table.from_objects to use conversions (#8020)
Closes #7957
2023-10-11 22:25:18 +00:00
Jaroslav Tulach
7ae0971d11
Runtime checks of signatures with polyglot java classes (#8016) 2023-10-11 15:05:36 +02:00
Radosław Waśko
6f78570115
Fix a DROP table bug, add SQL debug logging (#8007)
- Fixes a bug where creating a temporary table could accidentally issue a `DROP` statement of the table name that the user provided, risking destruction of user data.
- Fortunately, the bad scenario was almost impossible, because the `DROP` statement was only issued _if_ we previously checked that the mentioned table _does not exist_ - dropping a nonexistent table does not do any harm.
- It could have been dangerous in a very unlikely scenario that a table was created just between the _existence check_ and the _drop_.
- After the fix the existence check and any modifications are done within a transaction to avoid interference from concurrent modifications, and the DROP is correctly applied to a temporary Enso table instead of the original one.
- Replaced a temporary log with proper simple logging of SQL statements into a file, if an Environment variable is set.
- Used that feature to test that no unexpected statements occur.
2023-10-10 13:16:06 +00:00
Radosław Waśko
6e0bd86753
Implement Table.lookup_and_replace for in-memory (#7979)
- Closes #7749 implementing the in-memory logic.
- Additional complications have surfaced regarding the Database logic, so it has been split off into a separate ticket: #7981
2023-10-10 10:42:06 +00:00
Jaroslav Tulach
a234e82ee9
Instrumenter to observe behavior of nodes with UUID (#7833)
Exposing instrumentation capabilities to Enso. Fixes #7683.
2023-10-10 02:36:59 +00:00
Pavel Marek
4db61210c0
Exporting non-existing symbols fails with compiler error (#7960)
Adds a new compiler analysis pass that ensures that all the symbols exported via `export ...` or `from ... export ...` statements exist. If not, generates an IR Error.

# Important Notes
We already have such a compiler pass for the version with imports in [ImportSymbolAnalysis.scala](https://github.com/enso-org/enso/blob/develop/engine/runtime/src/main/scala/org/enso/compiler/pass/analyse/ImportSymbolAnalysis.scala)
2023-10-09 09:48:43 +00:00
GregoryTravis
a90af9f844
Cache filtered children in XML (#7998)
We currently scan the entire child list for each call to children, num_children, and get_child_element.

Instead, we should cache the filtered child list on first access.
2023-10-06 19:44:01 +00:00
GregoryTravis
9ba7be20af
Basic XML support (#7947)
This PR includes
* Reading XML from a file, stream, or string
* Reading XML via Data.fetch
* Accessing the root element, element children, and attributes
* Accessing tag text contents
* Get tags by name
* Inner / Outer XML string
2023-10-06 17:52:19 +00:00
Radosław Waśko
0cd446432f
Fix inconsistency when building a Mixed column, fixes to Union (#7919)
- Fixes #7352 by remembering original value types in type inference mode to be able to reconstruct them for Mixed.
   - Added more benchmarks for comparing performance of constructing columns.
- Fixes missing implementations that caused `Table.union` crashing on some type pairs.
- Ensures that `Loss_Of_Integer_Precision` warning is not swallowed when numeric columns are unioned to create a `Float` column.
- Adds test for all of the above cases.
- Allow to output benchmark results to a CSV by setting an environment variable - useful for quickly comparing benchmarks, e.g. in Enso.
2023-10-03 20:33:34 +02:00
Radosław Waśko
08cd449a99
Fix NumberParser to avoid thousandSeparator==decimalPoint and prefer US decimal format (#7946)
Closes #7930
2023-10-03 20:07:54 +02:00
Radosław Waśko
3222e5af62
Avoid exponential growth of column names (#7934)
- Fixes #7933
- Avoids duplicating `]` as `]]` in generated column names - now column names grow linearly.
2023-10-02 16:05:28 +00:00
Jaroslav Tulach
6a59fa5e93
Meta.Type.find a type by FQN (#7885) 2023-10-02 17:02:00 +02:00
James Dunkerley
fb50eb7595
Using conversions in a few places (#7859)
- Shuffle a few `from`s into correct places:
- `Day_Of_Week.from` removing `Day_Of_Week_From` module.
- Adding short cut for `http` and `https` in `Data.read` so it calls onto `Data.fetch` giving a single entry point.
- Moved `URI` extensions from `Standard.Base.Data` module into `Standard.Base.Network.Extensions`.
- Added `post` extension for `URI`.
- Added `contains_key` to `JS_Object`.
- Restored `into` in `JS_Object`:
- Follows old logic populating a constructor.
- Will use conversion from `JS_Object` if present.
- Added automatic deserialization of `Date`, `Time_Of_Day` and `Date_Time` from JSON.
- Uses conversion from `JS_Object`.
- Added conversion from `Text` to a `HTTP_Method` and type checking where `HTTP_Method` used in public APIs.
- Added support for `Date`, `Time_Of_Day` and `Date_Time` in `Table.from_objects`.
- Added `expand_column` to `Table` to expand `JS_Object` to values.
- Add type checking for `Table` in `right` arguments (allowing `Column`s to be used).
- Use type checking in `Table.set` to allow for conversion to a `Column`.
- Remove some unused imports.
- Fix for bug in S3 edge case.
2023-10-02 14:54:22 +00:00
Cassandra-Clark
d7258abbf5
Add Text.substring to allow for an easy short hand of Text.take (start.up_to end) (#7913)
* Add Text.substring function and get_position helper function

For #7876 adds a Text.substring function which supports negative indexes and returns a part of a string from 0-based index 'start' and continuing for 'length'

* added substring and simplified get function

For #7876 adds a Text.substring function which supports negative indexes and returns a part of a string from 0-based index 'start' and continuing for 'length'.

Also simplified get function as it looped unnecessarily.

* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso

punctuation corrections

Co-authored-by: GregoryTravis <greg.m.travis@gmail.com>

* Update Text_Spec.enso

Added test for start index larger than string

* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso

updated Arguments: section to use consistent style

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>

* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso

updated Index_Out_Of_Bounds error to reference cached length

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>

* Removed Slice label and added changelog entry

* Re-added slice tag to substring

Per conversation with James, added slice back to substring

* Update CHANGELOG.md add link

---------

Co-authored-by: GregoryTravis <greg.m.travis@gmail.com>
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
2023-09-29 10:57:57 -06:00
Pavel Marek
9f15b90caa
Implement Enso-specific assert (#7883)
Implement Enso-specific assert - `Runtime.assert` that works like asserts in any other runtime.

# Important Notes
- Enso-specific assertions are enabled when JVM assertions are enabled, or when `ENSO_ENABLE_ASSERTIONS` env var is not empty (See 72cd8361cb/engine/runtime/src/main/java/org/enso/interpreter/runtime/EnsoContext.java (L139))
2023-09-29 14:46:58 +00:00
Jaroslav Tulach
ffa036411d
Make sure Integer can be treated as Float (#7909) 2023-09-28 11:56:54 +02:00
Radosław Waśko
8d926166ea
Follow up improvements to Date_Time_Formatter (#7875)
- Closes #7872
- Also closes #7866
2023-09-28 09:38:00 +00:00
Radosław Waśko
c690559ec4
Implement auto_value_type operation (#7908)
Closes #6113
2023-09-27 15:45:34 +00:00
GregoryTravis
b03712390c
Improve HTTP tests (#7847)
* simple-httpbin encodes response using the Content-encoding header value
* Return sent body verbatim
2023-09-27 14:02:32 +00:00
Hubert Plociniczak
18b2491a41
Always log to console and file (#7825)
* Always log verbose to a file

The change adds an option by default to always log to a file with
verbose log level.
The implementation is a bit tricky because in the most common use-case
we have to always log in verbose mode to a socket and only later apply
the desired log levels. Previously socket appender would respect the
desired log level already before forwarding the log.

If by default we log to a file, verbose mode is simply ignored and does
not override user settings.

To test run `project-manager` with `ENSO_LOGSERVER_APPENDER=console` env
variable. That will output to the console with the default `INFO` level
and `TRACE` log level for the file.

* add docs

* changelog

* Address some PR requests

1. Log INFO level to CONSOLE by default
2. Change runner's default log level from ERROR to WARN

Took a while to figure out why the correct log level wasn't being passed
to the language server, therefore ignoring the (desired) verbose logs
from the log file.

* linter

* 3rd party uses log4j for logging

Getting rid of the warning by adding a log4j over slf4j bridge:
```
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
```

* legal review update

* Make sure tests use test resources

Having `application.conf` in `src/main/resources` and `test/resources`
does not guarantee that in Tests we will pick up the latter. Instead, by
default it seems to do some kind of merge of different configurations,
which is far from desired.

* Ensure native launcher test log to console only

Logging to console and (temporary) files is problematic for Windows.
The CI also revealed a problem with the native configuration because it
was not possible to modify the launcher via env variables as everything
was initialized during build time.

* Adapt to method changes

* Potentially deal with Windows failures
2023-09-26 11:32:04 +02:00
Radosław Waśko
12c4f2981d
More robust Date/Time format patterns parsing (#7826)
- Closes #7461 by introducing a `Date_Time_Formatter` type and making parsing date time formats more robust and safer.
- The default ('simple') set of patterns is slightly simplified and made case insensitive (except for `M/m` and `H/h`) to avoid the `YYYY` vs `yyyy` issues and make it less error prone.
- The `YYYY` now has the same meaning as `yyyy` in simple mode. The old meaning (week-based year) is moved to a _separate mode_, triggered by `Date_Time_Formatter.from_iso_week_date_pattern`.
- Full Java syntax, as well as custom-built Java `DateTimeFormatter` can also be used by `Date_Time_Formatter.from_java`.
- Text-based constants (e.g. `ISO_ZONED_DATE_TIME`) have now become methods on `Date_Time_Formatter`, e.g. `Date_Time_Formatter.iso_zoned_date_time`).
2023-09-22 10:12:18 +00:00
Jaroslav Tulach
5150c14afd
Avoid duplicated conversion & search target type scope (#7849) 2023-09-20 17:23:09 +02:00
James Dunkerley
74d1d0861c
S3 Read Access, Input Stream based reading (#7776)
- Added a `FileSystemSPI` allowing protocol resolution to a target type.
- Separated `Input_Stream` and `Output_Stream` from `File` to allow use in other spaces.
- `File_Format` types `read_web` changed to be `read_stream` working with `InputStream`.
- Added directory listing to `Auto_Detect` allowing for `Data.read` to list a folder.
- Adjusted HTTP to return an `InputStream` not a `byte[]`:
- `Response_Body` adjusted to wrap an `InputStream`.
- Added ability to materialize to either and in-memory vector (<4KB) or a temporary file.
- `Data.fetch` will materialize if not a recognized mime-type.
- Added `HTTP_Error` to handle IO exceptions from the stream.
- `Excel_Format` now supports mime-type and reading a stream.
- `Excel_Workbook` can now get a `Excel_Section` using `read_section`.
- Added S3 APIs:
- `parse_uri`: splits an S3 URI into bucket and key.
- `list_objects`: list the items in a S3 bucket with specified prefix.
- `read_bucket`: list prefixes and keys with a delimiter in a S3 bucket with specified prefix.
- `head`: either head_bucket (tests existance) or head_object API (reads object meta data).
- `get_object`: gets an object from S3 returning as a `Response_Body`.
- Added `S3_File` type acting like a `File`:
- No support for writing in this PR.
- **ToDo:** recursive listing, glob filtering, exists, size.
- Fixed a few invalid type signature line.
- Moved `create` methods for `Postgres_Connection` and `SQLite_Connection` into type instead of module.
- Renamed `Column_Fetcher.Builder` to `Column_Fetcher_Builder`.
- Fixed bug with `select_into` in Dry Run mode creating permanent tables.

**ToDo:** Unit tests.
2023-09-20 15:09:11 +00:00
GregoryTravis
b0c1f3b00e
New Data.post for sending a payload to a Web API (#7700) 2023-09-19 11:26:29 +00:00
Hubert Plociniczak
1ee3d8f4f0
Rename Decimal to Float (#7807)
Implements #6889.
2023-09-14 15:01:30 +00:00
Radosław Waśko
8b6e70b155
Support for BigInteger values in Table (#7715)
- Fixes #7354
- And also closes #7712
- Refactors how we handle numeric ops - ensuring that the 'kernels' are placed all in one place and selected based on storage types.
2023-09-12 13:18:04 +00:00
James Dunkerley
f0ae9bf9c5
Fixes issue writing to a dry run Excel File (#7763)
- Adds `size` to `File`.
- If file is empty, then create a new Excel file.
- Fixes dry run Excel write issue.
2023-09-08 08:52:00 +00:00
Radosław Waśko
7d424bf8a2
Implement Table.delete_rows. (#7709)
- Closes #7238
- Aligns `update_database_table` to a more consistent and clearer API - `update_rows`.
- Adds a `truncate_table` helper function, to pair up with `drop_table`. Both are `PRIVATE` for now.
- Adds tests for NULLs in keys in `update_rows` and `delete_rows`.
- The behaviour is sometimes unexpected, so instead these fail with `Null_Values_In_Key_Columns`.
- Adds a workaround for https://github.com/oracle/graal/issues/7359
- Adds a workaround for a related bug where a stack frame has no name (its `rootNode.getName() == null`).
- I could not track down this bug to provide a neat repro.
2023-09-07 11:07:53 +00:00
Radosław Waśko
87ce78615a
Change layout of local library search path in order to be able to move Round_Spec.enso back to Tests (#7634)
- Closes #7633
- Moves `Round_Spec.enso` from published `Standard.Test` into our `test/Tests` project; the `Table_Tests` that depend on it, simply `import enso_dev.Tests`.
- Changes the layout of the local libraries directory:
- It used to be `root/<namespace>/<name>`.
- Now it is `root/<dir>` - the namespace and name are now read from `package.yaml` instead.
- Adds the parent directory of the current project to the default `ENSO_LIBRARY_PATH`.
- It is treated as a secondary path, so the default `ENSO_HOME/lib` still takes precedence.
- This allows projects to reference and load 'sibling' projects easily - the only requirement is for the project to enable `prefer-local-libraries: true` or add the other local project to its edition. The edition resolution logic is **not changed**.
2023-09-01 20:20:04 +00:00
Jaroslav Tulach
1437a671e1
Introducing generic Any.to type conversion method (#7704) 2023-09-01 08:05:48 +02:00
GregoryTravis
061876e640
Add simple parts of Table.take and Table.drop functions to Database table (#7615)
Implements database Table and Column take/drop, except While and Sample.

Additional features and optimizations are in https://github.com/enso-org/enso/issues/7614.
2023-08-31 18:52:02 +00:00
Radosław Waśko
255b424b72
Add value_type to Column.from_vector and expected_value_type to Column.map and Column.zip (#7637)
- Closes #6111
- Aligns semantics of handling Mixed columns.
- Now, if an operation like `iif` or `fill_nothing` is given a `Mixed` column, the result will also be `Mixed` regardless of the `inferred_precise_value_type`.
- Enables a few old tests that were pending but could be enabled since the types work is advanced enough.
2023-08-31 13:20:49 +00:00
Jaroslav Tulach
6461e20870
Special support for Python Date/Time/Zone interop (#7617) 2023-08-25 10:27:16 +02:00
GregoryTravis
ddf18f212b
Handle writing to a relative file (#7638)
Fixes bug in writing to a non-absolute file (with backup).
2023-08-24 21:01:37 +00:00
Jaroslav Tulach
20e18d22df
More descriptive function information (#7629)
Fixes #7359 by printing more information about the function including partially applied arguments and over-saturated arguments.
2023-08-24 18:04:08 +00:00
James Dunkerley
7d83b3d7b4
Add GROUP to functions (#7622)
- Update list of groups to agreed list.
- Lower case `ALIAS` names to be consistent with function names.
- Add `GROUP` to methods.
- All constructors and functions have doc comments.
- Correct a few typos (e.g. `PRVIATE`).
- Mark some more things as `PRIVATE`.
- Use `ToDo:` and `Note:` consistently.
- Order tags in doc comment.

# Important Notes
We don't have all the doc comments on types and will want to add them in future,
2023-08-23 13:20:38 +00:00
Radosław Waśko
2385f5b357
Add size-limited strings and varying bit-width integer Value_Types to in-memory backend and check for ArithmeticOverflow in LongStorage (#7557)
- Closes #5159
- Now data downloaded from the database can keep the type much closer to the original type (like string length limits or smaller integer types).
- Cast also exposes these types.
- The integers are still all stored as 64-bit Java `long`s, we just check their bounds. Changing underlying storage for memory efficiency may come in the future: #6109
- Fixes #7565
- Fixes #7529 by checking for arithmetic overflow in in-memory integer arithmetic operations that could overflow. Adds a documentation note saying that the behaviour for Database backends is unspecified and depends on particular database.
2023-08-22 18:10:46 +00:00
Pavel Marek
a0086bb112
Ability to invoke all std benchmarks via jmh (#7519)
All the Enso benchmarks in `test/Benchmarks` can be invoked via JMH
2023-08-17 14:48:43 +02:00
GregoryTravis
c9d7c5cb2b
Convert in-memory Column.round to Java (#7521) 2023-08-16 14:45:23 +00:00
Jaroslav Tulach
aa0413e5a2
Use only Type instances as keys for State (#7585) 2023-08-16 15:54:17 +02:00
James Dunkerley
296c95d414
Fix for empty column on replace and out of memory catching for join and tab (#7593)
- Added a Panic.catch to catch heap memory error in joins and cross_tab.
- Adjusted column replace so type is correct.
2023-08-15 17:06:51 +00:00
Jaroslav Tulach
7a272ec152
Encapsulating array-like data and operations into a single package (#7544) 2023-08-15 13:00:47 +02:00
Radosław Waśko
8541a9e1ac
Improve generation of long operation in presence of column name length limit (#7556)
I planned to do this as part of #7428, but I forgot. Making up for that now.
2023-08-14 16:58:36 +00:00
GregoryTravis
d3436fae70
Implement Number.round as a builtin (#7460) 2023-08-14 15:43:39 +00:00
Radosław Waśko
b656b336c7
Report Loss_Of_Integer_Precision when an integer is not exactly representable as a float during conversion (#7509)
Closes #7353

I introduce a new type `WithAggregatedProblems`, because `WithProblems` was too simple - it only allowed to hold a `List<Problem>` but `AggregatedProblems` is more than that. Ideally we shouldn't multiply entities like this too much. We should probably unify all to use `WithAggregatedProblems` - but after starting this, I realised it will likely just take too much effort to do for this little PR. So instead, I created a follow-up task for this: #7514
2023-08-08 12:30:44 +00:00
GregoryTravis
758b3b31b9
Avoid indexing the table twice for Cross Tab (#7417)
Rewrites MultiValueIndex.makeCrossTabTable to build only a single index.
2023-08-04 21:14:18 +00:00
Radosław Waśko
bc9cde6543
Fix column naming edge cases - invalid and duplicated columns, case-insensitive name aliasing for case-insensitive backends (#7495)
- Fixes #7412
- Also adds tests and fixes some more edge cases:
- Ensures correct handling of existing Database tables whose column names may be invalid from Enso perspective, or clashing from Enso perspective (e.g. for most DBs `ś` and `s\u0301` are different names, but for Enso they are basically the same so this would cause issues - thus Enso now renames such columns when accessed (still using the correct column reference in the generated SQL under the hood).
2023-08-04 09:04:38 +00:00
GregoryTravis
037a687401
Expose Unicode normalization methods on Texts (#7425)
Exposes Text_Utils.normalize().
2023-08-03 18:07:00 +00:00
Radosław Waśko
c61c741476
Respect database backend naming limitations when generating table/column names and validate user-provided names to avoid silent name clashes; process JDBC warnings reported from backends (#7428)
- Closes #5951
- Ensures any SQL warnings reported by the database through the JDBC driver are processed and forwarded to the user.
- These warnings show issues like the implicit name truncation that this PR is also solving. It's good to make sure they are visible as they can help avoid and understand unexpected problems. They should not show up in most standard workflows.
- Adds simple history to our REPL.
2023-08-03 09:44:27 +00:00
GregoryTravis
628a51d8e2
Convert Number.round to Java (#7360) 2023-07-26 12:03:09 +00:00
James Dunkerley
7345f0fd9a
Speed up statistics (#7390)
- Allow `parse_to_columns` to take a `Regex` object.
- Add `pattern` to the `Regex` object.
- Add `column_names` to the `Row` object.
- Improve statistics performance.
- Add benchmarks for stats.

| Benchmark | Reference | New | Improvement |
| --- | --- | --- | --- |
| Max (by reduce) | 16.4ms | 16.3ms | - |
| Max (stats) | 703ms | 224ms | 68% |
| Sum (by reduce) | 38ms | 38ms | - |
| Sum (stats) | 753ms | 420ms | 44% |
| Variance (stats) | 745ms | 553s | 26% |

Also tried using a Ref approach for stats but as slower (7e13c45224).
2023-07-26 10:01:18 +00:00
Radosław Waśko
4b5a2e2176
Fixing operations on Mixed types (#7368)
- Fixes #7231
- Cleans up vectorized operations to distinguish unary and binary operations.
- Introduces MixedStorage which may pretend to be a more specialized storage on demand.
- Ensures that operations request a more specialized storage on right-hand side to ensure compatibility with reported inferred storage type.
- Ensures that a dataflow error returned by an Enso callback in Java is propagated as a polyglot exception and can be caught back in Enso
- Tests for comparison of Mixed storages with each other and other types
- Started using `Set` for `Filter_Condition.Is_In` for better performance.
- ~~Migrated `Column.map` and `Column.zip` to use the Java-to-Enso callbacks.~~
- This does not forward warnings. IMO we should not be losing them. We can switch and add a ticket to fix the warnings, but that would be a regression (current implementation handles them correctly). Instead, we should first gain some ability to work with warnings in polyglot. I created a ticket to get this figured out #7371
- ~~Trying to avoid conversions when calling Enso functions from Java.~~
- Needs extra care as dataflow errors may not be handled right then. So only works for simple functions that should not error.
- Not sure how much it really helps. [Benchmarks](https://github.com/enso-org/enso/pull/7270#issuecomment-1635618393) suggested it could improve the performance quite significantly, but the practical solution is not exactly the same as the one measured, so we may have to measure and tune it to get the best results.
- Created #7378 to track this.
2023-07-25 23:25:17 +00:00
GregoryTravis
1f6fcf189b
Implement replace on the Database Column (#7275)
Implements `replace` for database text columns, for text, regex, and column patterns.
2023-07-25 18:09:50 +00:00
James Dunkerley
2dc565b366
Fix failing test (#7394)
Fix a failing test.
2023-07-25 14:06:11 +00:00
Adam Obuchowicz
1d2371f986
Groups in DocTags (#7337)
Fixes #7336 in a quick way.

Next to the old way of defining groups, the library can just add `GROUP` tag to some entities, and it will be added to the group specified in tag's description.

The group name may be qualified (with project name, like `Standard.Base.Input/Output`) or just name - in the latter case, IDE will assume a group defined in the same library as the entity.

Also moved some entities from "export" list in package.yaml to GROUP tag to give an example. I didn't move all of those, as I assume the library team will reorganize those groups anyway.

### Important Notes

@jdunkerley @radeusgd @GregoryTravis When you will start specifying groups in tags, remember that:
* The groups still belongs to a concrete project; if some entity outside a project wants to be added to its group, the "qualified" name should be specified. See `Table.new` example in this PR.
* If the group name does not reflect any group in package.yaml **the tag is ignored**.
* A single entity may be only in a single group. If it's specified in both package.yaml and in tag, the tag takes precedence.

---------

Co-authored-by: Ilya Bogdanov <fumlead@gmail.com>
2023-07-24 15:54:16 +02:00
James Dunkerley
88f32d9b2a
Various small tickets... (#7367)
- Added `Text.length` into Text class so CB lists the built in.
- Added `File.starts_with` and tests for the built in method.
- Add `to_js_object` and `to_display_text` to `Regex`.
![image](https://github.com/enso-org/enso/assets/4699705/3b197c94-9c49-4bc5-a2cc-ce53b917942e)
- Add `to_js_object` and `to_display_text` to `Match`.
![image](https://github.com/enso-org/enso/assets/4699705/962ec4f2-324d-4f10-8ec0-932b093c6729)
- Remove the `bit_shift_l` alias from the built-ins.
- Add test and Enso wrapper for `Text.is_normalized`.
2023-07-23 09:04:11 +00:00
Radosław Waśko
56635c9a88
Add benchmarks comparing performance of Table operations 'vectorized' in Java vs performed in Enso (#7270)
The added benchmark is a basis for a performance investigation.

We compare the performance of the same operation run in Java vs Enso to see what is the overhead and try to get the Enso operations closer to the pure-Java performance.
2023-07-21 17:25:02 +00:00
Jaroslav Tulach
a5ec6a9e51
Bench builder API (#7324)
Designing new `Bench` API to _collect benchmarks_ first and only execute them then. This is a minimal change to allow  implementation of #7323  - e.g. ability to invoke a _single benchmark_ via JMH harness.

# Important Notes
This is just the basic API skeleton. It can be enhanced, if the basic properties (allowing integration with JMH) are kept. It is not intent of this PR to make the API 100% perfect and usable. Neither it is goal of this PR to update existing benchmarks to use it (74ac8d7 changes only one of them to demonstrate _it all works_ somehow). It is however expected that once this PR is integrated, the newly written benchmarks (like the ones from #7270) are going to use (or even enhance) the new API.
2023-07-19 09:18:28 +00:00
GregoryTravis
2fb5c3710b
Add Fallback to Prim_Text_Helper.compile_regex; accept Regex in Text.parse_to_table (#7297)
This PR does three related things:
- Fails more gracefully when a non-string is passed to compile_regex
- Don't pass a non-string to compile_regex
- Allow a Regex param to parse_to_table
2023-07-18 19:55:56 +00:00
James Dunkerley
fd0bdc86dd
Fix issue with rename_columns and revert order of parameter change on select_columns. (#7321)
The Regex change introduced some issues.
Added a test for missed case in `rename_columns` where using vector of pairs.
Reverted parameter order change for `select_columns`.
2023-07-18 13:30:23 +00:00
Pavel Marek
7264d81f2a
Builtin methods can support array-like arguments (#7235)
This PR modifies the builtin method processor such that it forbids arrays of non-primitive and non-guest objects in builtin methods. And provides a proper implementation for the builtin methods in `EnsoFile`.

- Remove last `to_array` calls from `File.enso`
2023-07-17 09:17:39 +02:00
James Dunkerley
aaa235fbad
Add drop down for replace, remove Column_Selector (#7295)
- Add dropdowns for `replace` functions.
- Retire `Column_Selector` type.
- Add `select_blank_columns` and `remove_blank_columns` functions to table types.
- Allow Regex to be used to pick columns.
2023-07-14 17:30:52 +00:00
Radosław Waśko
866283c0a8
Improve error message on Filter_Condition missing arguments in Table.filter (#7290)
In #7148 I improved the error message when a `Filter_Condition` constructor without arguments is provided to `Vector.filter` and its friends. This PR applies the same check to the `Table.filter`.

This is useful, because when we select a Filter_Condition from a widget, initially it does not have all its arguments applied. This used to lead to confusing errors being reported to the user, now, a much clearer error is shown:

![image](https://github.com/enso-org/enso/assets/1436948/19140a7b-d6fc-4292-81d3-dc6d61135cb9)
2023-07-14 08:00:13 +00:00
Radosław Waśko
620cc361ce
Add date_diff, date_add and date_part to scalar Enso date-time values. (#7273)
Followup of #7221, adding `date_diff`, `date_add` and `date_part` to scalar Enso date-time values.
2023-07-13 15:17:21 +00:00
Radosław Waśko
ca68dd94da
Adding new Date/Time operations (-, date_add, date_diff, date_part) (#7221)
- Adds `Column.date_diff` for computing date/time difference as integer multiply of some unit.
- Adds `Column.date_add` for shifting date/time by a unit.
- Adds `Column.date_part` for extracting various parts of the date/time value as integer.
- Adds widgets for the 3 methods above whose content depends on the column value type.
- Adds shorthands: `Column.hour`, `Column.minute` and `Column.second` to extract these date parts.
- Extends `Time_Period` with support for milli-, micro- and nano- seconds; and adapts functions taking `Time_Period` to support these wherever possible.
2023-07-13 12:56:54 +00:00
James Dunkerley
0adab6c68c
Round on a column was always adding a warning (#7246)
- Only warn if outside allowed range.
- Added `is_infinite` to In-Memory column.
- Allow integer value type for `is_nan` and `is_infinite`.
2023-07-10 17:35:23 +00:00
GregoryTravis
345d6b9cb1
Add cross_join support to Database Table (#7234) 2023-07-10 16:29:37 +00:00
James Dunkerley
1fb60df61b
Fixes from the live demo. (#7243)
- Removed defaults from `cross_tab`. It caused an out-of-heap space error when it attempted to build a 205k x 205k table. Now has a hard limit of 10,000 columns - we can increase this once we have more concrete test data.
![image](https://github.com/enso-org/enso/assets/4699705/bc38d41c-56dc-41bd-8a7c-fa89ecfa7f79)

- Adjusted the dropdowns on `Aggregate_Column` for `columns` and `order_by` to be dropdowns as nested Vector editors are not supported.
![image](https://github.com/enso-org/enso/assets/4699705/f4a7c7cc-6a21-462c-a39e-65fbab82c367)

- Altered `Aggregate_Column` so `new_name` now `new_name:Text=""` and not taking `Nothing` anymore. Makes it appear correctly in IDE.
![image](https://github.com/enso-org/enso/assets/4699705/196a49ba-4274-44bb-b876-0372c8f62746)

- Added dropdowns for `fill_empty`, `fill_nothing` and `replace` on `Table`.
![image](https://github.com/enso-org/enso/assets/4699705/9ee5cec2-82d5-4452-b650-67015ac9fee5)

- Added `replace` to Database table throwing `Unsupport_Database_Operation`.
2023-07-09 18:03:05 +00:00
GregoryTravis
bd26e95fd6
Add Table.replace; Change Text.replace to take a Text|Pattern, and remove the use_regex param. (#7223) 2023-07-06 16:13:11 +00:00