Commit Graph

472 Commits

Author SHA1 Message Date
AdRiley
0f688d0a25
Add newlines option to text cleanse/replace (#10761)
* Auto-commit work in progress before clean build on 2024-08-06 11:32:46

* Fixed Regex and additional test

* changelog

* .

* Make non-capturing
2024-08-06 18:59:54 +03:00
Radosław Waśko
3fd14642d9
Fix upload/delete transactions in Snowflake backend (#10738)
Fixes #10609 by rewriting all our upload-related operations to rely on `DDL_Transaction` - an abstraction that handles 'transactionality' of `CREATE TABLE` statements dependent on if a given backend allows DDLs within transactions or not (if not it emulates transactionality by creating the tables outside of transaction and then dropping them on rollback).
2024-08-06 08:14:44 +00:00
Kaz Wesley
aafdef1aeb
Improve parser contextualization (#10734) 2024-08-05 15:46:58 +00:00
Radosław Waśko
6ad3faf03b
Refactor Upload_Table to be more readable: split into separate smaller submodules (#10713)
- First step of #10609 - before I actually modify it, I decided I need to refactor the `Upload_Table` logic as it was quite convoluted. Doing this as a separate PR for easier review. A big 600+ line file was replaced by several smaller ones grouped by topics.
- Practically no changes apart from moving stuff into separate modules.
- One small change - added `Missing_Argument` to `SQL_Query` as I noticed that lack of defaults was giving rise to confusing errors when working with `query` in the GUI.

Before:
![image](https://github.com/user-attachments/assets/b586caec-f25c-406e-be5a-d402f10feb86)
After:
![image](https://github.com/user-attachments/assets/6b1d4206-05b1-4587-b3e1-43ca95ea7c2e)
![image](https://github.com/user-attachments/assets/58c098bd-db0c-4ee2-823c-bf5c9e758ce4)
2024-07-31 09:54:17 +00:00
James Dunkerley
74acc1de24
Tweaks from client Demo (#10685)
- Adjusted Filter_Condition removing keep/drop from basic filters.
- Fix Is_In to have selector.
- Fix for Date simple expressions.
- Add get_row to Table and DB_Table.
2024-07-26 18:44:36 +00:00
Radosław Waśko
46e8bab429
Fix aggregate after sort in DB (#10677)
* add test for https://github.com/enso-org/enso/issues/10321

* wrap in subquery if ordering was present to fix the bug

* adding one more test just to be sure

* fix import
2024-07-26 08:43:19 +01:00
GregoryTravis
f31c084f43
Implement in-memory and database mixed decimal column comparisons (#10614) 2024-07-25 21:27:19 +00:00
Radosław Waśko
ba8ae4502c
Fix date_diff in Snowflake (#10672)
- Closes #10438
- The results are again aligned across backends.
2024-07-25 16:16:33 +00:00
Kaz Wesley
e5b85bf16e
Space-precedence does not apply to value-level operators (#10597)
In a sequence of value-level operators, whitespace does not affect relative precedence. Functional operators still follow the space-precedence rules.

The "functional" operators are: `>> << |> |>> <| <<| : .`, application, and any operator containing `<-` or `->`. All other operators are considered value-level operators.

Asymmetric whitespace can still be used to form *operator sections* of value-level operators, e.g. `+2 * 3` is still equivalent to `x -> (x+2) * 3`.

Precedence of application is unchanged, so `f x+y` is still equivalent to `f (x + y)` and `f x+y * z` is still equivalent to `(f (x + y)) * z`.

Any attempt to use spacing to override value-level operator precedence will be caught by the new enso linter. Mixed spacing (for clarity) in value-operator expressions is allowed, as long as it is consistent with the precedences of the operators.

Closes #10366.

# Important Notes
Precedence warnings:
- The parser emits a warning if the whitespace in an expression is inconsistent with its effective precedence.
- A new enso linter can be run with `./run libraries lint`. It parses all `.enso` files in `distribution/lib` and `test`, and reports any errors or warnings. It can also be run on individual files: `cargo run --release --bin check_syntax -- file1 file2...` (the result may be easier to read than the `./run` output).
- The linter is also run as part of `./run lint`, so it is checked in CI.

Additional language change:
- The exponentiation operator (`^`) now has higher precedence than the multiplication class (`*`, `/`, `%`). This change did not affect any current enso files.

Library changes:
- The libraries have been updated. The new warnings were used to identify all affected code; the changes themselves have not been programmatically verified (in many cases their equivalence relies on the commutativity of string concatenation).
2024-07-24 10:55:44 +00:00
Radosław Waśko
3536a18efd
Initial template for the Extra Tests workflow (#10636)
- Closes #10618
- adjusts some edge case tests in Snowflake
2024-07-24 07:33:51 +00:00
Radosław Waśko
ba56f8e89b
Snowflake Dialect - pt. 7 (#10612)
- Closes #9486
- All tests are succeeding or marked pending
- Created follow up tickets for things that still need to be addressed, including:
- Fixing upload / table update #10609
- Fixing `Count_Distinct` on Boolean columns #10611
- Running the tests on CI is not part of this PR - to be addressed separately
2024-07-23 06:58:11 +00:00
Radosław Waśko
7fd8701690
Snowflake Dialect pt. 6 - Union, Distinct and other improvements (#10576)
- Part of #9486
- Fixes `Table.union`, `merge` and `distinct` tests
- Replaces `distinct_on` in `Context` that was actually a Postgres specific addition leaking into the base with a more abstract `Context_Extension` mechanism.
- This allows us to implement the Snowflake-specific `DISTINCT` using `QUALIFY`.
2024-07-19 16:04:00 +00:00
James Dunkerley
2442ebc52e
Restore Encoding.Default. (#10567)
Following the fix of Input Stream, restore the encoding parts.

No significant performance impact on reading the client test data.
2024-07-16 16:49:46 +00:00
Radosław Waśko
a30b0c60eb
Snowflake Dialect pt. 5 (#10528)
- Related to #9486
- Batching of expression tests
- Fixing arithmetic by simplifying `%` and `/` operations
- Trying to share some more tables, sometimes improving performance sometimes not really
- Adding sorting and other fixes to tests to make them pass: Missing_Values_Spec, Filter_Spec, Map_Spec
- Fixing warnings related to materialization of Decimal->Integer, thus fixing Join_Spec.
2024-07-16 09:38:57 +00:00
James Dunkerley
cc1ac87c99
Linting, XML to_table and fix JSON viz for XML (#10554)
- Linting fixes.
- `XML_Document` and `XML_Element` have `to_table` method.
- Add `to_default_visualization_data` to `XML_Document` and `XML_Element`.
- Add support to Table viz to render.

![image](https://github.com/user-attachments/assets/f01a3508-443e-48db-ad4f-605094a04c2b)

![image](https://github.com/user-attachments/assets/c7573b68-7549-494f-9c59-ea240178c0eb)
2024-07-15 18:33:37 +00:00
Jaroslav Tulach
220b40a1cd
Enforce conversion method return type & introduce Comparable.new (#10468) 2024-07-11 06:58:51 +02:00
Radosław Waśko
48c17845a7
Fixing Database tests and Snowflake Dialect - part 3 out of ... (#10458)
- Related to #9486
- Fixes types in literal tables that are used throughout the tests
- Tries to makes testing faster by disabling some edge cases, trying batching some queries, re-using the main connection and trying to re-use tables more
- Implements date/time type mapping and operations for Snowflake
- Updates type mapping to correctly reflect what Snowflake does
- Disables warnings for Integer->Decimal coercion as that's too annoying and implicitly understood in Snowflake
- Allows to select a Decimal column with `..By_Type ..Integer` (only in Snowflake backend) because the Decimal column there is its 'de-facto' Integer column replacement.
2024-07-10 13:21:30 +00:00
AdRiley
ce6995c83f
Make docker instructions clearer (#10501) 2024-07-10 13:53:24 +01:00
James Dunkerley
8da06309e9
Date Time Pickers, Temporarily Disable Encoding.default (#10493)
- Widgets for Date_Time, Time_Of_Day and Time_Zone.
- Disable Encoding.default for now as big performance impact on CSVs.

![image](https://github.com/enso-org/enso/assets/4699705/c1b936f0-3ab4-490c-8fe5-2310ef1ed079)

![image](https://github.com/enso-org/enso/assets/4699705/d5e29ec4-cc52-41e5-a532-17cd6dff34b9)

![image](https://github.com/enso-org/enso/assets/4699705/61455519-ea63-4275-9c7a-603714ff9f85)

![image](https://github.com/enso-org/enso/assets/4699705/48ccd3ad-5e15-49f9-87cd-4710ca559843)
2024-07-09 21:04:08 +00:00
GregoryTravis
71a6e2162e
Fix return type for Postgres Decimal division (#10479) 2024-07-09 15:22:14 +00:00
James Dunkerley
4b3e4ae15e
Rename Map to Dictionary and Set to Hashset. (#10474)
- Rename `Map` to `Dictionary`.
- Rename `Set` to `Hashset`.
- Add a deprecated place holder for the static method of `Map`.
2024-07-09 09:12:23 +00:00
James Dunkerley
4c0647ea29
Stop publishing First/Last as constructors and use auto-scoping for take and drop. (#10467)
- Removes `First` and `Last` from the `Standard.Base` exports.
- Enable auto-scoping for all `Index_Sub_Range` and `Text_Sub_Range`.
- Update all use of those methods to use auto-scoping.
2024-07-08 10:26:30 +00:00
James Dunkerley
018d4c312f
Stop publishing Postgres constructor, update Postgres_Details.Postgres to Postgres.Server. (#10466)
![image](https://github.com/enso-org/enso/assets/4699705/6d0d4167-e97b-4765-8079-650ad091ce60)

- Rename `Postgres_Details` to `Postgres`.
- Rename `Postgres` constructor to `Server`.
- Update SPI.
- Linting issues (indent, missing doc comment)
2024-07-08 07:58:08 +00:00
James Dunkerley
c2c4b95116
Final step removing the Problem_Behavior publishing. (#10461)
- Remove publishing the constructors.
- Fix any missed use in libs.
- Alter tests to generally use auto-scoped calls.
- `on_incomparable` to `on_problems`.
2024-07-05 18:41:36 +00:00
AdRiley
fc93b4d121
Refactor database dialect types (#10437)
* Auto-commit work in progress before clean build on 2024-07-03 14:17:22

* Refactor

* Revert

* revert

* Code Review feedback

* Green

* 2 Red

* Green

* Renames

* Code review changes

* Code review changes
2024-07-05 13:08:25 +01:00
James Dunkerley
0661f17d1c
Tune Text.trim, fix for Text.split (#10445)
- Rename `Location.Start` to `Location.Left`.
- Rename `Location.End` to `Location.Right`.
- Use auto-scoping for `Location`.
- Tune widgets for `Text.trim`.
- Correct signature of `Text.split`.
- Adjist `generateLocallyUniqueIdent` to not fail on bad signature.
2024-07-04 22:24:56 +00:00
James Dunkerley
9a2aed92f1
Separating list from read function and other small tweaks. (#10434)
- Rename `Faker.string_value` to `Faker.text_value`.
- Remove `Regex.pattern_string` as duplicate of `Regex.pattern`.
- Sort the Date picker.
- Rename `Data.list_directory` to `Data.list`.
- Remove support for reading a directory.

![image](https://github.com/enso-org/enso/assets/4699705/b42bb3c9-e63b-49f2-8cdc-4666cb6d968e)

![image](https://github.com/enso-org/enso/assets/4699705/97f49891-5ae5-4f0a-9a41-6200888fcd86)
2024-07-03 22:00:53 +00:00
GregoryTravis
48fb999eb3
Implement Decimal support for Postgres backend (#10216)
* treat scale nothing as unspecifed

* cast to decimal

* float int biginteger

* conversion failure ints

* loss of decimal precision

* precision loss for mixed column to float

* mixed columns

* loss of precision on inexact float conversion

* cleanup, reuse

* changelog

* review

* no fits bd

* no warning on 0.1 conversion

* fmt

* big_decimal_fetcher

* default fetcher and statement setting

* round-trip d

* fix warning

* expr +10

* double builder retype to bigdecimal

* Use BD fetcher for underspecified postgres numeric column, not inferred builder, and do not use biginteger builder for integral bigdecimal values

* fix tests

* fix test

* cast_op_type

* no-ops for other dialects

* Types

* sum + avg

* avg + sum test

* fix test

* update agg type inference test

* wip

* is_int8, stddev

* more doc, overflow check

* fmt

* finish round-trip test

* wip
2024-07-02 15:01:55 -04:00
AdRiley
132039a838
Rename env variables (#10336) 2024-06-28 11:38:22 +01:00
Radosław Waśko
db4f7ab3b5
Fixing Database tests and Snowflake Dialect - part 2 out of ... (#10319)
- Part of #9486
- Fixing our tests to not rely on deterministic ordering of created Tables in Database backends
- Before, SQLite and Postgres used to mostly return rows in the order they were inserted in, but Snowflake does not.
- Fixing various parts of Snowflake dialect.
2024-06-27 14:54:00 +00:00
James Dunkerley
d92078471b
Rename order_by to sort for Table and DB_Table. (#10372)
- Rename `order_by` to `sort` for `Table` and `DB_Table`.
- Added deprecated placeholder.
- Fixed a couple of minor deprecated mistakes.

![image](https://github.com/enso-org/enso/assets/4699705/96c32fa7-33e5-400a-9d3a-ebf330886911)
2024-06-26 17:46:09 +00:00
James Dunkerley
e6c8ec7ab5
Changes from session with Ned (#10349)
- Removed `second_row` and `second_column` from the `Table` and `DB_Table`.
- Added `first_value` and `last_value` to the `Table` and `DB_Table`.
- Fixed bug where negative index access wasn't allowed on `Column`.
- Added error if negative index access used on `DB_Column`. Tells user they have to materialize.
- Fix argument order for `Table.text_cleanse` and a couple of typo corrections.
- Rename `auto_value_type` to `auto_cast` on table and columns.
2024-06-24 12:47:14 +00:00
AdRiley
c324c78e23
Add duplicates component (#10323)
* Update existing behaviou to match new

* Add signatures

* Red test

* First test green

* sbt javafmtAll

* In-Memory working

* Not implemeted for In-Db

* Docs

* Disable tests for in-db

* Changelog

* Code review changes

* Fix

* Fix

* Fixc tests
2024-06-24 13:29:03 +03:00
James Dunkerley
791dba6729
Autoscoping for File_Format and minor tweaks. (#10348)
- Add `rename` ALIAS to `Table.use_first_row_as_names`.
- Add a shorthand to `Faker.string_value` to allow quick creation of fake values such as National Insurance numbers.
- Add `match` ALIAS for `Text.find` and `Text.find_all`.
- Auto scoping for `File_Format`. Demonstrates how to use it in a dynamic context.
- `SQLite_Format.For_File` renamed to `SQLite_Format.SQLite` (though kept for backwards compatibility.
- Fixed bug in `SQLite_Format` which was calling a non-existent constructor.

![image](https://github.com/enso-org/enso/assets/4699705/4506d27c-c1ff-4ad6-9276-53c2ae00de17)

![image](https://github.com/enso-org/enso/assets/4699705/9043ffb0-6740-42ba-91f8-ab0df555f20f)

![image](https://github.com/enso-org/enso/assets/4699705/03122fac-bdbb-4bcf-ac96-9491da41a8b9)

![image](https://github.com/enso-org/enso/assets/4699705/79122aac-a74a-435d-9849-ac3421f7d080)

![image](https://github.com/enso-org/enso/assets/4699705/54544de8-9aea-4dc6-bb4d-a7d8233c6814)
2024-06-24 08:28:54 +00:00
James Dunkerley
5042592d24
Some formatting issues and a few tweaks (#10298)
- Linting fixes.
- Adjust doc comments with `<` or `>` in plain text areas to use `&lt;` and `&gt;`.
- Refactor Statistics module and add auto-scoping.
- Add auto-scoping to `Text.to_case`.
- Fix type issue with `Table.get_value`.
- Private constructor for `Excel_Workbook` and move `xls_format` to public method.
- Add `fields` widget to `to_table`.
- Add `simple_expr` to make `Table.set` clearer.

![image](https://github.com/enso-org/enso/assets/4699705/3e21e800-142c-4006-a6c2-dd6196b76c9a)

![image](https://github.com/enso-org/enso/assets/4699705/d40dcbfd-a35e-4849-9e1a-f4a418d562dd)
2024-06-20 10:44:36 +00:00
Radosław Waśko
3a4784c226
Initial separation of Snowflake_Dialect from Postgres_Dialect (#10266)
- Part of #9486
- Building on top of initial work by @jdunkerley and finishing it
- Reverted the changes to the Postgres_Dialect from last Snowflake work and split the Snowflake_Dialect into a separate module.
- Moved from `rounding_decimal_places_not_allowed_for_floats` to `supports_float_round_decimal_places` (as too confusing).
- Added Snowflake_Dialect type.
- Extracted `Snowflake_Spec` into separate `Snowflake_Tests`
- It imports the common tests from `Table_Tests`.
- Some initial adaptations to make the snowflake dialect not-crash.
- Adding `Internals_Access` proxy to allow external implementations to access our internal data structures without directly exposing them to users. Users should not use these.
- Adding profiling of SQL to check performance.
2024-06-13 16:12:20 +00:00
AdRiley
fadb81abe6
Fix Text_Cleanse tests (#10263)
I hadn't connected the Text_Cleanse tests up properly and as a result they weren't running or working. This fixes that.
2024-06-13 08:25:32 +00:00
Radosław Waśko
41d02e95ef
Implement Windows-1252 fallback logic for Encoding.Default (#10190)
- Closes #10148
- [x] Tests for `Restartable_Input_Stream`, `peek_bytes` and `skip_n_bytes`.
- [x] Report `Managed_Resource` stack overflow bug: #10211
- [x] Followup possible optimization: #10220
- [x] Test use-case from blog.
2024-06-10 10:49:26 +00:00
James Dunkerley
d938c96c55
Adding type annotations and enabling auto-scoping (#10173)
- Renamed `Missing_Required_Argument` to `Missing_Argument`, and added `throw` method.
- Add default widget to `Case_Sensitivity.Insensitive locale`.
- Switch to auto scoping for `parse_type_selector`.
- Add type annotation to various simple typed arguments in `Table` and `DB_Table`.
- Altered `Filter_Condition` to have `Missing_Argument` for all non-defaulted arguments.
- Added resolution of `Column_Ref` passed as auto-scoped to `Table_Ref`.
- Altered `Simple_Calculation` to have `Missing_Argument` for all non-defaulted arguments.
- Altered `Join_Condition` to have `Missing_Argument` for all non-defaulted arguments.
- Altered `Sort_Column` to have `Missing_Argument` for all non-defaulted arguments.
- Altered `Aggregate_Column` to have `Missing_Argument` for all non-defaulted arguments.

**rename_columns:**
![image](https://github.com/enso-org/enso/assets/4699705/08aaba0f-687a-450c-9781-8eadc062bd50)

**aggregate:**
![image](https://github.com/enso-org/enso/assets/4699705/c29e7944-1a1c-4020-9fe0-528d874b8049)

**join:**
![image](https://github.com/enso-org/enso/assets/4699705/50038166-e56d-48c5-9eeb-bd46fa415e46)

**set:**
![image](https://github.com/enso-org/enso/assets/4699705/bee2462a-dafb-4bd4-b102-ec73edb4fb93)
2024-06-10 07:52:32 +00:00
GregoryTravis
4aa3d52b60
Implement conversions for Decimal column (#10206)
* treat scale nothing as unspecifed

* cast to decimal

* float int biginteger

* conversion failure ints

* loss of decimal precision

* precision loss for mixed column to float

* mixed columns

* loss of precision on inexact float conversion

* cleanup, reuse

* changelog

* review

* no fits bd

* no warning on 0.1 conversion

* fmt
2024-06-07 15:37:32 -04:00
AdRiley
fe6eafd06e
Add By_Type option to more components (#10183)
* reorder_columns

* format

* cast

* auto_value_types

* replace

* fill_nothing

* fill_empty

* text_replace

* text_cleanse

* Add By_Type

* Fix cast

* Fix more tests

* Fix the table helper

* Bug fix

* Remove check
2024-06-07 16:26:23 +01:00
GregoryTravis
5fad3558a6
BigDecimalBuilder and arithmetic operations. (#9950)
* hack

* make a column

* add

* no scale=0 on BD type

* a test

* wip

* 3 arithmetic ops

* /

* wip

* BigDecimalPowerOp

* wip

* mod test

* NumericBinaryOpReturningBigDecimal

* with scalar

* misc arithmetic tests

* fix integralBigDecimalToInteger

* mixed columns

* bigdecimal pow via double

* cleanup

* j2e on get

* arithmetic exception

* mod 0

* cleanup

* fmt

* changelog

* check type first

* merge

* mc error message

* add BD case to Builder.java

* fmt

* changelog

* add BD case to StorageConverter.java

* fmt

* fix test
2024-06-04 13:59:31 -04:00
AdRiley
7c35781a14
Add select/remove column by type (#10159)
* stash

* Working first pass

* Tests

* remove columns

* more tests

* More tests

* DB_Table

* In DB

* Deprecate

* Update documentation

* More docs

* Remove strict

* Remove tests

* Fix widgets

* Add doc

* Spaces

* Fix In-DB
2024-06-04 16:23:46 +01:00
marthasharkey
24d209abf9
Add Columns_To_Add type to move away from Integer | Nothing (#10152)
Adds new type "Coulmns_To_Add" that replaces the Integer | Nothing type
And adds new widget to split_to_columns and tokenise_to_columns
<img width="397" alt="image" src="https://github.com/enso-org/enso/assets/170310417/1b155682-0992-4cc0-8964-62b389ee7072">

Closes #10041
2024-06-04 15:06:08 +00:00
Radosław Waśko
7cf80f3196
Handle UTF BOM when decoding text (#10130)
- Improve BOM handling: detect and skip the BOM character, Default encoding that detects encoding based on BOM if present, warnings if unexpected BOM is encountered.
- Closes #9849
- Windows-1252 fallback will be done as a separate PR as it has additional complexity. Tracked in ticket #10148.
2024-06-04 13:22:19 +00:00
AdRiley
06327f8fde
Add statistic product (#10122)
Add Statistic.Product

![image](https://github.com/enso-org/enso/assets/1720119/f7fc7bb5-9efe-4dbe-9150-cd9e5101c553)
2024-05-31 09:29:52 +00:00
AdRiley
28bd5c522b
Add named_pattern and make it usable in Text_Replace (#10040)
Creates a new type Named_Expression and allows it to be used in text_replace

![image](https://github.com/enso-org/enso/assets/1720119/673a62a1-1ce5-4e1f-8289-0fa10e87b9de)

![image](https://github.com/enso-org/enso/assets/1720119/2f555e9b-d8c5-41e3-8000-d959d5818666)

![image](https://github.com/enso-org/enso/assets/1720119/d2ae2b0f-5bc4-4e9e-a391-402b58ee72d5)

![image](https://github.com/enso-org/enso/assets/1720119/3bf0547f-c2c7-4987-a45e-d922be2b0bae)
2024-05-27 13:01:26 +00:00
James Dunkerley
ab4b1f0f35
Add day_of_week and day_of_year to Column and DB_Column (#10081)
- Adds support for getting the weekday as an integer (1 Monday - 7 Sunday - ISO standard).
- Add support for getting the day of year.
2024-05-27 11:29:25 +00:00
James Dunkerley
d8059fd22c
Some tweaks following Steve's testings (#10042)
- Add ranged number widget to `at` and `get`.
- Add defaults to `at` and `get` picking the first item.
- PRIVATE on various Excel_Workbook methods. It still works like a connection but not shown in CB.
2024-05-27 09:04:29 +00:00
Radosław Waśko
9d75f79ff9
Cloud Integration updates: renames in file metadata structure, remove path resolver workaround, partial fix for datalink resolver issue (#10050)
- Supersedes #9966 as I wanted to test these changes in one go.
- Fixes #10037 caused by lack of CI check and my oversight (forgot to run full tests after a minor change).
- Fixes a regression after [file metadata fields were renamed](c09d856ac8 (diff-9f59b6a0ee3155efecdc70c1ea0c90ab5cde00b5623d84363118b1793f941c46R2037)).
- Fixes handling of creating new datalinks and using them after cache was cleared (e.g. workflow restart).
- This was caused by troubles with path resolver.
- The fix addresses the most common issue and adds a test for it (test flushes the caches to ensure path resolver is used instead of the cached value).
- Some related issues were discovered on the cloud side, tracked by https://github.com/enso-org/cloud-v2/issues/1252
2024-05-23 16:05:48 +00:00