Commit Graph

293 Commits

Author SHA1 Message Date
Radosław Waśko
7cf80f3196
Handle UTF BOM when decoding text (#10130)
- Improve BOM handling: detect and skip the BOM character, Default encoding that detects encoding based on BOM if present, warnings if unexpected BOM is encountered.
- Closes #9849
- Windows-1252 fallback will be done as a separate PR as it has additional complexity. Tracked in ticket #10148.
2024-06-04 13:22:19 +00:00
AdRiley
06327f8fde
Add statistic product (#10122)
Add Statistic.Product

![image](https://github.com/enso-org/enso/assets/1720119/f7fc7bb5-9efe-4dbe-9150-cd9e5101c553)
2024-05-31 09:29:52 +00:00
Radosław Waśko
233f28235a
Small fixes to Postgres integration (#10105)
- Better message when saving datalink in disabled Output context:
![image](https://github.com/enso-org/enso/assets/1436948/540d615b-79ff-4811-8262-a0475a7b6923)
Before it was:
![image](https://github.com/enso-org/enso/assets/1436948/51198bf1-1e50-41bc-a56b-f829bc32d09a)

- Hack to get Postgres widget to display connection options:
![image](https://github.com/enso-org/enso/assets/1436948/39f3db39-1163-4815-b59f-c629d812e2ab)
Before the `Postgres` constructor was created without any parameters and it was not showing any parameters for modification.
2024-05-28 14:34:44 +00:00
James Dunkerley
ab4b1f0f35
Add day_of_week and day_of_year to Column and DB_Column (#10081)
- Adds support for getting the weekday as an integer (1 Monday - 7 Sunday - ISO standard).
- Add support for getting the day of year.
2024-05-27 11:29:25 +00:00
Jaroslav Tulach
16c1b74218
Enso Library Feature to execute (a bit of) Base_Tests (#9997) 2024-05-23 08:20:19 +02:00
Radosław Waśko
1e0649fda1
Improvements to Table.union (#9968)
- Closes #9952
2024-05-22 09:38:10 +00:00
Radosław Waśko
5f0a16c87c
Audit Logs for Postgres connections opened through a data link (#9873)
- Closes #9599
- Implemented API for sending audit logs to the cloud on a background thread.
- If the Postgres connection is opened through a datalink, its internal JDBC connection is replaced by a wrapper that reports executed queries to the audit log.
- Also introduces `EnsoMeta` - a helper Java class that can be used in our helper libraries to access Enso types.
- I have replaced the common pattern scattered throughout the codebase with calls to this 'library' to avoid repetitive code.
- Refactored `Table.display` to share code between in-memory and DB - it was needed as the function stopped working for `DB_Table` after adding making the `Table` constructor `private`.
- Clearer error when reading a SQLite database from a remote file (tells the user to download it first).
- Follow up - correlate asset id of the data link:
#9869
- Follow up - include project name (once bug is fixed):
#9875
- Some problems/improvements of the audit log:
- The audit log system is not yet ready for high throughput of logs
#9870
- The logs may be lost if `System.exit` is used
#9871
2024-05-11 08:54:33 +00:00
AdRiley
e25ec96aaa
Add table running variance skew sd and kurtosis (#9854)
Adds support for Variance, Skew, Standard Deviation and Kurtosis to Table.Running.
2024-05-09 08:45:29 +00:00
AdRiley
15976a8505
Make table.Running return integer typed columns for min/max (#9853)
* New Tests

* Green

* Running min for longs

* Unsupported types test

* Revert

* Add support for all the integer types

* Another test
2024-05-07 10:49:12 +01:00
AdRiley
f647045214
Make excel writer work for all types (#9846)
* New Test

* Improve DateTime recognition

* Re-enable slow test

* If there is a time take it regardless of format

* If there is a time take it regardless of format

* Code Review Changes
2024-05-03 07:09:54 +01:00
AdRiley
5350b2d00d
Refactor add row number (#9822)
* Refactor add row number

* Refactor

* Green

* Green

* Remove dead code

* Cleanup

* Deduplicate check
2024-05-02 12:29:54 +01:00
James Dunkerley
d2e6ff260e
Restructure SQLite_Details. (#9832)
```
type SQLite_Details
SQLite location:File|In_Memory

type In_Memory
```
to
```
type SQLite
From_File location:File

In_Memory
```

# Important Notes
Splits the In-Memory entry for Database Connect but still works nicely.

![image](https://github.com/enso-org/enso/assets/4699705/ec798ce0-9f41-4903-a2fd-722a9e37743c)

![image](https://github.com/enso-org/enso/assets/4699705/f233b055-893e-4c56-a23d-562e982560f6)
2024-05-01 22:15:41 +00:00
James Dunkerley
4d6d6f239c
Handle URL encoding automatically in query string. (#9823)
A small fix to automatically encode the query string.
Attaches a warning if needed.

![image](https://github.com/enso-org/enso/assets/4699705/032bdb59-6896-46c0-b970-f5a542cc6adf)

![image](https://github.com/enso-org/enso/assets/4699705/6b2075b9-3c98-4de2-8a34-c860ecd65d0c)
2024-04-30 22:03:46 +00:00
AdRiley
d1bf4cb771
Add Ignored_Nothing_Values (#9770)
Add a `IgnoredNothing` warning for Table.Running

![image](https://github.com/enso-org/enso/assets/1720119/1941d278-2c33-43fe-a175-8bcc65bae51a)

![image](https://github.com/enso-org/enso/assets/1720119/b5f6b235-d939-4868-9490-de0f226ea1a2)

![image](https://github.com/enso-org/enso/assets/1720119/a1d617a6-a684-4cc1-be13-c4907d2e6876)
2024-04-30 13:30:40 +00:00
AdRiley
32c3f5f3e8
Make Table.should_equal and Column.should_equal consider NaN equal (#9799)
* Make Column.should_equal detect colums of different types and think nan==nan

* Refactor Table.should_equal

* More Column tests

* Adjust spacing

* Tests Green

* Check same number of columns

* Refactor

* Extra test

* Code Review Changes

* Fix

* Fix

* Fix tests

* Fix Tests

* Fix Test

* Fix test

* Code review change
2024-04-29 22:21:34 +01:00
Jaroslav Tulach
0d495ffd97
Make conversion of double to BigDecimal exact (#9740)
Resolves #9607 by computing `Number.hash` by converting given number to `Float` first and then computing the hash. Also the conversion from `Float.to Decimal` is exact - done via `new BigDecimal(double)`. There is `Decimal.new` that handles the user-friendly conversion. However as a result `Decimal.from 2.1 != Decimal.new 2.1` - that's the only way to ensure consistency between hash code and conversions.
2024-04-25 11:22:50 +00:00
James Dunkerley
fb9cf38914
Excel_Workbook.read_many (#9759)
- Some minor linting fixes.
- Adjust `headers` parameter so a dedicated type.
![image](https://github.com/enso-org/enso/assets/4699705/989f464d-df95-410e-a03b-36661f1c4a37)
- Fix bug with `read` on an `Excel_Workbook` so error handled more gracefully and not panicking to UI.
![image](https://github.com/enso-org/enso/assets/4699705/23b4575f-daad-4719-a5cc-30d064bd7f7a)
- Fix bug when writing to a file with an `Excel_Format` with an invalid extension which was causing a panic.
![image](https://github.com/enso-org/enso/assets/4699705/dc0e055c-c1b6-482f-b129-eb69f6554d72)
- Add `read_many` to `Excel_Workbook` allowing reading more than one sheet at a time.
2024-04-24 13:16:44 +00:00
AdRiley
4a97bfa31f
Add table running functionality for Sum, Mean, Min, Max. (#9577)
* Add Table.Running

* Code Review fixes

* Code Review changes

* Change null handling
2024-04-23 09:45:43 +01:00
AdRiley
ceaba7f48d
Make excel writer work for custom types (#9752) 2024-04-20 10:34:06 +01:00
GregoryTravis
86ecd3e027
Add Decimal.floor, .ceil, and .trunc (#9694) 2024-04-17 18:42:38 +00:00
Radosław Waśko
fda41cbfd1
Writing Cloud files (#9686)
- Closes #9291
2024-04-16 14:01:03 +00:00
Radosław Waśko
bdda1830b7
Integrate Cloud path resolver (#9662)
- Closes #9363
- Cleans up the Cloud mock as it got a bit messy. It still implements the bare minimum to be able to test basic secret and auth handling logic 'offline' (added very simple path resolution, only handling the minimum set of cases for the tests to work).
- Adds first implementation of caching Cloud replies.
- Currently only caching the `Enso_User.current`. This is a simple one to cache because we do not expect it to ever change, so it can be safely cached for a long period of time (I chose 2h to make it still refresh from time to time while not being noticeable).
- We may try using this for caching other values in future PRs.
2024-04-12 13:03:09 +00:00
GregoryTravis
e3afa5561d
Add Decimal.round (#9672) 2024-04-11 15:47:50 +00:00
Radosław Waśko
5650c7aed2
Refactoring Enso_File to be path based (#9581)
- Closes #9289
- Ensures that we can refer through `Enso_File` to files that do not _yet_ exist - preparing us for implementing the Write functionalities for `Enso_File` (#9291).
2024-04-09 11:15:29 +00:00
Radosław Waśko
f2d6079ac4
Fix missing AWS region in S3 operations (#9546)
- Closes #9284
- Now our tests run without the default `AWS_` config, thus ensuring that the tested setups work in a clean environment.
- After all, more complicated logic was needed for buckets access - apparently the AWS SDK only allows for some operations on buckets to happen if the client is connected to the correct region. Thus detection of bucket regions had to be implemented.
- Added `AWS_Region` widget based on autoscoping.
- Fixed `AWS_Credential.profile_names` crashing if no AWS config was found. Now it returns no profiles if not found. Added a regression test.
2024-03-27 12:00:15 +00:00
Radosław Waśko
af5354b869
Data Link for reading Enso_File (#9525)
- Closes #9282
2024-03-27 04:17:07 +00:00
Radosław Waśko
6665c22eb9
Make data-links behave more like 'symlinks' (#9485)
- Closes #9324
2024-03-22 17:01:54 +00:00
James Dunkerley
283c0b61d9
Data link for Snowflake. (#9514)
Adding in Snowflake into the Datalink APIs.
![image](https://github.com/enso-org/enso/assets/4699705/32bd347c-0b2b-47b5-bec2-5c939ecd0594)
2024-03-21 17:06:56 +00:00
James Dunkerley
2f0d99a1cb
Snowflake Connectivity (#9435)
* Initial connection to Snowflake via an account, username and password.

* Fix databases and schemas in Snowflake.
Add warehouses.

* Add warehouse.
Update schema dropdowns.

* Add ability to set warehouse and pass at connect.

* Fix for NPE in license review

* scalafmt

* Separate Snowflake from Database.

* Scala fmt.

* Legal Review

* Avoid using ARROW for snowflake.

* Tidy up Entity_Naming_Properties.

* Fix for separating Entity_Namimg_Properties.

* Allow some tweaking of Postgres dialect to allow snowflake to use as well.

* Working on reading Date, Time and Date Times.

* Changelog.

* Java format.

* Make Snowflake Time and TimeStamp stuff work.
Move some responsibilities to Type_Mapping.

* Make Snowflake Time and TimeStamp stuff work.
Move some responsibilities to Type_Mapping.

* fix

* Update distribution/lib/Standard/Database/0.0.0-dev/src/Connection/Connection.enso

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>

* PR comments.

* Last refactor for PR.

* Fix.

---------

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2024-03-20 10:06:12 +00:00
Radosław Waśko
6e5b4d93a3
Implement refreshing the Cloud token in Enso libraries (#9390)
- Closes #9300
- Now the Enso libraries are themselves capable of refreshing the access token, thus there is no more problems if the token expires during a long running workflow.
- Adds `get_optional_field` sibling to `get_required_field` for more unified parsing of JSON responses from the Cloud.
- Adds `expected_type` that checks the type of extracted fields. This way, if the response is malformed we get a nice Enso Cloud error telling us what is wrong with the payload instead of a `Type_Error` later down the line.
- Fixes `Test.expect_panic_with` to actually catch only panics. Before it used to also handle dataflow errors - but these have `.should_fail_with` instead. We should distinguish these scenarios.
2024-03-19 19:26:34 +00:00
GregoryTravis
9a9eff1aa6
Decimal type: constuctors, comparisons, and arithmetic (#9272) 2024-03-15 21:13:41 +00:00
Radosław Waśko
e98306f170
Excel DataLink (#9346)
- Adds the Excel format as one of the formats supported when creating a data link.
- The data link can choose to read the file as a workbook, or read a sheet or range from it as a table, like `Excel_Format`.
- Also updated Delimited format dialog to allow customizing the quote style.
2024-03-11 16:12:12 +00:00
AdRiley
3ebf1340e8
Add write to xml document (#9299)
* First commit

* Add xml.write

* Add comment

* Changelog.md

* Code review changes

* Code review changes

* Update import
2024-03-06 17:13:28 +00:00
Radosław Waśko
e37862b09d
Implement a Data Link for Postgres (#9269)
- Closes #9124
2024-03-06 11:57:12 +00:00
AdRiley
8b889f0977
Make Table.To_Xml return a XML_Document (#9263)
As part of the XML improvements it makes more sense for Table.To_Xml to return a XML_Document.
2024-03-04 15:19:20 +00:00
Radosław Waśko
39af372bcd
Allow resolving enso:// URIs in Data.read and other places (#9225)
- Implements the core parts of #9048
- Currently the path resolution is done by resolving each segment, one by one - requiring as many API calls as there are segments in the path.
- This should be replaced in a followup PR, once https://github.com/enso-org/cloud-v2/issues/899 is implemented.
2024-03-02 16:04:30 +00:00
James Dunkerley
964fdfd7ea
Align XML_Document and XML_Element APIs more. (#9233)
- Added `to_table` extensions on some core types.
- Added ICON to `Any.to`.
- Added ICON to `Column.info`, `Table.info`, `DB_Column.info` and `DB_Table.info`.
- Added defaults to `Table.cross_tab` and `DB_Table.cross_tab`.
- Added `name`, `get`, `at`, `inner_xml` and `outer_xml` to `XML_Document`.
- Added constants into left hand side of simple expressions.
- Added widget to `get` and `at` on `XML_Document` and `XML_Element`. (Some bug in annotation code with Dmitry)
- Altered `get` and `at` to not allow XPath and just get direct child/attribute values.
- Added `get_xpath` to `XML_Document`.
- Renamed `get_elements_by_tag_name` to `get_descendants_by_tag_name` and added new `get_children_by_tag_name`.
- Added `child_names` and `attribute_names` to `XML_Document` and `XML_Element`.
2024-03-01 17:42:44 +00:00
Radosław Waśko
4316709379
Implementing reading Data Links (#9215)
- Close #9123
2024-03-01 15:33:21 +00:00
James Dunkerley
8f2b9da664
IsNa to IsNothing, missing to Nothing in Table code. (#9154)
Starting to use Nothing everywhere...
2024-02-26 10:52:07 +00:00
James Dunkerley
0e2a91cfe1
Remove countMissing from Storage and replace with a new CountNothing operation. (#9137)
Removing another small piece of logic from the storages to it's own operation.
2024-02-22 19:32:46 +00:00
James Dunkerley
ee66b9fb1d
Refactoring the Unary operations so uncoupled from Storage. (#9090)
In order to allow clever masking, slicing, filtering and arrow backing stores...

- Adding ColumnStorage interface with the base API a storage will need.
- Refactored each of the unary operations to a new `UnaryOperation` interface which makes them responsible for deciding if they can be executed.
2024-02-19 17:11:52 +00:00
James Dunkerley
f2d2f73e89
Starting to refactor Storage and Operations (#9076)
Cleaning up some of the structures in Storage before working on UnaryOperations.

- Removed some legacy code: `countMask`, `Index` and `DefaultIndex`.
- Renamed `mask` to `applyFilter` on `Column` and `Storage`.
- Renamed `Table.mask` to `Table.filter`.
2024-02-15 18:21:07 +00:00
Pavel Marek
5919eda753
Fix incremental compilation of runtime/test (#8975) 2024-02-13 10:05:31 +01:00
AdRiley
9339672e0e
Remove _new and actually run the new tests (#9006)
Merge conflict on develop meant this one got left with a new_test.
2024-02-09 14:19:02 +00:00
AdRiley
e3f6ff1772
Add to_xml component (#8979)
Adds new to_xml component
2024-02-07 20:54:48 +00:00
James Dunkerley
0c39f8ec04
Allow Filter_Condition to be inverted. (#8861)
- Various linting fixes (doc comments and type annotations etc.).
- Add an action to determine if a `Filter_Condition` is keep or remove.

https://github.com/enso-org/enso/assets/4699705/69ba2bd3-8893-4237-acc4-eb01f534a209

- Remove `Not_In`, `Not_Contains` and `Not_Like` from `Filter_Condition`.

- Ability to use an `Expression` as a `Column_Ref`.

https://github.com/enso-org/enso/assets/4699705/16a2e030-f8f9-4f59-beca-2646f56fcb90
2024-02-07 14:36:14 +00:00
AdRiley
340a3eec4e
Split HashJoin to SimpleHashJoin and CompoundHashJoin (#8850)
Completes #8342 . Creates a SimpleHashJoin and CompoundHashJoin.

# Important Notes
Creates SimpleHashJoin and CompoundHashJoin.

CompoundHashJoin is what was HashJoin.
SimpleHashJoin is a new implementation that only indexs the smaller of the 2 tables being joined together.

The rest is refactor and clean-up of the shared join code.
2024-02-01 18:48:44 +00:00
James Dunkerley
eeaddbc434
Add parser for line by line processing (#8719)
- Linting fixes and groups.
- Add `File.from that:Text` and use `File` conversions instead of taking both `File` and `Text` and calling `File.new`.
- Align Unix Epoc with the UTC timezone and add converting from long value to `Date_Time` using it.
- Add simple first logging API allowing writing to log messages from Enso.
- Fix minor style issue where a test type had a empty constructor.
- Added a `long` based array builder.
- Added `File_By_Line` to read a file line by line.
- Added "fast" JSON parser based off Jackson.
- Altered range `to_vector` to be a proxy Vector.
- Added `at` and `get` to `Database.Column`.
- Added `get` to `Table.Column`.
- Added ability to expand `Vector`, `Array` `Range`, `Date_Range` to columns.
- Altered so `expand_to_column` default column name will be the same as the input column (i.e. no `Value` suffix).
- Added ability to expand `Map`, `JS_Object` and `Jackson_Object` to rows with two columns coming out (and extra key column).
-  Fixed bug where couldn't use integer index to expand to rows.
2024-02-01 07:29:50 +00:00
GregoryTravis
7436848e90
Implement relational NULL/Nothing for join for in-memory tables (#8849)
Implements relational NULL for join, for all `Join_Kind`s.
2024-01-29 16:19:07 +00:00
Radosław Waśko
ca4f98c78e
Adding tests and missing methods for Enso_File. (#8815)
- Closes #8808 - adds tests for various scenarios.
- Implements `size` using HEAD.
- Updates existing functions to changes in Cloud API.
- Adds stubs for `*_time` methods, `parent`, `path`.
- [x] TODO: resolve the `Enso_File.current_working_directory` from an environment variable.
- ~~TODO: recursive directory deletion?~~ left for later

# Important Notes
- Currently, the Cloud API does not offer an easy way to extract metadata for a file, in particular to get the parent folder from the file `id`.
- We should be able to get the parent, and stuff like creation/modified time.
- We need a way to resolve paths to asset ids, for `path` to work as well as `current_working_directory`.
- What is the environment variable that will be used to feed the `current_working_directory` property?
2024-01-26 19:04:42 +00:00