Commit Graph

337 Commits

Author SHA1 Message Date
James Dunkerley
6b544650b3
New NumberParser for Table parsing (#11499)
Replaces the Regex based number parser with a new parser which works out the same by working out each part as it sees and example of it.

Close #7398 - performance of reading the large CSV now about 2s (down from 15-20s).
2024-11-13 19:08:23 +00:00
Gregory Michael Travis
fb50a8f24f
HTTP cache size limit environment variables (#11530) 2024-11-13 13:40:54 -05:00
Radosław Waśko
e76fe907d3
Initial implementation of Data.read_many (#11490)
- Part of #11311
- Adds ability to read a list of files (Vector, Column, Table) into a Vector.
- Reading into a Table of objects or merged will come in a next PR.
2024-11-08 19:03:47 +00:00
James Dunkerley
86c1cd9953
Support for 1904 date format. (#11496)
- Adds support for reading Excel workbooks in 1904 date format.
- When writing to a workbook in 1904 format, will write dates correctly.

![image](https://github.com/user-attachments/assets/c17cd65d-1a09-4aa8-a946-8d427a2b7c22)

![image](https://github.com/user-attachments/assets/66796dac-4271-4bd1-acb3-1127afb5ec0b)
2024-11-05 23:10:34 +00:00
James Dunkerley
c5734a8fc8
Improved Google Analytics integration (#11484)
- Enhanced Google Analytics API.
- Now published as a type with static methods not a module.
- Bump version and add Admin API.
- Moved the reading logic to Java from Enso.
- Add dependency on Standard Table allowing report to be built into a Java Table directly.
- New `Google_Credential.new` method.
![image](https://github.com/user-attachments/assets/54e3ad87-045f-4e40-b609-337d827c5d02)
- Ability to list accounts for a credential (`Google_Analytics.list_accounts`).
![image](https://github.com/user-attachments/assets/296c6dcc-3b24-43fa-b909-5e74c40d77a1)
- Ability to list properties (either for an account or for all) (`Google_Analytics.list_properties`).
![image](https://github.com/user-attachments/assets/e420c824-d08e-48d0-b21c-560b4c7c4809)
- Simple object structure of `Google_Analytics_Account`, `Google_Analytics_Property` and `Google_Analytics_Field` with some helper methods.
- Widget for `account`, `property` and `credentials`.
![image](https://github.com/user-attachments/assets/221c1450-964d-4fce-af8b-2273aa8739a1)
![image](https://github.com/user-attachments/assets/e1daf1dd-2ade-4c33-875c-4e3cb1544fe6)
![image](https://github.com/user-attachments/assets/cd37b018-4fad-4771-9f48-1448f0076ef9)
- Widget for `dimensions` and `metrics` with defaults and then reading from Admin API.
![image](https://github.com/user-attachments/assets/3a4b1d42-9555-499d-90da-04d7586ab4c1)
![image](https://github.com/user-attachments/assets/16efcb11-3547-4eaf-9f28-944fa21c4aa2)
- Added widget for `start_date` and `end_date` on `Google_Analytics.read`.
- Bug fix for `parse` with auto type by reordering to allow numeric dates to be parsed.
- **ToDo**: better exception handling.
2024-11-05 10:11:42 +00:00
Gregory Michael Travis
dc50a7e369
HTTP response caching, with TTL and LRU logic (#11342) 2024-10-30 12:50:35 +00:00
James Dunkerley
78d9e34840
Excel before 1900 and AWS signed requests. (#11373) 2024-10-28 20:20:06 +00:00
James Dunkerley
5f44c512a8
Handle mixed Date and Date Time column within Excel (#11349)
- If a column contains both Date and DateTime in Excel, we create a DateTime column.
- If the column contains numbers or text as well then we end up with a mixed column.
![image](https://github.com/user-attachments/assets/b0b98d1c-c5c5-41db-8af5-0c946d8a5b92)
2024-10-17 16:42:42 +00:00
Radosław Waśko
d75e20c1d2
Save Database connection as data link, SQL Server data link support (#11343)
- Closes #11294
2024-10-17 09:06:57 +00:00
James Dunkerley
96fa2ee35a
Small fixes.. (#11338)
- Handle calculated header rows in Excel.
![image](https://github.com/user-attachments/assets/e0b307b1-90c2-435b-8d78-9e1b0e8d3932)
- Better reporting of Incomparable Values (bug fix in handler, add catch to add_row_number and running).
![image](https://github.com/user-attachments/assets/9d2ee953-ae5f-45f3-b3fa-6d593529bfc9)
- Remove default from `tokenize` as it was generating hundreds of rows.
- Added error to `Data.read` if no path provided.
![image](https://github.com/user-attachments/assets/71c8cd5f-ec40-4d8c-9972-94aa6fb9d3de)
2024-10-16 17:31:26 +00:00
Radosław Waśko
2843dcbf4a
When connecting to a Postgres database through a datalink stored on Enso Cloud, its asset ID is included in the audit logs (#11291)
- Closes #9869
2024-10-10 15:18:47 +00:00
Radosław Waśko
3458fe4fe1
Accessing and modifying description and labels of Enso Cloud assets (#11255)
- Closes #11227
- Additionally, it should fix #11278 by ensuring that every scheduled message goes to the desired endpoint, by splitting each batch by endpoint.
2024-10-10 12:11:10 +00:00
Gregory Michael Travis
cce50fab3a
Extend the range of int/float arguments to outside the range of Java long, in ceil, floor, and truncate (#11135) 2024-10-07 11:36:17 -04:00
James Dunkerley
fd72ab7052
Remove some catch alls (#11254)
- Allow Interrupted Exceptions to float out of the web requests.
- Use `Type_Error` rather than Any when catching auto scoping resolving.
- Rename `Java_Exception` to `JException`
2024-10-04 21:32:56 +00:00
James Dunkerley
6ea716f1b3
Widget for Database.connect (#11216)
- Use auto-scoping for Redshift, SQLServer and Snowflake.
![image](https://github.com/user-attachments/assets/2f5ff24a-44f4-4e87-909a-e064b8653511)
- Fix for widgets on Header functions.
![image](https://github.com/user-attachments/assets/4384efcf-a4da-48b1-b571-1167ca8d0134)
- Move `Snowflake_Details` to `Standard.Snowflake.Connection` namespace to make widgets work.
- Add widgets to `Snowflake_Details`.
![image](https://github.com/user-attachments/assets/b51d0126-a768-4f4a-9d87-c42c8e91e26b)
- Typo fix for SQLServer SPI.
- Change SQLServer port to be an Integer and added default.
- Reordered parameters on SQLServer **(potentially breaking change)**.
- Added widgets to SQLServer.
![image](https://github.com/user-attachments/assets/1c744da4-7913-4a87-9e64-fc10442a06eb)
- Added widget for JDBC options (as well as conversion from Vector to options).
![image](https://github.com/user-attachments/assets/4958b1e4-4cbc-43e3-8381-64e5ce7ea8ff)
- Added header alias to `use_first_row_as_names`.
- Added various aliases to `read` and `write`.
2024-10-01 08:43:03 +00:00
Kaz Wesley
e587d564f8
Improve backend error handling (#11136)
- Fix debug logging for #11088 case--attempt to create an exception that is its own cause fails.
- In case the parser is used after closing, throw an `IllegalStateException` instead of UB. (This case is not known to occur and doesn't seem to be behind the #11121, but we should handle it more safely if it does.)
2024-09-20 13:23:52 +00:00
Hubert Plociniczak
2c362ea519
More info when NPE is encountered (#11125)
We are seeing this problem almost daily and need more info rather
urgently.
Related to #11088.
2024-09-18 23:14:36 +02:00
James Dunkerley
ce4c741af1
Add today, now and time to expressions (#10944)
* Add today, now and time to Expression.

* Move running and compute into Column as that allows them to be used in expressions.

* Fix bug.

* Fix exports.

* Java fmt.
2024-09-02 11:13:51 +01:00
GregoryTravis
ad9fa4b8b6
Add vectorized rounding operation to Decimal columns (#10912) 2024-08-31 07:06:12 +00:00
James Dunkerley
91226be378
Small tweaks from QA (#10941) 2024-08-31 09:04:52 +02:00
Radosław Waśko
50325b6a1d
Pending Audit Logs are sent in batches (#10918) 2024-08-30 15:10:54 +02:00
GregoryTravis
1804f317b2
Implement .floor, .ceil, .trunc for the in-memory Decimal column (#10887)
* wip

* wip

* test

* round pending

* changelog

* fix test

* fully enable tests

* fix test
2024-08-28 14:27:26 -04:00
GregoryTravis
8260a9587f
Column-level lexically-scoped CTE expressions (#10826)
This implements `DB_Column.with`, which uses `WITH ... AS` SQL clauses to remove duplicates in the generated SQL.

After a discussion with @radeusgd, we concluded that we will probably want a more complete CTE implementation, so this one is useful for now to deal with big queries (like `round`).

# Important Notes
Still to do in this PR:

- [x] Rename `with` to `let` (or something similar)
- [x] tests
- [x] documentation
- [x] remove `State` hack by moving query generation into a class and using a `Ref` field for scoping

Results on `round_float`:

| --- | SQL length in characters (unprettified) | SQL length in lines (prettified) |
| --- | --- | --- |
| Without CTEs | 13193 | 851 |
| With CTEs | 3644 | 187 |

Compare the SQL:

[without-ctes.sql.txt](https://github.com/user-attachments/files/16629356/without-ctes.sql.txt)
[with-ctes.sql.txt](https://github.com/user-attachments/files/16629357/with-ctes.sql.txt)

Update, with name shortening:

| --- | SQL length in characters (unprettified) | SQL length in lines (prettified) |
| --- | --- | --- |
| Without CTEs | 13193 | 853 |
| With CTEs | 2427 | 176 |

[without-cte.txt](https://github.com/user-attachments/files/16694328/without-cte.txt)
[with-cte.txt](https://github.com/user-attachments/files/16694327/with-cte.txt)
2024-08-28 18:23:51 +00:00
Radosław Waśko
ff5e4c4e0a
Include projectName in audit logs (#10892)
- Closes #9875
2024-08-27 13:13:28 +00:00
James Dunkerley
422fa8c16b
Adding support for creating Atoms in expressions (#10820)
- Enables the `..` autoscoping style for creating Atoms in expressions.
- Add type checking to methods in columns.
- Auto wrap returns from method in expressions into a column as needed.
- Remove `Time_Period.Day` to remove confusion..
2024-08-15 15:52:30 +00:00
Radosław Waśko
b1958f8aa3
Adding vectorized implementations to some Column operations (#10795)
- Part of #6256 - implements operations that could have been vectorized without changes to the overall infrastructure
2024-08-13 08:53:39 +00:00
James Dunkerley
b8c036c476
Initial Tableau Reading Support (#10733)
- Adds `Hyper_File` allowing reading a Tableau hyper file.
- Can read the schema and table list.
- Can read the structure of a table.
- Can read data into an Enso Table.
2024-08-07 09:23:05 +00:00
Radosław Waśko
3fd14642d9
Fix upload/delete transactions in Snowflake backend (#10738)
Fixes #10609 by rewriting all our upload-related operations to rely on `DDL_Transaction` - an abstraction that handles 'transactionality' of `CREATE TABLE` statements dependent on if a given backend allows DDLs within transactions or not (if not it emulates transactionality by creating the tables outside of transaction and then dropping them on rollback).
2024-08-06 08:14:44 +00:00
AdRiley
0c552489e3
Add Initial SQL Server support (#10624)
* Squash all commits to resolve merge conflicts

* Fix merge problems

* Merge fix

* Fix port

* Fix warning

* cargo fmt

* legal review

* Small fixes

* Update instructions

* Code review feedback

* Cleanup

* typo

* Fix

* Remove leftover snowflake code

* Remove comment

* Add underscore

* Type cleanup

* Code review fix

* Cleanup

* Add datetime roundtrip test

* add comment

* drop

* Refactor

* Refactor

* Fix merge

* Fix

* Fix

* fix

* Add comment
2024-07-30 11:13:08 +01:00
GregoryTravis
f31c084f43
Implement in-memory and database mixed decimal column comparisons (#10614) 2024-07-25 21:27:19 +00:00
Jaroslav Tulach
c20eab2af9
Detect compilation while benchmarking (#10574)
Enables `engine.TruffleCompilation` in `std-benchmarks`, collects the logs and dumps compilation into to `System.err` when a benchmark is influenced by dynamic compilation.
2024-07-18 15:49:16 +00:00
Radosław Waśko
632355f85b
Snowflake Dialect pt. 4 - reading a column of small integers as Integer type, other type mapping tests (#10518)
- Related to #9486
- Ensures that even though an integer column in Snowflake is represented by `Decimal` type, if the values are small enough, they are materialized as `Integer`.
- If the values are larger, they are still read in as `Decimal`.
- Adds tests for some other `Decimal` edge cases (various precisions and scales), and for `Float`.
2024-07-11 20:14:46 +00:00
Jaroslav Tulach
220b40a1cd
Enforce conversion method return type & introduce Comparable.new (#10468) 2024-07-11 06:58:51 +02:00
Radosław Waśko
48c17845a7
Fixing Database tests and Snowflake Dialect - part 3 out of ... (#10458)
- Related to #9486
- Fixes types in literal tables that are used throughout the tests
- Tries to makes testing faster by disabling some edge cases, trying batching some queries, re-using the main connection and trying to re-use tables more
- Implements date/time type mapping and operations for Snowflake
- Updates type mapping to correctly reflect what Snowflake does
- Disables warnings for Integer->Decimal coercion as that's too annoying and implicitly understood in Snowflake
- Allows to select a Decimal column with `..By_Type ..Integer` (only in Snowflake backend) because the Decimal column there is its 'de-facto' Integer column replacement.
2024-07-10 13:21:30 +00:00
James Dunkerley
8da06309e9
Date Time Pickers, Temporarily Disable Encoding.default (#10493)
- Widgets for Date_Time, Time_Of_Day and Time_Zone.
- Disable Encoding.default for now as big performance impact on CSVs.

![image](https://github.com/enso-org/enso/assets/4699705/c1b936f0-3ab4-490c-8fe5-2310ef1ed079)

![image](https://github.com/enso-org/enso/assets/4699705/d5e29ec4-cc52-41e5-a532-17cd6dff34b9)

![image](https://github.com/enso-org/enso/assets/4699705/61455519-ea63-4275-9c7a-603714ff9f85)

![image](https://github.com/enso-org/enso/assets/4699705/48ccd3ad-5e15-49f9-87cd-4710ca559843)
2024-07-09 21:04:08 +00:00
Radosław Waśko
a3dc50fe1e
Replace presigned S3 URL with lambda request (#10456)
- Closes #10419
2024-07-09 09:36:10 +00:00
James Dunkerley
018d4c312f
Stop publishing Postgres constructor, update Postgres_Details.Postgres to Postgres.Server. (#10466)
![image](https://github.com/enso-org/enso/assets/4699705/6d0d4167-e97b-4765-8079-650ad091ce60)

- Rename `Postgres_Details` to `Postgres`.
- Rename `Postgres` constructor to `Server`.
- Update SPI.
- Linting issues (indent, missing doc comment)
2024-07-08 07:58:08 +00:00
GregoryTravis
48fb999eb3
Implement Decimal support for Postgres backend (#10216)
* treat scale nothing as unspecifed

* cast to decimal

* float int biginteger

* conversion failure ints

* loss of decimal precision

* precision loss for mixed column to float

* mixed columns

* loss of precision on inexact float conversion

* cleanup, reuse

* changelog

* review

* no fits bd

* no warning on 0.1 conversion

* fmt

* big_decimal_fetcher

* default fetcher and statement setting

* round-trip d

* fix warning

* expr +10

* double builder retype to bigdecimal

* Use BD fetcher for underspecified postgres numeric column, not inferred builder, and do not use biginteger builder for integral bigdecimal values

* fix tests

* fix test

* cast_op_type

* no-ops for other dialects

* Types

* sum + avg

* avg + sum test

* fix test

* update agg type inference test

* wip

* is_int8, stddev

* more doc, overflow check

* fmt

* finish round-trip test

* wip
2024-07-02 15:01:55 -04:00
AdRiley
c324c78e23
Add duplicates component (#10323)
* Update existing behaviou to match new

* Add signatures

* Red test

* First test green

* sbt javafmtAll

* In-Memory working

* Not implemeted for In-Db

* Docs

* Disable tests for in-db

* Changelog

* Code review changes

* Fix

* Fix

* Fixc tests
2024-06-24 13:29:03 +03:00
Jaroslav Tulach
fe2cf49568
Run whole test/Base_Tests in native image runner (#10296) 2024-06-21 06:03:53 +02:00
Radosław Waśko
a8358512ad
Small fixes to Cloud Integration (#10303)
- Includes HTTP method in error message
- Does not do special handling for `403` status code - this was wrong and led to `Unauthorized` error when the real cause was lack of permssions in the Cloud. The errors should be more understandable now.
- Adds `projectSessionId` to audit log metadata.
- Fixes a test (`Secrets_Spec`) that did not have unique names and would fail if cleanup of previous runs failed (or if ran in parallel).
2024-06-18 09:41:33 +00:00
Radosław Waśko
41d02e95ef
Implement Windows-1252 fallback logic for Encoding.Default (#10190)
- Closes #10148
- [x] Tests for `Restartable_Input_Stream`, `peek_bytes` and `skip_n_bytes`.
- [x] Report `Managed_Resource` stack overflow bug: #10211
- [x] Followup possible optimization: #10220
- [x] Test use-case from blog.
2024-06-10 10:49:26 +00:00
GregoryTravis
4aa3d52b60
Implement conversions for Decimal column (#10206)
* treat scale nothing as unspecifed

* cast to decimal

* float int biginteger

* conversion failure ints

* loss of decimal precision

* precision loss for mixed column to float

* mixed columns

* loss of precision on inexact float conversion

* cleanup, reuse

* changelog

* review

* no fits bd

* no warning on 0.1 conversion

* fmt
2024-06-07 15:37:32 -04:00
GregoryTravis
5fad3558a6
BigDecimalBuilder and arithmetic operations. (#9950)
* hack

* make a column

* add

* no scale=0 on BD type

* a test

* wip

* 3 arithmetic ops

* /

* wip

* BigDecimalPowerOp

* wip

* mod test

* NumericBinaryOpReturningBigDecimal

* with scalar

* misc arithmetic tests

* fix integralBigDecimalToInteger

* mixed columns

* bigdecimal pow via double

* cleanup

* j2e on get

* arithmetic exception

* mod 0

* cleanup

* fmt

* changelog

* check type first

* merge

* mc error message

* add BD case to Builder.java

* fmt

* changelog

* add BD case to StorageConverter.java

* fmt

* fix test
2024-06-04 13:59:31 -04:00
Radosław Waśko
7cf80f3196
Handle UTF BOM when decoding text (#10130)
- Improve BOM handling: detect and skip the BOM character, Default encoding that detects encoding based on BOM if present, warnings if unexpected BOM is encountered.
- Closes #9849
- Windows-1252 fallback will be done as a separate PR as it has additional complexity. Tracked in ticket #10148.
2024-06-04 13:22:19 +00:00
AdRiley
06327f8fde
Add statistic product (#10122)
Add Statistic.Product

![image](https://github.com/enso-org/enso/assets/1720119/f7fc7bb5-9efe-4dbe-9150-cd9e5101c553)
2024-05-31 09:29:52 +00:00
Radosław Waśko
233f28235a
Small fixes to Postgres integration (#10105)
- Better message when saving datalink in disabled Output context:
![image](https://github.com/enso-org/enso/assets/1436948/540d615b-79ff-4811-8262-a0475a7b6923)
Before it was:
![image](https://github.com/enso-org/enso/assets/1436948/51198bf1-1e50-41bc-a56b-f829bc32d09a)

- Hack to get Postgres widget to display connection options:
![image](https://github.com/enso-org/enso/assets/1436948/39f3db39-1163-4815-b59f-c629d812e2ab)
Before the `Postgres` constructor was created without any parameters and it was not showing any parameters for modification.
2024-05-28 14:34:44 +00:00
James Dunkerley
ab4b1f0f35
Add day_of_week and day_of_year to Column and DB_Column (#10081)
- Adds support for getting the weekday as an integer (1 Monday - 7 Sunday - ISO standard).
- Add support for getting the day of year.
2024-05-27 11:29:25 +00:00
Jaroslav Tulach
16c1b74218
Enso Library Feature to execute (a bit of) Base_Tests (#9997) 2024-05-23 08:20:19 +02:00
Radosław Waśko
1e0649fda1
Improvements to Table.union (#9968)
- Closes #9952
2024-05-22 09:38:10 +00:00