- Improve BOM handling: detect and skip the BOM character, Default encoding that detects encoding based on BOM if present, warnings if unexpected BOM is encountered.
- Closes#9849
- Windows-1252 fallback will be done as a separate PR as it has additional complexity. Tracked in ticket #10148.
As reported by our users, when using the AWS SSO, our code was failing with:
```
Execution finished with an error: To use Sso related properties in the 'xyz' profile, the 'sso' service module must be on the class path.
```
This PR adds the missing JARs to fix that.
Additionally it improves the license review tool UX a bit (parts of #9122):
- sorting the report by amount of problems, so that dependencies with unresolved problems appear at the top,
- semi-automatic helper button to rename package configurations after a version bump,
- button to remove stale entries from config (files or copyrights that disappeared after update),
- button to add custom copyright notice text straight from the report UI,
- button to set a file as the license for the project (creating the `custom-license` file automatically)
- ability to filter processed projects - e.g. `openLegalReviewReport AWS` will only run on the AWS subproject - saving time processing unchanged dependencies,
- updated the license search heuristic, fixing a problem with duplicates:
- if we had dependencies `netty-http` and `netty-http2`, because of a prefix-check logic, the notices for `netty-http` would also appear again for `netty-http2`, which is not valid. I have improved the heuristic to avoid these false positives and removed them from the current report.
- WIP: button to mark a license type as reviewed (not finished in this PR).
- Closes#9289
- Ensures that we can refer through `Enso_File` to files that do not _yet_ exist - preparing us for implementing the Write functionalities for `Enso_File` (#9291).
- Closes#9284
- Now our tests run without the default `AWS_` config, thus ensuring that the tested setups work in a clean environment.
- After all, more complicated logic was needed for buckets access - apparently the AWS SDK only allows for some operations on buckets to happen if the client is connected to the correct region. Thus detection of bucket regions had to be implemented.
- Added `AWS_Region` widget based on autoscoping.
- Fixed `AWS_Credential.profile_names` crashing if no AWS config was found. Now it returns no profiles if not found. Added a regression test.
* Initial connection to Snowflake via an account, username and password.
* Fix databases and schemas in Snowflake.
Add warehouses.
* Add warehouse.
Update schema dropdowns.
* Add ability to set warehouse and pass at connect.
* Fix for NPE in license review
* scalafmt
* Separate Snowflake from Database.
* Scala fmt.
* Legal Review
* Avoid using ARROW for snowflake.
* Tidy up Entity_Naming_Properties.
* Fix for separating Entity_Namimg_Properties.
* Allow some tweaking of Postgres dialect to allow snowflake to use as well.
* Working on reading Date, Time and Date Times.
* Changelog.
* Java format.
* Make Snowflake Time and TimeStamp stuff work.
Move some responsibilities to Type_Mapping.
* Make Snowflake Time and TimeStamp stuff work.
Move some responsibilities to Type_Mapping.
* fix
* Update distribution/lib/Standard/Database/0.0.0-dev/src/Connection/Connection.enso
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
* PR comments.
* Last refactor for PR.
* Fix.
---------
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
- Closes#9300
- Now the Enso libraries are themselves capable of refreshing the access token, thus there is no more problems if the token expires during a long running workflow.
- Adds `get_optional_field` sibling to `get_required_field` for more unified parsing of JSON responses from the Cloud.
- Adds `expected_type` that checks the type of extracted fields. This way, if the response is malformed we get a nice Enso Cloud error telling us what is wrong with the payload instead of a `Type_Error` later down the line.
- Fixes `Test.expect_panic_with` to actually catch only panics. Before it used to also handle dataflow errors - but these have `.should_fail_with` instead. We should distinguish these scenarios.
- Adjusted `AWS_Credential` to have a `Default` and removed support for `Nothing` from functions.
- Renamed `Response_Body.to_file` to `Response_Body.write`.
- Add `write` to `Response`.
- Add `Table.get_value` and `DB_Table.get_value` allowing getting a single value from a table.
- Added `Data.download` allowing downloading from a URL to a file.
- Added text widget as input widget for `Data` methods.
![image](https://github.com/enso-org/enso/assets/4699705/fcfc8b1e-1197-4106-b8a7-43b1435327c0)
- 4 weeks ago we discussed with @jdunkerley we may want to move this module.
- `Data` is a submodule mostly for various data-structures etc., the cloud stuff did not fit that well in there.
- Closes#8723
- Adds some missing features that were needed to make this work:
- `Enso_File.create_directory` and `Enso_File.delete`, and basic tests for it
- Changes how `Enso_Secret.list` is obtained - using a different Cloud endpoint allows us to implement the desired logic, the default endpoint was giving us _all_ secrets which was not what we wanted here.
- Implements `Enso_Secret.update` and tests for it
# Important Notes
Notes describing any problems with the current Cloud API:
https://docs.google.com/document/d/1x8RUt3KkwyhlxGux7XUGfOdtFSAZV3fI9lSSqQ3XsXk/edit
Apparently, everything that was needed to make this feature work has already been implemented, although a few features needed workarounds on Enso side to work properly.
- Closes#8555
- Refactors the file format detection logic, compacting lots of repetitive logic for HTTP handling into helper functions.
- Some updates to CODEOWNERS.
- Closes#8111 by making sure that all Excel workbooks are read using a backing file (which should be more memory efficient).
- If the workbook is being opened from an input stream, that stream is materialized to a `Temporary_File`.
- Adds tests fetching Table formats from HTTP.
- Extends `simple-httpbin` with ability to serve files for our tests.
- Ensures that the `Infer` option on `Excel` format also works with streams, if content-type metadata is available (e.g. from HTTP headers).
- Implements a `Temporary_File` facility that can be used to create a temporary file that is deleted once all references to the `Temporary_File` instance are GCed.
- Add a `File_For_Read` type. Used for `File_Format` to read files.
- Added `Enso_User` representing the current user in `Enso_Cloud`.
- *Will be later able to list known users.*
- Added `Enso_Secret` representing a value defined in `Enso_Cloud`.
- Value not used within Enso only accessed within polyglot Java.
- Integrated into `Username_And_Password` and can be used within JDBC connections.
- Integrated into HTTP Headers so a secret can be used as a value.
- New `URI_With_Query` with the same API as `URI`. Supporting secrets in the value.
- *Will be integrated with AWS credentials.*
- Added `Enso_File` representing a file or a folder in the cloud.
- Support the same API as `File` (like the `S3_File`).
- *Will support `enso://` URI style access.*
- Closes#7981
- Adds a `RUNTIME_ERROR` operation into the DB dialect, that may be used to 'crash' a query if a condition is met - used to validate if `lookup_and_replace` invariants are still satisfied when the query is materialized.
- Removes old `Table_Helpers.is_table` and `same_backend` checks, in favour of the new way of checking this that relies on `Table.from` conversions, and is much simpler to use and also more robust.
- Fixes a bug where creating a temporary table could accidentally issue a `DROP` statement of the table name that the user provided, risking destruction of user data.
- Fortunately, the bad scenario was almost impossible, because the `DROP` statement was only issued _if_ we previously checked that the mentioned table _does not exist_ - dropping a nonexistent table does not do any harm.
- It could have been dangerous in a very unlikely scenario that a table was created just between the _existence check_ and the _drop_.
- After the fix the existence check and any modifications are done within a transaction to avoid interference from concurrent modifications, and the DROP is correctly applied to a temporary Enso table instead of the original one.
- Replaced a temporary log with proper simple logging of SQL statements into a file, if an Environment variable is set.
- Used that feature to test that no unexpected statements occur.
- Shuffle a few `from`s into correct places:
- `Day_Of_Week.from` removing `Day_Of_Week_From` module.
- Adding short cut for `http` and `https` in `Data.read` so it calls onto `Data.fetch` giving a single entry point.
- Moved `URI` extensions from `Standard.Base.Data` module into `Standard.Base.Network.Extensions`.
- Added `post` extension for `URI`.
- Added `contains_key` to `JS_Object`.
- Restored `into` in `JS_Object`:
- Follows old logic populating a constructor.
- Will use conversion from `JS_Object` if present.
- Added automatic deserialization of `Date`, `Time_Of_Day` and `Date_Time` from JSON.
- Uses conversion from `JS_Object`.
- Added conversion from `Text` to a `HTTP_Method` and type checking where `HTTP_Method` used in public APIs.
- Added support for `Date`, `Time_Of_Day` and `Date_Time` in `Table.from_objects`.
- Added `expand_column` to `Table` to expand `JS_Object` to values.
- Add type checking for `Table` in `right` arguments (allowing `Column`s to be used).
- Use type checking in `Table.set` to allow for conversion to a `Column`.
- Remove some unused imports.
- Fix for bug in S3 edge case.
- Added a `FileSystemSPI` allowing protocol resolution to a target type.
- Separated `Input_Stream` and `Output_Stream` from `File` to allow use in other spaces.
- `File_Format` types `read_web` changed to be `read_stream` working with `InputStream`.
- Added directory listing to `Auto_Detect` allowing for `Data.read` to list a folder.
- Adjusted HTTP to return an `InputStream` not a `byte[]`:
- `Response_Body` adjusted to wrap an `InputStream`.
- Added ability to materialize to either and in-memory vector (<4KB) or a temporary file.
- `Data.fetch` will materialize if not a recognized mime-type.
- Added `HTTP_Error` to handle IO exceptions from the stream.
- `Excel_Format` now supports mime-type and reading a stream.
- `Excel_Workbook` can now get a `Excel_Section` using `read_section`.
- Added S3 APIs:
- `parse_uri`: splits an S3 URI into bucket and key.
- `list_objects`: list the items in a S3 bucket with specified prefix.
- `read_bucket`: list prefixes and keys with a delimiter in a S3 bucket with specified prefix.
- `head`: either head_bucket (tests existance) or head_object API (reads object meta data).
- `get_object`: gets an object from S3 returning as a `Response_Body`.
- Added `S3_File` type acting like a `File`:
- No support for writing in this PR.
- **ToDo:** recursive listing, glob filtering, exists, size.
- Fixed a few invalid type signature line.
- Moved `create` methods for `Postgres_Connection` and `SQLite_Connection` into type instead of module.
- Renamed `Column_Fetcher.Builder` to `Column_Fetcher_Builder`.
- Fixed bug with `select_into` in Dry Run mode creating permanent tables.
**ToDo:** Unit tests.
- Fixes#7354
- And also closes#7712
- Refactors how we handle numeric ops - ensuring that the 'kernels' are placed all in one place and selected based on storage types.
- Update list of groups to agreed list.
- Lower case `ALIAS` names to be consistent with function names.
- Add `GROUP` to methods.
- All constructors and functions have doc comments.
- Correct a few typos (e.g. `PRVIATE`).
- Mark some more things as `PRIVATE`.
- Use `ToDo:` and `Note:` consistently.
- Order tags in doc comment.
# Important Notes
We don't have all the doc comments on types and will want to add them in future,
- Closes#5951
- Ensures any SQL warnings reported by the database through the JDBC driver are processed and forwarded to the user.
- These warnings show issues like the implicit name truncation that this PR is also solving. It's good to make sure they are visible as they can help avoid and understand unexpected problems. They should not show up in most standard workflows.
- Adds simple history to our REPL.