enso-org/enso - enso - gitea: Gitea Service

mirror of https://github.com/enso-org/enso.git synced 2024-12-30 05:42:08 +03:00

Author	SHA1	Message	Date
Radosław Waśko	5f0a16c87c	Audit Logs for Postgres connections opened through a data link (#9873 ) - Closes #9599 - Implemented API for sending audit logs to the cloud on a background thread. - If the Postgres connection is opened through a datalink, its internal JDBC connection is replaced by a wrapper that reports executed queries to the audit log. - Also introduces `EnsoMeta` - a helper Java class that can be used in our helper libraries to access Enso types. - I have replaced the common pattern scattered throughout the codebase with calls to this 'library' to avoid repetitive code. - Refactored `Table.display` to share code between in-memory and DB - it was needed as the function stopped working for `DB_Table` after adding making the `Table` constructor `private`. - Clearer error when reading a SQLite database from a remote file (tells the user to download it first). - Follow up - correlate asset id of the data link: #9869 - Follow up - include project name (once bug is fixed): #9875 - Some problems/improvements of the audit log: - The audit log system is not yet ready for high throughput of logs #9870 - The logs may be lost if `System.exit` is used #9871	2024-05-11 08:54:33 +00:00
AdRiley	e25ec96aaa	Add table running variance skew sd and kurtosis (#9854 ) Adds support for Variance, Skew, Standard Deviation and Kurtosis to Table.Running.	2024-05-09 08:45:29 +00:00
AdRiley	15976a8505	Make table.Running return integer typed columns for min/max (#9853 ) * New Tests * Green * Running min for longs * Unsupported types test * Revert * Add support for all the integer types * Another test	2024-05-07 10:49:12 +01:00
AdRiley	f647045214	Make excel writer work for all types (#9846 ) * New Test * Improve DateTime recognition * Re-enable slow test * If there is a time take it regardless of format * If there is a time take it regardless of format * Code Review Changes	2024-05-03 07:09:54 +01:00
AdRiley	5350b2d00d	Refactor add row number (#9822 ) * Refactor add row number * Refactor * Green * Green * Remove dead code * Cleanup * Deduplicate check	2024-05-02 12:29:54 +01:00
James Dunkerley	d2e6ff260e	Restructure `SQLite_Details`. (#9832 ) ``` type SQLite_Details SQLite location:File\|In_Memory type In_Memory ``` to ``` type SQLite From_File location:File In_Memory ``` # Important Notes Splits the In-Memory entry for Database Connect but still works nicely. ![image](https://github.com/enso-org/enso/assets/4699705/ec798ce0-9f41-4903-a2fd-722a9e37743c) ![image](https://github.com/enso-org/enso/assets/4699705/f233b055-893e-4c56-a23d-562e982560f6)	2024-05-01 22:15:41 +00:00
James Dunkerley	4d6d6f239c	Handle URL encoding automatically in query string. (#9823 ) A small fix to automatically encode the query string. Attaches a warning if needed. ![image](https://github.com/enso-org/enso/assets/4699705/032bdb59-6896-46c0-b970-f5a542cc6adf) ![image](https://github.com/enso-org/enso/assets/4699705/6b2075b9-3c98-4de2-8a34-c860ecd65d0c)	2024-04-30 22:03:46 +00:00
AdRiley	d1bf4cb771	Add Ignored_Nothing_Values (#9770 ) Add a `IgnoredNothing` warning for Table.Running ![image](https://github.com/enso-org/enso/assets/1720119/1941d278-2c33-43fe-a175-8bcc65bae51a) ![image](https://github.com/enso-org/enso/assets/1720119/b5f6b235-d939-4868-9490-de0f226ea1a2) ![image](https://github.com/enso-org/enso/assets/1720119/a1d617a6-a684-4cc1-be13-c4907d2e6876)	2024-04-30 13:30:40 +00:00
AdRiley	32c3f5f3e8	Make Table.should_equal and Column.should_equal consider NaN equal (#9799 ) * Make Column.should_equal detect colums of different types and think nan==nan * Refactor Table.should_equal * More Column tests * Adjust spacing * Tests Green * Check same number of columns * Refactor * Extra test * Code Review Changes * Fix * Fix * Fix tests * Fix Tests * Fix Test * Fix test * Code review change	2024-04-29 22:21:34 +01:00
Jaroslav Tulach	0d495ffd97	Make conversion of double to BigDecimal exact (#9740 ) Resolves #9607 by computing `Number.hash` by converting given number to `Float` first and then computing the hash. Also the conversion from `Float.to Decimal` is exact - done via `new BigDecimal(double)`. There is `Decimal.new` that handles the user-friendly conversion. However as a result `Decimal.from 2.1 != Decimal.new 2.1` - that's the only way to ensure consistency between hash code and conversions.	2024-04-25 11:22:50 +00:00
James Dunkerley	fb9cf38914	`Excel_Workbook.read_many` (#9759 ) - Some minor linting fixes. - Adjust `headers` parameter so a dedicated type. ![image](https://github.com/enso-org/enso/assets/4699705/989f464d-df95-410e-a03b-36661f1c4a37) - Fix bug with `read` on an `Excel_Workbook` so error handled more gracefully and not panicking to UI. ![image](https://github.com/enso-org/enso/assets/4699705/23b4575f-daad-4719-a5cc-30d064bd7f7a) - Fix bug when writing to a file with an `Excel_Format` with an invalid extension which was causing a panic. ![image](https://github.com/enso-org/enso/assets/4699705/dc0e055c-c1b6-482f-b129-eb69f6554d72) - Add `read_many` to `Excel_Workbook` allowing reading more than one sheet at a time.	2024-04-24 13:16:44 +00:00
AdRiley	4a97bfa31f	Add table running functionality for Sum, Mean, Min, Max. (#9577 ) * Add Table.Running * Code Review fixes * Code Review changes * Change null handling	2024-04-23 09:45:43 +01:00
AdRiley	ceaba7f48d	Make excel writer work for custom types (#9752 )	2024-04-20 10:34:06 +01:00
GregoryTravis	86ecd3e027	Add `Decimal.floor`, `.ceil`, and `.trunc` (#9694 )	2024-04-17 18:42:38 +00:00
Radosław Waśko	fda41cbfd1	Writing Cloud files (#9686 ) - Closes #9291	2024-04-16 14:01:03 +00:00
Radosław Waśko	bdda1830b7	Integrate Cloud path resolver (#9662 ) - Closes #9363 - Cleans up the Cloud mock as it got a bit messy. It still implements the bare minimum to be able to test basic secret and auth handling logic 'offline' (added very simple path resolution, only handling the minimum set of cases for the tests to work). - Adds first implementation of caching Cloud replies. - Currently only caching the `Enso_User.current`. This is a simple one to cache because we do not expect it to ever change, so it can be safely cached for a long period of time (I chose 2h to make it still refresh from time to time while not being noticeable). - We may try using this for caching other values in future PRs.	2024-04-12 13:03:09 +00:00
GregoryTravis	e3afa5561d	Add `Decimal.round` (#9672 )	2024-04-11 15:47:50 +00:00
Radosław Waśko	5650c7aed2	Refactoring `Enso_File` to be path based (#9581 ) - Closes #9289 - Ensures that we can refer through `Enso_File` to files that do not _yet_ exist - preparing us for implementing the Write functionalities for `Enso_File` (#9291).	2024-04-09 11:15:29 +00:00
Radosław Waśko	f2d6079ac4	Fix missing AWS region in S3 operations (#9546 ) - Closes #9284 - Now our tests run without the default `AWS_` config, thus ensuring that the tested setups work in a clean environment. - After all, more complicated logic was needed for buckets access - apparently the AWS SDK only allows for some operations on buckets to happen if the client is connected to the correct region. Thus detection of bucket regions had to be implemented. - Added `AWS_Region` widget based on autoscoping. - Fixed `AWS_Credential.profile_names` crashing if no AWS config was found. Now it returns no profiles if not found. Added a regression test.	2024-03-27 12:00:15 +00:00
Radosław Waśko	af5354b869	Data Link for reading `Enso_File` (#9525 ) - Closes #9282	2024-03-27 04:17:07 +00:00
Radosław Waśko	6665c22eb9	Make data-links behave more like 'symlinks' (#9485 ) - Closes #9324	2024-03-22 17:01:54 +00:00
James Dunkerley	283c0b61d9	Data link for Snowflake. (#9514 ) Adding in Snowflake into the Datalink APIs. ![image](https://github.com/enso-org/enso/assets/4699705/32bd347c-0b2b-47b5-bec2-5c939ecd0594)	2024-03-21 17:06:56 +00:00
James Dunkerley	2f0d99a1cb	Snowflake Connectivity (#9435 ) * Initial connection to Snowflake via an account, username and password. * Fix databases and schemas in Snowflake. Add warehouses. * Add warehouse. Update schema dropdowns. * Add ability to set warehouse and pass at connect. * Fix for NPE in license review * scalafmt * Separate Snowflake from Database. * Scala fmt. * Legal Review * Avoid using ARROW for snowflake. * Tidy up Entity_Naming_Properties. * Fix for separating Entity_Namimg_Properties. * Allow some tweaking of Postgres dialect to allow snowflake to use as well. * Working on reading Date, Time and Date Times. * Changelog. * Java format. * Make Snowflake Time and TimeStamp stuff work. Move some responsibilities to Type_Mapping. * Make Snowflake Time and TimeStamp stuff work. Move some responsibilities to Type_Mapping. * fix * Update distribution/lib/Standard/Database/0.0.0-dev/src/Connection/Connection.enso Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org> * PR comments. * Last refactor for PR. * Fix. --------- Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>	2024-03-20 10:06:12 +00:00
Radosław Waśko	6e5b4d93a3	Implement refreshing the Cloud token in Enso libraries (#9390 ) - Closes #9300 - Now the Enso libraries are themselves capable of refreshing the access token, thus there is no more problems if the token expires during a long running workflow. - Adds `get_optional_field` sibling to `get_required_field` for more unified parsing of JSON responses from the Cloud. - Adds `expected_type` that checks the type of extracted fields. This way, if the response is malformed we get a nice Enso Cloud error telling us what is wrong with the payload instead of a `Type_Error` later down the line. - Fixes `Test.expect_panic_with` to actually catch only panics. Before it used to also handle dataflow errors - but these have `.should_fail_with` instead. We should distinguish these scenarios.	2024-03-19 19:26:34 +00:00
GregoryTravis	9a9eff1aa6	`Decimal` type: constuctors, comparisons, and arithmetic (#9272 )	2024-03-15 21:13:41 +00:00
Radosław Waśko	e98306f170	Excel DataLink (#9346 ) - Adds the Excel format as one of the formats supported when creating a data link. - The data link can choose to read the file as a workbook, or read a sheet or range from it as a table, like `Excel_Format`. - Also updated Delimited format dialog to allow customizing the quote style.	2024-03-11 16:12:12 +00:00
AdRiley	3ebf1340e8	Add write to xml document (#9299 ) * First commit * Add xml.write * Add comment * Changelog.md * Code review changes * Code review changes * Update import	2024-03-06 17:13:28 +00:00
Radosław Waśko	e37862b09d	Implement a Data Link for Postgres (#9269 ) - Closes #9124	2024-03-06 11:57:12 +00:00
AdRiley	8b889f0977	Make Table.To_Xml return a XML_Document (#9263 ) As part of the XML improvements it makes more sense for Table.To_Xml to return a XML_Document.	2024-03-04 15:19:20 +00:00
Radosław Waśko	39af372bcd	Allow resolving `enso://` URIs in `Data.read` and other places (#9225 ) - Implements the core parts of #9048 - Currently the path resolution is done by resolving each segment, one by one - requiring as many API calls as there are segments in the path. - This should be replaced in a followup PR, once https://github.com/enso-org/cloud-v2/issues/899 is implemented.	2024-03-02 16:04:30 +00:00
James Dunkerley	964fdfd7ea	Align XML_Document and XML_Element APIs more. (#9233 ) - Added `to_table` extensions on some core types. - Added ICON to `Any.to`. - Added ICON to `Column.info`, `Table.info`, `DB_Column.info` and `DB_Table.info`. - Added defaults to `Table.cross_tab` and `DB_Table.cross_tab`. - Added `name`, `get`, `at`, `inner_xml` and `outer_xml` to `XML_Document`. - Added constants into left hand side of simple expressions. - Added widget to `get` and `at` on `XML_Document` and `XML_Element`. (Some bug in annotation code with Dmitry) - Altered `get` and `at` to not allow XPath and just get direct child/attribute values. - Added `get_xpath` to `XML_Document`. - Renamed `get_elements_by_tag_name` to `get_descendants_by_tag_name` and added new `get_children_by_tag_name`. - Added `child_names` and `attribute_names` to `XML_Document` and `XML_Element`.	2024-03-01 17:42:44 +00:00
Radosław Waśko	4316709379	Implementing reading Data Links (#9215 ) - Close #9123	2024-03-01 15:33:21 +00:00
James Dunkerley	8f2b9da664	IsNa to IsNothing, missing to Nothing in Table code. (#9154 ) Starting to use Nothing everywhere...	2024-02-26 10:52:07 +00:00
James Dunkerley	0e2a91cfe1	Remove `countMissing` from Storage and replace with a new `CountNothing` operation. (#9137 ) Removing another small piece of logic from the storages to it's own operation.	2024-02-22 19:32:46 +00:00
James Dunkerley	ee66b9fb1d	Refactoring the Unary operations so uncoupled from Storage. (#9090 ) In order to allow clever masking, slicing, filtering and arrow backing stores... - Adding ColumnStorage interface with the base API a storage will need. - Refactored each of the unary operations to a new `UnaryOperation` interface which makes them responsible for deciding if they can be executed.	2024-02-19 17:11:52 +00:00
James Dunkerley	f2d2f73e89	Starting to refactor Storage and Operations (#9076 ) Cleaning up some of the structures in Storage before working on UnaryOperations. - Removed some legacy code: `countMask`, `Index` and `DefaultIndex`. - Renamed `mask` to `applyFilter` on `Column` and `Storage`. - Renamed `Table.mask` to `Table.filter`.	2024-02-15 18:21:07 +00:00
Pavel Marek	5919eda753	Fix incremental compilation of runtime/test (#8975 )	2024-02-13 10:05:31 +01:00
AdRiley	9339672e0e	Remove _new and actually run the new tests (#9006 ) Merge conflict on develop meant this one got left with a new_test.	2024-02-09 14:19:02 +00:00
AdRiley	e3f6ff1772	Add to_xml component (#8979 ) Adds new to_xml component	2024-02-07 20:54:48 +00:00
James Dunkerley	0c39f8ec04	Allow Filter_Condition to be inverted. (#8861 ) - Various linting fixes (doc comments and type annotations etc.). - Add an action to determine if a `Filter_Condition` is keep or remove. https://github.com/enso-org/enso/assets/4699705/69ba2bd3-8893-4237-acc4-eb01f534a209 - Remove `Not_In`, `Not_Contains` and `Not_Like` from `Filter_Condition`. - Ability to use an `Expression` as a `Column_Ref`. https://github.com/enso-org/enso/assets/4699705/16a2e030-f8f9-4f59-beca-2646f56fcb90	2024-02-07 14:36:14 +00:00
AdRiley	340a3eec4e	Split HashJoin to SimpleHashJoin and CompoundHashJoin (#8850 ) Completes #8342 . Creates a SimpleHashJoin and CompoundHashJoin. # Important Notes Creates SimpleHashJoin and CompoundHashJoin. CompoundHashJoin is what was HashJoin. SimpleHashJoin is a new implementation that only indexs the smaller of the 2 tables being joined together. The rest is refactor and clean-up of the shared join code.	2024-02-01 18:48:44 +00:00
James Dunkerley	eeaddbc434	Add parser for line by line processing (#8719 ) - ✅Linting fixes and groups. - ✅Add `File.from that:Text` and use `File` conversions instead of taking both `File` and `Text` and calling `File.new`. - ✅Align Unix Epoc with the UTC timezone and add converting from long value to `Date_Time` using it. - ❌Add simple first logging API allowing writing to log messages from Enso. - ✅Fix minor style issue where a test type had a empty constructor. - ❌Added a `long` based array builder. - Added `File_By_Line` to read a file line by line. - Added "fast" JSON parser based off Jackson. - ✅Altered range `to_vector` to be a proxy Vector. - ✅Added `at` and `get` to `Database.Column`. - ✅Added `get` to `Table.Column`. - ✅Added ability to expand `Vector`, `Array` `Range`, `Date_Range` to columns. - ✅Altered so `expand_to_column` default column name will be the same as the input column (i.e. no `Value` suffix). - ✅Added ability to expand `Map`, `JS_Object` and `Jackson_Object` to rows with two columns coming out (and extra key column). - ✅ Fixed bug where couldn't use integer index to expand to rows.	2024-02-01 07:29:50 +00:00
GregoryTravis	7436848e90	Implement relational NULL/Nothing for join for in-memory tables (#8849 ) Implements relational NULL for join, for all `Join_Kind`s.	2024-01-29 16:19:07 +00:00
Radosław Waśko	ca4f98c78e	Adding tests and missing methods for `Enso_File`. (#8815 ) - Closes #8808 - adds tests for various scenarios. - Implements `size` using HEAD. - Updates existing functions to changes in Cloud API. - Adds stubs for `*_time` methods, `parent`, `path`. - [x] TODO: resolve the `Enso_File.current_working_directory` from an environment variable. - ~~TODO: recursive directory deletion?~~ left for later # Important Notes - Currently, the Cloud API does not offer an easy way to extract metadata for a file, in particular to get the parent folder from the file `id`. - We should be able to get the parent, and stuff like creation/modified time. - We need a way to resolve paths to asset ids, for `path` to work as well as `current_working_directory`. - What is the environment variable that will be used to feed the `current_working_directory` property?	2024-01-26 19:04:42 +00:00
James Dunkerley	0b6db5797c	Refactor OrderMask to avoid memory copying (#8863 ) Goal of this PR is to refactor the design of OrderMask and avoid copying arrays or lists wherever possible. We have removed a few legacy functions which were not being used. On a poor mans benchmark seems to be quicker (13s vs 16s) and memory usage should be lower.	2024-01-26 11:16:16 +00:00
GregoryTravis	5eb3f3bd1d	Implement relational NULL semantics for Nothing for in-memory Column operations (#8816 ) Updates in-memory table column operations to treat Nothing as a relational NULL. This PR does not include changes to Table.join.	2024-01-24 17:02:45 +00:00
AdRiley	6eb00a7c5f	Remove Conditions Helper Class (#8842 ) * Tests green checkpoint * Remove ConditionsHelper * javafmtAll * Dedupe	2024-01-24 16:56:39 +00:00
Radosław Waśko	edfcfde11c	Tests and improvements for secrets in cloud subdirectories (#8791 ) - Closes #8723 - Adds some missing features that were needed to make this work: - `Enso_File.create_directory` and `Enso_File.delete`, and basic tests for it - Changes how `Enso_Secret.list` is obtained - using a different Cloud endpoint allows us to implement the desired logic, the default endpoint was giving us _all_ secrets which was not what we wanted here. - Implements `Enso_Secret.update` and tests for it # Important Notes Notes describing any problems with the current Cloud API: https://docs.google.com/document/d/1x8RUt3KkwyhlxGux7XUGfOdtFSAZV3fI9lSSqQ3XsXk/edit Apparently, everything that was needed to make this feature work has already been implemented, although a few features needed workarounds on Enso side to work properly.	2024-01-24 10:17:22 +00:00
Radosław Waśko	368e4867b4	Allow secrets in `AWS_Credential` (#8774 ) - Closes #8722	2024-01-19 19:00:56 +00:00
Radosław Waśko	14be36c401	Allow secrets in `Header.authorization_*` (#8761 ) - Closes #8739	2024-01-18 12:49:47 +00:00
AdRiley	b8e93b3cba	Add new text_left and text_right functions (#8691 ) Added text_left and text_right functions for in-memory and databases	2024-01-15 23:43:23 +00:00
Radosław Waśko	5b70ff25f7	Remove `set_user_info` from URI (#8738 ) I have added this in #8591, but I have realised it may not be a good idea to have it, so I am removing that particular change.	2024-01-15 17:35:17 +00:00
Radosław Waśko	f34abeda0c	Add tests for `Enso_Secret`s, update to new cloud API (#8736 ) - Closes #8556	2024-01-15 16:12:08 +00:00
Jaroslav Tulach	0e6952710a	Executing (parts of) Truffle TCK with Enso values (#8685 )	2024-01-12 07:21:16 +01:00
AdRiley	f31ecc7c87	Make fill_nothing take an empty string (#8643 ) * Add new test for required behaviour * Handle case where strArg is an empty string * More tests around fixed width field. Remove unneeded duplicate logic * javafmtAll * Further simplification * SQLite doesn't have full type system * SQLite doesn't have full type system	2024-01-10 11:59:10 +00:00
AdRiley	bf8dd1888c	Give file read its own helper widget for delimiters. (#8627 ) Give file read its own helper widget for delimiters. Remove newline add none. The file read delimiter is similar but different to the split one and so should have its own set of options.	2024-01-04 11:59:42 +00:00
James Dunkerley	ffa06c9476	Sort handling of Nothing within Column \|\| and && (#8656 ) Follows the database logic: ![image](https://github.com/enso-org/enso/assets/4699705/328a0e36-5508-4c63-a60b-ac9a280cd93a) Results: ![image](https://github.com/enso-org/enso/assets/4699705/77d6bf82-21f8-4aed-b4c5-45e429798189)	2024-01-03 10:40:40 +00:00
Radosław Waśko	d41d48e8a0	Merge `URI_With_Query` into `URI`, extend API of `URI` (#8591 ) - Closes #8544 - Adds `reset_query_arguments` and `/` operators allowing to transform a URI. - Adding tests for handling of various edge cases.	2023-12-21 18:39:26 +00:00
AdRiley	cfe0cbe0c1	Add text_length to column for in-memory and database (#8606 ) Closes #8521 Adds text_length to Column	2023-12-21 11:31:13 +00:00
Radosław Waśko	d56b800c11	Remove the Apache dependency from `std-base` (#8571 ) - After [suggestion](https://github.com/enso-org/enso/pull/8497#discussion_r1429543815) from @JaroslavTulach I have tried reimplementing the URL encoding using just `URLEncode` builtin util. I will see if this does not complicate other followup improvements, but most likely all should work so we should be able to get rid of the unnecessary bloat.	2023-12-20 18:01:08 +00:00
Radosław Waśko	724f8d2a56	Add tests for Enso Cloud auth + simple API mock for `Enso_User` (#8511 ) - Closes #8354 - Extends `simple-httpbin` with a simple mock of the Cloud API (currently it checks the token and serves the `/users` endpoint). - Renames `simple-httpbin` to `http-test-helper`.	2023-12-19 17:41:09 +00:00
Radosław Waśko	940b8f7d51	Improving tests and edge cases for URI and HTTP (#8497 ) - Closes #8352 - ~~Proposed fix for #8493~~ - The temporary fix is deemed not viable. I will try to figure out a workaround and leave fixing #8493 to the engine team.	2023-12-15 17:58:45 +00:00
James Dunkerley	9e27b6487b	Minor fixes and tweak for Cloud APIs. (#8557 ) - Fix secret to at least be working again - Tweak to allow a MIMIC flow to work with value types (revisit in 2024).	2023-12-15 17:10:07 +00:00
Pavel Marek	c1098865f2	Update java formatter sbt plugin (#8543 ) Add a local clone of javaFormatter plugin. The upstream is not maintained anymore. And we need to update it to use the newest Google java formatter because the old one, that we use, cannot format sources with Java 8+ syntax. # Important Notes Update to Google java formatter 1.18.1 - https://github.com/google/google-java-format/releases/tag/v1.18.1	2023-12-15 14:45:23 +00:00
Pavel Marek	4b65e44ef3	EpbLanguage re-uses other TruffleContext support to run tests with assertions enabled (#7882 )	2023-12-15 13:31:32 +01:00
Radosław Waśko	b5c995a7bf	Reworking Excel support to allow for reading of big files (#8403 ) - Closes #8111 by making sure that all Excel workbooks are read using a backing file (which should be more memory efficient). - If the workbook is being opened from an input stream, that stream is materialized to a `Temporary_File`. - Adds tests fetching Table formats from HTTP. - Extends `simple-httpbin` with ability to serve files for our tests. - Ensures that the `Infer` option on `Excel` format also works with streams, if content-type metadata is available (e.g. from HTTP headers). - Implements a `Temporary_File` facility that can be used to create a temporary file that is deleted once all references to the `Temporary_File` instance are GCed.	2023-12-15 00:02:15 +00:00
Radosław Waśko	c6b6384fe6	Improve performance of anti-join (#8338 ) - Closes #8217	2023-11-24 02:44:57 +00:00
James Dunkerley	ecaca12df1	Integrating Enso Cloud with the libraries (part 1...) (#8006 ) - Add a `File_For_Read` type. Used for `File_Format` to read files. - Added `Enso_User` representing the current user in `Enso_Cloud`. - Will be later able to list known users. - Added `Enso_Secret` representing a value defined in `Enso_Cloud`. - Value not used within Enso only accessed within polyglot Java. - Integrated into `Username_And_Password` and can be used within JDBC connections. - Integrated into HTTP Headers so a secret can be used as a value. - New `URI_With_Query` with the same API as `URI`. Supporting secrets in the value. - Will be integrated with AWS credentials. - Added `Enso_File` representing a file or a folder in the cloud. - Support the same API as `File` (like the `S3_File`). - Will support `enso://` URI style access.	2023-11-20 23:21:14 +00:00
Pavel Marek	5a7ad6bfe4	Upgrade enso to GraalVM for jdk 21 (#7991 ) Upgrade to GraalVM JDK 21. ``` > java -version openjdk version "21" 2023-09-19 OpenJDK Runtime Environment GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15) OpenJDK 64-Bit Server VM GraalVM CE 21+35.1 (build 21+35-jvmci-23.1-b15, mixed mode, sharing) ``` With SDKMan, download with `sdk install java 21-graalce`. # Important Notes - After this PR, one can theoretically run enso with any JRE with version at least 21. - Removed `sbt bootstrap` hack and all the other build time related hacks related to the handling of GraalVM distribution. - `project-manager` remains backward compatible - it can open older engines with runtimes. New engines now do no longer require a separate runtime to be downloaded. - sbt does not support compilation of `module-info.java` files in mixed projects - https://github.com/sbt/sbt/issues/3368 - Which means that we can have `module-info.java` files only for Java-only projects. - Anyway, we need just a single `module-info.class` in the resulting `runtime.jar` fat jar. - `runtime.jar` is assembled in `runtime-with-instruments` with a custom merge strategy (`sbt-assembly` plugin). Caching is disabled for custom merge strategies, which means that re-assembly of `runtime.jar` will be more frequent. - Engine distribution contains multiple JAR archives (modules) in `component` directory, along with `runner/runner.jar` that is hidden inside a nested directory. - The new entry point to the engine runner is [EngineRunnerBootLoader](https://github.com/enso-org/enso/pull/7991/files#diff-9ab172d0566c18456472aeb95c4345f47e2db3965e77e29c11694d3a9333a2aa) that contains a custom ClassLoader - to make sure that everything that does not have to be loaded from a module is loaded from `runner.jar`, which is not a module. - The new command line for launching the engine runner is in [distribution/bin/enso](https://github.com/enso-org/enso/pull/7991/files#diff-0b66983403b2c329febc7381cd23d45871d4d555ce98dd040d4d1e879c8f3725) - [Newest version of Frgaal](https://repo1.maven.org/maven2/org/frgaal/compiler/20.0.1/) (20.0.1) does not recognize `--source 21` option, only `--source 20`.	2023-11-17 18:02:36 +00:00
GregoryTravis	ea3d778456	Allow the creation of a constant column on an in-memory table with no rows. (#8218 )	2023-11-09 14:40:51 +00:00
Radosław Waśko	1b8b30a68d	Improve performance of `Join_Condition.Between` by sorting on one dimension (#8212 ) - Closes #5303 - Refactors `JoinStrategy` allowing us to 'stack' join strategies on top of each other (to some extent) - currently a `HashJoin` can be followed by another join strategy (currently `SortJoin`) - Adds benchmarks for join - Due to limitations of the sorting approach this will still not be as fast as possible for cases where there is more than 1 `Between` condition in a single query - trying to demonstrate that in benchmarks. - We can replace sorting by d-dimensional [RangeTrees](https://en.wikipedia.org/wiki/Range_tree) to get `O((n + m) log^d n + k)` performance (where `n` and `m` are sizes of joined tables, `d` is the amount of `Between` conditions used in the query and `k` is the result set size). - Follow up ticket for consideration later: #8216 - Closes #8215 - After all, it turned out that `TreeSet` was problematic (because of not enough flexibility with duplicate key handling), so the simplest solution was to immediately implement this sub-task. - Closes #8204 - Unrelated, but I ran into this here: adds type checks to other arguments of `set`. - Before, putting in a Column as `new_name` (i.e. mistakenly messing up the order of arguments), lead to a hard to understand `Method `if_then_else` of type Column could not be found.`, instead now it would file with type error 'expected Text got Column`.	2023-11-08 12:59:55 +00:00
Radosław Waśko	237aae33c7	Simplify internal logic of `Table.order_by`, avoid unnecessary warning (#8221 ) - Fixes #8213	2023-11-06 11:00:01 +00:00
GregoryTravis	1480f50207	Overhaul the random number and item generation code (#8127 ) Rewrite most of Random.enso.	2023-10-31 15:25:37 +00:00
Radosław Waśko	79011bd550	Implement `Table.lookup_and_replace` in Database (#8146 ) - Closes #7981 - Adds a `RUNTIME_ERROR` operation into the DB dialect, that may be used to 'crash' a query if a condition is met - used to validate if `lookup_and_replace` invariants are still satisfied when the query is materialized. - Removes old `Table_Helpers.is_table` and `same_backend` checks, in favour of the new way of checking this that relies on `Table.from` conversions, and is much simpler to use and also more robust.	2023-10-31 15:19:55 +00:00
Radosław Waśko	0c278391fe	Test and improve handling of `Date_Time with_timezone=False` in Postgres (#8114 ) - Fixes #8049 - Adds tests for handling of Date_Time upload/download in Postgres. - Adds tests for edge cases of handling of Decimal and Binary types in Postgres.	2023-10-21 21:35:13 +00:00
Radosław Waśko	8172896065	Support `Previous_Value` in `fill_nothing` and `fill_missing` (#8105 ) - Adds `Previous_Value` to `fill_nothing` and `fill_empty`, as requested by #7192.	2023-10-20 13:18:53 +00:00
Radosław Waśko	93a31fcc8b	Add benchmarks related to `add_row_number` performance investigation (#8091 ) - Follow-up of #8055 - Adds a benchmark comparing performance of Enso Map and Java HashMap in two scenarios - _only incremental_ updates (like `Vector.distinct`) and _replacing_ updates (like keeping a counter for each key). These benchmarks can be used as a metric for #8090	2023-10-18 17:21:59 +00:00
Radosław Waśko	e9fa12763e	Improve performance of `add_row_number` (#8076 ) Fixes #8055	2023-10-17 00:42:35 +00:00
Radosław Waśko	08b717eb54	Refactor Table problem handling to a more robust and hopefully cleaner approach (#7879 ) Closes #7514	2023-10-16 15:09:08 +00:00
GregoryTravis	f18d1323e1	Add Table.expand_to_rows to allow flattening vector and array values in table (#8042 ) # Important Notes Also includes a fix for a reallocation bug in `InferredBuilder`.	2023-10-13 20:54:06 +00:00
Radosław Waśko	cd84ac16ce	Restructure `Table.from_objects` to use conversions (#8020 ) Closes #7957	2023-10-11 22:25:18 +00:00
somebody1234	826127d8ff	Eliminate line feeds from `XML.outer_xml` on Windows (#8013 ) - Closes #7999 # Important Notes None	2023-10-10 23:21:34 +00:00
Radosław Waśko	6e0bd86753	Implement `Table.lookup_and_replace` for in-memory (#7979 ) - Closes #7749 implementing the in-memory logic. - Additional complications have surfaced regarding the Database logic, so it has been split off into a separate ticket: #7981	2023-10-10 10:42:06 +00:00
GregoryTravis	9ba7be20af	Basic XML support (#7947 ) This PR includes * Reading XML from a file, stream, or string * Reading XML via Data.fetch * Accessing the root element, element children, and attributes * Accessing tag text contents * Get tags by name * Inner / Outer XML string	2023-10-06 17:52:19 +00:00
Radosław Waśko	0cd446432f	Fix inconsistency when building a Mixed column, fixes to Union (#7919 ) - Fixes #7352 by remembering original value types in type inference mode to be able to reconstruct them for Mixed. - Added more benchmarks for comparing performance of constructing columns. - Fixes missing implementations that caused `Table.union` crashing on some type pairs. - Ensures that `Loss_Of_Integer_Precision` warning is not swallowed when numeric columns are unioned to create a `Float` column. - Adds test for all of the above cases. - Allow to output benchmark results to a CSV by setting an environment variable - useful for quickly comparing benchmarks, e.g. in Enso.	2023-10-03 20:33:34 +02:00
Radosław Waśko	08cd449a99	Fix `NumberParser` to avoid `thousandSeparator==decimalPoint` and prefer US decimal format (#7946 ) Closes #7930	2023-10-03 20:07:54 +02:00
Radosław Waśko	8d926166ea	Follow up improvements to `Date_Time_Formatter` (#7875 ) - Closes #7872 - Also closes #7866	2023-09-28 09:38:00 +00:00
Radosław Waśko	c690559ec4	Implement `auto_value_type` operation (#7908 ) Closes #6113	2023-09-27 15:45:34 +00:00
Radosław Waśko	12c4f2981d	More robust Date/Time format patterns parsing (#7826 ) - Closes #7461 by introducing a `Date_Time_Formatter` type and making parsing date time formats more robust and safer. - The default ('simple') set of patterns is slightly simplified and made case insensitive (except for `M/m` and `H/h`) to avoid the `YYYY` vs `yyyy` issues and make it less error prone. - The `YYYY` now has the same meaning as `yyyy` in simple mode. The old meaning (week-based year) is moved to a _separate mode_, triggered by `Date_Time_Formatter.from_iso_week_date_pattern`. - Full Java syntax, as well as custom-built Java `DateTimeFormatter` can also be used by `Date_Time_Formatter.from_java`. - Text-based constants (e.g. `ISO_ZONED_DATE_TIME`) have now become methods on `Date_Time_Formatter`, e.g. `Date_Time_Formatter.iso_zoned_date_time`).	2023-09-22 10:12:18 +00:00
Jaroslav Tulach	ad34a701e4	Upgrading to Frgaal compiler 20.0.1 (#7860 )	2023-09-22 09:58:19 +02:00
James Dunkerley	74d1d0861c	S3 Read Access, Input Stream based reading (#7776 ) - Added a `FileSystemSPI` allowing protocol resolution to a target type. - Separated `Input_Stream` and `Output_Stream` from `File` to allow use in other spaces. - `File_Format` types `read_web` changed to be `read_stream` working with `InputStream`. - Added directory listing to `Auto_Detect` allowing for `Data.read` to list a folder. - Adjusted HTTP to return an `InputStream` not a `byte[]`: - `Response_Body` adjusted to wrap an `InputStream`. - Added ability to materialize to either and in-memory vector (<4KB) or a temporary file. - `Data.fetch` will materialize if not a recognized mime-type. - Added `HTTP_Error` to handle IO exceptions from the stream. - `Excel_Format` now supports mime-type and reading a stream. - `Excel_Workbook` can now get a `Excel_Section` using `read_section`. - Added S3 APIs: - `parse_uri`: splits an S3 URI into bucket and key. - `list_objects`: list the items in a S3 bucket with specified prefix. - `read_bucket`: list prefixes and keys with a delimiter in a S3 bucket with specified prefix. - `head`: either head_bucket (tests existance) or head_object API (reads object meta data). - `get_object`: gets an object from S3 returning as a `Response_Body`. - Added `S3_File` type acting like a `File`: - No support for writing in this PR. - ToDo: recursive listing, glob filtering, exists, size. - Fixed a few invalid type signature line. - Moved `create` methods for `Postgres_Connection` and `SQLite_Connection` into type instead of module. - Renamed `Column_Fetcher.Builder` to `Column_Fetcher_Builder`. - Fixed bug with `select_into` in Dry Run mode creating permanent tables. ToDo: Unit tests.	2023-09-20 15:09:11 +00:00
Hubert Plociniczak	1ee3d8f4f0	Rename Decimal to Float (#7807 ) Implements #6889.	2023-09-14 15:01:30 +00:00
Radosław Waśko	8b6e70b155	Support for BigInteger values in Table (#7715 ) - Fixes #7354 - And also closes #7712 - Refactors how we handle numeric ops - ensuring that the 'kernels' are placed all in one place and selected based on storage types.	2023-09-12 13:18:04 +00:00
Radosław Waśko	255b424b72	Add `value_type` to `Column.from_vector` and `expected_value_type` to `Column.map` and `Column.zip` (#7637 ) - Closes #6111 - Aligns semantics of handling Mixed columns. - Now, if an operation like `iif` or `fill_nothing` is given a `Mixed` column, the result will also be `Mixed` regardless of the `inferred_precise_value_type`. - Enables a few old tests that were pending but could be enabled since the types work is advanced enough.	2023-08-31 13:20:49 +00:00
Radosław Waśko	2385f5b357	Add size-limited strings and varying bit-width integer Value_Types to in-memory backend and check for ArithmeticOverflow in LongStorage (#7557 ) - Closes #5159 - Now data downloaded from the database can keep the type much closer to the original type (like string length limits or smaller integer types). - Cast also exposes these types. - The integers are still all stored as 64-bit Java `long`s, we just check their bounds. Changing underlying storage for memory efficiency may come in the future: #6109 - Fixes #7565 - Fixes #7529 by checking for arithmetic overflow in in-memory integer arithmetic operations that could overflow. Adds a documentation note saying that the behaviour for Database backends is unspecified and depends on particular database.	2023-08-22 18:10:46 +00:00
GregoryTravis	c9d7c5cb2b	Convert in-memory Column.round to Java (#7521 )	2023-08-16 14:45:23 +00:00
Jaroslav Tulach	7a272ec152	Encapsulating array-like data and operations into a single package (#7544 )	2023-08-15 13:00:47 +02:00
Radosław Waśko	b656b336c7	Report `Loss_Of_Integer_Precision` when an integer is not exactly representable as a float during conversion (#7509 ) Closes #7353 I introduce a new type `WithAggregatedProblems`, because `WithProblems` was too simple - it only allowed to hold a `List<Problem>` but `AggregatedProblems` is more than that. Ideally we shouldn't multiply entities like this too much. We should probably unify all to use `WithAggregatedProblems` - but after starting this, I realised it will likely just take too much effort to do for this little PR. So instead, I created a follow-up task for this: #7514	2023-08-08 12:30:44 +00:00
Pavel Marek	8e49255d92	Invoke all Enso benchmarks via JMH (#7101 ) # Important Notes #### The Plot - there used to be two kinds of benchmarks: in Java and in Enso - those in Java got quite a good treatment - there even are results updated daily: https://enso-org.github.io/engine-benchmark-results/ - the benchmarks written in Enso used to be 2nd class citizen #### The Revelation This PR has the potential to fix it all! - It designs new [Bench API](`88fd6fb988`) ready for non-batch execution - It allows for _single benchmark in a dedicated JVM_ execution - It provides a simple way to wrap such an Enso benchmark as a Java benchmark - thus the results of Enso and Java benchmarks are [now unified](https://github.com/enso-org/enso/pull/7101#discussion_r1257504440) Long live _single benchmarking infrastructure for Java and Enso_!	2023-08-07 12:39:01 +00:00
GregoryTravis	758b3b31b9	Avoid indexing the table twice for Cross Tab (#7417 ) Rewrites MultiValueIndex.makeCrossTabTable to build only a single index.	2023-08-04 21:14:18 +00:00

1 2 3 4 5 ...

337 Commits