enso-org/enso - enso - gitea: Gitea Service

mirror of https://github.com/enso-org/enso.git synced 2024-11-27 04:02:59 +03:00

Author	SHA1	Message	Date
Dmitry Bushev	6249c79ffd	Update sbt-java-formatter plugin (#7011 ) Update java formatter plugin. The new version can remove unused imports.	2023-06-12 14:18:48 +00:00
James Dunkerley	578ba59f1d	Use US Locale for Date and Time parsing and formatting (#6967 ) Sorts out parsing and printing long form names of months and weekdays.	2023-06-06 21:44:25 +00:00
Radosław Waśko	1931e9e51f	Workaround for `to_date_time` type errors (#6964 ) Related to #6912 It essentially solves it by removing any builtins that would take an EnsoDate/EnsoTimeOfDay/EnsoTimeZone and replacing them with Java utils that do the same operation. This is not a proper solution - the builtin conversion is still invalid for the date/time types - but at this moment we may just no longer use the invalid conversion so it is much less of an issue. We still need to be aware of this if we want to introduce builtins taking date/time in the future.	2023-06-06 20:28:11 +00:00
GregoryTravis	912fbce97b	Reimplement Column.truncate, .ceil, and .floor as vectorized Java ops (#6941 ) Reimplement these in Java. Benchmarks: Before: Column.truncate floats average: 124.4ms Column.ceil floats average: 121.47ms Column.floor floats average: 120.18ms Column.truncate ints average: 124.78ms Column.ceil ints average: 120.41ms Column.floor ints average: 102.35ms After (boxed): Column.truncate floats average: 3.75ms Column.ceil floats average: 2.25ms Column.floor floats average: 1.89ms Column.truncate ints average: 2ms Column.ceil ints average: 1.77ms Column.floor ints average: 1.74ms After (unboxed): Column.truncate floats average: 3.32ms Column.ceil floats average: 2.15ms Column.floor floats average: 1.69ms Column.truncate ints average: 1.74ms Column.ceil ints average: 1.61ms Column.floor ints average: 1.99ms	2023-06-06 18:07:12 +00:00
Radosław Waśko	d44b1250b7	Implement `Table.add_row_number` (#6890 ) Closes #5227 # Important Notes - This lays first steps towards #6292 - we get pure Enso variants of MultiValueKey. - Another part refactors `LongStorage` into `AbstractLongStorage` allowing it to provide alternative implementations of the underlying storage, in our case `LongRangeStorage` generating the values ad-hoc and `LongConstantStorage` - currently unused but in the future it can be adapted to support constant columns (once we implement similar facilities for other types).	2023-06-02 10:13:13 +00:00
GregoryTravis	0337180384	Add rounding functions to the Column type (#6817 )	2023-06-01 20:06:23 +00:00
Radosław Waśko	c3e771c75c	Allow casting a Mixed column into a concrete type (#6777 ) Follow-up of #6711 Closes #6838	2023-05-26 13:25:53 +00:00
Radosław Waśko	447786a304	Implement `cast` for Table and Column (#6711 ) Closes #6112	2023-05-19 10:00:20 +00:00
Radosław Waśko	cd7fb73232	Add `Date_Range` (#6621 ) Closes #6543	2023-05-11 16:03:02 +00:00
GregoryTravis	4ba8409def	Add format to the in-memory Column (#6538 ) Add format to the in-memory Column # Important Notes Also updates .format in date types. Some rearrangement of date formatting builtins / Java libraries.	2023-05-09 08:47:40 +00:00
James Dunkerley	bc0db18a6e	Small changes from Book Club issues (#6533 ) - Add dropdown to tokenize and split `column`. - Remove the custom `Join_Kind` dropdown. - Adjust split and tokenize names to start numbering from 1, not 0. - Add JS_Object serialization for Period. - Add `days_until` and `until` to `Date`. - Add `Date_Period.Day` and create `next` and `previous` on `Date`. - Use simple names with `File_Format` dropdown. - Avoid using `Main.enso` based imports in `Standard.Base.Data.Map` and `Standard.Base.Data.Text.Helpers`. - Remove an incorrect import from `Standard.Database.Data.Table`. From #6587: A few small changes, lots of lines because this affected lots of tests: - `Table.join` now defaults to `Join_Kind.Left_Outer`, to avoid losing rows in the left table unexpectedly. If the user really wants to have an Inner join, they can switch to it. - `Table.join` now defaults to joining columns by name not by index - it looks in the right table for a column with the same name as the first column in left table. - Missing Input Column errors now specify which table they refer to in the join. - The unique name suffix in column renaming / default column names when loading from file is now a space instead of underscore.	2023-05-06 10:10:24 +00:00
Radosław Waśko	41a8257e8d	Separating Redshift connector from `Database` library into a new `AWS` library (#6550 ) Related to #5777	2023-05-04 17:36:51 +00:00
James Dunkerley	6b0c682b08	Add Execution Context control to Text.write (#6459 ) - Adjusted `Context.is_enabled` to support default argument (moved built in so can have defaults). - Made `environment` case-insensitive. - Bug fix for play button. - Short hand to execute within an enabled context. - Forbid file writing if the Output context is disabled with a `Forbidden_Operation` error. - Add temporary file support via `File.create_temporary_file` which is deleted on exit of JVM. - Execution Context first pass in `Text.write`. - Added dry run warning. - Writes to a temporary file if disabled. - Created a `DryRunFileManager` which will create and manage the temporary files. - Added `format` dropdown to `File.read` and `Data.read`. - Renamed `JSON_File` to `JSON_Format` to be consistent. (still to unit test).	2023-04-29 08:39:18 +00:00
James Dunkerley	0c7c3bdeaf	Fix for the massive number of warnings when renaming with invalid names. (#6450 ) * Rename makeUnique overloads to avoid issue when Nothing is passed. Suspend warnings when building the output table to avoid mass warning duplication. * Add test for mixed invalid names. Adjust so a single warning attached. * PR comments.	2023-04-27 14:51:59 +01:00
Radosław Waśko	a43d524336	Add typechecks to Aggregate and Cross Tab (#6380 ) Follow up of #6298 as it grew too much. Adds the needed typechecks to aggregate operations. Ensures that the DB operations report `Floating_Point_Equality` warning consistently with in-memory.	2023-04-24 08:55:54 +00:00
Radosław Waśko	8db2ad51a1	Adding typechecks to Column Operations (#6298 ) Closes #6106	2023-04-21 12:20:12 +00:00
James Dunkerley	0350762386	Add `replace`, `trim` to Column. Better number parsing. (#6253 ) - Add `replace` with same syntax as on `Text` to an in-memory `Column`. - Add `trim` with same syntax as on `Text` to an in-memory `Column`. - Add `trim` to in-database `Column`. - Added `is_supported` to dialects and exposed the dialect consistently on the `Connection`. - Add `write_table` support to `JSON_File` allowing `Table.write` to write JSON. - Updated the parsing for integers and decimals: - Support for currency symbols. - Support for brackets for negative numbers. - Automatic detection of decimal points and thousand separators. - Tighter rules for scientific and thousand separated numbers. - Remove `replace_text` from `Table`. - Remove `write_json` from `Table`.	2023-04-20 16:04:59 +00:00
Radosław Waśko	f5db35af07	Adjust `{Table\|Column}.parse` to use `Value_Type` (#6213 ) Closes #5660	2023-04-06 10:58:55 +00:00
Jaroslav Tulach	4805193428	Text.to_display_text is (shortened) identity (#6174 ) Fixes #5971.	2023-04-05 19:53:07 +00:00
GregoryTravis	d9bc5246ba	Remove old (Java) Regex library and replace with new (Truffle) library. (#6195 ) Remove old (Java) Regex library and replace with new (Truffle) library.	2023-04-04 19:58:26 +00:00
GregoryTravis	fb77f42fd5	Update `Text.split` to take a `Vector Text` parameter (#6156 ) Allows you to pass a vector of delimiters to `split`.	2023-04-04 14:44:47 +00:00
James Dunkerley	f26bcf6ab6	Small issues from working with Ned (#6160 ) - `Process.run` now returns a `Process_Result` allowing the easy capture of stdout and stderr. - Joining a column with a column name does not warn if adding just the prefix. - Stop the table viz from changing case and adding spaces to the headers.	2023-04-03 13:01:42 +00:00
Radosław Waśko	6ddcb553e5	Date/time support for Postgres. Year/month/day operations on Columns. (#6153 ) Closes #6115	2023-03-31 18:37:04 +00:00
Radosław Waśko	6f86115498	Proper implementation of Value Types in Table (#6073 ) This is the first part of the #5158 umbrella task. It closes #5158, follow-up tasks are listed as a comment in the issue. - Updates all prototype methods dealing with `Value_Type` with a proper implementation. - Adds a more precise mapping from in-memory storage to `Value_Type`. - Adds a dialect-dependent mapping between `SQL_Type` and `Value_Type`. - Removes obsolete methods and constants on `SQL_Type` that were not portable. - Ensures that in the Database backend, operation results are computed based on what the Database is meaning to return (by asking the Database about expected types of each operation). - But also ensures that the result types are sane. - While SQLite does not officially support a BOOLEAN affinity, we add a set of type overrides to our operations to ensure that Boolean operations will return Boolean values and will not be changed to integers as SQLite would suggest. - Some methods in SQLite fallback to a NUMERIC affinity unnecessarily, so stuff like `max(text, text)` will keep the `text` type instead of falling back to numeric as SQLite would suggest. - Adds ability to use custom fetch / builder logic for various types, so that we can support vendor specific types (for example, Postgres dates). # Important Notes - There are some TODOs left in the code. I'm still aligning follow-up tasks - once done I will try to add references to relevant tasks in them.	2023-03-31 16:16:18 +00:00
GregoryTravis	6b9cbeacb2	Implement Regular Expression replace and update `Text.replace` to the new API (#5959 ) Re-implement replace on top of Truffle regex.	2023-03-28 06:13:12 +00:00
James Dunkerley	bf2545fa04	Use new common parse method throwing less exceptions. (#6075 ) Avoiding exceptions by not using parseBest. Time now in CLI is 1.15s for 500k rows vs 1.65s in GUI. CLI: ![image](https://user-images.githubusercontent.com/4699705/227711266-bc005b0d-5011-450f-964b-65dd2e437c2e.png) GUI: ![image](https://user-images.githubusercontent.com/4699705/227711259-f7ddda29-86c7-4eef-a002-4bf0bda6063f.png) Added it as a function in the shared library so used by both engine and polyglot.	2023-03-27 11:02:10 +00:00
James Dunkerley	58f2c7643f	Use new Enso Hash Codes and Comparable (#6060 ) Enables `distinct`, `aggregate` and `cross_tab` to use the Enso hashing and equality operations. Also, I rewired the way the ObjectComparators are obtained in polyglot code to be more consistent. Add Comparator for `Day_Of_Week`, `Header`, `SQL_Type`, `Image` and `Matrix`. Also, removed the custom `==` from these types as needed. (Closes #5626)	2023-03-24 15:02:25 +00:00
Radosław Waśko	952beba8d1	Fix `cross_tab` column naming edge cases, add `fill_empty` (#5863 ) Closes #5151 and adds some additional tests for `cross_tab` that verify duplicated and invalid names. I decided that for empty or `Nothing` names, instead of replacing them with `Column` and implicitly losing connection with the value that was in the column, we should just error on such values. To make handling of these easier, `fill_empty` was added allowing to easily replace the empty values with something else. Also, `{is,fill}_missing` was renamed to `{is,fill}_nothing` to align with `Filter_Condition.Is_Nothing`.	2023-03-11 11:58:54 +00:00
Radosław Waśko	263c3ad651	Add a `common-polyglot-core-utils` project (#5855 ) Adds a common project that allows sharing code between the `runtime` and `std-bits`. Due to classpath separation and the way it is compiled, the classes will be duplicated - we will have one copy for the `runtime` classpath and another copy as a small JAR for `Standard.Base` library. This is still much better than having the code duplicated - now at least we have a single source of truth for the shared implementations. Due to the copying we should not expand this project too much, but I encourage to put here any methods that would otherwise require us to copy the code itself. This may be a good place to put parts of the hashing logic to then allow sharing the logic between the `runtime` and the `MultiValueKey` in the `Table` library (cc: @Akirathan).	2023-03-11 09:27:26 +00:00
Radosław Waśko	91ef8acf35	Review generated Column names (#5850 ) Closes #5583 and closes #5157	2023-03-10 19:07:58 +00:00
Radosław Waśko	62e57f5557	Test some Mismatched Quote edge cases in Delimited reader (#5810 ) Follow-up to #5113 - I add some more edge case tests as we discussed with @jdunkerley When debugging some quoting issues, I also realised the current `Mismatched_Quote` error provided not enough information. So I amended it to at least include some context indicating which was the 'offending' cell.	2023-03-10 15:47:57 +00:00
James Dunkerley	299bfd6b7d	Fixes from the Demo on 2nd March (#5823 ) - Fix issue with Geo Map viz. - Handle invalid format strings better in `Data_Formatter`. - New constants for the ISO format strings (and a special ENSO_ZONED_DATE_TIME) - Consistent Date Time format for parsing in all places. - Avoid throwing exception in datetime parsing. - Support for milliseconds (well nanoseconds) in Date_Time and Time_Of_Day. - `Column.map` stays within Enso. - Allow `Aggregate_Column.Group_By` in `cross_tab` group_by parameter.	2023-03-07 20:58:00 +00:00
Pavel Marek	b6e2319fcc	Comparators support partial ordering (#5778 )	2023-03-07 04:16:38 +00:00
Radosław Waśko	2d29456ed1	Review File/Data read and read_text warnings (#5799 ) Closes #5113 Fixes a bug where read-only files would be overwritten if File.write was used in backup mode, and added tests to avoid such regression. To implement it, introduced a `is_writable` property on `File`.	2023-03-06 03:43:38 +00:00
James Dunkerley	01fc34c18a	Improving Expression Support for In Database (#5790 ) - Adjust Excel Workbook write behaviour. - Support Nothing / Null constants. - Deduce the type of arithmetic operations and `iif`. - Allow Date_Time constants, treating as local timezone. - Removed the `to_column_name` and `ensure_sane_name` code.	2023-03-03 12:03:05 +00:00
Radosław Waśko	793eafc866	Improve Table.parse_values API (#5692 ) Closes #5111	2023-02-24 13:35:01 +00:00
James Dunkerley	652b8d5db3	Update `rename_columns` to new API design, add `first_row`, `second_row` and `last_row` functions to the table. (#5719 ) - Updates the `rename_columns` API. - Add `first_row`, `second_row` and `last_row` to the Table types. - New option for reading only last row of ResultSet.	2023-02-23 19:42:45 +00:00
Radosław Waśko	4dcf802831	Ensure that warnings are preserved on Nothing values passing back to Enso through polyglot boundary (#5677 ) Fixes #5672 # Important Notes - Added a subproject `enso-test-java-helpers` which allows the in-Enso tests to add Java helpers for testing.	2023-02-17 13:38:26 +00:00
Radosław Waśko	3027c6f3a2	Ensure entries containing newlines are quoted when writing Delimited Files (#5652 ) Fixes #5638	2023-02-17 00:57:48 +00:00
James Dunkerley	1bc27501e6	Remove `Column` type from Aggregate_Column, simplify Column_Selector, some new `File_Format`s (#5646 ) - Updated `Widget.Vector_Editor` ready for use by IDE team. - Added `get` to `Row` to make API more aligned. - Added `first_column`, `second_column` and `last_column` to `Table` APIs. - Adjusted `Column_Selector` and associated methods to have simpler API. - Removed `Column` from `Aggregate_Column` constructors. - Added new `Excel_Workbook` type and added to `Excel_Section`. - Added new `SQLiteFormatSPI` and `SQLite_Format`. - Added new `IamgeFormatSPI` and `Image_Format`.	2023-02-16 15:15:49 +00:00
Radosław Waśko	a02eab451e	Implement basic warnings for column arithmetic, review warnings on expressions and `filter` (#5605 ) Closes #5109 # Important Notes - Currently the tests pass for the in-memory parts of Common_Table_Operations, but still some stuff not working on DB backends - in progress.	2023-02-14 09:33:04 +00:00
James Dunkerley	1c821e22cf	Some fixed form the Anagrams experiment. (#5592 ) - Fixes the display of Date, Time_Of_Day and Date_Time so doesn't wrap. - Adjust serialization of large integer values for JS and display within table. - Workaround for issue with using `.lines` in the Table (new bug filed). - Disabled warning on no specified `separator` on `Concatenate`. Does not include fix for aggregation on integer values outside of `long` range.	2023-02-08 22:17:00 +00:00
Radosław Waśko	4f90946d1e	Rework Invalid Aggregations (#5579 ) Closes #5108	2023-02-08 18:39:09 +00:00
Radosław Waśko	778d28fba3	Table with no columns is not valid, No_Output_Columns is always an error (#4073 ) Implements https://www.pivotaltracker.com/story/show/184226020	2023-01-25 02:40:23 +00:00
Radosław Waśko	d2e57edc8b	Add Table.cross_join and Table.zip to In-Memory Table (#4063 ) Implements https://www.pivotaltracker.com/story/show/184239059	2023-01-23 13:19:52 +00:00
Radosław Waśko	8853053020	Division in Columns within InDB is integer based if both columns are integers (#4057 ) Fixes https://www.pivotaltracker.com/story/show/184073099 # Important Notes - Since now the only operator on columns for division, `/`, returns floats, it may be worth creating an additional `div` operator exposing integer division. But that will be done as a separate task aligning column operator APIs.	2023-01-17 20:29:25 +00:00
Radosław Waśko	082e0bfd0d	Add `Table.union` to the In-Memory Table. (#4052 ) Implements https://www.pivotaltracker.com/story/show/183854144	2023-01-17 00:34:57 +00:00
Radosław Waśko	0088096a58	Implement Distinct for the Database backends (#4027 ) Implements https://www.pivotaltracker.com/story/show/182307281	2023-01-11 22:46:54 +00:00
Radosław Waśko	8c661fdb74	Database Joins (#4007 ) Implements https://www.pivotaltracker.com/story/show/184032869 # Important Notes - Currently we get failures in Full joins on Postgres which show a more serious problem - amending equality to ensure that `[NULL = NULL] == True` breaks hash/merge based indexing - so such joins will be extremely inefficient. All our joins currently rely on this notion of equality which will mean all of our DB joins will be extremely inefficient. - We need to find a solution that will support nulls and still work OK with indices (but after exploring a few approaches: `COALESCE(a = b, a IS NULL AND b is NULL)`, `a IS NOT DISTINCT FROM b`, `(a = b) OR (a IS NULL AND b is NULL)`; all of which did not work (they all result in `ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions`) I'm less certain that it is possible. Alternatively, we may need to change the NULL semantics to align it with SQL - this seems like likely the simpler solution, allowing us to generate simple, reliable SQL - the NULL=NULL solution will be cornering us into nasty workarounds very dependent on the particular backend.	2023-01-05 10:36:22 +00:00
Dmitry Bushev	1e5e2327ab	Improve performance of Text.compare_to (#4012 ) PR adds a flag to `Text` implementation tracking whether it is in a FCD normal form. Then this information can be used in the `Normalizer.compare` method. \| Benchmark name \| Old (ms) \| With flag (ms) \| --- \| --- \| --- \| Unicode very short \| 40.29 \| 40.04 \| Unicode medium \| 9.07 \| 1.99 \| Unicode big - random \| 115.39 \| 0.35 \| Unicode big - early difference \| 107.02 \| 0.54 \| Unicode big - late difference \| 749.81 \| 94.73 \| ASCII very short \| 28.13 \| 31.13 \| ASCII medium \| 4.58 \| 2.26 \| ASCII big - random \| 42.68 \| 0.26 \| ASCII big - early difference \| 30.91 \| 0.32 \| ASCII big - late difference \| 66.29 \| 42.72 Full benchmark output. [bench_old.txt](https://github.com/enso-org/enso/files/10325202/bench_old.txt) [bench_new.txt](https://github.com/enso-org/enso/files/10325201/bench_new.txt)	2023-01-02 17:09:03 +00:00

1 2 3 4

170 Commits