enso-org/enso - enso - gitea: Gitea Service

mirror of https://github.com/enso-org/enso.git synced 2024-12-28 11:25:30 +03:00

Author	SHA1	Message	Date
GregoryTravis	061876e640	Add simple parts of Table.take and Table.drop functions to Database table (#7615 ) Implements database Table and Column take/drop, except While and Sample. Additional features and optimizations are in https://github.com/enso-org/enso/issues/7614.	2023-08-31 18:52:02 +00:00
Radosław Waśko	255b424b72	Add `value_type` to `Column.from_vector` and `expected_value_type` to `Column.map` and `Column.zip` (#7637 ) - Closes #6111 - Aligns semantics of handling Mixed columns. - Now, if an operation like `iif` or `fill_nothing` is given a `Mixed` column, the result will also be `Mixed` regardless of the `inferred_precise_value_type`. - Enables a few old tests that were pending but could be enabled since the types work is advanced enough.	2023-08-31 13:20:49 +00:00
Jaroslav Tulach	6461e20870	Special support for Python Date/Time/Zone interop (#7617 )	2023-08-25 10:27:16 +02:00
GregoryTravis	ddf18f212b	Handle writing to a relative file (#7638 ) Fixes bug in writing to a non-absolute file (with backup).	2023-08-24 21:01:37 +00:00
Jaroslav Tulach	20e18d22df	More descriptive function information (#7629 ) Fixes #7359 by printing more information about the function including partially applied arguments and over-saturated arguments.	2023-08-24 18:04:08 +00:00
James Dunkerley	7d83b3d7b4	Add GROUP to functions (#7622 ) - Update list of groups to agreed list. - Lower case `ALIAS` names to be consistent with function names. - Add `GROUP` to methods. - All constructors and functions have doc comments. - Correct a few typos (e.g. `PRVIATE`). - Mark some more things as `PRIVATE`. - Use `ToDo:` and `Note:` consistently. - Order tags in doc comment. # Important Notes We don't have all the doc comments on types and will want to add them in future,	2023-08-23 13:20:38 +00:00
Radosław Waśko	2385f5b357	Add size-limited strings and varying bit-width integer Value_Types to in-memory backend and check for ArithmeticOverflow in LongStorage (#7557 ) - Closes #5159 - Now data downloaded from the database can keep the type much closer to the original type (like string length limits or smaller integer types). - Cast also exposes these types. - The integers are still all stored as 64-bit Java `long`s, we just check their bounds. Changing underlying storage for memory efficiency may come in the future: #6109 - Fixes #7565 - Fixes #7529 by checking for arithmetic overflow in in-memory integer arithmetic operations that could overflow. Adds a documentation note saying that the behaviour for Database backends is unspecified and depends on particular database.	2023-08-22 18:10:46 +00:00
Pavel Marek	a0086bb112	Ability to invoke all std benchmarks via jmh (#7519 ) All the Enso benchmarks in `test/Benchmarks` can be invoked via JMH	2023-08-17 14:48:43 +02:00
GregoryTravis	c9d7c5cb2b	Convert in-memory Column.round to Java (#7521 )	2023-08-16 14:45:23 +00:00
Jaroslav Tulach	aa0413e5a2	Use only Type instances as keys for State (#7585 )	2023-08-16 15:54:17 +02:00
James Dunkerley	296c95d414	Fix for empty column on replace and out of memory catching for join and tab (#7593 ) - Added a Panic.catch to catch heap memory error in joins and cross_tab. - Adjusted column replace so type is correct.	2023-08-15 17:06:51 +00:00
Jaroslav Tulach	7a272ec152	Encapsulating array-like data and operations into a single package (#7544 )	2023-08-15 13:00:47 +02:00
Radosław Waśko	8541a9e1ac	Improve generation of long operation in presence of column name length limit (#7556 ) I planned to do this as part of #7428, but I forgot. Making up for that now.	2023-08-14 16:58:36 +00:00
GregoryTravis	d3436fae70	Implement Number.round as a builtin (#7460 )	2023-08-14 15:43:39 +00:00
Radosław Waśko	b656b336c7	Report `Loss_Of_Integer_Precision` when an integer is not exactly representable as a float during conversion (#7509 ) Closes #7353 I introduce a new type `WithAggregatedProblems`, because `WithProblems` was too simple - it only allowed to hold a `List<Problem>` but `AggregatedProblems` is more than that. Ideally we shouldn't multiply entities like this too much. We should probably unify all to use `WithAggregatedProblems` - but after starting this, I realised it will likely just take too much effort to do for this little PR. So instead, I created a follow-up task for this: #7514	2023-08-08 12:30:44 +00:00
GregoryTravis	758b3b31b9	Avoid indexing the table twice for Cross Tab (#7417 ) Rewrites MultiValueIndex.makeCrossTabTable to build only a single index.	2023-08-04 21:14:18 +00:00
Radosław Waśko	bc9cde6543	Fix column naming edge cases - invalid and duplicated columns, case-insensitive name aliasing for case-insensitive backends (#7495 ) - Fixes #7412 - Also adds tests and fixes some more edge cases: - Ensures correct handling of existing Database tables whose column names may be invalid from Enso perspective, or clashing from Enso perspective (e.g. for most DBs `ś` and `s\u0301` are different names, but for Enso they are basically the same so this would cause issues - thus Enso now renames such columns when accessed (still using the correct column reference in the generated SQL under the hood).	2023-08-04 09:04:38 +00:00
GregoryTravis	037a687401	Expose Unicode normalization methods on Texts (#7425 ) Exposes Text_Utils.normalize().	2023-08-03 18:07:00 +00:00
Radosław Waśko	c61c741476	Respect database backend naming limitations when generating table/column names and validate user-provided names to avoid silent name clashes; process JDBC warnings reported from backends (#7428 ) - Closes #5951 - Ensures any SQL warnings reported by the database through the JDBC driver are processed and forwarded to the user. - These warnings show issues like the implicit name truncation that this PR is also solving. It's good to make sure they are visible as they can help avoid and understand unexpected problems. They should not show up in most standard workflows. - Adds simple history to our REPL.	2023-08-03 09:44:27 +00:00
GregoryTravis	628a51d8e2	Convert Number.round to Java (#7360 )	2023-07-26 12:03:09 +00:00
James Dunkerley	7345f0fd9a	Speed up statistics (#7390 ) - Allow `parse_to_columns` to take a `Regex` object. - Add `pattern` to the `Regex` object. - Add `column_names` to the `Row` object. - Improve statistics performance. - Add benchmarks for stats. \| Benchmark \| Reference \| New \| Improvement \| \| --- \| --- \| --- \| --- \| \| Max (by reduce) \| 16.4ms \| 16.3ms \| - \| \| Max (stats) \| 703ms \| 224ms \| 68% \| \| Sum (by reduce) \| 38ms \| 38ms \| - \| \| Sum (stats) \| 753ms \| 420ms \| 44% \| \| Variance (stats) \| 745ms \| 553s \| 26% \| Also tried using a Ref approach for stats but as slower (`7e13c45224`).	2023-07-26 10:01:18 +00:00
Radosław Waśko	4b5a2e2176	Fixing operations on Mixed types (#7368 ) - Fixes #7231 - Cleans up vectorized operations to distinguish unary and binary operations. - Introduces MixedStorage which may pretend to be a more specialized storage on demand. - Ensures that operations request a more specialized storage on right-hand side to ensure compatibility with reported inferred storage type. - Ensures that a dataflow error returned by an Enso callback in Java is propagated as a polyglot exception and can be caught back in Enso - Tests for comparison of Mixed storages with each other and other types - Started using `Set` for `Filter_Condition.Is_In` for better performance. - ~~Migrated `Column.map` and `Column.zip` to use the Java-to-Enso callbacks.~~ - This does not forward warnings. IMO we should not be losing them. We can switch and add a ticket to fix the warnings, but that would be a regression (current implementation handles them correctly). Instead, we should first gain some ability to work with warnings in polyglot. I created a ticket to get this figured out #7371 - ~~Trying to avoid conversions when calling Enso functions from Java.~~ - Needs extra care as dataflow errors may not be handled right then. So only works for simple functions that should not error. - Not sure how much it really helps. [Benchmarks](https://github.com/enso-org/enso/pull/7270#issuecomment-1635618393) suggested it could improve the performance quite significantly, but the practical solution is not exactly the same as the one measured, so we may have to measure and tune it to get the best results. - Created #7378 to track this.	2023-07-25 23:25:17 +00:00
GregoryTravis	1f6fcf189b	Implement replace on the Database Column (#7275 ) Implements `replace` for database text columns, for text, regex, and column patterns.	2023-07-25 18:09:50 +00:00
James Dunkerley	2dc565b366	Fix failing test (#7394 ) Fix a failing test.	2023-07-25 14:06:11 +00:00
Adam Obuchowicz	1d2371f986	Groups in DocTags (#7337 ) Fixes #7336 in a quick way. Next to the old way of defining groups, the library can just add `GROUP` tag to some entities, and it will be added to the group specified in tag's description. The group name may be qualified (with project name, like `Standard.Base.Input/Output`) or just name - in the latter case, IDE will assume a group defined in the same library as the entity. Also moved some entities from "export" list in package.yaml to GROUP tag to give an example. I didn't move all of those, as I assume the library team will reorganize those groups anyway. ### Important Notes @jdunkerley @radeusgd @GregoryTravis When you will start specifying groups in tags, remember that: * The groups still belongs to a concrete project; if some entity outside a project wants to be added to its group, the "qualified" name should be specified. See `Table.new` example in this PR. * If the group name does not reflect any group in package.yaml the tag is ignored. * A single entity may be only in a single group. If it's specified in both package.yaml and in tag, the tag takes precedence. --------- Co-authored-by: Ilya Bogdanov <fumlead@gmail.com>	2023-07-24 15:54:16 +02:00
James Dunkerley	88f32d9b2a	Various small tickets... (#7367 ) - Added `Text.length` into Text class so CB lists the built in. - Added `File.starts_with` and tests for the built in method. - Add `to_js_object` and `to_display_text` to `Regex`. ![image](https://github.com/enso-org/enso/assets/4699705/3b197c94-9c49-4bc5-a2cc-ce53b917942e) - Add `to_js_object` and `to_display_text` to `Match`. ![image](https://github.com/enso-org/enso/assets/4699705/962ec4f2-324d-4f10-8ec0-932b093c6729) - Remove the `bit_shift_l` alias from the built-ins. - Add test and Enso wrapper for `Text.is_normalized`.	2023-07-23 09:04:11 +00:00
Radosław Waśko	56635c9a88	Add benchmarks comparing performance of Table operations 'vectorized' in Java vs performed in Enso (#7270 ) The added benchmark is a basis for a performance investigation. We compare the performance of the same operation run in Java vs Enso to see what is the overhead and try to get the Enso operations closer to the pure-Java performance.	2023-07-21 17:25:02 +00:00
Jaroslav Tulach	a5ec6a9e51	Bench builder API (#7324 ) Designing new `Bench` API to _collect benchmarks_ first and only execute them then. This is a minimal change to allow implementation of #7323 - e.g. ability to invoke a _single benchmark_ via JMH harness. # Important Notes This is just the basic API skeleton. It can be enhanced, if the basic properties (allowing integration with JMH) are kept. It is not intent of this PR to make the API 100% perfect and usable. Neither it is goal of this PR to update existing benchmarks to use it (`74ac8d7` changes only one of them to demonstrate _it all works_ somehow). It is however expected that once this PR is integrated, the newly written benchmarks (like the ones from #7270) are going to use (or even enhance) the new API.	2023-07-19 09:18:28 +00:00
GregoryTravis	2fb5c3710b	Add Fallback to Prim_Text_Helper.compile_regex; accept Regex in Text.parse_to_table (#7297 ) This PR does three related things: - Fails more gracefully when a non-string is passed to compile_regex - Don't pass a non-string to compile_regex - Allow a Regex param to parse_to_table	2023-07-18 19:55:56 +00:00
James Dunkerley	fd0bdc86dd	Fix issue with rename_columns and revert order of parameter change on select_columns. (#7321 ) The Regex change introduced some issues. Added a test for missed case in `rename_columns` where using vector of pairs. Reverted parameter order change for `select_columns`.	2023-07-18 13:30:23 +00:00
Pavel Marek	7264d81f2a	Builtin methods can support array-like arguments (#7235 ) This PR modifies the builtin method processor such that it forbids arrays of non-primitive and non-guest objects in builtin methods. And provides a proper implementation for the builtin methods in `EnsoFile`. - Remove last `to_array` calls from `File.enso`	2023-07-17 09:17:39 +02:00
James Dunkerley	aaa235fbad	Add drop down for replace, remove Column_Selector (#7295 ) - Add dropdowns for `replace` functions. - Retire `Column_Selector` type. - Add `select_blank_columns` and `remove_blank_columns` functions to table types. - Allow Regex to be used to pick columns.	2023-07-14 17:30:52 +00:00
Radosław Waśko	866283c0a8	Improve error message on `Filter_Condition` missing arguments in `Table.filter` (#7290 ) In #7148 I improved the error message when a `Filter_Condition` constructor without arguments is provided to `Vector.filter` and its friends. This PR applies the same check to the `Table.filter`. This is useful, because when we select a Filter_Condition from a widget, initially it does not have all its arguments applied. This used to lead to confusing errors being reported to the user, now, a much clearer error is shown: ![image](https://github.com/enso-org/enso/assets/1436948/19140a7b-d6fc-4292-81d3-dc6d61135cb9)	2023-07-14 08:00:13 +00:00
Radosław Waśko	620cc361ce	Add `date_diff`, `date_add` and `date_part` to scalar Enso date-time values. (#7273 ) Followup of #7221, adding `date_diff`, `date_add` and `date_part` to scalar Enso date-time values.	2023-07-13 15:17:21 +00:00
Radosław Waśko	ca68dd94da	Adding new Date/Time operations (`-`, `date_add`, `date_diff`, `date_part`) (#7221 ) - Adds `Column.date_diff` for computing date/time difference as integer multiply of some unit. - Adds `Column.date_add` for shifting date/time by a unit. - Adds `Column.date_part` for extracting various parts of the date/time value as integer. - Adds widgets for the 3 methods above whose content depends on the column value type. - Adds shorthands: `Column.hour`, `Column.minute` and `Column.second` to extract these date parts. - Extends `Time_Period` with support for milli-, micro- and nano- seconds; and adapts functions taking `Time_Period` to support these wherever possible.	2023-07-13 12:56:54 +00:00
James Dunkerley	0adab6c68c	Round on a column was always adding a warning (#7246 ) - Only warn if outside allowed range. - Added `is_infinite` to In-Memory column. - Allow integer value type for `is_nan` and `is_infinite`.	2023-07-10 17:35:23 +00:00
GregoryTravis	345d6b9cb1	Add cross_join support to Database Table (#7234 )	2023-07-10 16:29:37 +00:00
James Dunkerley	1fb60df61b	Fixes from the live demo. (#7243 ) - Removed defaults from `cross_tab`. It caused an out-of-heap space error when it attempted to build a 205k x 205k table. Now has a hard limit of 10,000 columns - we can increase this once we have more concrete test data. ![image](https://github.com/enso-org/enso/assets/4699705/bc38d41c-56dc-41bd-8a7c-fa89ecfa7f79) - Adjusted the dropdowns on `Aggregate_Column` for `columns` and `order_by` to be dropdowns as nested Vector editors are not supported. ![image](https://github.com/enso-org/enso/assets/4699705/f4a7c7cc-6a21-462c-a39e-65fbab82c367) - Altered `Aggregate_Column` so `new_name` now `new_name:Text=""` and not taking `Nothing` anymore. Makes it appear correctly in IDE. ![image](https://github.com/enso-org/enso/assets/4699705/196a49ba-4274-44bb-b876-0372c8f62746) - Added dropdowns for `fill_empty`, `fill_nothing` and `replace` on `Table`. ![image](https://github.com/enso-org/enso/assets/4699705/9ee5cec2-82d5-4452-b650-67015ac9fee5) - Added `replace` to Database table throwing `Unsupport_Database_Operation`.	2023-07-09 18:03:05 +00:00
GregoryTravis	bd26e95fd6	Add Table.replace; Change Text.replace to take a Text\|Pattern, and remove the use_regex param. (#7223 )	2023-07-06 16:13:11 +00:00
James Dunkerley	7749286c69	Tidy up the imports using script (#7220 ) Ordering the imports to test a script.	2023-07-06 14:22:50 +00:00
GregoryTravis	6eb46afb40	Do not rename column on fill_nothing and add version to the Table allowing filling multiple (include fill_empty as well). (#7166 ) Updated Column.fill_nothing and .fill_empty, and added the same to Table. (Both in-memory and db.)	2023-07-05 17:20:23 +00:00
Radosław Waśko	78545b4402	Add safepoints to standard libraries Java polyglot helpers (#7183 ) Closes #7129	2023-07-05 14:12:13 +00:00
GregoryTravis	966f8b773a	Combine Regex and Pattern (#7172 ) Merge Pattern into Regex.	2023-07-05 13:51:53 +00:00
Radosław Waśko	2d73277238	Fix a bug that somehow went under CI (#7204 )	2023-07-05 08:54:27 +00:00
James Dunkerley	4fbe7e3830	Remove `Array.new` and `Array.copy` and move Vector functions to builtins. (#7147 ) - Removed Array methods: `new`, `copy` and `new_[1234]`. - New builtins for `Vector.insert`, `Vector.remove` and `Vector.flatten`. - Replaced `Vector_Builder` use of `Array.copy` to a `Vector.Builder` approach.	2023-07-03 12:41:41 +00:00
Radosław Waśko	4ccf3566ce	Implement `add_row_number` for Database backends, fix primary key inference for SQLite (#7174 ) Closes #6921 and also closes #7037	2023-07-03 11:51:42 +00:00
Dmitry Bushev	bb862141e5	Truncate long error messages (#7180 ) close #6958 # Important Notes On the screenshot, the `max_length` is set to 10 to illustrate the new behavior. ![2023-06-30-205255_758x483_scrot](https://github.com/enso-org/enso/assets/357683/0b593b12-4469-49fd-a2e5-216ce54eb264)	2023-06-30 19:06:19 +00:00
GregoryTravis	c866aa7fb5	parse_to_columns should generate at least one row for a non-match (#7171 )	2023-06-30 18:10:33 +00:00
GregoryTravis	550d146493	Add round, ceil, floor, truncate to the In-Database Column type (#6988 )	2023-06-30 16:47:40 +00:00
Radosław Waśko	6eac095579	Add support for `Filter_Condition` in `any`, `all`, `find`, `partition` and `index_of` (#7148 ) Closes #6628	2023-06-30 16:06:01 +00:00

1 2 3 4 5 ...

464 Commits