Commit Graph

132 Commits

Author SHA1 Message Date
Radosław Waśko
93a31fcc8b
Add benchmarks related to add_row_number performance investigation (#8091)
- Follow-up of #8055
- Adds a benchmark comparing performance of Enso Map and Java HashMap in two scenarios - _only incremental_ updates (like `Vector.distinct`) and _replacing_ updates (like keeping a counter for each key). These benchmarks can be used as a metric for #8090
2023-10-18 17:21:59 +00:00
Radosław Waśko
e9fa12763e
Improve performance of add_row_number (#8076)
Fixes #8055
2023-10-17 00:42:35 +00:00
James Dunkerley
fac9e7a420
Expand capabilities of Table.set and better dropdown support, (#8005)
- Adds the ability to use numbers, date/time and Boolean values as constants in `set`.
- `Table.set` can take a `Column_Operation`, allowing for deriving of a new column based on other columns.
- Added `Column_Ref` type to refer to a column in `filter`.
2023-10-13 16:03:28 +00:00
Radosław Waśko
0cd446432f
Fix inconsistency when building a Mixed column, fixes to Union (#7919)
- Fixes #7352 by remembering original value types in type inference mode to be able to reconstruct them for Mixed.
   - Added more benchmarks for comparing performance of constructing columns.
- Fixes missing implementations that caused `Table.union` crashing on some type pairs.
- Ensures that `Loss_Of_Integer_Precision` warning is not swallowed when numeric columns are unioned to create a `Float` column.
- Adds test for all of the above cases.
- Allow to output benchmark results to a CSV by setting an environment variable - useful for quickly comparing benchmarks, e.g. in Enso.
2023-10-03 20:33:34 +02:00
Radosław Waśko
8d926166ea
Follow up improvements to Date_Time_Formatter (#7875)
- Closes #7872
- Also closes #7866
2023-09-28 09:38:00 +00:00
GregoryTravis
b0c1f3b00e
New Data.post for sending a payload to a Web API (#7700) 2023-09-19 11:26:29 +00:00
Hubert Plociniczak
1ee3d8f4f0
Rename Decimal to Float (#7807)
Implements #6889.
2023-09-14 15:01:30 +00:00
Radosław Waśko
7d424bf8a2
Implement Table.delete_rows. (#7709)
- Closes #7238
- Aligns `update_database_table` to a more consistent and clearer API - `update_rows`.
- Adds a `truncate_table` helper function, to pair up with `drop_table`. Both are `PRIVATE` for now.
- Adds tests for NULLs in keys in `update_rows` and `delete_rows`.
- The behaviour is sometimes unexpected, so instead these fail with `Null_Values_In_Key_Columns`.
- Adds a workaround for https://github.com/oracle/graal/issues/7359
- Adds a workaround for a related bug where a stack frame has no name (its `rootNode.getName() == null`).
- I could not track down this bug to provide a neat repro.
2023-09-07 11:07:53 +00:00
Radosław Waśko
87ce78615a
Change layout of local library search path in order to be able to move Round_Spec.enso back to Tests (#7634)
- Closes #7633
- Moves `Round_Spec.enso` from published `Standard.Test` into our `test/Tests` project; the `Table_Tests` that depend on it, simply `import enso_dev.Tests`.
- Changes the layout of the local libraries directory:
- It used to be `root/<namespace>/<name>`.
- Now it is `root/<dir>` - the namespace and name are now read from `package.yaml` instead.
- Adds the parent directory of the current project to the default `ENSO_LIBRARY_PATH`.
- It is treated as a secondary path, so the default `ENSO_HOME/lib` still takes precedence.
- This allows projects to reference and load 'sibling' projects easily - the only requirement is for the project to enable `prefer-local-libraries: true` or add the other local project to its edition. The edition resolution logic is **not changed**.
2023-09-01 20:20:04 +00:00
James Dunkerley
7d83b3d7b4
Add GROUP to functions (#7622)
- Update list of groups to agreed list.
- Lower case `ALIAS` names to be consistent with function names.
- Add `GROUP` to methods.
- All constructors and functions have doc comments.
- Correct a few typos (e.g. `PRVIATE`).
- Mark some more things as `PRIVATE`.
- Use `ToDo:` and `Note:` consistently.
- Order tags in doc comment.

# Important Notes
We don't have all the doc comments on types and will want to add them in future,
2023-08-23 13:20:38 +00:00
Pavel Marek
a0086bb112
Ability to invoke all std benchmarks via jmh (#7519)
All the Enso benchmarks in `test/Benchmarks` can be invoked via JMH
2023-08-17 14:48:43 +02:00
GregoryTravis
c9d7c5cb2b
Convert in-memory Column.round to Java (#7521) 2023-08-16 14:45:23 +00:00
James Dunkerley
296c95d414
Fix for empty column on replace and out of memory catching for join and tab (#7593)
- Added a Panic.catch to catch heap memory error in joins and cross_tab.
- Adjusted column replace so type is correct.
2023-08-15 17:06:51 +00:00
Radosław Waśko
b656b336c7
Report Loss_Of_Integer_Precision when an integer is not exactly representable as a float during conversion (#7509)
Closes #7353

I introduce a new type `WithAggregatedProblems`, because `WithProblems` was too simple - it only allowed to hold a `List<Problem>` but `AggregatedProblems` is more than that. Ideally we shouldn't multiply entities like this too much. We should probably unify all to use `WithAggregatedProblems` - but after starting this, I realised it will likely just take too much effort to do for this little PR. So instead, I created a follow-up task for this: #7514
2023-08-08 12:30:44 +00:00
Radosław Waśko
bc9cde6543
Fix column naming edge cases - invalid and duplicated columns, case-insensitive name aliasing for case-insensitive backends (#7495)
- Fixes #7412
- Also adds tests and fixes some more edge cases:
- Ensures correct handling of existing Database tables whose column names may be invalid from Enso perspective, or clashing from Enso perspective (e.g. for most DBs `ś` and `s\u0301` are different names, but for Enso they are basically the same so this would cause issues - thus Enso now renames such columns when accessed (still using the correct column reference in the generated SQL under the hood).
2023-08-04 09:04:38 +00:00
Radosław Waśko
c61c741476
Respect database backend naming limitations when generating table/column names and validate user-provided names to avoid silent name clashes; process JDBC warnings reported from backends (#7428)
- Closes #5951
- Ensures any SQL warnings reported by the database through the JDBC driver are processed and forwarded to the user.
- These warnings show issues like the implicit name truncation that this PR is also solving. It's good to make sure they are visible as they can help avoid and understand unexpected problems. They should not show up in most standard workflows.
- Adds simple history to our REPL.
2023-08-03 09:44:27 +00:00
Radosław Waśko
4b5a2e2176
Fixing operations on Mixed types (#7368)
- Fixes #7231
- Cleans up vectorized operations to distinguish unary and binary operations.
- Introduces MixedStorage which may pretend to be a more specialized storage on demand.
- Ensures that operations request a more specialized storage on right-hand side to ensure compatibility with reported inferred storage type.
- Ensures that a dataflow error returned by an Enso callback in Java is propagated as a polyglot exception and can be caught back in Enso
- Tests for comparison of Mixed storages with each other and other types
- Started using `Set` for `Filter_Condition.Is_In` for better performance.
- ~~Migrated `Column.map` and `Column.zip` to use the Java-to-Enso callbacks.~~
- This does not forward warnings. IMO we should not be losing them. We can switch and add a ticket to fix the warnings, but that would be a regression (current implementation handles them correctly). Instead, we should first gain some ability to work with warnings in polyglot. I created a ticket to get this figured out #7371
- ~~Trying to avoid conversions when calling Enso functions from Java.~~
- Needs extra care as dataflow errors may not be handled right then. So only works for simple functions that should not error.
- Not sure how much it really helps. [Benchmarks](https://github.com/enso-org/enso/pull/7270#issuecomment-1635618393) suggested it could improve the performance quite significantly, but the practical solution is not exactly the same as the one measured, so we may have to measure and tune it to get the best results.
- Created #7378 to track this.
2023-07-25 23:25:17 +00:00
Radosław Waśko
56635c9a88
Add benchmarks comparing performance of Table operations 'vectorized' in Java vs performed in Enso (#7270)
The added benchmark is a basis for a performance investigation.

We compare the performance of the same operation run in Java vs Enso to see what is the overhead and try to get the Enso operations closer to the pure-Java performance.
2023-07-21 17:25:02 +00:00
Jaroslav Tulach
a5ec6a9e51
Bench builder API (#7324)
Designing new `Bench` API to _collect benchmarks_ first and only execute them then. This is a minimal change to allow  implementation of #7323  - e.g. ability to invoke a _single benchmark_ via JMH harness.

# Important Notes
This is just the basic API skeleton. It can be enhanced, if the basic properties (allowing integration with JMH) are kept. It is not intent of this PR to make the API 100% perfect and usable. Neither it is goal of this PR to update existing benchmarks to use it (74ac8d7 changes only one of them to demonstrate _it all works_ somehow). It is however expected that once this PR is integrated, the newly written benchmarks (like the ones from #7270) are going to use (or even enhance) the new API.
2023-07-19 09:18:28 +00:00
James Dunkerley
7749286c69
Tidy up the imports using script (#7220)
Ordering the imports to test a script.
2023-07-06 14:22:50 +00:00
James Dunkerley
4fbe7e3830
Remove Array.new and Array.copy and move Vector functions to builtins. (#7147)
- Removed Array methods: `new`, `copy` and `new_[1234]`.
- New builtins for `Vector.insert`, `Vector.remove` and `Vector.flatten`.
- Replaced `Vector_Builder` use of `Array.copy` to a `Vector.Builder` approach.
2023-07-03 12:41:41 +00:00
James Dunkerley
56688ec1e7
Minor fixes. (#7122)
Mostly stuff to tidy up the static methods in the CB.

- Remove default pattern from `parse_to_table` (caused IDE to freeze).
- Rename any `_` arguments to what they are.
- Merge `Date.now` into `Date.today`
- Merge the Interval constructors into a single constructor.
- Hide various methods.
2023-06-27 18:18:15 +00:00
Radosław Waśko
2bac9cc844
Execution Context integration for Database write operations (#7072)
Closes #6887
2023-06-27 15:51:21 +00:00
James Dunkerley
937651f696
Code Clean Up, Fix Weird Namespace, S3 List Objects and Read Object (#7114)
Mostly a tidy up as part of looking over the function catalogue for groups.
Sorted some whitespaces issues.
2023-06-24 23:18:58 +00:00
Pavel Marek
67821bf8df
Add compiler pass that discovers ambiguous imports (#6868)
Add a new compiler pass that analyses duplicated and ambiguous symbols from imports
2023-06-14 12:18:57 +02:00
Radosław Waśko
d9ed63fb89
Implement Insert update action for update_database_table. (#6990)
This adds the spec for all update actions, but implements the common input validation framework and `Insert`. Tests for remaining actions are marked as pending - these will be implemented in a subsequent PR.
2023-06-14 00:14:32 +00:00
GregoryTravis
912fbce97b
Reimplement Column.truncate, .ceil, and .floor as vectorized Java ops (#6941)
Reimplement these in Java.

Benchmarks:

Before:

Column.truncate floats average: 124.4ms
Column.ceil floats average: 121.47ms
Column.floor floats average: 120.18ms
Column.truncate ints average: 124.78ms
Column.ceil ints average: 120.41ms
Column.floor ints average: 102.35ms

After (boxed):

Column.truncate floats average: 3.75ms
Column.ceil floats average: 2.25ms
Column.floor floats average: 1.89ms
Column.truncate ints average: 2ms
Column.ceil ints average: 1.77ms
Column.floor ints average: 1.74ms

After (unboxed):
Column.truncate floats average: 3.32ms
Column.ceil floats average: 2.15ms
Column.floor floats average: 1.69ms
Column.truncate ints average: 1.74ms
Column.ceil ints average: 1.61ms
Column.floor ints average: 1.99ms
2023-06-06 18:07:12 +00:00
James Dunkerley
62fecfa474
Widgets, Vector as Column, Viz Fixes and Rename Columns (#6768)
- Fix couple of bugs in Table viz: rounding of bottom div, missing character, not including row count as an option.
- Add better JSON format for `Row`, add support for visualization in the Table viz both for `Vector Row` or `Row`.
- Fix some type signature errors.
- Move `Column_Format` to `Standard.Table.Internal`.
- Move `format_widget` to `File_Format.default_widget` and sort the signature of `Widget` methods.
- Added utility to make `Single_Choice` widgets.
- Added dropdown for delimiter on split methods.
- Removed `default_widget` from `Problem_Behavior` and `Filter_Condition`.
- Altered signature and widgets for table functions.
- Added `to_column` extension to allow easy conversion of Range and Vector to Column.
- Added `compute`, `compute_bulk`, `running` to Column to allow statistic computation.
- Added drop down for `Table.write` format parameter.
- Added drop down for `Table.rename_columns`.
- Added support for Vector of pairs for renaming columns.
- Added check when making a map from Vector if not 2 items.

![image](https://github.com/enso-org/enso/assets/4699705/beed257c-efe3-44a3-9e3a-041354701735)
2023-05-19 23:24:47 +00:00
Radosław Waśko
a9a464af37
Implement simple variants of parse for the Database backend (#6731)
Implements the simplest `parse` scenarios for the Database backend.

Before #6711 these could have been done by `cast`, but in #6711 the APIs were unified to only allow casting to the same set of types in both in-memory and Database. Converting Text to other types is supposed to be done by `parse` and not `cast`, so the ability to use `cast` for rudimentary parsing is removed in the Database backend to make it consistent with in-memory. But now it is lacking any, even simplest, Text->Int/Text->Date support. To alleviate that, the simple scenarios for `parse` are implemented (no support for format customization yet, will boil down to a cast under the hood).
2023-05-19 22:11:23 +00:00
Radosław Waśko
447786a304
Implement cast for Table and Column (#6711)
Closes #6112
2023-05-19 10:00:20 +00:00
Dmitry Bushev
706791779b
SuggestionBuilder needs to send ascribedType of constructor parameters (#6655)
close #6611

Changelog:
- update: run compiler passes on the `ascribedType` field of the constructor arguments
- update: suggestion builder uses the type information attached to `ascribedType`
- feat: resolve qualified names in type signatures
2023-05-13 18:33:03 +00:00
Radosław Waśko
d8b926922a
Improve Non_Unique_Primary_Key error, split file format detection into read/write, improve SQLite format detection (#6604)
Closes #6437
Related to #6410

- Add example duplicate row to `Non_Unique_Primary_Key`.
- Ensure `File.read` fails if the file does not exist, always.
- Ensure SQLite fails if file is empty or nonexistent or malformed.
- Split file format detection into read and write modes, so that the read mode can depend on actual file _contents_.
2023-05-09 17:15:44 +00:00
James Dunkerley
6b0c682b08
Add Execution Context control to Text.write (#6459)
- Adjusted `Context.is_enabled` to support default argument (moved built in so can have defaults).
- Made `environment` case-insensitive.
- Bug fix for play button.
- Short hand to execute within an enabled context.
- Forbid file writing if the Output context is disabled with a `Forbidden_Operation` error.
- Add temporary file support via `File.create_temporary_file` which is deleted on exit of JVM.
- Execution Context first pass in `Text.write`.
- Added dry run warning.
- Writes to a temporary file if disabled.
- Created a `DryRunFileManager` which will create and manage the temporary files.
- Added `format` dropdown to `File.read` and `Data.read`.
- Renamed `JSON_File` to `JSON_Format` to be consistent.

(still to unit test).
2023-04-29 08:39:18 +00:00
Hubert Plociniczak
ae3f9025e3
Invoke instance methods for Any overrides (#6441)
This change modifies method dispatch for methods that override Any's definitions. When an overrided method is invoked statically we call Any's method to stay consistent.
This change primarily addresses the plethora of problems related to `to_text` invocations. It does not attempt to completely modify method dispatch logic.

Closes #6300.
2023-04-28 07:18:37 +00:00
Radosław Waśko
a43d524336
Add typechecks to Aggregate and Cross Tab (#6380)
Follow up of #6298 as it grew too much. Adds the needed typechecks to aggregate operations. Ensures that the DB operations report `Floating_Point_Equality` warning consistently with in-memory.
2023-04-24 08:55:54 +00:00
Radosław Waśko
8db2ad51a1
Adding typechecks to Column Operations (#6298)
Closes #6106
2023-04-21 12:20:12 +00:00
Pavel Marek
b42e910280
sort handles incomparable values (#5998)
* Update type ascriptions in some operators in Any

* Add @GenerateUncached to AnyToTextNode.

Will be used in another node with @GenerateUncached.

* Add tests for "sort handles incomparable types"

* Vector.sort handles incomparable types

* Implement sort handling for different comparators

* Comparison operators in Any do not throw Type_Error

* Fix some issues in Ordering_Spec

* Remove the remaining comparison operator overrides for numbers.

* Consolidate all sorting functionality into a single builtin node.

* Fix warnings attachment in sort

* PrimitiveValuesComparator handles other types than primitives

* Fix byFunc calling

* on function can be called from the builtin

* Fix build of native image

* Update changelog

* Add VectorSortTest

* Builtin method should not throw DataflowError.

If yes, the message is discarded (a bug?)

* TypeOfNode may not return only Type

* UnresolvedSymbol is not supported as `on` argument to Vector.sort_builtin

* Fix docs

* Fix bigint spec in LessThanNode

* Small fixes

* Small fixes

* Nothings and Nans are sorted at the end of default comparator group.

But not at the whole end of the resulting vector.

* Fix checking of `by` parameter - now accepts functions with default arguments.

* Fix changelog formatting

* Fix imports in DebuggingEnsoTest

* Remove Array.sort_builtin

* Add comparison operators to micro-distribution

* Remove Array.sort_builtin

* Replace Incomparable_Values by Type_Error in some tests

* Add on_incomparable argument to Vector.sort_builtin

* Fix after merge - Array.sort delegates to Vector.sort

* Add more tests for problem_behavior on Vector.sort

* SortVectorNode throws only Incomparable_Values.

* Delete Collections helper class

* Add test for expected failure for custom incomparable values

* Cosmetics.

* Fix test expecting different comparators warning

* isNothing is checked via interop

* Remove TruffleLogger from SortVectorNode

* Small review refactorings

* Revert "Remove the remaining comparison operator overrides for numbers."

This reverts commit 0df66b1080.

* Improve bench_download.py tool's `--compare` functionality.

- Output table is sorted by benchmark labels.
- Do not fail when there are different benchmark labels in both runs.

* Wrap potential interop values with `HostValueToEnsoNode`

* Use alter function in Vector_Spec

* Update docs

* Invalid comparison throws Incomparable_Values rather than Type_Error

* Number comparison builtin methods return Nothing in case of incomparables
2023-04-16 16:40:12 +02:00
Radosław Waśko
f5db35af07
Adjust {Table|Column}.parse to use Value_Type (#6213)
Closes #5660
2023-04-06 10:58:55 +00:00
Radosław Waśko
6ddcb553e5
Date/time support for Postgres. Year/month/day operations on Columns. (#6153)
Closes #6115
2023-03-31 18:37:04 +00:00
Radosław Waśko
6f86115498
Proper implementation of Value Types in Table (#6073)
This is the first part of the #5158 umbrella task. It closes #5158, follow-up tasks are listed as a comment in the issue.

- Updates all prototype methods dealing with `Value_Type` with a proper implementation.
- Adds a more precise mapping from in-memory storage to `Value_Type`.
- Adds a dialect-dependent mapping between `SQL_Type` and `Value_Type`.
- Removes obsolete methods and constants on `SQL_Type` that were not portable.
- Ensures that in the Database backend, operation results are computed based on what the Database is meaning to return (by asking the Database about expected types of each operation).
- But also ensures that the result types are sane.
- While SQLite does not officially support a BOOLEAN affinity, we add a set of type overrides to our operations to ensure that Boolean operations will return Boolean values and will not be changed to integers as SQLite would suggest.
- Some methods in SQLite fallback to a NUMERIC affinity unnecessarily, so stuff like `max(text, text)` will keep the `text` type instead of falling back to numeric as SQLite would suggest.
- Adds ability to use custom fetch / builder logic for various types, so that we can support vendor specific types (for example, Postgres dates).

# Important Notes
- There are some TODOs left in the code. I'm still aligning follow-up tasks - once done I will try to add references to relevant tasks in them.
2023-03-31 16:16:18 +00:00
Hubert Plociniczak
8c6fd60aaf
Detect conflicts between exported types and FQNs (#5986)
Exporting types named the same as the module where they are defined in `Main` modules of library components may lead to accidental name conflicts. This became apparent when trying to access `Problem_Behavior` module via a fully qualified name and the compiler rejected it. This is due to the fact that `Main` module exported `Error` type defined in `Standard.Base.Error` module, thus making it impossible to access any other submodules of `Standard.Base.Error` via a fully qualified name.

This change adds a warning to FullyQualifiedNames pass that detects any such future problems.
While only `Error` module was affected, it was widely used in the stdlib, hence the number of changes.

Closes #5902.

# Important Notes
I left out the potential conflict in micro-distribution, thus ensuring we actually detect and report the warning.
2023-03-21 21:09:41 +00:00
James Dunkerley
7c9b9ead8e
Fix up some type signatures... (#5979)
Align any type signatures with a mismatch in count between types and arguments.
2023-03-17 11:53:23 +00:00
Jaroslav Tulach
8bbdd1af5b
Meta.is_a consistent with case-type-of check (#5853)
Removing special handling of `AtomConstructor` in `Meta.is_a` check.

# Important Notes
A lot of tests are about to fail. Many of them indirectly call `Meta.is_a` with a constructor rather than type.
2023-03-10 07:41:04 +00:00
Radosław Waśko
2d29456ed1
Review File/Data read and read_text warnings (#5799)
Closes #5113

Fixes a bug where read-only files would be overwritten if File.write was used in backup mode, and added tests to avoid such regression. To implement it, introduced a `is_writable` property on `File`.
2023-03-06 03:43:38 +00:00
Radosław Waśko
793eafc866
Improve Table.parse_values API (#5692)
Closes #5111
2023-02-24 13:35:01 +00:00
GregoryTravis
3a09ee88f6
Wip/gmt/match find only text (#5721)
Rename is_match + match to match + find (respectively), and remove all non-regexp functionality.

Regexp flags and Match_Mode are also no longer supported by these methods.
2023-02-23 09:47:10 +00:00
Radosław Waśko
a02eab451e
Implement basic warnings for column arithmetic, review warnings on expressions and filter (#5605)
Closes #5109

# Important Notes
- Currently the tests pass for the in-memory parts of Common_Table_Operations, but still some stuff not working on DB backends - in progress.
2023-02-14 09:33:04 +00:00
Radosław Waśko
b9dbfd036f
First steps of the Problem Handling refactor to the new design (#4086)
Implements:
- https://www.pivotaltracker.com/story/show/184226137
- https://www.pivotaltracker.com/story/show/184226434
- https://www.pivotaltracker.com/story/show/184226462
2023-01-30 16:48:06 +00:00
Radosław Waśko
d2e57edc8b
Add Table.cross_join and Table.zip to In-Memory Table (#4063)
Implements https://www.pivotaltracker.com/story/show/184239059
2023-01-23 13:19:52 +00:00
Hubert Plociniczak
3379ce51f2
Report failed name resolutions in type signatures (#4030)
Compiler performed name resolution of literals in type signatures but would silently fail to report any problems.
This meant that wrong names or forgotten imports would sneak in to stdlib.

This change introduces 2 main changes:
1) failed name resolutions are appended in `TypeNames` pass
2) `GatherDiagnostics` pass also collects and reports failures from type
signatures IR

Updated stdlib so that it passes given the correct gatekeepers in place.
2023-01-09 10:35:36 +00:00