Commit Graph

256 Commits

Author SHA1 Message Date
Radosław Waśko
0088096a58
Implement Distinct for the Database backends (#4027)
Implements https://www.pivotaltracker.com/story/show/182307281
2023-01-11 22:46:54 +00:00
Hubert Plociniczak
ae0889e843
Make ArrayOverBuffer behave like an Array/Array.sort no longer mutates the Array (#4022)
Most of the problems with accessing `ArrayOverBuffer` have been resolved by using `CoerceArrayNode` (https://github.com/enso-org/enso/pull/3817). In `Array.sort` we still however specialized on Array which wasn't compatible with `ArrayOverBuffer`. Similarly sorting JS or Python arrays wouldn't work.

Added a specialization to `Array.sort` to deal with that case. A generic specialization (with `hasArrayElements`) not only handles `ArrayOverBuffer` but also polyglot arrays coming from JS or Python. We could have an additional specialization for `ArrayOverBuffer` only (removed in the last commit) that returns `ArrayOverBuffer` rather than `Array` although that adds additional complexity which so far is unnecessary.

Also fixed an example in `Array.enso` by providing a default argument.
2023-01-09 17:49:49 +00:00
Jaroslav Tulach
41b2aac39f
Removing Unsafe.set_atom_field (#4023)
Introducing `Meta.atom_with_hole` to create an `Atom` _with a hole_ that is then _safely_ filled in later.
2023-01-09 13:39:14 +00:00
Hubert Plociniczak
3379ce51f2
Report failed name resolutions in type signatures (#4030)
Compiler performed name resolution of literals in type signatures but would silently fail to report any problems.
This meant that wrong names or forgotten imports would sneak in to stdlib.

This change introduces 2 main changes:
1) failed name resolutions are appended in `TypeNames` pass
2) `GatherDiagnostics` pass also collects and reports failures from type
signatures IR

Updated stdlib so that it passes given the correct gatekeepers in place.
2023-01-09 10:35:36 +00:00
Radosław Waśko
8c661fdb74
Database Joins (#4007)
Implements https://www.pivotaltracker.com/story/show/184032869

# Important Notes
- Currently we get failures in Full joins on Postgres which show a more serious problem - amending equality to ensure that `[NULL = NULL] == True` breaks hash/merge based indexing - so such joins will be extremely inefficient. All our joins currently rely on this notion of equality which will mean all of our DB joins will be extremely inefficient.
- We need to find a solution that will support nulls and still work OK with indices (but after exploring a few approaches: `COALESCE(a = b, a IS NULL AND b is NULL)`, `a IS NOT DISTINCT FROM b`, `(a = b) OR (a IS NULL AND b is NULL)`; all of which did not work (they all result in `ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions`) I'm less certain that it is possible. Alternatively, we may need to change the NULL semantics to align it with SQL - this seems like likely the simpler solution, allowing us to generate simple, reliable SQL - the NULL=NULL solution will be cornering us into nasty workarounds very dependent on the particular backend.
2023-01-05 10:36:22 +00:00
Dmitry Bushev
1e5e2327ab
Improve performance of Text.compare_to (#4012)
PR adds a flag to `Text` implementation tracking whether it is in a FCD normal form. Then this information can be used in the `Normalizer.compare` method.



| Benchmark name | Old (ms) | With flag (ms)
| --- | --- | ---
| Unicode very short | 40.29 | 40.04
| Unicode medium | 9.07 | 1.99
| Unicode big - random | 115.39 | 0.35
| Unicode big - early difference | 107.02 | 0.54
| Unicode big - late difference | 749.81 | 94.73
| ASCII very short | 28.13 | 31.13
| ASCII medium | 4.58 | 2.26
| ASCII big - random | 42.68 | 0.26
| ASCII big - early difference | 30.91 | 0.32
| ASCII big - late difference | 66.29 | 42.72

Full benchmark output.
[bench_old.txt](https://github.com/enso-org/enso/files/10325202/bench_old.txt)
[bench_new.txt](https://github.com/enso-org/enso/files/10325201/bench_new.txt)
2023-01-02 17:09:03 +00:00
Pavel Marek
e6838bc90d
Convert Any.== to a builtin (#3956)
`Any.==` is a builtin method. The semantics is the same as it used to be, except that we no longer assume `x == y` iff `Meta.is_same_object x y`, which used to be the case and caused failures in table tests.

# Important Notes
Measurements from `EqualsBenchmarks` shows that the performance of `Any.==` for recursive atoms increased by roughly 20%, and the performance for primitive types stays roughly the same.
2022-12-29 21:20:00 +00:00
Dmitry Bushev
74742d3267
Make To Text Conversion Identity for Text (#4009)
First part of fixing `Text.to_text`.
- add: `pretty` method for pretty printing.
- update: make `Text.to_text` conversion identity for Text

In the next iterations `to_text` will be gradually replaced with `to Text` conversion once the related issues with conversions are fixed.
2022-12-29 12:21:24 +00:00
Jaroslav Tulach
7252af6d62
Enso.getMetaObject, Type.isMetaInstance and Meta.is_a consolidation (#3949)
Implements `getMetaObject` and related messages from Truffle interop for Enso values and types. Turns `Meta.is_a` into builtin and re-uses the same functionality.

# Important Notes
Adds `ValueGenerator` testing infrastructure to provide unified access to special Enso values and builtin types that can be reused by other tests, not just `MetaIsATest` and `MetaObjectTest`.
2022-12-22 08:00:06 +00:00
Radosław Waśko
c0c0abe4fe
Add benchmarks comparing ArrayProxy with elements generated ad-hoc with a regular Vector (#3831) 2022-12-21 20:15:39 +00:00
James Dunkerley
d7eb8bdcdc
Workaround the JS serialisation issue for errors. (#4000)
Currently, getting a PolyglotProxy error as a JavaScript string is being passed.
This works around the problem by concatenating an empty string.
2022-12-20 18:22:25 +00:00
James Dunkerley
579d3fc397
Adds Date, Time_Of_Day and Date_Time support to Excel IO (#3997)
- Allow date time inputs from Excel.
- Enables disabled test.
- Fix for Map.==.
- Allow nulls in crosstab name.
2022-12-20 16:12:00 +00:00
James Dunkerley
ace459ed53
Let JavaScript parse JSON and write JSON ... (#3987)
Use JavaScript to parse and serialise to JSON. Parses to native Enso object.
- `.to_json` now returns a `Text` of the JSON.
- Json methods now `parse`, `stringify` and `from_pairs`.
- New `JSON_Object` representing a JavaScript Object.
- `.to_js_object` allows for types to custom serialize. Returning a `JS_Object`.
- Default JSON format for Atom now has a `type` and `constructor` property (or method to call for as needed to deserialise).
- Removed `.into` support for now.
- Added JSON File Format and SPI to allow `Data.read` to work.
- Added `Data.fetch` API for easy Web download.
- Default visualization for JS Object trunctes, and made Vector default truncate children too.

Fixes defect where types with no constructor crashed on `to_json` (e.g. `Matching_Mode.Last.to_json`.
Adjusted default visualisation for Vector, so it doesn't serialise an array of arrays forever.
Likewise, JS_Object default visualisation is truncated to a small subset.

New convention:
- `.get` returns `Nothing` if a key or index is not present. Takes an `other` argument allowing control of default.
- `.at` error if key or index is not present.
- `Nothing` gains a `get` method allowing for easy propagation.
2022-12-20 10:33:46 +00:00
Hubert Plociniczak
49204e92cf
Simplify exception handling for polyglot exceptions (#3981)
This removes the special handling of polyglot exceptions and allows matching on Java exceptions in the same way as for any other types.

`Polyglot_Error`, `Panic.catch_java` and `Panic.catch_primitive` are gone

The change mostly deals with the backslash of removing `Polyglot_Error` and two `Panic` methods.
`Panic.catch` was implemented as a builtin instead of delegating to `Panic.catch_primitive` builtin that is now gone.

This fixes https://www.pivotaltracker.com/story/show/182844611
2022-12-19 19:16:43 +00:00
Hubert Plociniczak
d09dd1f697
Allow for dry-running benchmark test (#3989)
When integrated with CI, will guard against any compilation failures. Can be enabled via env var `ENSO_BENCHMARK_TEST_DRY_RUN="True"`:
```
ENSO_BENCHMARK_TEST_DRY_RUN="True" built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/bin/enso --run test/Benchmarks
```

# Important Notes
/cc @mwu-tow this could be run only on linux PRs in CI
2022-12-16 15:35:49 +00:00
Jaroslav Tulach
4b4167fc06
Curried and lambda function invocation should be both fast (#3979)
Benchmark to compare _curried and lambda_ based function invocations and a fix to make _curried_ invocation (at least) as fast as the _lambda_ one. Allows us to use _curried invocations_ in standard library again without loosing any speed.

# Important Notes
Execute as:
```
sbt:runtime> benchOnly CurriedFunctionBenchmarks
```
Prior to subsequent bugfixes in this PR the benchmark results were:
- `averageCurried` runs in 0.290 ms
- `averageLambda` runs in 0.122 ms

e.g. _curried invocations_ is more than twice slow. That confirms our findings from the `Array_Proxy` vector benchmarks. The problem is that _function object is not compilation final_. After fixing it we have following results:
- `averageCurried` runs in 0.102 ms
- `averageLambda` runs in 0.111 ms

e.g. both operations are of similar complexity.
2022-12-16 07:12:24 +00:00
Radosław Waśko
b9bf958f2c
Efficient joining for Equals and Equals_Ignore_Case using a hashmap (#3978)
- Implemented https://www.pivotaltracker.com/story/show/183913276
- Refactored MultiValueIndex and MultiValueKeys to be more type-safe and more direct about using ordered or unordered maps.
- Added performance tests ensuring we use an efficient algorithm for the joins (the tests will fail for a full O(N*M) scan).
- Removed some duplicate code in the Table library.
- Added optional coloring of test results in terminal to make failures easier to spot.
2022-12-14 22:56:20 +00:00
James Dunkerley
77fe69dfd9
JSON Improvements, small Table stuff, Statistic in Enso not Java and few other minor bits. (#3964)
- Aligned `compare_to` so returns `Type_Error` if `that` is wrong type for `Text`, `Ordering` and `Duration`.
- Add `empty_object`, `empty_array`. `get_or_else`, `at`, `field_names` and `length` to `Json`.
- Fix `Json` serialisation of NaN and Infinity (to "null").
- Added `length`, `at` and `to_vector` to Pair (allowing it to be treated as a Vector).
- Added `running_fold` to the `Vector` and `Range`.
- Added `first` and `last` to the `Vector.Builder`.
- Allow `order_by` to take a single `Sort_Column` or have a mix of `Text` and `Sort_Column.Name` in a `Vector`.
- Allow `select_columns_helper` to take a `Text` value. Allows for a single field in group_by in cross_tab.
- Added `Patch` and `Custom` to HTTP_Method.
- Added running `Statistic` calculation and moved more of the logic from Java to Enso. Performance seems similar to pure Java version now.
2022-12-14 19:40:27 +00:00
Marian Stramm
f0a9f72428
Add with_clue support to test framework (#3786)
Add `Test.with_clue` function similar to ScalaTest's [with_clue](https://www.scalatest.org/user_guide/using_assertions).
This is useful for tests where the assertion depends on context which is not clear from it's operands.

### Important Notes

Using with_clue changes `State Clue` which is extracted by the `fail` function.
The State is introduced by `Test.specify` when the test is not pending.

There should be no changes to the public API apart form the addition of `Test.with_clue`.
2022-12-13 13:46:50 +01:00
Jaroslav Tulach
ec047d6ac1
Speeding up Array_Proxy.from_proxy_object twice (#3969)
Using lambda instead of higher order function.

# Important Notes
Running:
```
sbt:runtime> benchOnly VectorBenchmarks.averageOverArrayProxy
```
speeds up from `0.038 ms/op` to `0.016 ms/op` on my computer. Which seems good enough.
2022-12-10 10:35:14 +00:00
Radosław Waśko
8e880e430b
Improve basic join implementation (#3958)
Implements https://www.pivotaltracker.com/story/show/183913232

# Important Notes
Added counts of succeeded/failed tests within a group and global summary, to easier see how many tests failed.
2022-12-09 00:55:07 +00:00
James Dunkerley
11e07f8676
Use the MultiValueIndex for the JoinStrategy. (#3959)
Use the MultiValueStrategy for pure equals Joins.
2022-12-08 12:24:53 +00:00
James Dunkerley
da0dc253cb
Fix order by Text (#3957)
Mistake in the definition.
2022-12-07 19:16:32 +00:00
James Dunkerley
8f30fcf376
Fix issue where constructor was becoming a value. Fixes to_json. (#3955)
If a constructor is fully specified the Meta constructor was becoming the atom in a few places.

This broke to_json and hence viz.
2022-12-07 17:20:56 +00:00
Hubert Plociniczak
0855b74875
Vector should preserve warnings (#3938)
* Sequence literal (Vector) should preserve warnings

When Vector was created via a sequence literal, we simply dropped any
associated any warnings associated with it.
This change propagates Warnings during the creation of the Vector.
Ideally, it would be sufficient to propagate warnings from the
individual elements to the underlying storage but doesn't go well with
`Vector.fromArray`.

* update changelog

* Array-like structures preserver warnings

Added a WarningsLibrary that exposes `hasWarnings` and `getWarnings`
messages. That way we can have a single storage that defines how to
extract warnings from an Array and the others just delegate to it.

This simplifies logic added to sequence literals to handle warnings.

* Ensure polyglot method calls are warning-free

Since warnings are no longer automatically extracted from Array-like
structures, we delay the operation until an actual polyglot method call
is performed.

Discovered a bug in `Warning.detach_selected_warnings` which was missing
any usage or tests.

* nits

* Support multi-dimensional Vectors with warnings

* Propagate warnings from case branches

* nit

* Propagate all vector warnings when reading element

Previously, accessing an element of an Array-like structure would only
return warnings of that element or of the structure itself.
Now, accessing an element also returns warnings from all its elements as
well.
2022-12-07 11:10:11 +01:00
James Dunkerley
4cbd72a4eb
Some more tidying based on remaining tickets and PR comments. (#3946)
- Moved `to_default_visualization_data` to `Standard.Visualization`.
- Remove the use of `is_a` in favour of case statements.
- Stop exporting Standard.Base.Error.Common.
- Separate errors to own files.
- Change constructors to be called `Error`.
- Rename `Caught_Panic.Caught_Panic_Data` -> `Caught_Panic.Panic`.
- Rename `Project_Description.Project_Description_Data` ->`Project_Description.Value`
- Rename `Regex_Matcher.Regex_Matcher_Data` -> `Regex_Matcher.Value` (can't come up with anything better!).
- Rename `Range.Value` -> `Range.Between`.
- Rename `Interval.Value` -> `Interval.Between`.
- Rename `Column.Column_Data` -> `Column.Value`.
- Rename `Table.Table_Data` -> `Table.Value`.
- Align all the Error types in Table.
- Removed GEO Json bits from Table.
- `Json.to_table` doesn't have the GEO bits anymore.
- Added `Json.geo_json_to_table` to add the functions back in.

# Important Notes
No more exports from anywhere but Main!
No more `_Data` constructors!
2022-12-06 18:35:18 +00:00
James Dunkerley
0ad70c6332
Tidy Standard.Base part 5 of n ... (hopefully the end...) (#3929)
- Moved `Any`, `Error` and `Panic` to `Standard.Base`.
- Separated `Json` and `Range` extensions into own modules.
- Tidied `Case`, `Case_Sensitivity`, `Encoding`, `Matching`, `Regex_Matcher`, `Span`, `Text_Matcher`, `Text_Ordering` and `Text_Sub_Range` in `Standard.Base.Data.Text`.
- Tidied `Standard.Base.Data.Text.Extensions` and stopped it re-exporting anything.
- Tidied `Regex_Mode`. Renamed `Option` to `Regex_Option` and added type to export.
- Tidied up `Regex` space.
- Tidied up `Meta` space.
- Remove `Matching` from export.
- Moved `Standard.Base.Data.Boolean` to `Standard.Base.Boolean`.

# Important Notes
- Moved `to_json` and `to_default_visualization_data` from base types to extension methods.
2022-12-02 18:08:14 +00:00
Edward Kmett
d9cdb32121
implement compare_to for Ordering (#3936)
Ordering should itself be ordered.

# Important Notes
Implement the obvious Ord instance for Ordering and supply a test.
2022-12-01 18:42:46 +00:00
Jaroslav Tulach
099f045178
Integer.parse and Decimal.parse improvements (#3934)
Converting `Integer.parse` into a builtin and making sure it can parse big values like `100!`. Adding `locale` parameter to `Locale.parse` and making sure it parses `32,5` as `32.5` double in Czech locale.
2022-12-01 11:25:28 +00:00
Hubert Plociniczak
06bd69436b
Import modules' extension methods only with unqualified import statements (#3906)
# Important Notes
Note that one cannot
```
import Standard.Table as Table_Module
```
because of the 2-component name restriction that gets desugared to `Standard.Table.Main` and we have to write
```
import Standard.Table.Main as Table_Module
```
in a few places. Once we move `Json.to_table` extension this can be improved.
2022-12-01 10:13:34 +00:00
Hubert Plociniczak
20c22f2422
from/all import must not include module in name resolution (#3931)
It appears that when were doing
`from XYZ import all`
the module `XYZ` was also being taken into account during name resolution.
This was unfortunate and became problematic when one had a type with the same name defined in it.
During pattern matching one could not simply do
```
from XYZ import all
...
case ... of
_ : XYZ -> ...
```
since the compiler would complain that we try to pattern match on a type but give it a module.

The module is now excluded from the name resolution, when importing everything from the module.
It appears that this "feature" was used in a number of our tests, so they had to be adapted.
This fixes task 4 in https://www.pivotaltracker.com/story/show/183833055
2022-11-30 16:28:57 +00:00
James Dunkerley
c37eade954
Use Vector.new for fill and tuning... (#3744)
- Make `Vector.fill` use the `Vector.new` method.
- Tuning of some Range methods to try and get better performance.

| Test | Old | Current | Change |
| --- | --- | --- | --- |
| New Vector | 77.5 | 72.5 | 94% |
| Fill Constant | 71.8 | 42.1 | 59% |
| Fill Random | 156.5 | 124.2 | 79% |
| Append Single | 13.3 | 3.9 | 29% |
| Append Large | 13.0 | 4.9 | 38% |
| Sum | 146.4 | 122.3 | 84% |
| Drop First 20 and Sum | 148.0 | 132.7 | 90% |
| Drop Last 20 and Sum | 145.3 | 138.0 | 95% |
| Filter | 79.4 | 68.5 | 86% |
| Filter With Index | 152.9 | 158.5 | 104% |
| Map & Filter | 438.0 | 440.7 | 101% |
| Partition | 256.4 | 296.7 | 116% |
| Partition With Index | 410.0 | 392.0 | 96% |
| Each | 117.4 | 103.8 | 88% |
2022-11-30 09:20:07 +00:00
James Dunkerley
4518f8303d
Implementing transpose and cross_tab for the InMemory table. (#3919)
- Adds transpose and cross_tab to the In-Memory table.
- Cross Tab is built on top of aggregate and hence allows for expressions and has same error trapping as in aggregate.

# Important Notes
Only basic tests have been implemented. Error and warning tests will be added as a follow up task.
2022-11-30 01:19:25 +00:00
Radosław Waśko
85cbf7d9f9
Initial (naive) implementation for in memory join (#3918)
Implements https://www.pivotaltracker.com/story/show/183854123

It features a naive full scan join and only allows equality conditions. More advanced conditions and better optimized algorithms will be implemented in a subsequent PR.
2022-11-29 19:37:31 +00:00
James Dunkerley
4e30b3036d
Tidy Standard.Base part 4 of n ... (#3898)
- Export all for `Problem_Behavior` (allowing for Report_Warning, Report_Error and Ignore to be trivially used).
- Renamed `Range.Range_Data` to `Range.Value` moved to using `up_to` wherever possible.
- Reviewed `Function`, `IO`, `Polyglot`, `Random`, `Runtime`, `System`.
- `File` now published as type. Some static methods moved to `Data` others into type. Removed `read_bytes` static.
- New `Data` module for reading input data in one place (e.g. `Data.read_file`) will add `Data.connect` later.
- Added `Random` module to the exports.
- Move static methods into `Warning` type and exporting the type not the module.

# Important Notes
- Sorted a few imports into order (ordering by direct import in project, then by from import in project then polyglot and finally self imports).
2022-11-25 02:00:16 +00:00
Hubert Plociniczak
4c5868cf0e
Fix imports for Index_Sub_Range's constructors (#3902)
The change that is now allowed due to https://github.com/enso-org/enso/pull/3897

This fixes the first problem mentioned in https://www.pivotaltracker.com/n/projects/2539304/stories/183833055
2022-11-24 22:36:58 +00:00
Dmitry Bushev
5a0aad16eb
Check that type names are resolved (#3895)
RuntimeStdlibTest now checks that names in type signatures are qualified (aka poor man's typechecker)

Will be merged after the stdlib tidying by James is complete.
2022-11-21 22:33:36 +00:00
James Dunkerley
93fee3a51f
Tidy Standard.Base part 3 of n ... (#3893)
Here we go again...
- Tidied up `Pair` and stopped exporting `Pair_Data`. Adjusted so type exported.
- Tidy imports for `Json`, `Json.Internal`, `Locale`.
- Tidy imports Ordering.*. Export `Sort_Direction` and `Case_Sensitivity` as types.
- Move methods of `Statistics` into `Statistic`. Publishing the types not the module.
- Added a `compute` to a `Rank_Method`.
- Tidied the `Regression` module.
- Move methods of `Date`, `Date_Time`, `Duration`, `Time_Of_Day` and `Time_Zone` into type. Publishing types not modules.
- Added exporting `Period`, `Date_Period` and `Time_Period` as types. Static methods moved into types.

# Important Notes
- Move `compare_to_ignore_case`, `equals_ignore_case` and `to_case_insensitive_key` from Extensions into `Text`.
- Hiding polyglot java imports from export all in `Main.enso`.
2022-11-21 15:30:18 +00:00
James Dunkerley
99bacc5c06
Tidy Standard.Base Part 2 of n... (#3889)
- Moved static methods into `Locale` type. Publishing type not module.
- Stop publishing `Nil` and `Cons` from `List`.
- Tidied up `Json` and merged static in to type. Sorted out various type signatures which used a `Constructor`. Now exporting type and extensions.
- Tidied up `Noise` and merge `Generator` into file. Export type not module.
- Moved static method of `Map` into type. Publishing type not module.

# Important Notes
- Move `Text.compare_to` into `Text`.
- Move `Text.to_json` into `Json`.
2022-11-19 08:01:45 +00:00
Radosław Waśko
5b6fd74929
Data analysts should be able to Text.match, Text.match_all, Text.is_match to find or check matches (#3841)
Implements https://www.pivotaltracker.com/story/show/181266092

# Important Notes
Also renaming `Text.location_of` and `Text.location_of_all` to `Text.locate` and `Text.locate_all`.
2022-11-18 22:17:42 +00:00
James Dunkerley
14dbe7287b
Tidy Standard.Base Part 1 of n... (#3884)
* Tidy Bound and Interval.

* Fix Interval tests.

* Fix Interval tests.

* Restructure Index_Sub_Range to new Type/Statics.

* Adjust for Vector exported as a type and static methods on it.

* Tidy Maybe.

* Fix issue with Line_Ending_Style.

* Revert Filter_Condition change.
Fix benchmark test issue.
Tidy imports on Index_Sub_Range.

* Revert Filter_Condition change.
Fix benchmark test issue.
Tidy imports on Index_Sub_Range.

* Can't export constructors unless exported from type in module.

* Fix failing tests.
2022-11-18 08:57:41 +00:00
James Dunkerley
c868ed5efe
Some minor fixes (#3874)
- Allow `Map` to store a `Nothing` key (fixes `Vector.distinct` with a `Nothing`).
- Add `column_names` method to `Table` as a shorthand.
- Return data flow error when comparing with Nothing (not a Panic or a Polyglot exception).
- Allow milli and micro second for DateTime and Time Of Day

# Important Notes
- Added a load of tests for the various comparison operators to Numbers_Spec.
2022-11-17 07:11:18 +00:00
Hubert Plociniczak
7b0759f8b3
Don't add module's builtins to the scope of a builtin type (#3791)
It appears that we were always adding builtin methods to the scope of the module and the builtin type that shared the same name.
This resulted in some methods being accidentally available even though they shouldn't.

This change treats differently builtins of types and modules and introduces auto-registration feature for builtins.
By default all builtin methods are registered with a type, unless explicitly defined in the annotation property.
Builtin methods that are auto-registered do not have to be explicitly defined and are registered with the underlying type.
Registration correctly infers the right type, depending whether we deal with static or instance methods.

Builtin methods that are not auto-registered have to be explicitly defined **always**. Modules' builtin methods are the prime example.

# Important Notes
Builtins now carry information whether they are static or not (inferred from the lack of `self` parameter).
They also carry a `autoRegister` property to determine if a builtin method should be automatically registered with the type.
2022-11-16 10:23:52 +00:00
Kaz Wesley
a1db36b57c
Support mixed constructors/bindings in types (#3870)
Libraries: Revert changes that were necessitated by a new rule we have decided not to introduce.

Parser:
- Support mixed constructors/bindings in types.
- Disallow zero-length hex sequences in character escapes: `\x`, `\u`, `\u{}`, `\U`, `\U{}` are no longer legal synonyms for `\0` (matches old parser behavior).
2022-11-14 20:24:07 +00:00
Jaroslav Tulach
ecd1fdc3f8
Caching the grapheme_length of a Text (#3864)
Computing length of a text takes time. Let's cache it after first computation.

# Important Notes
Wrote `StringBenchmarks` that sums lengths of (the same) `Text` present many time in a `Vector`. Initially it took `383.673 ms` per operation. Then it took `0.031 ms/op`. Looks like the `length` calls are returning instantly as they get cached.
2022-11-14 15:53:10 +00:00
James Dunkerley
45276b243d
Expanding Derived Columns and Expression Syntax (#3782)
- Added expression ANTLR4 grammar and sbt based build.
- Added expression support to `set` and `filter` on the Database and InMemory `Table`.
- Added expression support to `aggregate` on the Database and InMemory `Table`.
- Removed old aggregate functions (`sum`, `max`, `min` and `mean`) from `Column` types.
- Adjusted database `Column` `+` operator to do concatenation (`||`) when text types.
- Added power operator `^` to both `Column` types.
- Adjust `iif` to allow for columns to be passed for `when_true` and `when_false` parameters.
- Added `is_present` to database `Column` type.
- Added `coalesce`, `min` and `max` functions to both `Column` types performing row based operation.
- Added support for `Date`, `Time_Of_Day` and `Date_Time` constants in database.
- Added `read` method to InMemory `Column` returning `self` (or a slice).

# Important Notes
- Moved approximate type computation to `SQL_Type`.
- Fixed issue in `LongNumericOp` where it was always casting to a double.
- Removed `head` from InMemory Table (still has `first` method).
2022-11-08 15:57:59 +00:00
James Dunkerley
b5881efdf0
Allow integers for take and drop. (#3854)
Allows passing an integer to take or drop as a shorthand.
2022-11-04 14:03:28 +00:00
Radosław Waśko
438a284346
Implement missing Table.take/drop While (#3840)
Implements https://www.pivotaltracker.com/story/show/183677982
2022-11-01 12:11:50 +00:00
Kaz Wesley
330612119a
Parse the standard library (#3830)
Fix bugs in `TreeToIr` (rewrite) and parser. Implement more undocumented features in parser. Emulate some old parser bugs and quirks for compatibility.

Changes in libs:
- Fix some bugs.
- Clean up some odd syntaxes that the old parser translates idiosyncratically.
- Constructors are now required to precede methods.

# Important Notes
Out of 221 files:
- 215 match the old parser
- 6 contain complex types the old parser is known not to handle correctly

So, compared to the old parser, the new parser parses 103% of files correctly.
2022-10-31 16:19:12 +00:00
Radosław Waśko
f60e9e9d8e
Add a Visualization to the Table.Row type (#3837)
Follow up to https://www.pivotaltracker.com/story/show/182307026

When working in the IDE I noticed that the default vis is bad, so this should make it better.
2022-10-31 14:20:13 +00:00