Commit Graph

360 Commits

Author SHA1 Message Date
James Dunkerley
a3de3c6128
Use ArraySlice to slice a Vector (#3724)
Use an `ArraySlice` to slice `Vector`.
Avoids memory copying for the slice function.

# Important Notes
| Test | Ref | New |
| --- | --- | --- |
| New Vector | 71.9 | 71.0 |
| Append Single | 26.0 | 27.7 |
| Append Large | 15.1 | 14.9 |
| Sum | 156.4 | 165.8 |
| Drop First 20 and Sum | 171.2 | 165.3 |
| Drop Last 20 and Sum | 170.7 | 163.0 |
| Filter | 76.9 | 76.9 |
| Filter With Index | 166.3 | 168.3 |
| Partition | 278.5 | 273.8 |
| Partition With Index | 392.0 | 393.7 |
| Each | 101.9 | 102.7 |

- Note: the performance of New and Append has got slower from previous tests.
2022-09-23 15:13:16 +00:00
James Dunkerley
6f54e80970
Adjust Database connection to use query/read to access data. (#3727)
Adjust Database connection API to align with new [design](https://github.com/enso-org/design/blob/wip/jd/database-read/epics/basic-libraries/database-read/design.md#querying-tables).
- `query` replaces the old `access_table` and is expanded to support raw SQL queries.
- `read` replaces `execute_query` and matches the API of `query`.
- `to_dataframe` is renamed to `read`.

# Important Notes
Added support for `++` to concatenate a Text without wrapping in a `SQL.Code`.
2022-09-23 07:35:08 +00:00
Radosław Waśko
e9ebc663c1
Add business days functions to Date and Date_Time (#3726)
Implements https://www.pivotaltracker.com/story/show/183082087

# Important Notes
- Removed unnecessary invocations of `Error.throw` improving performance of `Vector.distinct`. The time of the `add_work_days and work_days_until should be consistent with each other` test suite came down from 15s to 3s after the changes.
2022-09-22 08:31:15 +00:00
Dmitry Bushev
4443ccc0a9
Fix expression updates for builtin types (#3721)
Changelog:
- add missing cases to runtime Types check
- create an appropriate test suite
2022-09-19 13:56:51 +00:00
James Dunkerley
d6346e9d66
Renaming various constructors and moving types around for Database. (#3715)
Repairing the constructor name following the types work. Some general tiding up as well.

- Remove `Standard.Database.Data.Column.Aggregate_Column_Builder`.
- Remove `Standard.Database.Data.Dialect.Dialect.Dialect_Data`.
- Remove unused imports and update some type definitions.
- Rename `Postgres.Postgres_Data` => `Postgres_Options.Postgres`.
- Rename `Redshift.Redshift_Data` => `Redshift_Options.Redshift`.
- Rename `SQLite.SQLite_Data` => `SQLite_Options.SQLite`.
- Rename `Credentials.Credentials_Data` => `Credentials.Username_And_Password`.
- Rename `Sql` to `SQL` across the board.
- Merge `Standard.Database.Data.Internal` into `Standard.Database.Internal`.
- Move dialects into `Internal` and merge the function in `Helpers` into `Base_Generator`.
2022-09-19 12:39:40 +00:00
Radosław Waśko
8fa8d12cc3
String functionality in std-table should use std-base (#3717)
Implements https://www.pivotaltracker.com/story/show/181754646
2022-09-17 14:38:02 +00:00
Hubert Plociniczak
0e5df935d3
Don't rename imported Main module that only imports names (#3710)
Turns that if you import a two-part import we had special code that would a) add Main submodule b) add an explicit rename.

b) is problematic because sometimes we only want to import specific names.
E.g.,
```
from Bar.Foo import Bar, Baz
```
would be translated to
```
from Bar.Foo.Main as Foo import Bar, Baz
```
and it should only be translated to
```
from Bar.Foo.Main import Bar, Baz
```

This change detects this scenario and does not add renames in that case.

Fixes [183276486](https://www.pivotaltracker.com/story/show/183276486).
2022-09-16 13:01:06 +00:00
Radosław Waśko
5ed388930e
Additional tests for handling Dates in Table (#3707)
Resolves https://www.pivotaltracker.com/story/show/183285801

@JaroslavTulach suggested the current implementation may not handle these correctly, which suggests that the logic is not completely trivial - so I added a test to ensure that it works as we'd expect. Fortunately, it did work - but it's good to keep the tests to avoid regressions.
2022-09-15 23:18:19 +00:00
James Dunkerley
0126f02e7b
Restructure File.read into the new design (#3701)
Changes following Marcin's work. Should be back to very similar public API as before.

- Add an "interface" type: `Standard.Base.System.File_Format.File_Format`.
- All `File_Format` types now have a `can_read` method to decide if they can read a file.
- Move `Standard.Table.IO.File_Format.Text.Text_Data` to `Standard.Base.System.File_Format.Plain_Text_Format.Plain_Text`.
- Move `Standard.Table.IO.File_Format.Bytes` to `Standard.Base.System.File_Format.Bytes`.
- Move `Standard.Table.IO.File_Format.Infer` to `Standard.Base.System.File_Format.Infer`. **(doesn't belong here...)**
- Move `Standard.Table.IO.File_Format.Unsupported_File_Type` to `Standard.Base.Error.Common.Unsupported_File_Type`.
- Add `Infer`, `File_Format`, `Bytes`, `Plain_Text`, `Plain_Text_Format` to `Standard.Base` exports.
- Fold extension methods of `Standard.Base.Meta.Unresolved_Symbol` into type.
- Move `Standard.Table.IO.File_Format.Auto` to `Standard.Table.IO.Auto_Detect.Auto_Detect`.
- Added a `types` Vector of all the built in formats.
- `Auto_Detect` asks each type if they `can_read` a file.
- Broke up and moved `Standard.Table.IO.Excel` into `Standard.Table.Excel`:
- Moved `Standard.Table.IO.File_Format.Excel.Excel_Data` to `Standard.Table.Excel.Excel_Format.Excel_Format.Excel`.
- Renamed `Sheet` to `Worksheet`.
- Internal types `Reader` and `Writer` providing the actual read and write methods.
- Created `Standard.Table.Delimited` with similar structure to `Standard.Table.Excel`:
- Moved `Standard.Table.IO.File_Format.Delimited.Delimited_Data` to `Standard.Table.Delimited.Delimited_Format.Delimited_Format.Delimited`.
- Moved `Standard.Table.IO.Quote_Style` to `Standard.Table.Delimited.Quote_Style`.
- Moved the `Reader` and `Writer` internal types into here. Renamed methods to have unique names.
- Add `Aggregate_Column`, `Auto_Detect`, `Delimited`, `Delimited_Format`, `Excel`, `Excel_Format`, `Sheet_Names`, `Range_Names`, `Worksheet` and `Cell_Range` to `Standard.Table` exports.
2022-09-15 14:48:46 +00:00
Radosław Waśko
b304402d8e
Add Period Start and End functions to Date and DateTime (#3695)
Implements https://www.pivotaltracker.com/story/show/183081152
2022-09-13 09:51:08 +00:00
Hubert Plociniczak
fba5047acc
Improved Vector/Array interop (#3667)
`Vector` type is now a builtin type. This requires a bunch of additional builtin methods for its creation:
- Use `Vector.from_array` to convert any array-like structure into a `Vector` [by copy](f628b28f5f)
- Use (already existing) `Vector.from_polyglot_array` to convert any array-like structure into a `Vector` **without** copying
- Use (already existing) `Vector.fill 1 item` to create a singleton `Vector`

Additional, for pattern matching purposes, we had to implement a `VectorBranchNode`. Use following to match on `x` being an instance of `Vector` type:
```
import Standard.Base.Data.Vector

size = case x of
Vector.Vector -> x.length
_ -> 0
```

Finally, `VectorLiterals` pass that transforms `[1,2,3]` to (roughly)
```
a1 = 1
a2 = 2
a3 = 3
Vector (Array (a1,a2, a3))
```
had to be modified to generate
```
a1 = 1
a2 = 2
a3 = 3
Vector.from_array (Array (a1, a2, a3))
```
instead to accomodate to the API changes. As of 025acaa676 all the known CI checks passes. Let's start the review.

# Important Notes
Matching in `case` statement is currently done via `Vector_Data`. Use:
```
case x of
Vector.Vector_Data -> True
```
until a better alternative is found.
2022-09-13 03:07:17 +00:00
James Dunkerley
4c82b657de
Tidy up type signatures and error types (#3693)
Small clean up PR.

- Aligns a few type signatures with their functions.
- Some formatting fixes.
- Remove a few unused types.
- Make error extension functions be standard methods.
2022-09-09 11:11:46 +00:00
James Dunkerley
2b425f8e08
Restructuring Database.Connection to allow for database specific types. (#3632)
- Added `databases`, `database`, `set_database`.
- Added `schemas`, `schema`, `set_schema`.
- Added `table_types`,
- Added `tables`.
- Moved the vast majority of the connection work into a lower level `JDBC_Connection` object.
- `Connection` represents the standard API for database connections and provides a base JDBC implementation.
- `SQLite_Connection` has the `Connection` API but with custom `databases` and `schemas` methods for SQLite.
- `Postgres_Connection` has the `Connection` API but with custom `set_database`, `databases`, `set_schema` and `schemas` methods for Postgres.
- Updated `Redshift` - no public API change.
2022-09-07 17:32:28 +00:00
Radosław Waśko
551100af3b
Add Table.distinct function to In-Memory table (#3684)
Implements https://www.pivotaltracker.com/story/show/182307143

# Important Notes
- Modified standard library Java helpers dependencies so that `std-table` module depends on `std-base`, as a provided dependency. This is allowed, because `std-table` is used by the `Standard.Table` Enso module which depends on `Standard.Base` which ensures that the `std-base` is loaded onto the classpath, thus whenever `std-table` is loaded by `Standard.Table`, so is `std-base`. Thus we can rely on classes from `std-base` and its dependencies being _provided_ on the classpath. Thanks to that we can use utilities like `Text_Utils` also in `std-table`, avoiding code duplication. Additional advantage of that is that we don't need to specify ICU4J as a separate dependency for `std-table`, since it is 'taken' from `std-base` already - so we avoid including it in our build packages twice.
2022-09-07 12:28:41 +00:00
Radosław Waśko
eafba079d9
Make In Memory Table Aggregator types more specific where possible (#3679)
Many aggregation types fell back to the general `Any` type where they could have used the type of input column - for example `First` of a column of integers is guaranteed to fit the `Integer` storage type, so it doesn't have to fall back to `Any`. This PR fixes that and adds a test that checks this.
2022-09-05 09:17:41 +00:00
Radosław Waśko
65140f48ca
Add storage support for Date, Time and DateTime to InMemory table (#3673)
Implements https://www.pivotaltracker.com/story/show/183080911
2022-08-31 22:06:29 +00:00
Marcin Kostrzewa
4fc6dcced0
Get rid of free-floating atoms. Everything has a type now! (#3671)
This is a step towards the new language spec. The `type` keyword now means something. So we now have
```
type Maybe a
Some (from_some : a)
None
```
as a thing one may write. Also `Some` and `None` are not standalone types now – only `Maybe` is.
This halfway to static methods – we still allow for things like `Number + Number` for backwards compatibility. It will disappear in the next PR.

The concept of a type is now used for method dispatch – with great impact on interpreter code density.

Some APIs in the STDLIB may require re-thinking. I take this is going to be up to the libraries team – some choices are not as good with a semantically different language. I've strived to update stdlib with minimal changes – to make sure it still works as it did.

It is worth mentioning the conflicting constructor name convention I've used: if `Foo` only has one constructor, previously named `Foo`, we now have:
```
type Foo
Foo_Data f1 f2 f3
```

This is now necessary, because we still don't have proper statics. When they arrive, this can be changed (quite easily, with SED) to use them, and figure out the actual convention then.

I have also reworked large parts of the builtins system, because it did not work at all with the new concepts.

It also exposes the type variants in SuggestionBuilder, that was the original tiny PR this was based on.

PS I'm so sorry for the size of this. No idea how this could have been smaller. It's a breaking language change after all.
2022-08-30 22:54:53 +00:00
Radosław Waśko
e6e4692692
DataFormatter should infer datetime from values without seconds (#3668)
Fixes https://www.pivotaltracker.com/story/show/183033133
2022-08-26 21:10:52 +00:00
Radosław Waśko
d7ebc4a338
Add Table.take and Table.drop functions to In-Memory table (#3647)
Implements https://www.pivotaltracker.com/story/show/182307347
2022-08-26 19:41:36 +00:00
James Dunkerley
a20d43390e
Adding DateTime part functions (#3669)
- Added `Zone`, `Date_Time` and `Time_Of_Day` to `Standard.Base`.
- Renamed `Zone` to `Time_Zone`.
- Added `century`.
- Added `is_leap_year`.
- Added `length_of_year`.
- Added `length_of_month`.
- Added `quarter`.
- Added `day_of_year`.
- Added `Day_Of_Week` type and `day_of_week` function.
- Updated `week_of_year` to support ISO.

# Important Notes
- Had to pass locale to formatter for date/time tests to work on my PC.
- Changed default of `week_of_year` to use ISO.
2022-08-26 15:47:58 +00:00
Radosław Waśko
fd318cfa96
Remove Array.set_at (#3634)
Implements https://www.pivotaltracker.com/story/show/182879865

# Important Notes
Note that removing `set_at` still does not make our arrays fully immutable - `Array.copy` can still be used to mutate them.
2022-08-26 09:34:33 +00:00
Hubert Plociniczak
d87a32d019
Builtin Date_Time, Time_Of_Day, Zone (#3658)
* Builtin Date_Time, Time_Of_Day, Zone

Improved polyglot support for Date_Time (formerly Time), Time_Of_Day and
Zone. This follows the pattern introduced for Enso Date.

Minor caveat - in tests for Date, had to bend a lot for JS Date to pass.
This is because JS Date is not really only a Date, but also a Time and
Timezone, previously we just didn't consider the latter.
Also, JS Date does not deal well with setting timezones so the trick I
used is to first call foreign function returning a polyglot JS Date,
which is converted to ZonedDateTime and only then set the correct
timezone. That way none of the existing tests had to be changes or
special cased.

Additionally, JS deals with milliseconds rather than nanoseconds so
there is loss in precision, as noted in Time_Spec.

* Add tests for Java's LocalTime

* changelog

* Make date formatters in table happy

* PR review, add more tests for zone

* More tests and fixed a bug in column reader

Column reader didn't take into account timezone but that was a mistake
since then it wouldn't map to Enso's Date_Time.
Added tests that check it now.

* remove redundant conversion

* Update distribution/lib/Standard/Base/0.0.0-dev/src/Data/Time.enso

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>

* First round of addressing PR review

* don't leak java exceptions in Zone

* Move Date_Time to top-level module

* PR review

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
Co-authored-by: Jaroslav Tulach <jaroslav.tulach@enso.org>
2022-08-24 12:31:29 +02:00
Hubert Plociniczak
4b9c91626e
Use Vector.from_polyglot_array to make Vectors backed by polyglot arrays (#3628)
Use Proxy_Polyglot_Array as a proxy for polyglot arrays, thus unifying
the way the underlying array is accessed in Vector.

Used the opportunity to cleanup builtin lookup, which now actually
respects what is defined in the body of @Builtin_Method annotation.

Also discovered that polyglot null values (in JS, Python and R) were leaking to Enso.
Fixed that by doing explicit translation to `Nothing`.

https://www.pivotaltracker.com/story/show/181123986
2022-08-23 21:13:16 +00:00
Jaroslav Tulach
2b9352d2fc
Lazy scatterplot for Vector & Table (#3655)
First of all this PR demonstrates how to implement _lazy visualization_:
- one needs to write/enhance Enso visualization libraries - this PR adds two optional parameters (`bounds` and `limit`) to `process_to_json_text` function.
- the `process_to_json_text` can be tested by standard Enso test harness which this PR also does
- then one has to modify JavaScript on the IDE side to construct `setPreprocessor` expression using the optional parameters

The idea of _scatter plot lazy visualization_ is to limit the amount of points the IDE requests. Initially the limit is set to `limit=1024`. The `Scatter_Plot.enso` then processes the data and selects/generates the `limit` subset. Right now it includes `min`, `max` in both `x`, `y` axis plus randomly chosen points up to the `limit`.

![Zooming In](https://user-images.githubusercontent.com/26887752/185336126-f4fbd914-7fd8-4f0b-8377-178095401f46.png)

The D3 visualization widget is capable of _zooming in_. When that happens the JavaScript widget composes new expression with `bounds` set to the newly visible area. By calling `setPreprocessor` the engine recomputes the visualization data, filters out any data outside of the `bounds` and selects another `limit` points from the new data. The IDE visualization then updates itself to display these more detailed data. Users can zoom-in to see the smallest detail where the number of points gets bellow `limit` or they can select _Fit all_ to see all the data without any `bounds`.

# Important Notes
Randomly selecting `limit` samples from the dataset may be misleading. Probably implementing _k-means clustering_ (where `k=limit`) would generate more representative approximation.
2022-08-23 12:12:22 +00:00
James Dunkerley
684adcb7fb
Tidy up the default imports for Standard.Table (#3660)
- Removed various unnecessary `Standard.Base` imports still left behind.
- Added `Regex` to default `Standard.Base`.
- Removed aliasing from the examples as no longer needed (case coercion no long occurs).
- Remove `import Standard.Table` from within the Table library (directly importing types).
- Reviewed what was in `Standard.Database` - a few tweaks and removals.
- Removed various un-needed aliasing following Hubert's import work.
2022-08-22 19:21:54 +00:00
Radosław Waśko
bcca7f10d9
Add key functions to Table to make it act as [Column] (#3644)
Implements https://www.pivotaltracker.com/story/show/181370836
2022-08-18 12:33:02 +00:00
Hubert Plociniczak
68f9fce21a
Use Java's LocalDate for parsing date in tests (#3650)
Rather than using `Date.parse`, which is already being tested in other
tests, we use `LocalDate.parse`. Making use of a helper class to
mitigate API differences.
2022-08-17 09:34:31 +00:00
Radosław Waśko
fbf6c800f1
Short hand version for order_by (#3643)
Implements https://www.pivotaltracker.com/story/show/182868310
2022-08-16 15:41:37 +00:00
Hubert Plociniczak
8575b76b0a
Support pattern matching on constants (#3641)
This change adds support for matching on constants by:
1) extending parser to allow literals in patterns
2) generate branch node for literals

Related to https://www.pivotaltracker.com/story/show/182743559
2022-08-12 13:18:58 +00:00
Radosław Waśko
3dca738cf7
Add Vector.take and Vector.drop functions (#3629)
Implements https://www.pivotaltracker.com/story/show/182307048
2022-08-10 16:02:02 +00:00
Hubert Plociniczak
42dbd8bb59
Allow for importing methods (#3633)
Importing individual methods didn't work as advertised because parser
would allow them but later drop that information. This slipped by because we never had mixed atoms and methods in stdlib.

# Important Notes
Added some basic tests but we need to ensure that the new parser allows for this.
@jdunkerley will be adding some changes to stdlib that will be testing this functionality as well.
2022-08-05 16:25:51 +00:00
Radosław Waśko
0a2fea925c
Create Index_Sub_Range type and update Text.take and Text.drop (#3617) 2022-08-03 11:41:34 +00:00
Hubert Plociniczak
d59714a29d
Support module imports using a qualified name (#3608)
This change allows for importing modules using a qualified name and deals with any conflicts on the way.
Given a module C defined at `A/B/C.enso` with
```
type C
type C a
```
it is now possible to import it as
```
import project.A
...
val x = A.B.C 10
```

Given a module located at `A/B/C/D.enso`, we will generate
intermediate, synthetic, modules that only import and export the successor module along the path.
For example, the contents of a synthetic module B will look like
```
import <namespace>.<pkg-name>.A.B.C
export <namespace>.<pkg-name>.A.B.C
```
If module B is defined already by the developer, the compiler will _inject_ the above statements to the IR.

Also removed the last elements of some lowercase name resolution that managed to survive recent
changes (`Meta.Enso_Project` would now be ambiguous with `enso_project` method).

Finally, added a pass that detects shadowing of the synthetic module by the type defined along the path.
We print a warning in such a situation.

Related to https://www.pivotaltracker.com/n/projects/2539304

# Important Notes
There was an additional request to fix the annoying problem with `from` imports that would always bring
the module into the scope. The changes in stdlib demonstrate how it is now possible to avoid the workaround of
```
from X.Y.Z as Z_Module import A, B
```
(i.e. `as Z_Module` part is almost always unnecessary).
2022-07-29 14:19:07 +00:00
Hubert Plociniczak
f63e40df1b
Explicit self (#3569)
This change modifies the current language by requiring explicit `self` parameter declaration
for methods. Methods without `self` parameter in the first position should be treated as statics
although that is not yet part of this PR. We add an implicit self to all methods
This obviously required updating the whole stdlib and its components, tests etc but the change
is pretty straightforward in the diff.

Notice that this change **does not** change method dispatch, which was removed in the last changes.
This was done on purpose to simplify the implementation for now. We will likely still remove all
those implicit selfs to bring true statics.
Minor caveat - since `main` doesn't actually need self, already removed that which simplified
a lot of code.
2022-07-27 17:45:36 +00:00
James Dunkerley
a54a7d5553
Tidying up what is in Standard.Base (#3603)
- Added various of the types from the new APIs to the Standard.Base export.
- Removed Syntax_Error types for Regex and Uri and used the common one.
2022-07-27 13:28:00 +00:00
Radosław Waśko
ee91656f30
Remove duplicate Line_Ending_Style and update defaults (#3597)
Implements https://www.pivotaltracker.com/story/show/182749831
2022-07-27 09:43:51 +00:00
James Dunkerley
7090e1fb91
Docker file for testing Postgres SSL and updated Postgres Spec (#3607)
Adds a Dockerfile and `CreatePostgresSSL.sh` script, which makes an Alpine based Postgres server with a self signed certificate. The script will drop the generated `rootCA.crt` into the `data/transient` folder.

This can then be included in the test by setting the environment variable `ENSO_DATABASE_TEST_CA_CERT_FILE`.

Test has been updated to check the various SSL connection modes.
2022-07-26 13:28:43 +00:00
James Dunkerley
be311457bd
Add Linear Regression support for Vectors. (#3601)
Adds least squares regression APIs. Covers the basic 4 trend line types from Excel (doesn't cover Polynomial or Moving Average).
Removes the old `Model` from the `Standard.Table`.
2022-07-22 08:41:17 +00:00
Radosław Waśko
16fd038c1a
Add support for .pgpass to PostgreSQL (#3593)
Implements https://www.pivotaltracker.com/story/show/182582924
2022-07-21 13:32:37 +00:00
Jaroslav Tulach
4465d63dd8
Improved polyglot Date support (#3559)
Significantly improves the polyglot Date support (as introduced by #3374). It enhances the `Date_Spec` to run it in four flavors:
- with Enso Date (as of now)
- with JavaScript Date
- with JavaScript Date wrapped in (JavaScript) array
- with Java LocalDate allocated directly

The code is then improved by necessary modifications to make the `Date_Spec` pass.

# Important Notes
James has requested in [#181755990](https://www.pivotaltracker.com/n/projects/2539304/stories/181755990) - e.g. _Review and improve InMemory Table support for Dates, Times, DateTimes, BigIntegers_ the following program to work:
```
foreign js dateArr = """
return [1, new Date(), 7]

main =
IO.println <| (dateArr.at 1).week_of_year
```
the program works with here in provided changes and prints `27` as of today.

@jdunkerley has provided tests for proper behavior of date in `Table` and `Column`. Those tests are working as of [f16d07e](f16d07e640). One just needs to accept `List<Value>` and then query `Value` for `isDate()` when needed.

Last round of changes is related to **exception handling**. 8b686b12bd makes sure `makePolyglotError` accepts only polyglot values. Then it wraps plain Java exceptions into `WrapPlainException` with `has_type` method - 60da5e70ed - the remaining changes in the PR are only trying to get all tests working in the new setup.

The support for `Time` isn't part of this PR yet.
2022-07-21 06:32:40 +00:00
James Dunkerley
5e4083978f
Type name case fixes: (#3590)
- MacOS => Mac_OS
- PostgreSQL => Postgres
- SQLite => SQLite (align a few)
- InMemory => In_Memory
- PointData => Point_Data
- Io_Error => IO_Error
- Standard.Table.Io => Standard.Table.IO

In Tests:
- MyError => My_Error
- NotFoo => Not_Foo
2022-07-19 14:09:09 +00:00
Radosław Waśko
fc110659db
Implement should_succeed (#3586)
Implements https://www.pivotaltracker.com/story/show/182709976
2022-07-14 19:58:44 +00:00
Radosław Waśko
35ddd2a89e
Add new options to the Delimited format (#3581)
Implements https://www.pivotaltracker.com/story/show/182662195 and https://www.pivotaltracker.com/story/show/182651884
2022-07-14 15:01:26 +00:00
James Dunkerley
9578dc1e43
Move write_bytes to be part of Vector. (#3583)
Updates `write_bytes` API to be part of `Vector` and to conform to `write` APIs.

# Important Notes
Ensures doesn't touch the file if an invalid byte array.
2022-07-14 11:30:40 +00:00
James Dunkerley
e41936f436
Additional tests for Excel Append (#3580)
Add some additional scenarios to Excel append tests:
- Non-A1 start
- Name duplication
- Hitting another range

# Important Notes
Also fixed a warning in the Image library.
2022-07-13 13:02:39 +00:00
James Dunkerley
2527a7bdb2
Update SQLite, PostgreSQL and Redshift drivers (#3571)
Updated the SQLite, PostgreSQL and Redshift drivers.

# Important Notes
Updated the API for Redshift and proved able to connect without the ini file workaround.
2022-07-11 18:39:16 +00:00
Radosław Waśko
df10e4ba7c
Add appending support for Delimited files (#3573)
Implements https://www.pivotaltracker.com/story/show/182309839
2022-07-11 12:36:01 +00:00
Jaroslav Tulach
735053c218
Implementing basic functions (#3554)
The language specification suggests to add [five basic functions into the standard library](https://github.com/enso-org/design/blob/wip/wd/enso-spec/epics/enso-spec-1.0/05.%20Functions.md#useful-functions-in-the-standard-library). `identity`, `flip`, `const`, `curry` & `uncurry`.

# Important Notes
The new functions are being added into existing `Function.enso` file. That may not be the best place, but it is not clear from the [design spec](https://github.com/enso-org/design/blob/wip/wd/enso-spec/epics/enso-spec-1.0/05.%20Functions.md#useful-functions-in-the-standard-library) how they are supposed to be imported. I can move them wherever needed.

There is a documentation provided for each of the functions, but I am not sure how to verify it is correct. Do we generate the documentation for stdlib somehow?
2022-07-11 10:30:44 +00:00
Radosław Waśko
28513a3389
Allow filtering caught error type in Error.catch (#3574)
More and more often I need a way to only recover a specific type of a dataflow error (in a similar manner as with panics). So the API for `Error.catch` has been amended to more closely resemble `Panic.catch`, allowing to handle only specific types of dataflow errors, passing others through unchanged. The default is `Any`, meaning all errors are caught by default, and the behaviour of `x.catch` remains unchanged.
2022-07-11 08:26:44 +00:00
Radosław Waśko
d8dddf40c6
Fix Meta.Polyglot.get_language (#3568) 2022-07-07 13:29:38 +00:00
Hubert Plociniczak
22fc03ca3d
Fix here leftovers (#3567)
Hooking up Reporting_Stream_Encoder_Spec reveals missing renaming.
Also removed one more `here.` that wasn't run in tests.

Kudos to @radeusgd for finding those!
2022-07-07 12:02:18 +00:00
Hubert Plociniczak
96e50648dd
Remove 'here' and make method name resolution case-sensitive (#3538)
Modified UppercaseNames to now resolve methods without an explicit `here` to point to the current module.
`here` was also often used instead of `self` which was allowed by the compiler.
Therefore UppercaseNames pass is now GlobalNames and does some extra work -
it translated method calls without an explicit target into proper applications.

# Important Notes
There was a long-standing bug in scopes usage when compiling standalone expressions.
This resulted in AliasAnalysis generating incorrect graphs and manifested itself only in unit tests
and when running `eval`, thus being a bit hard to locate.
See `runExpression` for details.

Additionally, method name resolution is now case-sensitive.

Obsolete passes like UndefinedVariables and ModuleThisToHere were removed. All tests have been adapted.
2022-07-07 10:31:06 +00:00
James Dunkerley
16e6f2fa08
Adding Append support to Excel.Write (#3558)
Adds support for appending to an existing Excel table.

# Important Notes
- Renamed `Column_Mapping` to `Column_Name_Mapping`
- Changed new type name to `Map_Column`
- Added last modified time and creation time to `File`.
2022-07-07 06:41:33 +00:00
Radosław Waśko
7c94fa6a77
Custom Encoding support when writing Delimited files (#3564)
Implements https://www.pivotaltracker.com/story/show/182545847
2022-07-07 00:20:00 +00:00
James Dunkerley
5174cc6ece
Update Database.connect to match new API (#3542)
Initial work restructuring the `Database.connect` API
- New SQLite API with support for InMemory.
- Updated PostgreSQL API with SSL and Client Certificate Support.
- Updated Redshift API.

# Important Notes
Follow up tasks:
- PostgreSQL SSL additional testing.
- Driver version updating.
- `.pgpass` support.
2022-07-04 20:26:44 +00:00
James Dunkerley
4ca2097488
Adding write support to File_Format.Excel (#3551)
Support for writing tables to Excel.

# Important Notes
Has custom support for Error mode as will allow appending a new table in this mode to the file.
2022-07-04 18:32:16 +00:00
Jaroslav Tulach
096d8fdca0
Avoid crashing the engine when JS debugger statement is used (#3547)
Try following enso program:
```
main =
here.debug

foreign js debug = """
debugger;

```
it crashes the engine with exception:
```
Execution finished with an error: java.lang.ClassCastException: class com.oracle.truffle.js.runtime.builtins.JSFunctionObject$Unbound cannot be cast to class org.enso.interpreter.runtime.callable.CallerInfo (com.oracle.truffle.js.runtime.builtins.JSFunctionObject$Unbound and org.enso.interpreter.runtime.callable.CallerInfo are in unnamed module of loader com.oracle.graalvm.locator.GraalVMLocator$GuestLangToolsLoader @55cb6996)
at <java> org.enso.interpreter.runtime.callable.function.Function$ArgumentsHelper.getCallerInfo(Function.java:352)
at <java> org.enso.interpreter.instrument.ReplDebuggerInstrument$ReplExecutionEventNodeImpl.onEnter(ReplDebuggerInstrument.java:179)
at <java> org.graalvm.truffle/com.oracle.truffle.api.instrumentation.ProbeNode$EventProviderChainNode.innerOnEnter(ProbeNode.java:1397)
at <java> org.graalvm.truffle/com.oracle.truffle.api.instrumentation.ProbeNode$EventChainNode.onEnter(ProbeNode.java:912)
at <java> org.graalvm.truffle/com.oracle.truffle.api.instrumentation.ProbeNode.onEnter(ProbeNode.java:216)
at <java> com.oracle.truffle.js.nodes.JavaScriptNodeWrapper.execute(JavaScriptNodeWrapper.java:44)
at <java> com.oracle.truffle.js.nodes.control.DiscardResultNode.execute(DiscardResultNode.java:88)
at <java> com.oracle.truffle.js.nodes.function.FunctionBodyNode.execute(FunctionBodyNode.java:73)
at <java> com.oracle.truffle.js.nodes.JavaScriptNodeWrapper.execute(JavaScriptNodeWrapper.java:45)
at <java> com.oracle.truffle.js.nodes.function.FunctionRootNode.executeInRealm(FunctionRootNode.java:150)
at <java> com.oracle.truffle.js.runtime.JavaScriptRealmBoundaryRootNode.execute(JavaScriptRealmBoundaryRootNode.java:93)
at <js> <js> poly_enso_eval(Unknown)
at <epb> <epb> null(Unknown)
at <enso> Prg.debug(Prg.enso:27-28)
```
2022-06-25 05:58:24 +00:00
Hubert Plociniczak
8f9a0b33d5
Fix System.nanoTime definition (#3543)
While redesigning builtins a bug was accidentally introduced.
Stumbled upon accidentally while trying to run `test/Benchmarks`.
2022-06-24 10:46:31 +00:00
Radosław Waśko
972b34d1a9
Implement value formatting and writing new files in Delimited format. (#3528)
Implements https://www.pivotaltracker.com/story/show/182309429 and https://www.pivotaltracker.com/story/show/182309573
2022-06-23 16:51:52 +00:00
James Dunkerley
7a2d304fa0
Update Excel reading API (#3523)
- Remove `from_xls` and `from_xlsx`.
- Add `headers` support to `File_Format.Excel`.
- Altered default read for Excel to be the first sheet.
- Altered behavior so that single cells grow down and right when reading sheet.
- Altered `Excel_Range` so knows if single cell or 1x1 range address.

# Important Notes
- Renamed `Range` to `Cell_Range` to avoid name clash.
2022-06-21 13:39:32 +00:00
Hubert Plociniczak
22a371a9c6
Substitute this with self (#3524)
A semi-manual s/this/self appied to the whole standard library.
Related to https://www.pivotaltracker.com/story/show/182328601

In the compiler promoted to use constants instead of hardcoded
`this`/`self` whenever possible.

# Important Notes
The PR **does not** require explicit `self` parameter declaration for methods as this part
of the design is still under consideration.
2022-06-21 10:53:52 +00:00
Marcin Kostrzewa
ec3fa32fec
introduce a micro stdlib for testing (#3531)
This introduces a tiny alternative to our stdlib, that can be used for testing the interpreter. There are 2 main advantages of such a solution:
1. Performance: on my machine, `runtime-with-intstruments/test` drops from 146s to 65s, while `runtime/test` drops from 165s to 51s. >6 mins total becoming <2 mins total is awesome. This alone means I'll drink less coffee in these breaks and will be healthier.
2. Better separation of concepts – currently working on a feature that breaks _all_ enso code. The dependency of interpreter tests on the stdlib means I have no means of incremental testing – ALL of stdlib must compile. This is horrible, rendered my work impossible, and resulted in this PR.
2022-06-16 10:25:24 +00:00
James Dunkerley
a0c6fa9c96
Removing old functions and tidy up of Table types (#3519)
- Removed `select` method.
- Removed `group` method.
- Removed `Aggregate_Table` type.
- Removed `Order_Rule` type.
- Removed `sort` method from Table.
- Expanded comments on `order_by`.
- Update comment on `aggregate` on Database.
- Update Visualisation to use new APIs.
- Updated Data Science examples to use new APIs.
- Moved Examples test out of Tests to own test.

# Important Notes
Need to get Examples_Tests added to CI.
2022-06-14 13:37:20 +00:00
Radosław Waśko
e83c36d9d6
Add scaffolding for Table.write function (#3521)
Implements https://www.pivotaltracker.com/story/show/182309559

This task implements common scaffolding for the `Table.write`, so that the particular implementations for Delimited and Excel file formats can be done in parallel.
2022-06-14 11:29:03 +00:00
Hubert Plociniczak
fd46e84e8d
Towards a full-blown builtins DSL (part 3) (#3471)
Auto-generate all builtin methods for builtin `File` type from method signatures.
Similarly, for `ManagedResource` and `Warning`.
Additionally, support for specializations for overloaded and non-overloaded methods is added.
Coverage can be tracked by the number of hard-coded builtin classes that are now deleted.

## Important notes

Notice how `type File` now lacks `prim_file` field and we were able to get rid off all of those
propagating method calls without writing a single builtin node class.
Similarly `ManagedResource` and `Warning` are now builtins and `Prim_Warnings` stub is now gone.
2022-06-13 11:48:34 +00:00
Radosław Waśko
a04825a5ce
Add Text.write Function (#3518)
Implements https://www.pivotaltracker.com/story/show/182309026
2022-06-13 09:11:46 +00:00
James Dunkerley
e97d27e1e0
Adjusting First and Last order_by to use Sort_Column_Selector (#3517) 2022-06-10 09:59:03 +00:00
James Dunkerley
8afba43add
Implement In-Memory Table order_by (#3515)
Implemented the `order_by` function with support for all modes of operation.
Added support for case insensitive natural order.

# Important Notes
- Improved MultiValueIndex/Key to not create loads of arrays.
- Adjusted HashCode for MultiValueKey to have a simple algorithm.
- Added Text_Utils.compare_normalized_ignoring_case to allow for case insensitive comparisons.
- Fixed issues with ObjectComparator and added some unit tests for it.
2022-06-08 12:30:50 +00:00
Radosław Waśko
2af970fe52
Basic changes to File_Format (#3516)
Implements https://www.pivotaltracker.com/story/show/182308987
2022-06-08 09:53:18 +00:00
Radosław Waśko
a382e0c15e
Improve database Table.order_by (#3514)
Implements https://www.pivotaltracker.com/story/show/182195405

Adds support for the Postgres dialect and simple case insensitive collation for SQLite.
2022-06-07 12:31:55 +00:00
Radosław Waśko
7d94efa6f2
Implement Table.order_by for SQLite and the common scaffolding for all backends (#3502)
Implements the common and SQLite parts of https://www.pivotaltracker.com/story/show/182195405
2022-06-06 10:56:52 +00:00
Hubert Plociniczak
e43325bfe1
Short-circuiting || and && (#3492)
Short-circuiting || and && is typically taken for granted
by users of other PLs. This change makes it happen for Enso.

Related to https://www.pivotaltracker.com/story/show/182261401
2022-06-02 16:58:38 +00:00
James Dunkerley
ba5d6823a9
Merge the Unique Name Strategy with NameDeduplicator (#3490)
- Merge the two approaches and makes them consistent
- Add warning support into Reader

# Important Notes
- Added support for JUnit format XML generation on tests. Use `ENSO_TEST_JUNIT_DIR`
2022-06-01 12:52:23 +00:00
Michał Wawrzyniec Urbańczyk
9842b0e5f0
Allow trailing space in ENSO_HTTP_TEST_HTTPBIN_URL URL. (#3500) 2022-06-01 10:32:03 +03:00
James Dunkerley
1aa0bb3552
Rank Data, Correlation, Covariance, R Squared (#3484)
- Added new `Statistic`s: Covariance, Pearson, Spearman, R Squared
- Added `covariance_matrix` function
- Added `pearson_correlation` function to compute correlation matrix
- Added `rank_data` and Rank_Method type to create rankings of a Vector
- Added `spearman_correlation` function to compute Spearman Rank correlation matrix

# Important Notes
- Added `Panic.throw_wrapped_if_error` and `Panic.handle_wrapped_dataflow_error` to help with errors within a loop.
- Removed `Array.set_at` use from `Table.Vector_Builder`
2022-05-30 17:13:06 +00:00
Radosław Waśko
f0f3a343eb
Adjust Table.sort_columns to use Text_Ordering design (#3487)
Implements https://www.pivotaltracker.com/story/show/182195306
2022-05-30 12:26:29 +00:00
Radosław Waśko
db611e1581
Remove obsolete Csv reading module (#3482)
Completes https://www.pivotaltracker.com/story/show/182037405

# Important Notes
- Some tests had to be adapted to the new parsing logic.
2022-05-28 10:01:14 +00:00
Radosław Waśko
8828d801ea
Implement Table from Text conversion (#3478)
Implements https://www.pivotaltracker.com/story/show/181824168
2022-05-26 12:04:25 +00:00
Radosław Waśko
7f572bf3e4
The user should be able to have the headers Inferred when reading a Delimited file (#3472)
Implements https://www.pivotaltracker.com/story/show/181986831
2022-05-25 13:29:17 +00:00
Radosław Waśko
ec1b072824
Integrate value parsing with Delimited file reading (#3463)
Implements https://www.pivotaltracker.com/story/show/182200028
2022-05-24 17:59:00 +02:00
Radosław Waśko
ff7700ebb1
Automatic inference of value types when parsing table columns (#3462)
Implements https://www.pivotaltracker.com/story/show/182199966
2022-05-20 15:08:36 +00:00
Radosław Waśko
0073f461d9
Fix Dataflow Error propagation for Builtins accepting primitives (#3400)
[ci no changelog needed]

Fixes https://www.pivotaltracker.com/story/show/181652841
2022-05-19 15:25:30 +00:00
Radosław Waśko
8430ce2625
Parsing values with known types (#3455)
Implements https://www.pivotaltracker.com/story/show/181824146
2022-05-18 15:27:48 +00:00
Jaroslav Tulach
78e7d69198
Generator of natural numbers yields IllegalStateException (#3440) 2022-05-18 11:48:29 +00:00
Hubert Plociniczak
6b6b1430bc
Cleanup Ref - get/put (#3457)
The change promotes static methods of `Ref`, `get` and `put`, to be
methods of `Ref` type.
The change also removes `Ref` module from the default namespace.
Had to mostly c&p functional dispatch for now, in order for the methods
to be found. Will auto-generate that code as part of builtins system.

Related to https://www.pivotaltracker.com/story/show/182138899
2022-05-17 10:26:36 +00:00
James Dunkerley
4f3a76817c
Statistics on a Vector (#3442)
- Implements various statistics on Vector

# Important Notes
Some minor codebase improvements:
- Some tweaks to Any/Nothing to improve performance
- Fixed bug in ObjectComparator
- Added if_nothing
- Removed Group_By_Key
2022-05-11 13:25:06 +00:00
Radosław Waśko
64f178f7a8
Delimited File Encoding (#3430)
Implements https://www.pivotaltracker.com/story/show/181998375
2022-05-10 22:44:05 +00:00
James Dunkerley
078c665a60
File_Format.Excel work (#3425)
- Read in Excel files following the specification.
- Support for XLSX and XLS formats.
- Ability to select ranges and sheets.
- Skip Rows and Row Limits.

# Important Notes
- Minor fix to DelimitedReader for Windows
2022-05-06 13:21:10 +00:00
Hubert Plociniczak
4bbabc00be
Move Builtin Types and Methods to stdlib (#3363)
This PR replaces hard-coded `@Builtin_Method` and `@Builtin_Type` nodes in Builtins with an automated solution
that a) collects metadata from such annotations b) generates `BuiltinTypes` c) registers builtin methods with corresponding
constructors.
The main differences are:
1) The owner of the builtin method does not necessarily have to be a builtin type
2) You can now mix regular methods and builtin ones in stdlib 
3) No need to keep track of builtin methods and types in various places and register them by hand (a source of many typos or omissions as it found during the process of this PR)

Related to #181497846
Benchmarks also execute within the margin of error.

### Important Notes

The PR got a bit large over time as I was moving various builtin types and finding various corner cases.
Most of the changes however are rather simple c&p from Builtins.enso to the corresponding stdlib module.
Here is the list of the most crucial updates:
- `engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java` - the core of the changes. We no longer register individual builtin constructors and their methods by hand. Instead, the information about those is read from 2 metadata files generated by annotation processors. When the builtin method is encountered in stdlib, we do not ignore the method. Instead we lookup it up in the list of registered functions (see `getBuiltinFunction` and `IrToTruffle`)
- `engine/runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/AtomConstructor.java` has now information whether it corresponds to the builtin type or not.
- `engine/runtime/src/main/scala/org/enso/compiler/codegen/RuntimeStubsGenerator.scala` - when runtime stubs generator encounters a builtin type, based on the @Builtin_Type annotation, it looks up an existing constructor for it and registers it in the provided scope, rather than creating a new one. The scope of the constructor is also changed to the one coming from stdlib, while ensuring that synthetic methods (for fields) also get assigned correctly
- `engine/runtime/src/main/scala/org/enso/compiler/codegen/IrToTruffle.scala` - when a builtin method is encountered in stdlib we don't generate a new function node for it, instead we look it up in the list of registered builtin methods. Note that Integer and Number present a bit of a challenge because they list a whole bunch of methods that don't have a corresponding method (instead delegating to small/big integer implementations).
During the translation new atom constructors get initialized but we don't want to do it for builtins which have gone through the process earlier, hence the exception
- `lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/MethodProcessor.java` - @Builtin_Method processor not only  generates the actual code fpr nodes but also collects and writes the info about them (name, class, params) to a metadata file that is read during builtins initialization 
- `lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/MethodProcessor.java` - @Builtin_Method processor no longer generates only (root) nodes but also collects and writes the info about them (name, class, params) to a metadata file that is read during builtins initialization
- `lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/TypeProcessor.java` - Similar to MethodProcessor but handles @Builtin_Type annotations. It doesn't, **yet**, generate any builtin objects.  It also collects the names, as present in stdlib, if any, so that we can generate the names automatically (see generated `types/ConstantsGen.java`)
- `engine/runtime/src/main/java/org/enso/interpreter/node/expression/builtin` - various classes annotated with @BuiltinType to ensure that the atom constructor is always properly registered for the builitn. Note that in order to support types fields in those, annotation takes optional `params` parameter (comma separated). 
- `engine/runtime/src/bench/scala/org/enso/interpreter/bench/fixtures/semantic/AtomFixtures.scala` - drop manual creation of test list which seemed to be a relict of the old design
2022-05-05 20:18:06 +02:00
Radosław Waśko
8219dca400
Improve support for reading Delimited files (#3424)
Implements https://www.pivotaltracker.com/story/show/181823957
2022-04-29 17:12:19 +00:00
Radosław Waśko
14257d07aa
Data analysts should be able to use Text.split, Text.lines and Text.words to break up strings (#3415)
Implements https://www.pivotaltracker.com/story/show/181266184

### Important Notes

Changed example image download to only proceed if the file did not exist before - thus cutting on the build time (the build used to download it _every_ time - which completely failed the build if network is down). A redownload can be forced by performing a fresh repository checkout.
2022-04-26 17:22:53 +02:00
Radosław Waśko
fecaa81551
Review Range and Interval, resolve infinite loop issue (#3408)
Implements: https://www.pivotaltracker.com/story/show/181652841
2022-04-20 16:22:01 +00:00
James Dunkerley
5a6b6749cc
Restructuring for File.read (#3390)
- Added Encoding type
- Added `Text.bytes`, `Text.from_bytes` with Encoding support
- Renamed `File.read` to `File.read_text`
- Renamed `File.write` to `File.write_text`
- Added Encoding support to `File.read_text` and `File.write_text`
- Added warnings to invalid encodings
2022-04-19 16:50:03 +00:00
Jaroslav Tulach
ab692b3b74
Enso Date shall be converted to java.time.LocalDate when passed to Java (#3374) 2022-04-15 06:02:05 +02:00
Radosław Waśko
0ea5dc2a6f
Data analysts should be able to use Text.replace to substitute parts of the text (#3393)
Implements https://www.pivotaltracker.com/story/show/181266274
2022-04-13 19:21:47 +00:00
Radosław Waśko
891f064a6a
Extend Aggregate_Spec test suite with tests for missed edge-cases to ensure the feature is well-tested on all backends (#3383)
Implements https://www.pivotaltracker.com/story/show/181805693 and finishes the basic set of features of the Aggregate component.

Still not all aggregations are supported everywhere, because for example SQLite has quite limited support for aggregations. Currently the workaround is to bring the table into memory (if possible) and perform the computation locally. Later on, we may add more complex generator features to emulate the missing aggregations with complex sub-queries.
2022-04-12 11:02:01 +00:00
Marcin Kostrzewa
4e51f31eb7
Always call defaulted atom arguments (#3358)
Solves the issue of defaulted args not being called in atoms. Doesn't solve the more general function issue.
2022-04-08 08:21:59 +00:00
James Dunkerley
bade0c31de
First and Last ordering (#3380)
Add the missing `order_by` support to First and Last aggregations for InMemory table.
2022-04-06 12:36:46 +00:00
Radosław Waśko
a71db71645
Adding most of remaining aggregates to Database Table (#3375) 2022-04-06 10:06:50 +00:00
Nikita Pekin
42ac28d0de
Add benchmark for Text.reverse with strings of varying length (#3381)
This pull request adds a benchmark for the `Text.reverse` function added in #3377 as part of https://www.pivotaltracker.com/n/projects/2539304/stories/181265419.

Per discussion with @jdunkerley on Discord it is useful to have this benchmark as this is a low-level item we want to track.
2022-04-05 17:59:21 +00:00
Nikita Pekin
22e3941371
Data analysts should be able to reverse strings using Text.reverse (#3377)
This commit implements `Text.reverse` as an extension on `Text`.
`Text.reverse` reverses strings. For example: `"Hello World!".reverse`
results in `"!dlroW olleH"`.

Strings are reversed by their Extended Grapheme Clusters not by their
characters. This has some performance implications because we need to
find these grapheme cluster boundaries when iterating. To do so,
`BreakIterator.getCharacterInstance` is used.

Implements: https://www.pivotaltracker.com/n/projects/2539304/stories/181265419
2022-04-05 16:45:56 +00:00
James Dunkerley
a4dbc9a37b
Moving Aggregation to Java (#3364) 2022-04-04 09:12:48 +00:00
Radosław Waśko
43265f10a8
Implement Error-Handling for Database aggregations, unify some error helpers across backends (#3371) 2022-03-31 12:10:22 +00:00
Radosław Waśko
20be5516a5
Aggregates in the Database library - MVP (#3353)
Implements infrastructure for new aggregations in the Database. It comes with only some basic aggregations and limited error-handling. More aggregations and problem handling will be added in subsequent PRs.

# Important Notes
This introduces basic aggregations using our existing codegen and sets-up our testing infrastructure to be able to use the same aggregate tests as in-memory backend for the database backends.

Many aggregations are not yet implemented - they will be added in subsequent tasks.

There are some TODOs left - they will be addressed in the next tasks.
2022-03-28 15:51:37 +00:00
Radosław Waśko
85a5770b7f
Quick-fix for Error.to_text CCE (#3357)
This is just a quick fix addressing an issue which was making debugging problematic.

The proper solution to the broader issue described at https://github.com/enso-org/enso/issues/1538#issuecomment-789645573 still needs to be done.
2022-03-24 13:12:53 +00:00
Radosław Waśko
85c09e7414
Make Resource.bracket not run the action if initializer failed with a dataflow error (#3356) 2022-03-23 16:36:35 +01:00
James Dunkerley
02bcfbb2a8
Refactor Aggregate Column (#3349)
- Make it easier to understand the computations.
- Fix issue with First.
- Improve quote handling in Concatenate
- Added validation and warnings to input
2022-03-22 18:18:46 +00:00
Hubert Plociniczak
66e2135b0d
Initialize AtomConstructor's fields via local vars (#3330)
The mechanism follows a similar approach to what is being in functions
with default arguments.
Additionally since InstantiateAtomNode wasn't a subtype of EnsoRootNode it
couldn't be used in the application, which was the primary reason for
issue #181449213.
Alternatively InstantiateAtomNode could have been enhanced to extend
EnsoRootNode rather than RootNode to carry scope info but the former
seemed simpler.

See test cases for previously crashing and invalid cases.
2022-03-21 09:15:14 +00:00
Radosław Waśko
cc7333812d
The library developer should be able to handle specific types of Panics while passing through others (#3344)
Implements https://www.pivotaltracker.com/story/show/181569176

Also ensures that Dataflow Errors have proper stack traces (earlier they did not point at the right location).
2022-03-18 16:57:06 +00:00
Radosław Waśko
08183f59f2
Minor fixes for Text (#3340)
* Avoid unnecessary copies

* Add tests for conversions

* Add guidelines for Text tests

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-03-15 16:11:46 +00:00
James Dunkerley
6c1c4554f5
Refactor table.group_by to table.aggregate (#3339)
Following UX work move to `table.aggregate` function.
2022-03-15 15:23:36 +01:00
Radosław Waśko
dedd1eac96
Refactor library warnings to use the new system (#3337)
Implements https://www.pivotaltracker.com/story/show/181536964
2022-03-15 12:52:57 +01:00
Radosław Waśko
247b284316
Data analysts should be able to use Text.location_of to find indexes within string using various matchers (#3324)
Implements https://www.pivotaltracker.com/n/projects/2539304/stories/181266029
2022-03-12 19:42:00 +00:00
Marcin Kostrzewa
4653bfeeab
Decorate values with arbitrary warnings (#3248) 2022-03-09 16:40:02 +01:00
James Dunkerley
65465fb8ef
Restructuring the Faker type and creating tests for Group_By (#3318)
- Added Minimum, Maximum, Longest. Shortest, Mode, Percentile
- Added first and last to Map
- Restructured Faker type more inline with FakerJS
- Created 2,500 row data set
- Tests for group_by
- Performance tests for group_by
2022-03-09 10:31:02 +00:00
Hubert Plociniczak
f92108158c
Added compare_to to True/False (#3317) 2022-03-08 14:24:04 +01:00
Hubert Plociniczak
8bdca89917
New Text.insert function (#3311)
Implements https://www.pivotaltracker.com/n/projects/2539304
2022-03-04 16:40:34 +01:00
James Dunkerley
fb68f18739
Within Vector, use Array.Copy wherever possible (#3236)
Following the Slice and Array.Copy experiment, took just the Array.Copy parts out and built into the Vector class.

This gives big performance wins in common operations:

| Test | Ref | New |
| --- | --- | --- |
| New Vector | 41.5 | 41.4 |
| Append Single | 26.6 | 4.2 |
| Append Large | 26.6 | 4.2 |
| Sum | 230.1 | 99.1 |
| Drop First 20 and Sum | 343.5 | 96.9 |
| Drop Last 20 and Sum | 311.7 | 96.9 |
| Filter | 240.2 | 92.5 |
| Filter With Index | 364.9 | 237.2 |
| Partition | 772.6 | 280.4 |
| Partition With Index | 912.3 | 427.9 |
| Each | 110.2 | 113.3 |

*Benchmarks run on an AWS EC2 r5a.xlarge with 1,000,000 item count, 100  iteration size run 10 times.*

# Important Notes
Have generally tried to push the `@Tail_Call` down from the Vector class and move to calling functions on the range class.

- Expanded benchmarks on Vector
- Added `take` method to Vector
- Added `each_with_index` method to Vector
- Added `filter_with_index` method to Vector
2022-03-03 15:40:48 +00:00
Radosław Waśko
500aed9d86
Fix the Test library ignoring dataflow errors (#3312)
Fixes https://www.pivotaltracker.com/story/show/181369176
2022-03-03 11:02:13 +01:00
James Dunkerley
ad1130587d
Updating Text.repeat and adding Text.* (#3310)
Updating the `Text.repeat` function:
- fix issue with negative count
- add * operator

Add tests of the function.
2022-03-02 19:00:47 +00:00
Radosław Waśko
40c851bf8b
Text.pad and Text.trim (#3309)
Implements https://www.pivotaltracker.com/story/show/181265516
2022-03-02 17:19:39 +00:00
James Dunkerley
738a691662
Table.group_by (#3305)
Functioning group_by based of Enso Map.

# Important Notes
This is an initial version which will be used to establish the API.
The grouping map will need to be moved to Java code for performance.
2022-03-01 16:18:11 +00:00
Radosław Waśko
0d96f59f44
Data analysts should be able to use Text.to_case to change the case of Text values (#3302)
* Move to_upper_case and to_lower_case into to_case

* Add an export, not sure about it

* Implement title case

TODO: some more tests would be good

* Add more tests

* explain title case

* fix todo

* changelog
2022-02-28 23:20:41 +00:00
Radosław Waśko
b03416f907
Update Column_Selector and Column_Mapping to use Matcher over Matching_Strategy (#3299)
Implements https://www.pivotaltracker.com/story/show/181339748
2022-02-25 18:39:10 +00:00
Radosław Waśko
2ae636f63c
Data analysts should be able to use Text.starts_with and Text.ends_with (#3292)
Implements https://www.pivotaltracker.com/story/show/181265900
2022-02-23 16:48:33 +00:00
James Dunkerley
2e2c5562a8
Text.take and Text.drop (#3287)
Implementation of the Text take and drop APIs
- Added `Range.contains` function
- Added `Text_Sub_Range` type
- Added `Text_Utils.index_of` and `Text_Utils.last_index_of` based on ICU StringSearcher
2022-02-22 18:50:59 +00:00
Radosław Waśko
ae9d51555f
Data analysts should be able to use Text.contains to check for substring using various matcher techniques. (#3285)
* Add matching mode definitions

* Add stub for new method API and an initial test suite

* Fix tests, implement exact matching

* Implement Regex matching

* changelog

* Add benchmarks

* Wokraround for case insensitive regex locale support

* minor tweaks

* Unify Case_Insensitive

* Update edge cases

* Fix other affected places

* minor style change

* Add a problematic test

* Add a regex test for a similar situation

* Migrate to StringSearch:wq

* Add test cases for scharfes S edge case

* Add problematic Regex Unicode normalization test

* Document the regex accents peculiarity

* Do not apply the normalization in ASCII only mode

* cr
2022-02-22 15:41:56 +00:00
Radosław Waśko
14f57271a2
Ensure that Text.compare_to compares strings according to grapheme clusters (#3282)
https://www.pivotaltracker.com/story/show/181175238
2022-02-17 17:09:41 +00:00
James Dunkerley
7afc8c48c5
Adding Integer.Parse (#3283)
* Integer parse via Longs

* Integer parse via Longs

* Benchmark for Number Parse

* CHANGELOG.md and Natural Order

* Expanded test set

* Number base tests

* Few more negative tests
2022-02-17 15:04:00 +00:00
James Dunkerley
68b85dea82
Improvement to the Natural Order Sort (#3276)
* Improved Natural Order
Data generator for benchmarking

* Missing Import
Benchmark script

* Update Natural_Order.enso

Restore missing ToDo

* Changelog

* PR Comments

* PR Comments

* Additional comments.

* Correction
2022-02-16 17:40:33 +00:00
Marcin Kostrzewa
67b4e59506
Properly expose stacktraces and related data to user code (#3271) 2022-02-16 10:36:19 +03:00
Radosław Waśko
fbf747d6cf
Implement Vector.flatten (#3259) 2022-02-15 16:16:08 +01:00
James Dunkerley
585afd83ce
Adding Text.at and Text.is_digit functions (#3269)
* Add Text.at function

* Add tests for Text.at

* Add tests for Text.is_digit

* Change log

* Avoid memory allocation
2022-02-14 09:03:55 +00:00
Edward Kmett
0c25ee736c
Upgrade Truffle and Graal to Version 21.3.0 (#3258) 2022-02-11 19:05:13 +03:00
James Dunkerley
1814d3c4f1
Data analysts should be able to transform a Table using the rename_columns functions (#3249)
* Implement Natural_Order and sort_columns

* Starting on Rename

Align Column_Mapping

Add By_Position
Separating off the validation for By_Index so can reuse for rename

By_Position implemented

By_Index implemented
Adjusted behaviour following discussion with Ned, so that renames dominate untouched columns.

Moving to validation style checks for problems

Putting accumulator back

Rename work

* Add Range.find

* More work

* Regex support
Tidy of Unique Name Strategy

* Fix Regex support

* Warning messages
Tests for Unique Naming Strategy
Table rename working

* Database Table rename_columns
Fix for Table
**Must follow up on slice**

* Some tests

* More tests

* Complete test set
(and associated fixes)

* Functional use_first_row_as_names
Tests to go...

* Test for use_first_row_as_names

* Change log

* trailing space

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
2022-02-11 10:18:51 +00:00
Marcin Kostrzewa
ee8df25fd5
Fix vector sorting with TCO comparators (#3256) 2022-02-09 22:17:43 +01:00
Radosław Waśko
8b24336604
Data analysts should be able to reorder columns into name order using sort_columns functions (#3250) 2022-02-08 17:28:46 +01:00
Edward Kmett
8a70debb59
Implement conversions (#180312665) (#3227)
* Implement conversions

start wip branch for conversion methods for collaborating with marcin

add conversions to MethodDispatchLibrary (wip)

start MethodDispatchLibrary implementations

conversions for atoms and functions

Implement a bunch of missing conversion lookups

final bug fixes for merged methoddispatchlibrary implementations

UnresolvedConversion.resolveFor

progress on invokeConversion

start extracting constructors (still not working)

fix a bug

add some initial conversion tests

fix a bug in qualified name resolution, test conversions accross modules

implement error reporting, discover a ton of ignored errors...

start fixing errors that we exposed in the standard library

fix remaining standard lib type errors not caused by the inability to parse type signatures for operators

TODO: fix type signatures for operators. all of them are broken

fix type signature parsing for operators

test cases for meta & polyglot

play nice with polyglot

start pretending unresolved conversions are unresolved symbols

treat UnresolvedConversons as UnresolvedSymbols in enso user land

* update RELEASES.md

* disable test error about from conversions being tail calls. (pivotal issue #181113110)

* add changelog entry

* fix OverloadsResolutionTest

* fix MethodDefinitionsTest

* fix DataflowAnalysisTest

* the field name for a from conversion must be 'that'. Fix remaining tests that aren't ExpressionUpdates vs. ExecutionUpdate behavioral changes

* fix ModuleThisToHereTest

* feat: suppress compilation errors from Builtins

* Revert "feat: suppress compilation errors from Builtins"

This reverts commit 63d069bd4f.

* fix tests

* fix: formatting

Co-authored-by: Dmitry Bushev <bushevdv@gmail.com>
Co-authored-by: Marcin Kostrzewa <marckostrzewa@gmail.com>
2022-02-06 04:02:09 -05:00
Radosław Waśko
d3c0f968fa
Data analysts should be able to transform a Table using the remove_columns and reorder_columns functions (#3240) 2022-02-03 15:18:47 +01:00
Radosław Waśko
b5fc87e618
Data analysts should be able to transform a Table using the select_columns function (#3230)
* Utility for mapping errors and warnings
* Imlpement By_Index
* Expose select_columns in InMem and DB. Need testing
* checkpoint: writing tests
* Fix minor issues, mock warning mapping for testing purposes
* Improve By_Index error handling
* A helper for testing problem handling
* More error handling
* docs
* changelog
* Fix matching test
* Add SQLite tests
* cleanup after test
* Rework problem handling
* small refactor
* add examples
* Add more test cases for regex matching
* Fix Regex.Patter.matches to match full string
* "Fix" tests
2022-02-02 09:04:06 +00:00
Radosław Waśko
cfdb33bc68
Improve Vector (#3232) 2022-01-25 18:29:39 +01:00
James Dunkerley
8387375d83
Moving distinct to Map (#3229)
* Moving distinct to Map

* Mixed Type Comparable Wrapper

* Missing Bracket
Still an issue with `Integer` in the mixed vector test

* PR comments

* Use naive approach for mixed types

* Enable pending test

* Performance timing function

* Handle incomparable types cleanly

* Tidy up the time_execution function

* PR comments.

* Change log
2022-01-25 09:57:30 +00:00
Radosław Waśko
107128aeec
A library developer should be able to select matching names given a list (#3220) 2022-01-20 11:11:43 +01:00
Michał Wawrzyniec Urbańczyk
ed0e918bff
Fix the new engine CI workflow (#180855729) (#3219)
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
2022-01-17 19:21:34 +01:00
Radosław Waśko
66082ea554
The user should be able to remove duplicate elements from a Vector (#3224) 2022-01-17 12:51:56 +03:00
Dmitry Bushev
c14a2d8169
Fix codec spec (#3185) 2021-12-09 15:01:47 +03:00
Dmitry Bushev
93f7362199
Set Locale in Tests (#3158) 2021-11-16 17:18:25 +03:00
Ara Adkins
337f6c8ad4
Implement linear regression on tables (#2003) 2021-09-29 15:33:18 +01:00
Ara Adkins
d6465e9e97
Implement a --compile command for the engine runner (#1998) 2021-09-24 12:24:44 +01:00
Ara Adkins
1cd2706ba8
Load IR Caches from Disk (#1996) 2021-09-18 13:48:13 +01:00