Commit Graph

212 Commits

Author SHA1 Message Date
Radosław Waśko
a71db71645
Adding most of remaining aggregates to Database Table (#3375) 2022-04-06 10:06:50 +00:00
Nikita Pekin
42ac28d0de
Add benchmark for Text.reverse with strings of varying length (#3381)
This pull request adds a benchmark for the `Text.reverse` function added in #3377 as part of https://www.pivotaltracker.com/n/projects/2539304/stories/181265419.

Per discussion with @jdunkerley on Discord it is useful to have this benchmark as this is a low-level item we want to track.
2022-04-05 17:59:21 +00:00
Nikita Pekin
22e3941371
Data analysts should be able to reverse strings using Text.reverse (#3377)
This commit implements `Text.reverse` as an extension on `Text`.
`Text.reverse` reverses strings. For example: `"Hello World!".reverse`
results in `"!dlroW olleH"`.

Strings are reversed by their Extended Grapheme Clusters not by their
characters. This has some performance implications because we need to
find these grapheme cluster boundaries when iterating. To do so,
`BreakIterator.getCharacterInstance` is used.

Implements: https://www.pivotaltracker.com/n/projects/2539304/stories/181265419
2022-04-05 16:45:56 +00:00
James Dunkerley
a4dbc9a37b
Moving Aggregation to Java (#3364) 2022-04-04 09:12:48 +00:00
Radosław Waśko
43265f10a8
Implement Error-Handling for Database aggregations, unify some error helpers across backends (#3371) 2022-03-31 12:10:22 +00:00
Radosław Waśko
20be5516a5
Aggregates in the Database library - MVP (#3353)
Implements infrastructure for new aggregations in the Database. It comes with only some basic aggregations and limited error-handling. More aggregations and problem handling will be added in subsequent PRs.

# Important Notes
This introduces basic aggregations using our existing codegen and sets-up our testing infrastructure to be able to use the same aggregate tests as in-memory backend for the database backends.

Many aggregations are not yet implemented - they will be added in subsequent tasks.

There are some TODOs left - they will be addressed in the next tasks.
2022-03-28 15:51:37 +00:00
Radosław Waśko
85a5770b7f
Quick-fix for Error.to_text CCE (#3357)
This is just a quick fix addressing an issue which was making debugging problematic.

The proper solution to the broader issue described at https://github.com/enso-org/enso/issues/1538#issuecomment-789645573 still needs to be done.
2022-03-24 13:12:53 +00:00
Radosław Waśko
85c09e7414
Make Resource.bracket not run the action if initializer failed with a dataflow error (#3356) 2022-03-23 16:36:35 +01:00
James Dunkerley
02bcfbb2a8
Refactor Aggregate Column (#3349)
- Make it easier to understand the computations.
- Fix issue with First.
- Improve quote handling in Concatenate
- Added validation and warnings to input
2022-03-22 18:18:46 +00:00
Hubert Plociniczak
66e2135b0d
Initialize AtomConstructor's fields via local vars (#3330)
The mechanism follows a similar approach to what is being in functions
with default arguments.
Additionally since InstantiateAtomNode wasn't a subtype of EnsoRootNode it
couldn't be used in the application, which was the primary reason for
issue #181449213.
Alternatively InstantiateAtomNode could have been enhanced to extend
EnsoRootNode rather than RootNode to carry scope info but the former
seemed simpler.

See test cases for previously crashing and invalid cases.
2022-03-21 09:15:14 +00:00
Radosław Waśko
cc7333812d
The library developer should be able to handle specific types of Panics while passing through others (#3344)
Implements https://www.pivotaltracker.com/story/show/181569176

Also ensures that Dataflow Errors have proper stack traces (earlier they did not point at the right location).
2022-03-18 16:57:06 +00:00
Radosław Waśko
08183f59f2
Minor fixes for Text (#3340)
* Avoid unnecessary copies

* Add tests for conversions

* Add guidelines for Text tests

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-03-15 16:11:46 +00:00
James Dunkerley
6c1c4554f5
Refactor table.group_by to table.aggregate (#3339)
Following UX work move to `table.aggregate` function.
2022-03-15 15:23:36 +01:00
Radosław Waśko
dedd1eac96
Refactor library warnings to use the new system (#3337)
Implements https://www.pivotaltracker.com/story/show/181536964
2022-03-15 12:52:57 +01:00
Radosław Waśko
247b284316
Data analysts should be able to use Text.location_of to find indexes within string using various matchers (#3324)
Implements https://www.pivotaltracker.com/n/projects/2539304/stories/181266029
2022-03-12 19:42:00 +00:00
Marcin Kostrzewa
4653bfeeab
Decorate values with arbitrary warnings (#3248) 2022-03-09 16:40:02 +01:00
James Dunkerley
65465fb8ef
Restructuring the Faker type and creating tests for Group_By (#3318)
- Added Minimum, Maximum, Longest. Shortest, Mode, Percentile
- Added first and last to Map
- Restructured Faker type more inline with FakerJS
- Created 2,500 row data set
- Tests for group_by
- Performance tests for group_by
2022-03-09 10:31:02 +00:00
Hubert Plociniczak
f92108158c
Added compare_to to True/False (#3317) 2022-03-08 14:24:04 +01:00
Hubert Plociniczak
8bdca89917
New Text.insert function (#3311)
Implements https://www.pivotaltracker.com/n/projects/2539304
2022-03-04 16:40:34 +01:00
James Dunkerley
fb68f18739
Within Vector, use Array.Copy wherever possible (#3236)
Following the Slice and Array.Copy experiment, took just the Array.Copy parts out and built into the Vector class.

This gives big performance wins in common operations:

| Test | Ref | New |
| --- | --- | --- |
| New Vector | 41.5 | 41.4 |
| Append Single | 26.6 | 4.2 |
| Append Large | 26.6 | 4.2 |
| Sum | 230.1 | 99.1 |
| Drop First 20 and Sum | 343.5 | 96.9 |
| Drop Last 20 and Sum | 311.7 | 96.9 |
| Filter | 240.2 | 92.5 |
| Filter With Index | 364.9 | 237.2 |
| Partition | 772.6 | 280.4 |
| Partition With Index | 912.3 | 427.9 |
| Each | 110.2 | 113.3 |

*Benchmarks run on an AWS EC2 r5a.xlarge with 1,000,000 item count, 100  iteration size run 10 times.*

# Important Notes
Have generally tried to push the `@Tail_Call` down from the Vector class and move to calling functions on the range class.

- Expanded benchmarks on Vector
- Added `take` method to Vector
- Added `each_with_index` method to Vector
- Added `filter_with_index` method to Vector
2022-03-03 15:40:48 +00:00
Radosław Waśko
500aed9d86
Fix the Test library ignoring dataflow errors (#3312)
Fixes https://www.pivotaltracker.com/story/show/181369176
2022-03-03 11:02:13 +01:00
James Dunkerley
ad1130587d
Updating Text.repeat and adding Text.* (#3310)
Updating the `Text.repeat` function:
- fix issue with negative count
- add * operator

Add tests of the function.
2022-03-02 19:00:47 +00:00
Radosław Waśko
40c851bf8b
Text.pad and Text.trim (#3309)
Implements https://www.pivotaltracker.com/story/show/181265516
2022-03-02 17:19:39 +00:00
James Dunkerley
738a691662
Table.group_by (#3305)
Functioning group_by based of Enso Map.

# Important Notes
This is an initial version which will be used to establish the API.
The grouping map will need to be moved to Java code for performance.
2022-03-01 16:18:11 +00:00
Radosław Waśko
0d96f59f44
Data analysts should be able to use Text.to_case to change the case of Text values (#3302)
* Move to_upper_case and to_lower_case into to_case

* Add an export, not sure about it

* Implement title case

TODO: some more tests would be good

* Add more tests

* explain title case

* fix todo

* changelog
2022-02-28 23:20:41 +00:00
Radosław Waśko
b03416f907
Update Column_Selector and Column_Mapping to use Matcher over Matching_Strategy (#3299)
Implements https://www.pivotaltracker.com/story/show/181339748
2022-02-25 18:39:10 +00:00
Radosław Waśko
2ae636f63c
Data analysts should be able to use Text.starts_with and Text.ends_with (#3292)
Implements https://www.pivotaltracker.com/story/show/181265900
2022-02-23 16:48:33 +00:00
James Dunkerley
2e2c5562a8
Text.take and Text.drop (#3287)
Implementation of the Text take and drop APIs
- Added `Range.contains` function
- Added `Text_Sub_Range` type
- Added `Text_Utils.index_of` and `Text_Utils.last_index_of` based on ICU StringSearcher
2022-02-22 18:50:59 +00:00
Radosław Waśko
ae9d51555f
Data analysts should be able to use Text.contains to check for substring using various matcher techniques. (#3285)
* Add matching mode definitions

* Add stub for new method API and an initial test suite

* Fix tests, implement exact matching

* Implement Regex matching

* changelog

* Add benchmarks

* Wokraround for case insensitive regex locale support

* minor tweaks

* Unify Case_Insensitive

* Update edge cases

* Fix other affected places

* minor style change

* Add a problematic test

* Add a regex test for a similar situation

* Migrate to StringSearch:wq

* Add test cases for scharfes S edge case

* Add problematic Regex Unicode normalization test

* Document the regex accents peculiarity

* Do not apply the normalization in ASCII only mode

* cr
2022-02-22 15:41:56 +00:00
Radosław Waśko
14f57271a2
Ensure that Text.compare_to compares strings according to grapheme clusters (#3282)
https://www.pivotaltracker.com/story/show/181175238
2022-02-17 17:09:41 +00:00
James Dunkerley
7afc8c48c5
Adding Integer.Parse (#3283)
* Integer parse via Longs

* Integer parse via Longs

* Benchmark for Number Parse

* CHANGELOG.md and Natural Order

* Expanded test set

* Number base tests

* Few more negative tests
2022-02-17 15:04:00 +00:00
James Dunkerley
68b85dea82
Improvement to the Natural Order Sort (#3276)
* Improved Natural Order
Data generator for benchmarking

* Missing Import
Benchmark script

* Update Natural_Order.enso

Restore missing ToDo

* Changelog

* PR Comments

* PR Comments

* Additional comments.

* Correction
2022-02-16 17:40:33 +00:00
Marcin Kostrzewa
67b4e59506
Properly expose stacktraces and related data to user code (#3271) 2022-02-16 10:36:19 +03:00
Radosław Waśko
fbf747d6cf
Implement Vector.flatten (#3259) 2022-02-15 16:16:08 +01:00
James Dunkerley
585afd83ce
Adding Text.at and Text.is_digit functions (#3269)
* Add Text.at function

* Add tests for Text.at

* Add tests for Text.is_digit

* Change log

* Avoid memory allocation
2022-02-14 09:03:55 +00:00
Edward Kmett
0c25ee736c
Upgrade Truffle and Graal to Version 21.3.0 (#3258) 2022-02-11 19:05:13 +03:00
James Dunkerley
1814d3c4f1
Data analysts should be able to transform a Table using the rename_columns functions (#3249)
* Implement Natural_Order and sort_columns

* Starting on Rename

Align Column_Mapping

Add By_Position
Separating off the validation for By_Index so can reuse for rename

By_Position implemented

By_Index implemented
Adjusted behaviour following discussion with Ned, so that renames dominate untouched columns.

Moving to validation style checks for problems

Putting accumulator back

Rename work

* Add Range.find

* More work

* Regex support
Tidy of Unique Name Strategy

* Fix Regex support

* Warning messages
Tests for Unique Naming Strategy
Table rename working

* Database Table rename_columns
Fix for Table
**Must follow up on slice**

* Some tests

* More tests

* Complete test set
(and associated fixes)

* Functional use_first_row_as_names
Tests to go...

* Test for use_first_row_as_names

* Change log

* trailing space

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
2022-02-11 10:18:51 +00:00
Marcin Kostrzewa
ee8df25fd5
Fix vector sorting with TCO comparators (#3256) 2022-02-09 22:17:43 +01:00
Radosław Waśko
8b24336604
Data analysts should be able to reorder columns into name order using sort_columns functions (#3250) 2022-02-08 17:28:46 +01:00
Edward Kmett
8a70debb59
Implement conversions (#180312665) (#3227)
* Implement conversions

start wip branch for conversion methods for collaborating with marcin

add conversions to MethodDispatchLibrary (wip)

start MethodDispatchLibrary implementations

conversions for atoms and functions

Implement a bunch of missing conversion lookups

final bug fixes for merged methoddispatchlibrary implementations

UnresolvedConversion.resolveFor

progress on invokeConversion

start extracting constructors (still not working)

fix a bug

add some initial conversion tests

fix a bug in qualified name resolution, test conversions accross modules

implement error reporting, discover a ton of ignored errors...

start fixing errors that we exposed in the standard library

fix remaining standard lib type errors not caused by the inability to parse type signatures for operators

TODO: fix type signatures for operators. all of them are broken

fix type signature parsing for operators

test cases for meta & polyglot

play nice with polyglot

start pretending unresolved conversions are unresolved symbols

treat UnresolvedConversons as UnresolvedSymbols in enso user land

* update RELEASES.md

* disable test error about from conversions being tail calls. (pivotal issue #181113110)

* add changelog entry

* fix OverloadsResolutionTest

* fix MethodDefinitionsTest

* fix DataflowAnalysisTest

* the field name for a from conversion must be 'that'. Fix remaining tests that aren't ExpressionUpdates vs. ExecutionUpdate behavioral changes

* fix ModuleThisToHereTest

* feat: suppress compilation errors from Builtins

* Revert "feat: suppress compilation errors from Builtins"

This reverts commit 63d069bd4f.

* fix tests

* fix: formatting

Co-authored-by: Dmitry Bushev <bushevdv@gmail.com>
Co-authored-by: Marcin Kostrzewa <marckostrzewa@gmail.com>
2022-02-06 04:02:09 -05:00
Radosław Waśko
d3c0f968fa
Data analysts should be able to transform a Table using the remove_columns and reorder_columns functions (#3240) 2022-02-03 15:18:47 +01:00
Radosław Waśko
b5fc87e618
Data analysts should be able to transform a Table using the select_columns function (#3230)
* Utility for mapping errors and warnings
* Imlpement By_Index
* Expose select_columns in InMem and DB. Need testing
* checkpoint: writing tests
* Fix minor issues, mock warning mapping for testing purposes
* Improve By_Index error handling
* A helper for testing problem handling
* More error handling
* docs
* changelog
* Fix matching test
* Add SQLite tests
* cleanup after test
* Rework problem handling
* small refactor
* add examples
* Add more test cases for regex matching
* Fix Regex.Patter.matches to match full string
* "Fix" tests
2022-02-02 09:04:06 +00:00
Radosław Waśko
cfdb33bc68
Improve Vector (#3232) 2022-01-25 18:29:39 +01:00
James Dunkerley
8387375d83
Moving distinct to Map (#3229)
* Moving distinct to Map

* Mixed Type Comparable Wrapper

* Missing Bracket
Still an issue with `Integer` in the mixed vector test

* PR comments

* Use naive approach for mixed types

* Enable pending test

* Performance timing function

* Handle incomparable types cleanly

* Tidy up the time_execution function

* PR comments.

* Change log
2022-01-25 09:57:30 +00:00
Radosław Waśko
107128aeec
A library developer should be able to select matching names given a list (#3220) 2022-01-20 11:11:43 +01:00
Michał Wawrzyniec Urbańczyk
ed0e918bff
Fix the new engine CI workflow (#180855729) (#3219)
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
2022-01-17 19:21:34 +01:00
Radosław Waśko
66082ea554
The user should be able to remove duplicate elements from a Vector (#3224) 2022-01-17 12:51:56 +03:00
Dmitry Bushev
c14a2d8169
Fix codec spec (#3185) 2021-12-09 15:01:47 +03:00
Dmitry Bushev
93f7362199
Set Locale in Tests (#3158) 2021-11-16 17:18:25 +03:00
Ara Adkins
337f6c8ad4
Implement linear regression on tables (#2003) 2021-09-29 15:33:18 +01:00
Ara Adkins
d6465e9e97
Implement a --compile command for the engine runner (#1998) 2021-09-24 12:24:44 +01:00
Ara Adkins
1cd2706ba8
Load IR Caches from Disk (#1996) 2021-09-18 13:48:13 +01:00
Ara Adkins
ab8b2a2d4a
Implement writing of IR caches (#1991) 2021-09-08 17:15:42 +01:00
Marcin Kostrzewa
4f4e472ddf
Statistical functions (#1990) 2021-09-06 14:48:09 +02:00
Marcin Kostrzewa
a81257b402
Google Spreadsheet Reading (#1976) 2021-09-03 21:41:12 +02:00
Ara Adkins
c12cab9bd9
Add Column.set_index (#1982) 2021-09-02 10:30:02 +01:00
Marcin Kostrzewa
b73e5e84b3
Redshift Connector (#1985) 2021-09-02 11:28:49 +02:00
Ara Adkins
c18fe2d750
Provide regex support on Text (#1968) 2021-08-23 12:09:51 +01:00
Marcin Kostrzewa
4536ed9f9b
Stdlib Improvements (#1963) 2021-08-19 14:55:15 +02:00
Radosław Waśko
385464d0f0
Implement Files.list (#1961) 2021-08-18 21:26:22 +02:00
Marcin Kostrzewa
98eab2873e
Allow specifying a cell range when reading spreadsheets (#1954) 2021-08-16 17:01:33 +02:00
Marcin Kostrzewa
ad0b677ed8
Entry point for writing tables (#1946) 2021-08-12 15:16:24 +02:00
Ara Adkins
eb7e7d0872
Propagate dataflow errors in host and polyglot (#1941) 2021-08-11 17:05:23 +01:00
Marcin Kostrzewa
ca8252c9cf
Table to JSON serialization (#1937) 2021-08-10 15:35:51 +02:00
Ara Adkins
4e9043c395
Make the time types orderable (#1916) 2021-08-02 15:10:00 +01:00
Marcin Kostrzewa
9ce6eb0560
Write XLSX files (#1906) 2021-07-28 13:51:27 +02:00
Marcin Kostrzewa
ca52757c10
CSV Writing (#1894) 2021-07-22 15:13:00 +02:00
Marcin Kostrzewa
f55d66cb2c
XLS(X) Reading (#1879) 2021-07-20 13:32:19 +02:00
Marcin Kostrzewa
334a022ffd
Import syntax including namespace (#1806) 2021-06-24 12:42:24 +02:00
Ara Adkins
5a3775e028
Add syntactic support for conversion definitions (#1815) 2021-06-23 18:29:13 +01:00
Marcin Kostrzewa
b4709ab529
Default visualization definitions (#1786) 2021-06-08 08:12:02 +02:00
Ara Adkins
3890abe6fa
Update the protocol to support streaming files (#1757) 2021-05-26 15:08:41 +01:00
Ara Adkins
a8fce2421a
Remove the EPB GIL for GraalPython (#1747) 2021-05-20 08:29:22 +01:00
Dmitry Bushev
1a6b67d361
Add a .to_json conversion for Error (#1742) 2021-05-14 14:22:51 +01:00
Ara Adkins
48bcebc723
Update to GraalVM 21.1.0 (#1738) 2021-05-14 13:08:39 +01:00
Ara Adkins
c4c483683e
Improve error types in the standard library (#1734) 2021-05-11 10:19:30 +01:00
Ara Adkins
74b1fe9d23
Finish updating the standard library examples (#1731) 2021-05-06 16:55:26 +01:00
Dmitry Bushev
46725e07c3
Remove reflective access when loading OpenCV (#1727) 2021-05-05 17:26:01 +01:00
Dmitry Bushev
24d299d90e
HTTP Library Updates (#1722)
Misc fixes to HTTP library
2021-05-04 18:59:45 +03:00
Ara Adkins
66599fda25
Enhance examples for Standard.Base.* (#1714) 2021-05-04 09:49:53 +01:00
Ara Adkins
6060d31c79
Update examples for Standard.Base.Data.* (#1707) 2021-04-29 11:27:16 +01:00
Ara Adkins
3080d8f6f7
Add .sum to Vector (#1702) 2021-04-28 10:47:57 +01:00
Ara Adkins
170514b9d2
Fix some naming for Maybe (#1666) 2021-04-13 11:38:59 +01:00
Ara Adkins
8b0588939e
Fix some implementations for the Vector constructors (#1650) 2021-04-06 20:06:34 +01:00
Radosław Waśko
117ca51921
Improve how indexing in Table works (#1643) 2021-04-01 14:39:31 +01:00
Ara Adkins
9585080ab8
Clean up the standard library docs (#1641) 2021-04-01 12:20:36 +01:00
Michał Wawrzyniec Urbańczyk
8d77a565eb
Case Insensitive Dataframe Support in Visualizations (#1634)
Ref https://github.com/enso-org/ide/issues/1391
2021-04-01 10:05:17 +02:00
Radosław Waśko
444ae39d28
Inlining Helper for Benchmarks (#1638) 2021-03-31 17:11:34 +02:00
Dmitry Bushev
5cfd9284be
Convert GeoJSON to Table (#1632) 2021-03-30 15:06:22 +01:00
Ara Adkins
6ee0c19d53
Implement additional methods for table (#1628) 2021-03-29 17:34:06 +01:00
Radosław Waśko
301672df24
Fix a Bug in the Database Join Implementation (#1614) 2021-03-26 00:34:16 +01:00
Michał Wawrzyniec Urbańczyk
5b57960da3
Histogram and Scatterplot visualizations support for Table (#1608) 2021-03-25 17:47:22 +01:00
Dmitry Bushev
534ed305fc
Image Processing Library Prototype (#1450)
Add the Standard.Image library.
2021-03-23 13:16:43 +03:00
Marcin Kostrzewa
d97c7f51a4
Make the process library more IDE-friendly (#1591) 2021-03-18 15:45:02 +00:00
Radosław Waśko
49b30f2e9d
Database Visualization Support (#1582) 2021-03-18 14:28:52 +01:00
Ara Adkins
e4e16a3da3
Fix the array visualisation and misc crashes (#1588) 2021-03-17 16:34:53 +00:00
Radosław Waśko
21f667323e
PostgreSQL Support in Database Library (#1565)
Co-authored-by: Marcin Kostrzewa <marckostrzewa@gmail.com>
2021-03-16 17:53:04 +01:00
Ara Adkins
96697ddc97
Fix a crash due to shadowed project names (#1571) 2021-03-16 12:45:19 +00:00
Radosław Waśko
5f8af886e5
Connection and Materialization in the Database Library (#1546) 2021-03-09 19:52:42 +01:00
Marcin Kostrzewa
f298fbd3cf
R Interop (#1559) 2021-03-09 16:19:05 +01:00