enso-org/enso - enso - gitea: Gitea Service

mirror of https://github.com/enso-org/enso.git synced 2024-12-02 14:33:27 +03:00

Author	SHA1	Message	Date
James Dunkerley	8afba43add	Implement In-Memory Table order_by (#3515 ) Implemented the `order_by` function with support for all modes of operation. Added support for case insensitive natural order. # Important Notes - Improved MultiValueIndex/Key to not create loads of arrays. - Adjusted HashCode for MultiValueKey to have a simple algorithm. - Added Text_Utils.compare_normalized_ignoring_case to allow for case insensitive comparisons. - Fixed issues with ObjectComparator and added some unit tests for it.	2022-06-08 12:30:50 +00:00
Radosław Waśko	2af970fe52	Basic changes to File_Format (#3516 ) Implements https://www.pivotaltracker.com/story/show/182308987	2022-06-08 09:53:18 +00:00
Radosław Waśko	a382e0c15e	Improve database `Table.order_by` (#3514 ) Implements https://www.pivotaltracker.com/story/show/182195405 Adds support for the Postgres dialect and simple case insensitive collation for SQLite.	2022-06-07 12:31:55 +00:00
Radosław Waśko	7d94efa6f2	Implement `Table.order_by` for SQLite and the common scaffolding for all backends (#3502 ) Implements the common and SQLite parts of https://www.pivotaltracker.com/story/show/182195405	2022-06-06 10:56:52 +00:00
James Dunkerley	ba5d6823a9	Merge the Unique Name Strategy with NameDeduplicator (#3490 ) - Merge the two approaches and makes them consistent - Add warning support into Reader # Important Notes - Added support for JUnit format XML generation on tests. Use `ENSO_TEST_JUNIT_DIR`	2022-06-01 12:52:23 +00:00
Radosław Waśko	f0f3a343eb	Adjust Table.sort_columns to use Text_Ordering design (#3487 ) Implements https://www.pivotaltracker.com/story/show/182195306	2022-05-30 12:26:29 +00:00
Radosław Waśko	db611e1581	Remove obsolete `Csv` reading module (#3482 ) Completes https://www.pivotaltracker.com/story/show/182037405 # Important Notes - Some tests had to be adapted to the new parsing logic.	2022-05-28 10:01:14 +00:00
Radosław Waśko	8828d801ea	Implement Table from Text conversion (#3478 ) Implements https://www.pivotaltracker.com/story/show/181824168	2022-05-26 12:04:25 +00:00
Radosław Waśko	7f572bf3e4	The user should be able to have the headers Inferred when reading a Delimited file (#3472 ) Implements https://www.pivotaltracker.com/story/show/181986831	2022-05-25 13:29:17 +00:00
Radosław Waśko	ec1b072824	Integrate value parsing with Delimited file reading (#3463 ) Implements https://www.pivotaltracker.com/story/show/182200028	2022-05-24 17:59:00 +02:00
Radosław Waśko	ff7700ebb1	Automatic inference of value types when parsing table columns (#3462 ) Implements https://www.pivotaltracker.com/story/show/182199966	2022-05-20 15:08:36 +00:00
Radosław Waśko	8430ce2625	Parsing values with known types (#3455 ) Implements https://www.pivotaltracker.com/story/show/181824146	2022-05-18 15:27:48 +00:00
Hubert Plociniczak	6b6b1430bc	Cleanup Ref - get/put (#3457 ) The change promotes static methods of `Ref`, `get` and `put`, to be methods of `Ref` type. The change also removes `Ref` module from the default namespace. Had to mostly c&p functional dispatch for now, in order for the methods to be found. Will auto-generate that code as part of builtins system. Related to https://www.pivotaltracker.com/story/show/182138899	2022-05-17 10:26:36 +00:00
James Dunkerley	4f3a76817c	Statistics on a Vector (#3442 ) - Implements various statistics on Vector # Important Notes Some minor codebase improvements: - Some tweaks to Any/Nothing to improve performance - Fixed bug in ObjectComparator - Added if_nothing - Removed Group_By_Key	2022-05-11 13:25:06 +00:00
Radosław Waśko	64f178f7a8	Delimited File Encoding (#3430 ) Implements https://www.pivotaltracker.com/story/show/181998375	2022-05-10 22:44:05 +00:00
James Dunkerley	078c665a60	File_Format.Excel work (#3425 ) - Read in Excel files following the specification. - Support for XLSX and XLS formats. - Ability to select ranges and sheets. - Skip Rows and Row Limits. # Important Notes - Minor fix to DelimitedReader for Windows	2022-05-06 13:21:10 +00:00
Radosław Waśko	8219dca400	Improve support for reading Delimited files (#3424 ) Implements https://www.pivotaltracker.com/story/show/181823957	2022-04-29 17:12:19 +00:00
Radosław Waśko	14257d07aa	Data analysts should be able to use `Text.split`, `Text.lines` and `Text.words` to break up strings (#3415 ) Implements https://www.pivotaltracker.com/story/show/181266184 ### Important Notes Changed example image download to only proceed if the file did not exist before - thus cutting on the build time (the build used to download it _every_ time - which completely failed the build if network is down). A redownload can be forced by performing a fresh repository checkout.	2022-04-26 17:22:53 +02:00
James Dunkerley	5a6b6749cc	Restructuring for File.read (#3390 ) - Added Encoding type - Added `Text.bytes`, `Text.from_bytes` with Encoding support - Renamed `File.read` to `File.read_text` - Renamed `File.write` to `File.write_text` - Added Encoding support to `File.read_text` and `File.write_text` - Added warnings to invalid encodings	2022-04-19 16:50:03 +00:00
Radosław Waśko	0ea5dc2a6f	Data analysts should be able to use `Text.replace` to substitute parts of the text (#3393 ) Implements https://www.pivotaltracker.com/story/show/181266274	2022-04-13 19:21:47 +00:00
Radosław Waśko	891f064a6a	Extend Aggregate_Spec test suite with tests for missed edge-cases to ensure the feature is well-tested on all backends (#3383 ) Implements https://www.pivotaltracker.com/story/show/181805693 and finishes the basic set of features of the Aggregate component. Still not all aggregations are supported everywhere, because for example SQLite has quite limited support for aggregations. Currently the workaround is to bring the table into memory (if possible) and perform the computation locally. Later on, we may add more complex generator features to emulate the missing aggregations with complex sub-queries.	2022-04-12 11:02:01 +00:00
James Dunkerley	bade0c31de	First and Last ordering (#3380 ) Add the missing `order_by` support to First and Last aggregations for InMemory table.	2022-04-06 12:36:46 +00:00
Radosław Waśko	a71db71645	Adding most of remaining aggregates to Database Table (#3375 )	2022-04-06 10:06:50 +00:00
James Dunkerley	a4dbc9a37b	Moving Aggregation to Java (#3364 )	2022-04-04 09:12:48 +00:00
Radosław Waśko	43265f10a8	Implement Error-Handling for Database aggregations, unify some error helpers across backends (#3371 )	2022-03-31 12:10:22 +00:00
Radosław Waśko	20be5516a5	Aggregates in the Database library - MVP (#3353 ) Implements infrastructure for new aggregations in the Database. It comes with only some basic aggregations and limited error-handling. More aggregations and problem handling will be added in subsequent PRs. # Important Notes This introduces basic aggregations using our existing codegen and sets-up our testing infrastructure to be able to use the same aggregate tests as in-memory backend for the database backends. Many aggregations are not yet implemented - they will be added in subsequent tasks. There are some TODOs left - they will be addressed in the next tasks.	2022-03-28 15:51:37 +00:00
James Dunkerley	02bcfbb2a8	Refactor Aggregate Column (#3349 ) - Make it easier to understand the computations. - Fix issue with First. - Improve quote handling in Concatenate - Added validation and warnings to input	2022-03-22 18:18:46 +00:00
James Dunkerley	6c1c4554f5	Refactor table.group_by to table.aggregate (#3339 ) Following UX work move to `table.aggregate` function.	2022-03-15 15:23:36 +01:00
Radosław Waśko	dedd1eac96	Refactor library warnings to use the new system (#3337 ) Implements https://www.pivotaltracker.com/story/show/181536964	2022-03-15 12:52:57 +01:00
James Dunkerley	65465fb8ef	Restructuring the Faker type and creating tests for Group_By (#3318 ) - Added Minimum, Maximum, Longest. Shortest, Mode, Percentile - Added first and last to Map - Restructured Faker type more inline with FakerJS - Created 2,500 row data set - Tests for group_by - Performance tests for group_by	2022-03-09 10:31:02 +00:00
James Dunkerley	738a691662	Table.group_by (#3305 ) Functioning group_by based of Enso Map. # Important Notes This is an initial version which will be used to establish the API. The grouping map will need to be moved to Java code for performance.	2022-03-01 16:18:11 +00:00
Radosław Waśko	b03416f907	Update Column_Selector and Column_Mapping to use Matcher over Matching_Strategy (#3299 ) Implements https://www.pivotaltracker.com/story/show/181339748	2022-02-25 18:39:10 +00:00
Radosław Waśko	ae9d51555f	Data analysts should be able to use `Text.contains` to check for substring using various matcher techniques. (#3285 ) * Add matching mode definitions * Add stub for new method API and an initial test suite * Fix tests, implement exact matching * Implement Regex matching * changelog * Add benchmarks * Wokraround for case insensitive regex locale support * minor tweaks * Unify Case_Insensitive * Update edge cases * Fix other affected places * minor style change * Add a problematic test * Add a regex test for a similar situation * Migrate to StringSearch:wq * Add test cases for scharfes S edge case * Add problematic Regex Unicode normalization test * Document the regex accents peculiarity * Do not apply the normalization in ASCII only mode * cr	2022-02-22 15:41:56 +00:00
James Dunkerley	1814d3c4f1	Data analysts should be able to transform a Table using the rename_columns functions (#3249 ) * Implement Natural_Order and sort_columns * Starting on Rename Align Column_Mapping Add By_Position Separating off the validation for By_Index so can reuse for rename By_Position implemented By_Index implemented Adjusted behaviour following discussion with Ned, so that renames dominate untouched columns. Moving to validation style checks for problems Putting accumulator back Rename work * Add Range.find * More work * Regex support Tidy of Unique Name Strategy * Fix Regex support * Warning messages Tests for Unique Naming Strategy Table rename working * Database Table rename_columns Fix for Table Must follow up on slice * Some tests * More tests * Complete test set (and associated fixes) * Functional use_first_row_as_names Tests to go... * Test for use_first_row_as_names * Change log * trailing space Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>	2022-02-11 10:18:51 +00:00
Radosław Waśko	8b24336604	Data analysts should be able to reorder columns into name order using sort_columns functions (#3250 )	2022-02-08 17:28:46 +01:00
Radosław Waśko	d3c0f968fa	Data analysts should be able to transform a Table using the remove_columns and reorder_columns functions (#3240 )	2022-02-03 15:18:47 +01:00
Radosław Waśko	b5fc87e618	Data analysts should be able to transform a Table using the select_columns function (#3230 ) * Utility for mapping errors and warnings * Imlpement By_Index * Expose select_columns in InMem and DB. Need testing * checkpoint: writing tests * Fix minor issues, mock warning mapping for testing purposes * Improve By_Index error handling * A helper for testing problem handling * More error handling * docs * changelog * Fix matching test * Add SQLite tests * cleanup after test * Rework problem handling * small refactor * add examples * Add more test cases for regex matching * Fix Regex.Patter.matches to match full string * "Fix" tests	2022-02-02 09:04:06 +00:00
Radosław Waśko	107128aeec	A library developer should be able to select matching names given a list (#3220 )	2022-01-20 11:11:43 +01:00
Ara Adkins	337f6c8ad4	Implement linear regression on tables (#2003 )	2021-09-29 15:33:18 +01:00
Marcin Kostrzewa	4f4e472ddf	Statistical functions (#1990 )	2021-09-06 14:48:09 +02:00
Ara Adkins	c12cab9bd9	Add `Column.set_index` (#1982 )	2021-09-02 10:30:02 +01:00
Marcin Kostrzewa	4536ed9f9b	Stdlib Improvements (#1963 )	2021-08-19 14:55:15 +02:00
Marcin Kostrzewa	98eab2873e	Allow specifying a cell range when reading spreadsheets (#1954 )	2021-08-16 17:01:33 +02:00
Marcin Kostrzewa	ad0b677ed8	Entry point for writing tables (#1946 )	2021-08-12 15:16:24 +02:00
Marcin Kostrzewa	ca8252c9cf	Table to JSON serialization (#1937 )	2021-08-10 15:35:51 +02:00
Marcin Kostrzewa	9ce6eb0560	Write XLSX files (#1906 )	2021-07-28 13:51:27 +02:00
Marcin Kostrzewa	ca52757c10	CSV Writing (#1894 )	2021-07-22 15:13:00 +02:00
Marcin Kostrzewa	f55d66cb2c	XLS(X) Reading (#1879 )	2021-07-20 13:32:19 +02:00
Marcin Kostrzewa	334a022ffd	Import syntax including namespace (#1806 )	2021-06-24 12:42:24 +02:00
Marcin Kostrzewa	b4709ab529	Default visualization definitions (#1786 )	2021-06-08 08:12:02 +02:00
Ara Adkins	c4c483683e	Improve error types in the standard library (#1734 )	2021-05-11 10:19:30 +01:00
Ara Adkins	6060d31c79	Update examples for Standard.Base.Data.* (#1707 )	2021-04-29 11:27:16 +01:00
Radosław Waśko	117ca51921	Improve how indexing in Table works (#1643 )	2021-04-01 14:39:31 +01:00
Ara Adkins	9585080ab8	Clean up the standard library docs (#1641 )	2021-04-01 12:20:36 +01:00
Dmitry Bushev	5cfd9284be	Convert GeoJSON to Table (#1632 )	2021-03-30 15:06:22 +01:00
Ara Adkins	6ee0c19d53	Implement additional methods for table (#1628 )	2021-03-29 17:34:06 +01:00
Radosław Waśko	49b30f2e9d	Database Visualization Support (#1582 )	2021-03-18 14:28:52 +01:00
Ara Adkins	96697ddc97	Fix a crash due to shadowed project names (#1571 )	2021-03-16 12:45:19 +00:00
Radosław Waśko	5f8af886e5	Connection and Materialization in the Database Library (#1546 )	2021-03-09 19:52:42 +01:00
Marcin Kostrzewa	3dd348c1be	Table: Fix bool column sorting (#1505 )	2021-02-24 17:36:24 +01:00
Marcin Kostrzewa	14dd4006bb	Table API: contatenation, index access, column aggregation, API unification (#1489 )	2021-02-18 16:00:19 +01:00
Marcin Kostrzewa	05945ede90	Table Visualization Fixes (#1476 )	2021-02-15 09:55:54 +01:00
Marcin Kostrzewa	93b6680d4f	Sorting Tables (#1471 )	2021-02-11 16:50:07 +01:00
Ara Adkins	af1aab35aa	Improve dataflow errors in the standard library (#1446 )	2021-02-02 12:31:33 +00:00
Marcin Kostrzewa	197190ceeb	Remove UFCS (#1398 )	2021-01-14 21:53:04 +01:00
Marcin Kostrzewa	b751dfb3ec	Table: grouping (#1392 )	2021-01-11 17:05:06 +01:00
Radosław Waśko	58346917eb	Implement Some Vectorized Text Operations And Dropping Missing (#1381 )	2021-01-04 14:24:08 +01:00
Radosław Waśko	ab51bffd87	Implement fill_missing (#1372 )	2020-12-22 23:10:27 +01:00
Marcin Kostrzewa	bf37754428	Table: maps, zips & more builtins (#1356 )	2020-12-16 11:23:23 +01:00
Marcin Kostrzewa	a40989e7c6	Table: Indexes & Joins (#1317 )	2020-11-30 16:21:55 +01:00
Marcin Kostrzewa	ab2c5ed097	Tables: column mapping & masking (#1297 )	2020-11-18 15:09:43 +01:00
Marcin Kostrzewa	f420dd3702	Rename Unit to Nothing (#1269 )	2020-11-06 12:44:11 +01:00
Marcin Kostrzewa	150771c0e2	Simple CSV parser (#1268 )	2020-11-05 16:53:50 +01:00

... 5 6 7 8 9

423 Commits