enso-org/enso - enso - gitea: Gitea Service

mirror of https://github.com/enso-org/enso.git synced 2024-12-25 06:44:32 +03:00

Author	SHA1	Message	Date
James Dunkerley	7a2d304fa0	Update Excel reading API (#3523 ) - Remove `from_xls` and `from_xlsx`. - Add `headers` support to `File_Format.Excel`. - Altered default read for Excel to be the first sheet. - Altered behavior so that single cells grow down and right when reading sheet. - Altered `Excel_Range` so knows if single cell or 1x1 range address. # Important Notes - Renamed `Range` to `Cell_Range` to avoid name clash.	2022-06-21 13:39:32 +00:00
James Dunkerley	a0c6fa9c96	Removing old functions and tidy up of Table types (#3519 ) - Removed `select` method. - Removed `group` method. - Removed `Aggregate_Table` type. - Removed `Order_Rule` type. - Removed `sort` method from Table. - Expanded comments on `order_by`. - Update comment on `aggregate` on Database. - Update Visualisation to use new APIs. - Updated Data Science examples to use new APIs. - Moved Examples test out of Tests to own test. # Important Notes Need to get Examples_Tests added to CI.	2022-06-14 13:37:20 +00:00
James Dunkerley	e97d27e1e0	Adjusting First and Last order_by to use Sort_Column_Selector (#3517 )	2022-06-10 09:59:03 +00:00
James Dunkerley	8afba43add	Implement In-Memory Table order_by (#3515 ) Implemented the `order_by` function with support for all modes of operation. Added support for case insensitive natural order. # Important Notes - Improved MultiValueIndex/Key to not create loads of arrays. - Adjusted HashCode for MultiValueKey to have a simple algorithm. - Added Text_Utils.compare_normalized_ignoring_case to allow for case insensitive comparisons. - Fixed issues with ObjectComparator and added some unit tests for it.	2022-06-08 12:30:50 +00:00
Radosław Waśko	2af970fe52	Basic changes to File_Format (#3516 ) Implements https://www.pivotaltracker.com/story/show/182308987	2022-06-08 09:53:18 +00:00
James Dunkerley	ba5d6823a9	Merge the Unique Name Strategy with NameDeduplicator (#3490 ) - Merge the two approaches and makes them consistent - Add warning support into Reader # Important Notes - Added support for JUnit format XML generation on tests. Use `ENSO_TEST_JUNIT_DIR`	2022-06-01 12:52:23 +00:00
James Dunkerley	1aa0bb3552	Rank Data, Correlation, Covariance, R Squared (#3484 ) - Added new `Statistic`s: Covariance, Pearson, Spearman, R Squared - Added `covariance_matrix` function - Added `pearson_correlation` function to compute correlation matrix - Added `rank_data` and Rank_Method type to create rankings of a Vector - Added `spearman_correlation` function to compute Spearman Rank correlation matrix # Important Notes - Added `Panic.throw_wrapped_if_error` and `Panic.handle_wrapped_dataflow_error` to help with errors within a loop. - Removed `Array.set_at` use from `Table.Vector_Builder`	2022-05-30 17:13:06 +00:00
Radosław Waśko	db611e1581	Remove obsolete `Csv` reading module (#3482 ) Completes https://www.pivotaltracker.com/story/show/182037405 # Important Notes - Some tests had to be adapted to the new parsing logic.	2022-05-28 10:01:14 +00:00
Radosław Waśko	7f572bf3e4	The user should be able to have the headers Inferred when reading a Delimited file (#3472 ) Implements https://www.pivotaltracker.com/story/show/181986831	2022-05-25 13:29:17 +00:00
Hubert Plociniczak	4918ccb5a3	Make sure formatting is applied to std-bits projects (#3477 ) @radeusgd discovered that no formatting was being applied to std-bits projects. This was caused by the fact that `enso` project didn't aggregate them. Compilation and packaging still worked because one relied on the output of some tasks but ``` sbt> javafmtAll ``` didn't apply it to `std-bits`. # Important Notes Apart from `build.sbt` no manual changes were made.	2022-05-25 09:26:50 +00:00
Radosław Waśko	ec1b072824	Integrate value parsing with Delimited file reading (#3463 ) Implements https://www.pivotaltracker.com/story/show/182200028	2022-05-24 17:59:00 +02:00
Radosław Waśko	ff7700ebb1	Automatic inference of value types when parsing table columns (#3462 ) Implements https://www.pivotaltracker.com/story/show/182199966	2022-05-20 15:08:36 +00:00
Radosław Waśko	8430ce2625	Parsing values with known types (#3455 ) Implements https://www.pivotaltracker.com/story/show/181824146	2022-05-18 15:27:48 +00:00
James Dunkerley	4f3a76817c	Statistics on a Vector (#3442 ) - Implements various statistics on Vector # Important Notes Some minor codebase improvements: - Some tweaks to Any/Nothing to improve performance - Fixed bug in ObjectComparator - Added if_nothing - Removed Group_By_Key	2022-05-11 13:25:06 +00:00
Radosław Waśko	64f178f7a8	Delimited File Encoding (#3430 ) Implements https://www.pivotaltracker.com/story/show/181998375	2022-05-10 22:44:05 +00:00
James Dunkerley	078c665a60	File_Format.Excel work (#3425 ) - Read in Excel files following the specification. - Support for XLSX and XLS formats. - Ability to select ranges and sheets. - Skip Rows and Row Limits. # Important Notes - Minor fix to DelimitedReader for Windows	2022-05-06 13:21:10 +00:00
Radosław Waśko	8219dca400	Improve support for reading Delimited files (#3424 ) Implements https://www.pivotaltracker.com/story/show/181823957	2022-04-29 17:12:19 +00:00
Radosław Waśko	14257d07aa	Data analysts should be able to use `Text.split`, `Text.lines` and `Text.words` to break up strings (#3415 ) Implements https://www.pivotaltracker.com/story/show/181266184 ### Important Notes Changed example image download to only proceed if the file did not exist before - thus cutting on the build time (the build used to download it _every_ time - which completely failed the build if network is down). A redownload can be forced by performing a fresh repository checkout.	2022-04-26 17:22:53 +02:00
James Dunkerley	5a6b6749cc	Restructuring for File.read (#3390 ) - Added Encoding type - Added `Text.bytes`, `Text.from_bytes` with Encoding support - Renamed `File.read` to `File.read_text` - Renamed `File.write` to `File.write_text` - Added Encoding support to `File.read_text` and `File.write_text` - Added warnings to invalid encodings	2022-04-19 16:50:03 +00:00
Radosław Waśko	0ea5dc2a6f	Data analysts should be able to use `Text.replace` to substitute parts of the text (#3393 ) Implements https://www.pivotaltracker.com/story/show/181266274	2022-04-13 19:21:47 +00:00
Radosław Waśko	891f064a6a	Extend Aggregate_Spec test suite with tests for missed edge-cases to ensure the feature is well-tested on all backends (#3383 ) Implements https://www.pivotaltracker.com/story/show/181805693 and finishes the basic set of features of the Aggregate component. Still not all aggregations are supported everywhere, because for example SQLite has quite limited support for aggregations. Currently the workaround is to bring the table into memory (if possible) and perform the computation locally. Later on, we may add more complex generator features to emulate the missing aggregations with complex sub-queries.	2022-04-12 11:02:01 +00:00
James Dunkerley	bade0c31de	First and Last ordering (#3380 ) Add the missing `order_by` support to First and Last aggregations for InMemory table.	2022-04-06 12:36:46 +00:00
James Dunkerley	a4dbc9a37b	Moving Aggregation to Java (#3364 )	2022-04-04 09:12:48 +00:00
James Dunkerley	02bcfbb2a8	Refactor Aggregate Column (#3349 ) - Make it easier to understand the computations. - Fix issue with First. - Improve quote handling in Concatenate - Added validation and warnings to input	2022-03-22 18:18:46 +00:00
Radosław Waśko	247b284316	Data analysts should be able to use `Text.location_of` to find indexes within string using various matchers (#3324 ) Implements https://www.pivotaltracker.com/n/projects/2539304/stories/181266029	2022-03-12 19:42:00 +00:00
Hubert Plociniczak	ac5c02ed8c	Use `.isEmpty()` instead of `.length() == 0` (#3314 ) Minor nit - since String is a CharSequence it's advisable to use the corresponding method for checking the condition rather than writing it by hand.	2022-03-04 16:41:48 +01:00
Radosław Waśko	40c851bf8b	Text.pad and Text.trim (#3309 ) Implements https://www.pivotaltracker.com/story/show/181265516	2022-03-02 17:19:39 +00:00
Radosław Waśko	2ae636f63c	Data analysts should be able to use `Text.starts_with` and `Text.ends_with` (#3292 ) Implements https://www.pivotaltracker.com/story/show/181265900	2022-02-23 16:48:33 +00:00
James Dunkerley	2e2c5562a8	Text.take and Text.drop (#3287 ) Implementation of the Text take and drop APIs - Added `Range.contains` function - Added `Text_Sub_Range` type - Added `Text_Utils.index_of` and `Text_Utils.last_index_of` based on ICU StringSearcher	2022-02-22 18:50:59 +00:00
Radosław Waśko	ae9d51555f	Data analysts should be able to use `Text.contains` to check for substring using various matcher techniques. (#3285 ) * Add matching mode definitions * Add stub for new method API and an initial test suite * Fix tests, implement exact matching * Implement Regex matching * changelog * Add benchmarks * Wokraround for case insensitive regex locale support * minor tweaks * Unify Case_Insensitive * Update edge cases * Fix other affected places * minor style change * Add a problematic test * Add a regex test for a similar situation * Migrate to StringSearch:wq * Add test cases for scharfes S edge case * Add problematic Regex Unicode normalization test * Document the regex accents peculiarity * Do not apply the normalization in ASCII only mode * cr	2022-02-22 15:41:56 +00:00
Radosław Waśko	14f57271a2	Ensure that `Text.compare_to` compares strings according to grapheme clusters (#3282 ) https://www.pivotaltracker.com/story/show/181175238	2022-02-17 17:09:41 +00:00
James Dunkerley	1814d3c4f1	Data analysts should be able to transform a Table using the rename_columns functions (#3249 ) * Implement Natural_Order and sort_columns * Starting on Rename Align Column_Mapping Add By_Position Separating off the validation for By_Index so can reuse for rename By_Position implemented By_Index implemented Adjusted behaviour following discussion with Ned, so that renames dominate untouched columns. Moving to validation style checks for problems Putting accumulator back Rename work * Add Range.find * More work * Regex support Tidy of Unique Name Strategy * Fix Regex support * Warning messages Tests for Unique Naming Strategy Table rename working * Database Table rename_columns Fix for Table Must follow up on slice * Some tests * More tests * Complete test set (and associated fixes) * Functional use_first_row_as_names Tests to go... * Test for use_first_row_as_names * Change log * trailing space Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>	2022-02-11 10:18:51 +00:00
Marcin Kostrzewa	a81257b402	Google Spreadsheet Reading (#1976 )	2021-09-03 21:41:12 +02:00
Ara Adkins	c12cab9bd9	Add `Column.set_index` (#1982 )	2021-09-02 10:30:02 +01:00
Marcin Kostrzewa	b73e5e84b3	Redshift Connector (#1985 )	2021-09-02 11:28:49 +02:00
Ara Adkins	c18fe2d750	Provide regex support on `Text` (#1968 )	2021-08-23 12:09:51 +01:00
Marcin Kostrzewa	4536ed9f9b	Stdlib Improvements (#1963 )	2021-08-19 14:55:15 +02:00
Marcin Kostrzewa	98eab2873e	Allow specifying a cell range when reading spreadsheets (#1954 )	2021-08-16 17:01:33 +02:00
Marcin Kostrzewa	7c45b92462	Allow malformed CSVs with too many headers in the CSV parser (#1942 )	2021-08-12 10:47:28 +02:00
Ara Adkins	7fe27ad6ff	Fix a bounds-checking bug in CSV parsing (#1914 )	2021-08-02 13:00:13 +01:00
Marcin Kostrzewa	9ce6eb0560	Write XLSX files (#1906 )	2021-07-28 13:51:27 +02:00
Marcin Kostrzewa	ca52757c10	CSV Writing (#1894 )	2021-07-22 15:13:00 +02:00
Marcin Kostrzewa	f55d66cb2c	XLS(X) Reading (#1879 )	2021-07-20 13:32:19 +02:00
Marcin Kostrzewa	334a022ffd	Import syntax including namespace (#1806 )	2021-06-24 12:42:24 +02:00
Dmitry Bushev	46725e07c3	Remove reflective access when loading OpenCV (#1727 )	2021-05-05 17:26:01 +01:00
Ara Adkins	6060d31c79	Update examples for Standard.Base.Data.* (#1707 )	2021-04-29 11:27:16 +01:00
Radosław Waśko	117ca51921	Improve how indexing in Table works (#1643 )	2021-04-01 14:39:31 +01:00
Ara Adkins	6ee0c19d53	Implement additional methods for table (#1628 )	2021-03-29 17:34:06 +01:00
Dmitry Bushev	565d74188b	fix: color conversion (#1612 ) Update the visualization function with the BGR-> RGB conversion.	2021-03-25 08:35:59 +03:00
Dmitry Bushev	534ed305fc	Image Processing Library Prototype (#1450 ) Add the Standard.Image library.	2021-03-23 13:16:43 +03:00
Radosław Waśko	49b30f2e9d	Database Visualization Support (#1582 )	2021-03-18 14:28:52 +01:00
Radosław Waśko	21f667323e	PostgreSQL Support in Database Library (#1565 ) Co-authored-by: Marcin Kostrzewa <marckostrzewa@gmail.com>	2021-03-16 17:53:04 +01:00
Ara Adkins	96697ddc97	Fix a crash due to shadowed project names (#1571 )	2021-03-16 12:45:19 +00:00
Radosław Waśko	6544c2478d	Implement the first part of the database library (#1475 )	2021-02-25 13:48:18 +00:00
Radosław Waśko	58346917eb	Implement Some Vectorized Text Operations And Dropping Missing (#1381 )	2021-01-04 14:24:08 +01:00
Ara Adkins	2c12a18b25	Implement sorting for `Vector` (#1349 )	2020-12-15 14:20:59 +00:00
Ara Adkins	e62f6796fe	Add the ability to split Text on word boundaries (#1302 )	2020-11-20 13:29:34 +00:00
Ara Adkins	fbe1f4c439	Implement better splitting for Text (#1298 )	2020-11-19 13:28:03 +00:00
Marcin Kostrzewa	ab2c5ed097	Tables: column mapping & masking (#1297 )	2020-11-18 15:09:43 +01:00
Ara Adkins	bc8a22e279	Add further standard library improvements (#1290 )	2020-11-16 12:56:31 +00:00
Dmitry Bushev	11e4241921	HTTP Library (#1220 ) Add `Base.Net.Http` library	2020-10-27 14:45:10 +03:00
Marcin Kostrzewa	c0de753d95	JSON Library (#1241 )	2020-10-23 14:16:48 +02:00
Marcin Kostrzewa	207aaaccf5	Map Implementation (#1222 )	2020-10-20 13:43:04 +02:00
Radosław Waśko	0a9e2a42ce	Automate License Information Gathering (#1198 )	2020-10-09 16:19:58 +02:00
Marcin Kostrzewa	05f4cc2e7c	Files API (#1204 )	2020-10-09 14:05:22 +02:00
Dmitry Bushev	72bf87c648	Implement Enso Time Library (#1171 ) Add `Base.Time` module. The module wraps `java.time` data types and provides utility Enso methods to work with them.	2020-10-09 10:40:54 +03:00
Marcin Kostrzewa	a1748c3978	Enso's Text Type (#1166 )	2020-09-30 13:33:57 +02:00

1 2 3

117 Commits