Commit Graph

44 Commits

Author SHA1 Message Date
James Dunkerley
65465fb8ef
Restructuring the Faker type and creating tests for Group_By (#3318)
- Added Minimum, Maximum, Longest. Shortest, Mode, Percentile
- Added first and last to Map
- Restructured Faker type more inline with FakerJS
- Created 2,500 row data set
- Tests for group_by
- Performance tests for group_by
2022-03-09 10:31:02 +00:00
James Dunkerley
738a691662
Table.group_by (#3305)
Functioning group_by based of Enso Map.

# Important Notes
This is an initial version which will be used to establish the API.
The grouping map will need to be moved to Java code for performance.
2022-03-01 16:18:11 +00:00
Radosław Waśko
b03416f907
Update Column_Selector and Column_Mapping to use Matcher over Matching_Strategy (#3299)
Implements https://www.pivotaltracker.com/story/show/181339748
2022-02-25 18:39:10 +00:00
Radosław Waśko
ae9d51555f
Data analysts should be able to use Text.contains to check for substring using various matcher techniques. (#3285)
* Add matching mode definitions

* Add stub for new method API and an initial test suite

* Fix tests, implement exact matching

* Implement Regex matching

* changelog

* Add benchmarks

* Wokraround for case insensitive regex locale support

* minor tweaks

* Unify Case_Insensitive

* Update edge cases

* Fix other affected places

* minor style change

* Add a problematic test

* Add a regex test for a similar situation

* Migrate to StringSearch:wq

* Add test cases for scharfes S edge case

* Add problematic Regex Unicode normalization test

* Document the regex accents peculiarity

* Do not apply the normalization in ASCII only mode

* cr
2022-02-22 15:41:56 +00:00
James Dunkerley
1814d3c4f1
Data analysts should be able to transform a Table using the rename_columns functions (#3249)
* Implement Natural_Order and sort_columns

* Starting on Rename

Align Column_Mapping

Add By_Position
Separating off the validation for By_Index so can reuse for rename

By_Position implemented

By_Index implemented
Adjusted behaviour following discussion with Ned, so that renames dominate untouched columns.

Moving to validation style checks for problems

Putting accumulator back

Rename work

* Add Range.find

* More work

* Regex support
Tidy of Unique Name Strategy

* Fix Regex support

* Warning messages
Tests for Unique Naming Strategy
Table rename working

* Database Table rename_columns
Fix for Table
**Must follow up on slice**

* Some tests

* More tests

* Complete test set
(and associated fixes)

* Functional use_first_row_as_names
Tests to go...

* Test for use_first_row_as_names

* Change log

* trailing space

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
2022-02-11 10:18:51 +00:00
Radosław Waśko
8b24336604
Data analysts should be able to reorder columns into name order using sort_columns functions (#3250) 2022-02-08 17:28:46 +01:00
Radosław Waśko
d3c0f968fa
Data analysts should be able to transform a Table using the remove_columns and reorder_columns functions (#3240) 2022-02-03 15:18:47 +01:00
Radosław Waśko
b5fc87e618
Data analysts should be able to transform a Table using the select_columns function (#3230)
* Utility for mapping errors and warnings
* Imlpement By_Index
* Expose select_columns in InMem and DB. Need testing
* checkpoint: writing tests
* Fix minor issues, mock warning mapping for testing purposes
* Improve By_Index error handling
* A helper for testing problem handling
* More error handling
* docs
* changelog
* Fix matching test
* Add SQLite tests
* cleanup after test
* Rework problem handling
* small refactor
* add examples
* Add more test cases for regex matching
* Fix Regex.Patter.matches to match full string
* "Fix" tests
2022-02-02 09:04:06 +00:00
Radosław Waśko
107128aeec
A library developer should be able to select matching names given a list (#3220) 2022-01-20 11:11:43 +01:00
Ara Adkins
337f6c8ad4
Implement linear regression on tables (#2003) 2021-09-29 15:33:18 +01:00
Marcin Kostrzewa
4f4e472ddf
Statistical functions (#1990) 2021-09-06 14:48:09 +02:00
Ara Adkins
c12cab9bd9
Add Column.set_index (#1982) 2021-09-02 10:30:02 +01:00
Marcin Kostrzewa
4536ed9f9b
Stdlib Improvements (#1963) 2021-08-19 14:55:15 +02:00
Marcin Kostrzewa
98eab2873e
Allow specifying a cell range when reading spreadsheets (#1954) 2021-08-16 17:01:33 +02:00
Marcin Kostrzewa
ad0b677ed8
Entry point for writing tables (#1946) 2021-08-12 15:16:24 +02:00
Marcin Kostrzewa
ca8252c9cf
Table to JSON serialization (#1937) 2021-08-10 15:35:51 +02:00
Marcin Kostrzewa
9ce6eb0560
Write XLSX files (#1906) 2021-07-28 13:51:27 +02:00
Marcin Kostrzewa
ca52757c10
CSV Writing (#1894) 2021-07-22 15:13:00 +02:00
Marcin Kostrzewa
f55d66cb2c
XLS(X) Reading (#1879) 2021-07-20 13:32:19 +02:00
Marcin Kostrzewa
334a022ffd
Import syntax including namespace (#1806) 2021-06-24 12:42:24 +02:00
Marcin Kostrzewa
b4709ab529
Default visualization definitions (#1786) 2021-06-08 08:12:02 +02:00
Ara Adkins
c4c483683e
Improve error types in the standard library (#1734) 2021-05-11 10:19:30 +01:00
Ara Adkins
6060d31c79
Update examples for Standard.Base.Data.* (#1707) 2021-04-29 11:27:16 +01:00
Radosław Waśko
117ca51921
Improve how indexing in Table works (#1643) 2021-04-01 14:39:31 +01:00
Ara Adkins
9585080ab8
Clean up the standard library docs (#1641) 2021-04-01 12:20:36 +01:00
Dmitry Bushev
5cfd9284be
Convert GeoJSON to Table (#1632) 2021-03-30 15:06:22 +01:00
Ara Adkins
6ee0c19d53
Implement additional methods for table (#1628) 2021-03-29 17:34:06 +01:00
Radosław Waśko
49b30f2e9d
Database Visualization Support (#1582) 2021-03-18 14:28:52 +01:00
Ara Adkins
96697ddc97
Fix a crash due to shadowed project names (#1571) 2021-03-16 12:45:19 +00:00
Radosław Waśko
5f8af886e5
Connection and Materialization in the Database Library (#1546) 2021-03-09 19:52:42 +01:00
Marcin Kostrzewa
3dd348c1be
Table: Fix bool column sorting (#1505) 2021-02-24 17:36:24 +01:00
Marcin Kostrzewa
14dd4006bb
Table API: contatenation, index access, column aggregation, API unification (#1489) 2021-02-18 16:00:19 +01:00
Marcin Kostrzewa
05945ede90
Table Visualization Fixes (#1476) 2021-02-15 09:55:54 +01:00
Marcin Kostrzewa
93b6680d4f
Sorting Tables (#1471) 2021-02-11 16:50:07 +01:00
Ara Adkins
af1aab35aa
Improve dataflow errors in the standard library (#1446) 2021-02-02 12:31:33 +00:00
Marcin Kostrzewa
197190ceeb
Remove UFCS (#1398) 2021-01-14 21:53:04 +01:00
Marcin Kostrzewa
b751dfb3ec
Table: grouping (#1392) 2021-01-11 17:05:06 +01:00
Radosław Waśko
58346917eb
Implement Some Vectorized Text Operations And Dropping Missing (#1381) 2021-01-04 14:24:08 +01:00
Radosław Waśko
ab51bffd87
Implement fill_missing (#1372) 2020-12-22 23:10:27 +01:00
Marcin Kostrzewa
bf37754428
Table: maps, zips & more builtins (#1356) 2020-12-16 11:23:23 +01:00
Marcin Kostrzewa
a40989e7c6
Table: Indexes & Joins (#1317) 2020-11-30 16:21:55 +01:00
Marcin Kostrzewa
ab2c5ed097
Tables: column mapping & masking (#1297) 2020-11-18 15:09:43 +01:00
Marcin Kostrzewa
f420dd3702
Rename Unit to Nothing (#1269) 2020-11-06 12:44:11 +01:00
Marcin Kostrzewa
150771c0e2
Simple CSV parser (#1268) 2020-11-05 16:53:50 +01:00