mirror of
https://github.com/enso-org/enso.git
synced 2024-11-22 22:10:15 +03:00
Implement Table.replace for the in-memory backend (#8935)
This commit is contained in:
parent
e0ba39ed3e
commit
6554972b7d
@ -611,6 +611,7 @@
|
||||
`Filter_Condition`.][8865]
|
||||
- [Added `File_By_Line` type allowing processing a file line by line. New faster
|
||||
JSON parser based off Jackson.][8719]
|
||||
- [Implemented `Table.replace` for the in-memory backend.][8935]
|
||||
|
||||
[debug-shortcuts]:
|
||||
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
|
||||
@ -878,6 +879,7 @@
|
||||
[8816]: https://github.com/enso-org/enso/pull/8816
|
||||
[8849]: https://github.com/enso-org/enso/pull/8849
|
||||
[8865]: https://github.com/enso-org/enso/pull/8865
|
||||
[8935]: https://github.com/enso-org/enso/pull/8935
|
||||
|
||||
#### Enso Compiler
|
||||
|
||||
|
@ -1403,7 +1403,7 @@ type Table
|
||||
- If a column that is being updated from the lookup table has a type
|
||||
that is not compatible with the type of the corresponding column in
|
||||
this table, a `No_Common_Type` error is raised.
|
||||
- If a key column contains `Nothing` values, either in the lookup table,
|
||||
- If a key column contains `Nothing` values in the lookup table,
|
||||
a `Null_Values_In_Key_Columns` error is raised.
|
||||
- If `allow_unmatched_rows` is `False` and there are rows in this table
|
||||
that do not have a matching row in the lookup table, an
|
||||
@ -1420,6 +1420,85 @@ type Table
|
||||
Helpers.ensure_same_connection "table" [self, lookup_table] <|
|
||||
Lookup_Query_Helper.build_lookup_query self lookup_table key_columns add_new_columns allow_unmatched_rows on_problems
|
||||
|
||||
## ALIAS find replace
|
||||
GROUP Standard.Base.Calculations
|
||||
ICON join
|
||||
Replaces values in `column` using `lookup_table` to specify a
|
||||
mapping from old to new values.
|
||||
|
||||
Arguments:
|
||||
- lookup_table: the table to use as a mapping from old to new values. A
|
||||
`Map` can also be used here (in which case passing `from_column` or
|
||||
`to_column` is disallowed and will throw an `Illegal_Argument` error.
|
||||
- column: the column within `self` to perform the replace on.
|
||||
- from_column: the column within `lookup_table` to match against `column`
|
||||
in `self`.
|
||||
- to_column: the column within `lookup_table` to get new values from.
|
||||
- allow_unmatched_rows: Specifies how to handle missing rows in the lookup.
|
||||
If `False` (the default), an `Unmatched_Rows_In_Lookup` error is raised.
|
||||
If `True`, the unmatched rows are left unchanged. Any new columns will
|
||||
be filled with `Nothing`.
|
||||
- on_problems: Specifies how to handle problems if they occur, reporting
|
||||
them as warnings by default.
|
||||
|
||||
? Result Ordering
|
||||
|
||||
When operating in-memory, this operation preserves the order of rows
|
||||
from this table (unlike `join`).
|
||||
In the Database backend, there are no guarantees related to ordering of
|
||||
results.
|
||||
|
||||
? Error Conditions
|
||||
|
||||
- If this table or the lookup table is lacking any of the columns
|
||||
specified by `from_column`, `to_column`, or `column`, a
|
||||
`Missing_Input_Columns` error is raised.
|
||||
- If a single row is matched by multiple entries in the lookup table,
|
||||
a `Non_Unique_Key` error is raised.
|
||||
- If a column that is being updated from the lookup table has a type
|
||||
that is not compatible with the type of the corresponding column in
|
||||
this table, a `No_Common_Type` error is raised.
|
||||
- If a key column contains `Nothing` values in the lookup table,
|
||||
a `Null_Values_In_Key_Columns` error is raised.
|
||||
- If `allow_unmatched_rows` is `False` and there are rows in this table
|
||||
that do not have a matching row in the lookup table, an
|
||||
`Unmatched_Rows_In_Lookup` error is raised.
|
||||
- The following problems may be reported according to the `on_problems`
|
||||
setting:
|
||||
- If any of the `columns` is a floating-point type,
|
||||
a `Floating_Point_Equality`.
|
||||
|
||||
> Example
|
||||
Replace values in column 'x' using a lookup table.
|
||||
|
||||
table = Table.new [['x', [1, 2, 3, 4]], ['y', ['a', 'b', 'c', 'd']], ['z', ['e', 'f', 'g', 'h']]]
|
||||
# | x | y | z
|
||||
# ---+---+---+---
|
||||
# 0 | 1 | a | e
|
||||
# 1 | 2 | b | f
|
||||
# 2 | 3 | c | g
|
||||
# 3 | 4 | d | h
|
||||
|
||||
lookup_table = Table.new [['x', [1, 2, 3, 4]], ['new_x', [10, 20, 30, 40]]]
|
||||
# | old_x | new_x
|
||||
# ---+-------+-------
|
||||
# 0 | 1 | 10
|
||||
# 1 | 2 | 20
|
||||
# 2 | 3 | 30
|
||||
# 3 | 4 | 40
|
||||
|
||||
result = table.replace lookup_table 'x'
|
||||
# | x | y | z
|
||||
# ---+----+---+---
|
||||
# 0 | 10 | a | e
|
||||
# 1 | 20 | b | f
|
||||
# 2 | 30 | c | g
|
||||
# 3 | 40 | d | h
|
||||
replace : Table | Map -> (Text | Integer) -> (Text | Integer) -> (Text | Integer) -> Boolean -> Problem_Behavior -> Table ! Missing_Input_Columns | Non_Unique_Key | Unmatched_Rows_In_Lookup
|
||||
replace self lookup_table:(Table | Map) column:(Text | Integer) from_column:(Text | Integer)=0 to_column:(Text | Integer)=1 allow_unmatched_rows:Boolean=True on_problems:Problem_Behavior=Problem_Behavior.Report_Warning =
|
||||
_ = [lookup_table, column, from_column, to_column, allow_unmatched_rows, on_problems]
|
||||
Error.throw (Unsupported_Database_Operation.Error "Table.replace is not implemented yet for the Database backends.")
|
||||
|
||||
## ALIAS join by row position
|
||||
GROUP Standard.Base.Calculations
|
||||
ICON dataframes_join
|
||||
|
@ -1913,7 +1913,7 @@ type Table
|
||||
- If a column that is being updated from the lookup table has a type
|
||||
that is not compatible with the type of the corresponding column in
|
||||
this table, a `No_Common_Type` error is raised.
|
||||
- If a key column contains `Nothing` values, either in the lookup table,
|
||||
- If a key column contains `Nothing` values in the lookup table,
|
||||
a `Null_Values_In_Key_Columns` error is raised.
|
||||
- If `allow_unmatched_rows` is `False` and there are rows in this table
|
||||
that do not have a matching row in the lookup table, an
|
||||
@ -1959,6 +1959,112 @@ type Table
|
||||
java_table = LookupJoin.lookupAndReplace java_keys java_descriptions allow_unmatched_rows java_problem_aggregator
|
||||
Table.Value java_table
|
||||
|
||||
## ALIAS find replace
|
||||
GROUP Standard.Base.Calculations
|
||||
ICON join
|
||||
Replaces values in `column` using `lookup_table` to specify a
|
||||
mapping from old to new values.
|
||||
|
||||
Arguments:
|
||||
- lookup_table: the table to use as a mapping from old to new values. A
|
||||
`Map` can also be used here (in which case passing `from_column` or
|
||||
`to_column` is disallowed and will throw an `Illegal_Argument` error.
|
||||
- column: the column within `self` to perform the replace on.
|
||||
- from_column: the column within `lookup_table` to match against `column`
|
||||
in `self`.
|
||||
- to_column: the column within `lookup_table` to get new values from.
|
||||
- allow_unmatched_rows: Specifies how to handle missing rows in the lookup.
|
||||
If `False` (the default), an `Unmatched_Rows_In_Lookup` error is raised.
|
||||
If `True`, the unmatched rows are left unchanged. Any new columns will
|
||||
be filled with `Nothing`.
|
||||
- on_problems: Specifies how to handle problems if they occur, reporting
|
||||
them as warnings by default.
|
||||
|
||||
? Result Ordering
|
||||
|
||||
When operating in-memory, this operation preserves the order of rows
|
||||
from this table (unlike `join`).
|
||||
In the Database backend, there are no guarantees related to ordering of
|
||||
results.
|
||||
|
||||
? Error Conditions
|
||||
|
||||
- If this table or the lookup table is lacking any of the columns
|
||||
specified by `from_column`, `to_column`, or `column`, a
|
||||
`Missing_Input_Columns` error is raised.
|
||||
- If a single row is matched by multiple entries in the lookup table,
|
||||
a `Non_Unique_Key` error is raised.
|
||||
- If a column that is being updated from the lookup table has a type
|
||||
that is not compatible with the type of the corresponding column in
|
||||
this table, a `No_Common_Type` error is raised.
|
||||
- If a key column contains `Nothing` values in the lookup table,
|
||||
a `Null_Values_In_Key_Columns` error is raised.
|
||||
- If `allow_unmatched_rows` is `False` and there are rows in this table
|
||||
that do not have a matching row in the lookup table, an
|
||||
`Unmatched_Rows_In_Lookup` error is raised.
|
||||
- The following problems may be reported according to the `on_problems`
|
||||
setting:
|
||||
- If any of the `columns` is a floating-point type,
|
||||
a `Floating_Point_Equality`.
|
||||
|
||||
> Example
|
||||
Replace values in column 'x' using a lookup table.
|
||||
|
||||
table = Table.new [['x', [1, 2, 3, 4]], ['y', ['a', 'b', 'c', 'd']], ['z', ['e', 'f', 'g', 'h']]]
|
||||
# | x | y | z
|
||||
# ---+---+---+---
|
||||
# 0 | 1 | a | e
|
||||
# 1 | 2 | b | f
|
||||
# 2 | 3 | c | g
|
||||
# 3 | 4 | d | h
|
||||
|
||||
lookup_table = Table.new [['x', [1, 2, 3, 4]], ['new_x', [10, 20, 30, 40]]]
|
||||
# | old_x | new_x
|
||||
# ---+-------+-------
|
||||
# 0 | 1 | 10
|
||||
# 1 | 2 | 20
|
||||
# 2 | 3 | 30
|
||||
# 3 | 4 | 40
|
||||
|
||||
result = table.replace lookup_table 'x'
|
||||
# | x | y | z
|
||||
# ---+----+---+---
|
||||
# 0 | 10 | a | e
|
||||
# 1 | 20 | b | f
|
||||
# 2 | 30 | c | g
|
||||
# 3 | 40 | d | h
|
||||
@column Widget_Helpers.make_column_name_selector
|
||||
@from_column Widget_Helpers.make_column_name_selector
|
||||
@to_column Widget_Helpers.make_column_name_selector
|
||||
replace : Table | Map -> (Text | Integer) -> (Text | Integer | Nothing) -> (Text | Integer | Nothing) -> Boolean -> Problem_Behavior -> Table ! Missing_Input_Columns | Non_Unique_Key | Unmatched_Rows_In_Lookup
|
||||
replace self lookup_table:(Table | Map) column:(Text | Integer) from_column:(Text | Integer | Nothing)=Nothing to_column:(Text | Integer | Nothing)=Nothing allow_unmatched_rows:Boolean=True on_problems:Problem_Behavior=Problem_Behavior.Report_Warning =
|
||||
case lookup_table of
|
||||
_ : Map ->
|
||||
if from_column.is_nothing.not || to_column.is_nothing.not then Error.throw (Illegal_Argument.Error "If a Map is provided as the lookup_table, then from_column and to_column should not also be specified.") else
|
||||
self.replace (map_to_lookup_table lookup_table 'from' 'to') column 'from' 'to' allow_unmatched_rows=allow_unmatched_rows on_problems=on_problems
|
||||
_ : Table ->
|
||||
from_column_resolved = from_column.if_nothing 0
|
||||
to_column_resolved = to_column.if_nothing 1
|
||||
selected_lookup_columns = lookup_table.select_columns [from_column_resolved, to_column_resolved]
|
||||
self.select_columns column . if_not_error <| selected_lookup_columns . if_not_error <|
|
||||
unique = self.column_naming_helper.create_unique_name_strategy
|
||||
unique.mark_used (self.column_names)
|
||||
|
||||
## We perform a `merge` into `column`, using a duplicate of `column`
|
||||
as the key column to join with `from_column`.
|
||||
|
||||
duplicate_key_column_name = unique.make_unique "duplicate_key"
|
||||
duplicate_key_column = self.at column . rename duplicate_key_column_name
|
||||
self_with_duplicate = self.set duplicate_key_column set_mode=Set_Mode.Add
|
||||
|
||||
## Create a lookup table with just `to_column` and `from_column`,
|
||||
renamed to match the base table's `column` and its duplicate,
|
||||
respectively.
|
||||
lookup_table_renamed = selected_lookup_columns . rename_columns (Map.from_vector [[from_column_resolved, duplicate_key_column_name], [to_column_resolved, column]])
|
||||
|
||||
merged = self_with_duplicate.merge lookup_table_renamed duplicate_key_column_name add_new_columns=False allow_unmatched_rows=allow_unmatched_rows on_problems=on_problems
|
||||
merged.remove_columns duplicate_key_column_name
|
||||
|
||||
## ALIAS join by row position
|
||||
GROUP Standard.Base.Calculations
|
||||
ICON dataframes_join
|
||||
@ -2701,6 +2807,13 @@ concat_columns column_set all_tables result_type result_row_count on_problems =
|
||||
sealed_storage = storage_builder.seal
|
||||
Column.from_storage column_set.name sealed_storage
|
||||
|
||||
## PRIVATE
|
||||
A helper that creates a two-column table from a map.
|
||||
map_to_lookup_table : Map Any Any -> Text -> Text -> Table
|
||||
map_to_lookup_table map key_column value_column =
|
||||
keys_and_values = map.to_vector
|
||||
Table.new [[key_column, keys_and_values.map .first], [value_column, keys_and_values.map .second]]
|
||||
|
||||
## PRIVATE
|
||||
Conversion method to a Table from a Column.
|
||||
Table.from (that:Column) = that.to_table
|
||||
|
@ -19,7 +19,7 @@ type Data
|
||||
|
||||
setup create_connection_fn =
|
||||
Data.Value (create_connection_fn Nothing)
|
||||
|
||||
|
||||
teardown self = self.connection.close
|
||||
|
||||
|
||||
|
@ -0,0 +1,112 @@
|
||||
from Standard.Base import all
|
||||
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
|
||||
|
||||
from Standard.Table import all
|
||||
from Standard.Table.Errors import all
|
||||
|
||||
from Standard.Database import all
|
||||
|
||||
from Standard.Test_New import all
|
||||
|
||||
from project.Common_Table_Operations.Util import run_default_backend
|
||||
import project.Util
|
||||
|
||||
main = run_default_backend add_specs
|
||||
|
||||
type Data
|
||||
Value ~connection
|
||||
|
||||
setup create_connection_fn =
|
||||
Data.Value (create_connection_fn Nothing)
|
||||
|
||||
teardown self = self.connection.close
|
||||
|
||||
|
||||
add_specs suite_builder setup =
|
||||
prefix = setup.prefix
|
||||
create_connection_fn = setup.create_connection_func
|
||||
suite_builder.group prefix+"Table.replace" group_builder->
|
||||
data = Data.setup create_connection_fn
|
||||
|
||||
group_builder.teardown <|
|
||||
data.teardown
|
||||
|
||||
table_builder cols =
|
||||
setup.table_builder cols connection=data.connection
|
||||
|
||||
group_builder.specify "should be able to replace values via a lookup table, using from/to column defaults" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4, 2]], ['y', ['a', 'b', 'c', 'd', 'e']]]
|
||||
lookup_table = table_builder [['x', [2, 1, 4, 3]], ['z', [20, 10, 40, 30]]]
|
||||
expected = table_builder [['x', [10, 20, 30, 40, 20]], ['y', ['a', 'b', 'c', 'd', 'e']]]
|
||||
result = table.replace lookup_table 'x'
|
||||
result . should_equal expected
|
||||
|
||||
group_builder.specify "should be able to replace values via a lookup table, specifying from/to columns" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4, 2]], ['y', ['a', 'b', 'c', 'd', 'e']]]
|
||||
lookup_table = table_builder [['d', [4, 5, 6, 7]], ['x', [2, 1, 4, 3]], ['d2', [5, 6, 7, 8]], ['z', [20, 10, 40, 30]]]
|
||||
expected = table_builder [['x', [10, 20, 30, 40, 20]], ['y', ['a', 'b', 'c', 'd', 'e']]]
|
||||
result = table.replace lookup_table 'x' 'x' 'z'
|
||||
result . should_equal expected
|
||||
|
||||
group_builder.specify "should be able to replace values via a lookup table provided as a Map" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4, 2]], ['y', ['a', 'b', 'c', 'd', 'e']]]
|
||||
lookup_table = Map.from_vector [[2, 20], [1, 10], [4, 40], [3, 30]]
|
||||
expected = table_builder [['x', [10, 20, 30, 40, 20]], ['y', ['a', 'b', 'c', 'd', 'e']]]
|
||||
result = table.replace lookup_table 'x'
|
||||
result . should_equal expected
|
||||
|
||||
group_builder.specify "should fail with Missing_Input_Columns if the specified columns do not exist" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4]], ['y', ['a', 'b', 'c', 'd']]]
|
||||
lookup_table = table_builder [['x', [2, 1, 4, 3]], ['z', [20, 10, 40, 30]]]
|
||||
table.replace lookup_table 'q' 'x' 'z' . should_fail_with Missing_Input_Columns
|
||||
table.replace lookup_table 'x' 'q' 'z' . should_fail_with Missing_Input_Columns
|
||||
table.replace lookup_table 'x' 'x' 'q' . should_fail_with Missing_Input_Columns
|
||||
|
||||
group_builder.specify "can allow unmatched rows" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4]], ['y', ['a', 'b', 'c', 'd']]]
|
||||
lookup_table = table_builder [['x', [4, 3, 1]], ['z', [40, 30, 10]]]
|
||||
expected = table_builder [['x', [10, 2, 30, 40]], ['y', ['a', 'b', 'c', 'd']]]
|
||||
result = table.replace lookup_table 'x'
|
||||
result . should_equal expected
|
||||
|
||||
group_builder.specify "fails on unmatched rows" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4]], ['y', ['a', 'b', 'c', 'd']]]
|
||||
lookup_table = table_builder [['x', [4, 3, 1]], ['z', [40, 30, 10]]]
|
||||
table.replace lookup_table 'x' allow_unmatched_rows=False . should_fail_with Unmatched_Rows_In_Lookup
|
||||
|
||||
group_builder.specify "fails on non-unique keys" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4]], ['y', ['a', 'b', 'c', 'd']]]
|
||||
lookup_table = table_builder [['x', [2, 1, 4, 1, 3]], ['z', [20, 10, 40, 11, 30]]]
|
||||
table.replace lookup_table 'x' . should_fail_with Non_Unique_Key
|
||||
|
||||
group_builder.specify "should avoid name clashes in the (internally) generated column name" <|
|
||||
table = table_builder [['duplicate_key', [1, 2, 3, 4]], ['y', ['a', 'b', 'c', 'd']]]
|
||||
lookup_table = table_builder [['x', [2, 1, 4, 3]], ['z', [20, 10, 40, 30]]]
|
||||
expected = table_builder [['duplicate_key', [10, 20, 30, 40]], ['y', ['a', 'b', 'c', 'd']]]
|
||||
result = table.replace lookup_table 'duplicate_key'
|
||||
result . should_equal expected
|
||||
|
||||
group_builder.specify "(edge-case) should allow lookup with itself" <|
|
||||
table = table_builder [['x', [2, 1, 4, 3]], ['y', [20, 10, 40, 30]]]
|
||||
expected = table_builder [['x', [20, 10, 40, 30]], ['y', [20, 10, 40, 30]]]
|
||||
result = table.replace table 'x'
|
||||
result . should_equal expected
|
||||
|
||||
group_builder.specify "should not merge columns other than the one specified in the `column` param" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4, 2]], ['y', ['a', 'b', 'c', 'd', 'e']], ['q', [4, 5, 6, 7, 8]]]
|
||||
lookup_table = table_builder [['x', [2, 1, 4, 3]], ['z', [20, 10, 40, 30]], ['q', [40, 50, 60, 70]]]
|
||||
expected = table_builder [['x', [10, 20, 30, 40, 20]], ['y', ['a', 'b', 'c', 'd', 'e']], ['q', [4, 5, 6, 7, 8]]]
|
||||
result = table.replace lookup_table 'x'
|
||||
result . should_equal expected
|
||||
|
||||
group_builder.specify "should fail on null key values in lookup table" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4, 2]], ['y', ['a', 'b', 'c', 'd', 'e']]]
|
||||
lookup_table = table_builder [['x', [2, 1, Nothing, 3]], ['z', [20, 10, 40, 30]]]
|
||||
table.replace lookup_table 'x' . should_fail_with Null_Values_In_Key_Columns
|
||||
|
||||
group_builder.specify "should not allow from/to_coumn to specified if the argument is a Map" <|
|
||||
table = table_builder [['x', [1, 2, 3, 4, 2]], ['y', ['a', 'b', 'c', 'd', 'e']]]
|
||||
lookup_table = Map.from_vector [[2, 20], [1, 10], [4, 40], [3, 30]]
|
||||
table.replace lookup_table 'x' from_column=8 . should_fail_with Illegal_Argument
|
||||
table.replace lookup_table 'x' to_column=9 . should_fail_with Illegal_Argument
|
||||
table.replace lookup_table 'x' from_column=8 to_column=9 . should_fail_with Illegal_Argument
|
Loading…
Reference in New Issue
Block a user