Update the aggregate API to take a separate group_by (#9027)

Separate out the `Group_By` from the column definition in `aggregate`.
![image](https://github.com/enso-org/enso/assets/4699705/6b4f03bc-1c4a-4582-b38a-ba528ae94167)

Supports the old API with a warning attached about deprecation:
![image](https://github.com/enso-org/enso/assets/4699705/0cc42ff7-6047-41a5-bb99-c717d06d0d93)

Widgets have been updated with `Group_By` removed from the dropdown.
This commit is contained in:
James Dunkerley 2024-02-13 10:23:59 +00:00 committed by GitHub
parent f7a84d06e4
commit 8c197f325b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
34 changed files with 507 additions and 469 deletions

View File

@ -615,6 +615,7 @@
- [Allow removing rows using a Filter_Condition.][8861] - [Allow removing rows using a Filter_Condition.][8861]
- [Added `Table.to_xml`.][8979] - [Added `Table.to_xml`.][8979]
- [Implemented Write support for `S3_File`.][8921] - [Implemented Write support for `S3_File`.][8921]
- [Separate `Group_By` from `columns` into new argument on `aggregate`.][9027]
[debug-shortcuts]: [debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
@ -886,6 +887,7 @@
[8861]: https://github.com/enso-org/enso/pull/8861 [8861]: https://github.com/enso-org/enso/pull/8861
[8979]: https://github.com/enso-org/enso/pull/8979 [8979]: https://github.com/enso-org/enso/pull/8979
[8921]: https://github.com/enso-org/enso/pull/8921 [8921]: https://github.com/enso-org/enso/pull/8921
[9027]: https://github.com/enso-org/enso/pull/9027
#### Enso Compiler #### Enso Compiler

View File

@ -0,0 +1,13 @@
import project.Data.Text.Text
## A warning that an API is deprecated.
type Deprecated
## PRIVATE
Warning type_name:Text method_name:Text message:Text=""
## PRIVATE
Pretty prints the Deprecated warning.
to_display_text : Text
to_display_text self =
if self.message.is_empty then ("Deprecated: " + self.type_name + "." + self.method_name + " is deprecated.") else self.message

View File

@ -6,6 +6,7 @@ import Standard.Base.Errors.Common.Additional_Warnings
import Standard.Base.Errors.Common.Incomparable_Values import Standard.Base.Errors.Common.Incomparable_Values
import Standard.Base.Errors.Common.Index_Out_Of_Bounds import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Common.Type_Error import Standard.Base.Errors.Common.Type_Error
import Standard.Base.Errors.Deprecated.Deprecated
import Standard.Base.Errors.File_Error.File_Error import Standard.Base.Errors.File_Error.File_Error
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Base.Errors.Illegal_State.Illegal_State import Standard.Base.Errors.Illegal_State.Illegal_State
@ -1720,10 +1721,14 @@ type Table
GROUP Standard.Base.Calculations GROUP Standard.Base.Calculations
ICON sigma ICON sigma
Aggregates the rows in a table using any `Group_By` entries in columns. Aggregates the rows in a table using `group_by` columns.
The columns argument specifies which additional aggregations to perform and to return. The columns argument specifies which additional aggregations to perform
and to return.
Arguments: Arguments:
- group_by: Vector of column identifiers to group by. These will be
included at the start of the resulting table. If no columns are
specified a single row will be returned with the aggregate columns.
- columns: Vector of `Aggregate_Column` specifying the aggregated table. - columns: Vector of `Aggregate_Column` specifying the aggregated table.
Expressions can be used within the aggregate column to perform more Expressions can be used within the aggregate column to perform more
complicated calculations. complicated calculations.
@ -1745,7 +1750,7 @@ type Table
- If a column index is out of range, a `Missing_Input_Columns` is - If a column index is out of range, a `Missing_Input_Columns` is
reported according to the `on_problems` setting, unless reported according to the `on_problems` setting, unless
`error_on_missing_columns` is set to `True`, in which case it is `error_on_missing_columns` is set to `True`, in which case it is
raised as an error. Problems resolving `Group_By` columns are raised as an error. Problems resolving `group_by` columns are
reported as dataflow errors regardless of these settings, as a reported as dataflow errors regardless of these settings, as a
missing grouping will completely change semantics of the query. missing grouping will completely change semantics of the query.
- If a column selector is given as a `Text` and it does not match any - If a column selector is given as a `Text` and it does not match any
@ -1753,7 +1758,7 @@ type Table
`Invalid_Aggregate_Column` error is raised according to the `Invalid_Aggregate_Column` error is raised according to the
`on_problems` settings (unless `error_on_missing_columns` is set to `on_problems` settings (unless `error_on_missing_columns` is set to
`True` in which case it will always be an error). Problems resolving `True` in which case it will always be an error). Problems resolving
`Group_By` columns are reported as dataflow errors regardless of `group_by` columns are reported as dataflow errors regardless of
these settings, as a missing grouping will completely change these settings, as a missing grouping will completely change
semantics of the query. semantics of the query.
- If an aggregation fails, an `Invalid_Aggregation` dataflow error is - If an aggregation fails, an `Invalid_Aggregation` dataflow error is
@ -1771,63 +1776,73 @@ type Table
- If there are more than 10 issues with a single column, - If there are more than 10 issues with a single column,
an `Additional_Warnings`. an `Additional_Warnings`.
> Example
Count all the rows
table.aggregate columns=[Aggregate_Column.Count]
> Example > Example
Group by the Key column, count the rows Group by the Key column, count the rows
table.aggregate [Aggregate_Column.Group_By "Key", Aggregate_Column.Count] table.aggregate ["Key"] [Aggregate_Column.Count]
@group_by Widget_Helpers.make_column_name_vector_selector
@columns Widget_Helpers.make_aggregate_column_vector_selector @columns Widget_Helpers.make_aggregate_column_vector_selector
aggregate : Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings aggregate : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings
aggregate self columns (error_on_missing_columns=False) (on_problems=Report_Warning) = aggregate self group_by=[] columns=[] (error_on_missing_columns=False) (on_problems=Report_Warning) =
validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper columns self error_on_missing_columns=error_on_missing_columns normalized_group_by = Vector.unify_vector_or_element group_by
key_columns = validated.key_columns if normalized_group_by.is_empty && columns.is_empty then Error.throw (No_Output_Columns.Error "At least one column must be specified.") else
key_problems = key_columns.flat_map internal_column-> validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper normalized_group_by columns self error_on_missing_columns=error_on_missing_columns
column = self.make_column internal_column
case column.value_type.is_floating_point of
True -> [Floating_Point_Equality.Error column.name]
False -> []
on_problems.attach_problems_before validated.problems+key_problems <|
resolved_aggregates = validated.valid_columns
key_expressions = key_columns.map .expression
new_ctx = self.context.set_groups key_expressions
problem_builder = Problem_Builder.new
## TODO [RW] here we will perform as many fetches as there are
aggregate columns, but technically we could perform just one
fetch fetching all column types - TODO we should do that. We can
do it here by creating a builder that will gather all requests
from the executed callbacks and create Lazy references that all
point to a single query.
See #6118.
infer_from_database_callback expression =
SQL_Type_Reference.new self.connection self.context expression
dialect = self.connection.dialect
type_mapping = dialect.get_type_mapping
infer_return_type op_kind columns expression =
type_mapping.infer_return_type infer_from_database_callback op_kind columns expression
results = resolved_aggregates.map p->
agg = p.second
new_name = p.first
result = Aggregate_Helper.make_aggregate_column self agg new_name dialect infer_return_type problem_builder
## If the `result` did contain an error, we catch it to be
able to store it in a vector and then we will partition the
created columns and failures.
result.catch Any error->
DB_Wrapped_Error.Value error
partitioned = results.partition (_.is_a DB_Wrapped_Error) key_columns = validated.key_columns
key_problems = key_columns.flat_map internal_column->
column = self.make_column internal_column
case column.value_type.is_floating_point of
True -> [Floating_Point_Equality.Error column.name]
False -> []
on_problems.attach_problems_before validated.problems+key_problems <|
resolved_aggregates = validated.valid_columns
key_expressions = key_columns.map .expression
new_ctx = self.context.set_groups key_expressions
problem_builder = Problem_Builder.new
## TODO [RW] here we will perform as many fetches as there are
aggregate columns, but technically we could perform just one
fetch fetching all column types - TODO we should do that. We can
do it here by creating a builder that will gather all requests
from the executed callbacks and create Lazy references that all
point to a single query.
See #6118.
infer_from_database_callback expression =
SQL_Type_Reference.new self.connection self.context expression
dialect = self.connection.dialect
type_mapping = dialect.get_type_mapping
infer_return_type op_kind columns expression =
type_mapping.infer_return_type infer_from_database_callback op_kind columns expression
results = resolved_aggregates.map p->
agg = p.second
new_name = p.first
result = Aggregate_Helper.make_aggregate_column self agg new_name dialect infer_return_type problem_builder
## If the `result` did contain an error, we catch it to be
able to store it in a vector and then we will partition the
created columns and failures.
result.catch Any error->(DB_Wrapped_Error.Value error)
## When working on join we may encounter further issues with having partitioned = results.partition (_.is_a DB_Wrapped_Error)
aggregate columns exposed directly, it may be useful to re-use
the `lift_aggregate` method to push the aggregates into a ## When working on join we may encounter further issues with having
subquery. aggregate columns exposed directly, it may be useful to re-use
new_columns = partitioned.second the `lift_aggregate` method to push the aggregates into a
problem_builder.attach_problems_before on_problems <| subquery.
problems = partitioned.first.map .value new_columns = partitioned.second
on_problems.attach_problems_before problems <| problem_builder.attach_problems_before on_problems <|
handle_no_output_columns = problems = partitioned.first.map .value
first_problem = if problems.is_empty then Nothing else problems.first on_problems.attach_problems_before problems <|
Error.throw (No_Output_Columns.Error first_problem) handle_no_output_columns =
if new_columns.is_empty then handle_no_output_columns else first_problem = if problems.is_empty then Nothing else problems.first
self.updated_context_and_columns new_ctx new_columns subquery=True Error.throw (No_Output_Columns.Error first_problem)
if new_columns.is_empty then handle_no_output_columns else
result = self.updated_context_and_columns new_ctx new_columns subquery=True
if validated.old_style.not then result else
Warning.attach (Deprecated.Warning "Standard.Database.Data.Aggregate_Column.Aggregate_Column" "Group_By" "Deprecated: `Group_By` constructor has been deprecated, use the `group_by` argument instead.") result
## GROUP Standard.Base.Calculations ## GROUP Standard.Base.Calculations
ICON dataframe_map_row ICON dataframe_map_row
@ -1939,7 +1954,7 @@ type Table
A | Example | France A | Example | France
@group_by Widget_Helpers.make_column_name_vector_selector @group_by Widget_Helpers.make_column_name_vector_selector
@name_column Widget_Helpers.make_column_name_selector @name_column Widget_Helpers.make_column_name_selector
@values (Widget_Helpers.make_aggregate_column_selector include_group_by=False) @values Widget_Helpers.make_aggregate_column_selector
cross_tab : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> (Text | Integer) -> Aggregate_Column | Vector Aggregate_Column -> Problem_Behavior -> Table ! Missing_Input_Columns | Invalid_Aggregate_Column | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings | Invalid_Column_Names cross_tab : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> (Text | Integer) -> Aggregate_Column | Vector Aggregate_Column -> Problem_Behavior -> Table ! Missing_Input_Columns | Invalid_Aggregate_Column | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings | Invalid_Column_Names
cross_tab self group_by name_column values=Aggregate_Column.Count (on_problems=Report_Warning) = cross_tab self group_by name_column values=Aggregate_Column.Count (on_problems=Report_Warning) =
## Avoid unused arguments warning. We cannot rename arguments to `_`, ## Avoid unused arguments warning. We cannot rename arguments to `_`,

View File

@ -64,7 +64,7 @@ type Context
- groups: a list of grouping expressions, for each entry a GROUP BY is - groups: a list of grouping expressions, for each entry a GROUP BY is
added, the resulting query can then directly include only the added, the resulting query can then directly include only the
grouped-by columns or aggregate expressions. grouped-by columns or aggregate expressions.
- limit: an optional maximum number of elements that the equery should - limit: an optional maximum number of elements that the query should
return. return.
Value (from_spec : From_Spec) (where_filters : Vector SQL_Expression) (orders : Vector Order_Descriptor) (groups : Vector SQL_Expression) (limit : Nothing | Integer) (distinct_on : Nothing | Vector SQL_Expression) Value (from_spec : From_Spec) (where_filters : Vector SQL_Expression) (orders : Vector Order_Descriptor) (groups : Vector SQL_Expression) (limit : Nothing | Integer) (distinct_on : Nothing | Vector SQL_Expression)

View File

@ -228,8 +228,8 @@ type Non_Unique_Key_Recipe
Creates a `Non_Unique_Key` error containing information about an Creates a `Non_Unique_Key` error containing information about an
example group violating the uniqueness constraint. example group violating the uniqueness constraint.
raise_duplicated_primary_key_error source_table primary_key original_panic = raise_duplicated_primary_key_error source_table primary_key original_panic =
agg = source_table.aggregate [Aggregate_Column.Count]+(primary_key.map Aggregate_Column.Group_By) agg = source_table.aggregate primary_key [Aggregate_Column.Count]
filtered = agg.filter column=0 (Filter_Condition.Greater than=1) filtered = agg.filter column=-1 (Filter_Condition.Greater than=1)
materialized = filtered.read max_rows=1 warn_if_more_rows=False materialized = filtered.read max_rows=1 warn_if_more_rows=False
case materialized.row_count == 0 of case materialized.row_count == 0 of
## If we couldn't find a duplicated key, we give up the translation and ## If we couldn't find a duplicated key, we give up the translation and
@ -239,8 +239,8 @@ raise_duplicated_primary_key_error source_table primary_key original_panic =
True -> Panic.throw original_panic True -> Panic.throw original_panic
False -> False ->
row = materialized.first_row.to_vector row = materialized.first_row.to_vector
example_count = row.first example_count = row.last
example_entry = row.drop 1 example_entry = row.drop (Last 1)
Error.throw (Non_Unique_Key.Error primary_key example_entry example_count) Error.throw (Non_Unique_Key.Error primary_key example_entry example_count)
## PRIVATE ## PRIVATE
@ -619,15 +619,15 @@ check_duplicate_key_matches_for_delete target_table tmp_table key_columns allow_
Checks if any rows identified by `key_columns` have more than one match between two tables. Checks if any rows identified by `key_columns` have more than one match between two tables.
check_multiple_rows_match left_table right_table key_columns ~continuation = check_multiple_rows_match left_table right_table key_columns ~continuation =
joined = left_table.join right_table on=key_columns join_kind=Join_Kind.Inner joined = left_table.join right_table on=key_columns join_kind=Join_Kind.Inner
counted = joined.aggregate [Aggregate_Column.Count]+(key_columns.map (Aggregate_Column.Group_By _)) counted = joined.aggregate key_columns [Aggregate_Column.Count]
duplicates = counted.filter 0 (Filter_Condition.Greater than=1) duplicates = counted.filter -1 (Filter_Condition.Greater than=1)
example = duplicates.read max_rows=1 warn_if_more_rows=False example = duplicates.read max_rows=1 warn_if_more_rows=False
case example.row_count == 0 of case example.row_count == 0 of
True -> continuation True -> continuation
False -> False ->
row = example.first_row . to_vector row = example.first_row . to_vector
offending_key = row.drop 1 offending_key = row.drop (Last 1)
count = row.first count = row.last
Error.throw (Multiple_Target_Rows_Matched_For_Update.Error offending_key count) Error.throw (Multiple_Target_Rows_Matched_For_Update.Error offending_key count)
## PRIVATE ## PRIVATE

View File

@ -4,11 +4,11 @@ import project.Data.Sort_Column.Sort_Column
## Defines an Aggregate Column ## Defines an Aggregate Column
type Aggregate_Column type Aggregate_Column
## Specifies a column to group the rows by. ## PRIVATE
Specifies a column to group the rows by. Deprecated but used internally.
Arguments: Arguments:
- column: the column (specified by name, expression or index) to group - column: the column (either name, expression or index) to group by.
by.
- new_name: name of new column. - new_name: name of new column.
Group_By (column:Text|Integer|Any) (new_name:Text="") # Any needed because of 6866 Group_By (column:Text|Integer|Any) (new_name:Text="") # Any needed because of 6866

View File

@ -10,6 +10,7 @@ import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Common.No_Such_Method import Standard.Base.Errors.Common.No_Such_Method
import Standard.Base.Errors.Common.Out_Of_Memory import Standard.Base.Errors.Common.Out_Of_Memory
import Standard.Base.Errors.Common.Type_Error import Standard.Base.Errors.Common.Type_Error
import Standard.Base.Errors.Deprecated.Deprecated
import Standard.Base.Errors.File_Error.File_Error import Standard.Base.Errors.File_Error.File_Error
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Base.Errors.Unimplemented.Unimplemented import Standard.Base.Errors.Unimplemented.Unimplemented
@ -658,10 +659,14 @@ type Table
GROUP Standard.Base.Calculations GROUP Standard.Base.Calculations
ICON sigma ICON sigma
Aggregates the rows in a table using any `Group_By` entries in columns. Aggregates the rows in a table using `group_by` columns.
The columns argument specifies which additional aggregations to perform and to return. The columns argument specifies which additional aggregations to perform
and to return.
Arguments: Arguments:
- group_by: Vector of column identifiers to group by. These will be
included at the start of the resulting table. If no columns are
specified a single row will be returned with the aggregate columns.
- columns: Vector of `Aggregate_Column` specifying the aggregated table. - columns: Vector of `Aggregate_Column` specifying the aggregated table.
Expressions can be used within the aggregate column to perform more Expressions can be used within the aggregate column to perform more
complicated calculations. complicated calculations.
@ -681,7 +686,7 @@ type Table
- If a column index is out of range, a `Missing_Input_Columns` is - If a column index is out of range, a `Missing_Input_Columns` is
reported according to the `on_problems` setting, unless reported according to the `on_problems` setting, unless
`error_on_missing_columns` is set to `True`, in which case it is `error_on_missing_columns` is set to `True`, in which case it is
raised as an error. Problems resolving `Group_By` columns are raised as an error. Problems resolving `group_by` columns are
reported as dataflow errors regardless of these settings, as a reported as dataflow errors regardless of these settings, as a
missing grouping will completely change semantics of the query. missing grouping will completely change semantics of the query.
- If a column selector is given as a `Text` and it does not match any - If a column selector is given as a `Text` and it does not match any
@ -689,7 +694,7 @@ type Table
`Invalid_Aggregate_Column` problem is raised according to the `Invalid_Aggregate_Column` problem is raised according to the
`on_problems` settings (unless `error_on_missing_columns` is set to `on_problems` settings (unless `error_on_missing_columns` is set to
`True` in which case it will always be an error). Problems resolving `True` in which case it will always be an error). Problems resolving
`Group_By` columns are reported as dataflow errors regardless of `group_by` columns are reported as dataflow errors regardless of
these settings, as a missing grouping will completely change these settings, as a missing grouping will completely change
semantics of the query. semantics of the query.
- If an aggregation fails, an `Invalid_Aggregation` dataflow error is - If an aggregation fails, an `Invalid_Aggregation` dataflow error is
@ -707,22 +712,31 @@ type Table
- If there are more than 10 issues with a single column, - If there are more than 10 issues with a single column,
an `Additional_Warnings`. an `Additional_Warnings`.
> Example
Count all the rows
table.aggregate columns=[Aggregate_Column.Count]
> Example > Example
Group by the Key column, count the rows Group by the Key column, count the rows
table.aggregate [Aggregate_Column.Group_By "Key", Aggregate_Column.Count] table.aggregate ["Key"] [Aggregate_Column.Count]
@group_by Widget_Helpers.make_column_name_vector_selector
@columns Widget_Helpers.make_aggregate_column_vector_selector @columns Widget_Helpers.make_aggregate_column_vector_selector
aggregate : Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings aggregate : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings
aggregate self columns (error_on_missing_columns=False) (on_problems=Report_Warning) = aggregate self group_by=[] columns=[] (error_on_missing_columns=False) (on_problems=Report_Warning) =
validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper columns self error_on_missing_columns=error_on_missing_columns normalized_group_by = Vector.unify_vector_or_element group_by
if normalized_group_by.is_empty && columns.is_empty then Error.throw (No_Output_Columns.Error "At least one column must be specified.") else
validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper normalized_group_by columns self error_on_missing_columns=error_on_missing_columns
on_problems.attach_problems_before validated.problems <| Illegal_Argument.handle_java_exception <| on_problems.attach_problems_before validated.problems <| Illegal_Argument.handle_java_exception <|
java_key_columns = validated.key_columns.map .java_column java_key_columns = validated.key_columns.map .java_column
Java_Problems.with_problem_aggregator on_problems java_problem_aggregator-> Java_Problems.with_problem_aggregator on_problems java_problem_aggregator->
index = self.java_table.indexFromColumns java_key_columns java_problem_aggregator index = self.java_table.indexFromColumns java_key_columns java_problem_aggregator
new_columns = validated.valid_columns.map c->(Aggregate_Column_Helper.java_aggregator c.first c.second) new_columns = validated.valid_columns.map c->(Aggregate_Column_Helper.java_aggregator c.first c.second)
java_table = index.makeTable new_columns java_table = index.makeTable new_columns
Table.Value java_table if validated.old_style.not then Table.Value java_table else
Warning.attach (Deprecated.Warning "Standard.Database.Data.Aggregate_Column.Aggregate_Column" "Group_By" "Deprecated: `Group_By` constructor has been deprecated, use the `group_by` argument instead.") (Table.Value java_table)
## ALIAS sort ## ALIAS sort
GROUP Standard.Base.Selections GROUP Standard.Base.Selections
@ -2448,7 +2462,7 @@ type Table
A | Example | France A | Example | France
@group_by Widget_Helpers.make_column_name_vector_selector @group_by Widget_Helpers.make_column_name_vector_selector
@name_column Widget_Helpers.make_column_name_selector @name_column Widget_Helpers.make_column_name_selector
@values (Widget_Helpers.make_aggregate_column_selector include_group_by=False) @values Widget_Helpers.make_aggregate_column_selector
cross_tab : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> (Text | Integer) -> Aggregate_Column | Vector Aggregate_Column -> Problem_Behavior -> Table ! Missing_Input_Columns | Invalid_Aggregate_Column | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings | Invalid_Column_Names cross_tab : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> (Text | Integer) -> Aggregate_Column | Vector Aggregate_Column -> Problem_Behavior -> Table ! Missing_Input_Columns | Invalid_Aggregate_Column | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings | Invalid_Column_Names
cross_tab self group_by name_column values=Aggregate_Column.Count (on_problems=Report_Warning) = Out_Of_Memory.handle_java_exception "cross_tab" <| cross_tab self group_by name_column values=Aggregate_Column.Count (on_problems=Report_Warning) = Out_Of_Memory.handle_java_exception "cross_tab" <|
columns_helper = self.columns_helper columns_helper = self.columns_helper

View File

@ -1,5 +1,6 @@
from Standard.Base import all hiding First, Last from Standard.Base import all hiding First, Last
import Standard.Base.Data.Vector.No_Wrap import Standard.Base.Data.Vector.No_Wrap
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Base.Runtime import assert from Standard.Base.Runtime import assert
import project.Data.Aggregate_Column.Aggregate_Column import project.Data.Aggregate_Column.Aggregate_Column
@ -31,69 +32,76 @@ polyglot java import org.enso.table.aggregations.StandardDeviation as StandardDe
polyglot java import org.enso.table.aggregations.Sum as SumAggregator polyglot java import org.enso.table.aggregations.Sum as SumAggregator
## Result type for aggregate_columns validation ## Result type for aggregate_columns validation
- key_columns: Vector of Columns from the table to group by - key_columns: Vector of Columns from the table to group by.
- valid_columns: Table structure to build as pairs of unique column name and Aggregate_Column - valid_columns: Table structure to build as pairs of unique column name and Aggregate_Column.
- problems: Set of any problems when validating the input - problems: Set of any problems when validating the input.
- old_style: Boolean indicating if the input was in the old style.
type Validated_Aggregate_Columns type Validated_Aggregate_Columns
## PRIVATE ## PRIVATE
Value (key_columns:(Vector Column)) (valid_columns:(Vector (Pair Text Aggregate_Column))) (problems:(Vector Any)) Value (key_columns:(Vector Column)) (valid_columns:(Vector (Pair Text Aggregate_Column))) (problems:(Vector Any)) (old_style:Boolean)
## PRIVATE ## PRIVATE
Prepares an aggregation input for further processing: Prepares an aggregation input for further processing:
- resolves the column descriptors, reporting any issues, - resolves the column descriptors, reporting any issues,
- ensures that the output names are unique, - ensures that the output names are unique,
- finds the key columns. - finds the key columns.
prepare_aggregate_columns : Column_Naming_Helper -> Vector Aggregate_Column -> Table -> Boolean -> Validated_Aggregate_Columns prepare_aggregate_columns : Column_Naming_Helper -> Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> Vector Aggregate_Column -> Table -> Boolean -> Validated_Aggregate_Columns
prepare_aggregate_columns naming_helper aggregates table error_on_missing_columns = prepare_aggregate_columns naming_helper group_by aggregates table error_on_missing_columns =
is_a_key c = case c of is_a_key c = case c of
Aggregate_Column.Group_By _ _ -> True Aggregate_Column.Group_By _ _ -> True
_ -> False _ -> False
keys = aggregates.filter is_a_key ## Resolve old style aggregate into new style
# Key resolution always errors on missing, regardless of any settings. old_style = aggregates.is_empty && group_by.any (g-> g.is_a Aggregate_Column)
keys_problem_builder = Problem_Builder.new error_on_missing_columns=True if old_style.not && group_by.any is_a_key then Error.throw (Invalid_Aggregation.Error "`columns` should not contain a `Group_By`.") else
resolved_keys = keys.map (resolve_aggregate table keys_problem_builder) keys = if old_style then group_by.filter is_a_key else group_by.map (Aggregate_Column.Group_By _ "")
## Since `keys_problem_builder` has `error_on_missing_columns` set to `True`, # Key resolution always errors on missing, regardless of any settings.
any missing columns will be reported as errors. Therefore, we can assume keys_problem_builder = Problem_Builder.new error_on_missing_columns=True
that all the columns were present. resolved_keys = keys.map (resolve_aggregate table keys_problem_builder)
keys_problem_builder.attach_problems_before Problem_Behavior.Report_Error <|
assert (resolved_keys.contains Nothing . not)
problem_builder = Problem_Builder.new error_on_missing_columns=error_on_missing_columns
valid_resolved_aggregate_columns = aggregates.map on_problems=No_Wrap (resolve_aggregate table problem_builder) . filter x-> x.is_nothing.not
# Grouping Key ## Since `keys_problem_builder` has `error_on_missing_columns` set to `True`,
key_columns = resolved_keys.map .column any missing columns will be reported as errors. Therefore, we can assume
unique_key_columns = key_columns.distinct (on = .name) that all the columns were present.
keys_problem_builder.attach_problems_before Problem_Behavior.Report_Error <|
assert (resolved_keys.contains Nothing . not)
problem_builder = Problem_Builder.new error_on_missing_columns=error_on_missing_columns
columns = if old_style then group_by else keys+aggregates
valid_resolved_aggregate_columns = columns.map on_problems=No_Wrap (resolve_aggregate table problem_builder) . filter x-> x.is_nothing.not
# Resolve Names # Grouping Key
unique = naming_helper.create_unique_name_strategy key_columns = resolved_keys.map .column
## First pass ensures the custom names specified by the user are unique. unique_key_columns = key_columns.distinct (on = .name)
The second pass resolves the default names, ensuring that they do not
clash with the user-specified names (ensuring that user-specified names
take precedence).
pass_1 = valid_resolved_aggregate_columns.map on_problems=No_Wrap c-> if c.new_name == "" then "" else
# Verify if the user-provided name is valid and if not, throw an error.
naming_helper.ensure_name_is_valid c.new_name <|
unique.make_unique c.new_name
renamed_columns = pass_1.map_with_index i->name->
agg = valid_resolved_aggregate_columns.at i
new_name = if name != "" then name else unique.make_unique (default_aggregate_column_name agg)
Pair.new new_name agg
# Build Problems Output # Resolve Names
case renamed_columns.is_empty of unique = naming_helper.create_unique_name_strategy
True ->
## First, we try to raise any warnings that may have caused the ## First pass ensures the custom names specified by the user are unique.
lack of columns, promoted to errors. The second pass resolves the default names, ensuring that they do not
problem_builder.attach_problems_before Problem_Behavior.Report_Error <| clash with the user-specified names (ensuring that user-specified names
## If none were found, we raise a generic error (this may take precedence).
happen primarily when an empty list is provided to the pass_1 = valid_resolved_aggregate_columns.map on_problems=No_Wrap c-> if c.new_name == "" then "" else
aggregate method). # Verify if the user-provided name is valid and if not, throw an error.
Error.throw No_Output_Columns.Error naming_helper.ensure_name_is_valid c.new_name <|
False -> unique.make_unique c.new_name
problem_builder.report_unique_name_strategy unique renamed_columns = pass_1.map_with_index i->name->
Validated_Aggregate_Columns.Value unique_key_columns renamed_columns problem_builder.get_problemset_throwing_distinguished_errors agg = valid_resolved_aggregate_columns.at i
new_name = if name != "" then name else unique.make_unique (default_aggregate_column_name agg)
Pair.new new_name agg
# Build Problems Output
case renamed_columns.is_empty of
True ->
## First, we try to raise any warnings that may have caused the
lack of columns, promoted to errors.
problem_builder.attach_problems_before Problem_Behavior.Report_Error <|
## If none were found, we raise a generic error (this may
happen primarily when an empty list is provided to the
aggregate method).
Error.throw No_Output_Columns.Error
False ->
problem_builder.report_unique_name_strategy unique
Validated_Aggregate_Columns.Value unique_key_columns renamed_columns problem_builder.get_problemset_throwing_distinguished_errors old_style
## PRIVATE ## PRIVATE
Defines the default name of an `Aggregate_Column`. Defines the default name of an `Aggregate_Column`.

View File

@ -230,7 +230,7 @@ type Table_Column_Helper
Blank_Selector.All_Cells -> Aggregate_Column.Minimum _ Blank_Selector.All_Cells -> Aggregate_Column.Minimum _
aggregates = blanks.map blanks_col-> col_aggregate blanks_col.name aggregates = blanks.map blanks_col-> col_aggregate blanks_col.name
aggregate_result = just_indicators.aggregate aggregates on_problems=Problem_Behavior.Report_Error aggregate_result = just_indicators.aggregate columns=aggregates on_problems=Problem_Behavior.Report_Error
materialized_result = self.materialize <| aggregate_result.catch Any error-> materialized_result = self.materialize <| aggregate_result.catch Any error->
msg = "Unexpected dataflow error has been thrown in an `select_blank_columns_helper`. This is a bug in the Table library. The unexpected error was: "+error.to_display_text msg = "Unexpected dataflow error has been thrown in an `select_blank_columns_helper`. This is a bug in the Table library. The unexpected error was: "+error.to_display_text
Panic.throw (Illegal_State.Error message=msg cause=error) Panic.throw (Illegal_State.Error message=msg cause=error)

View File

@ -16,13 +16,12 @@ from project.Extensions.Table_Conversions import all
## PRIVATE ## PRIVATE
Make an aggregate column selector. Make an aggregate column selector.
make_aggregate_column_selector : Table -> Display -> Boolean -> Widget make_aggregate_column_selector : Table -> Display -> Widget
make_aggregate_column_selector table display=Display.Always include_group_by=True = make_aggregate_column_selector table display=Display.Always =
col_names_selector = make_column_name_selector table display=Display.Always col_names_selector = make_column_name_selector table display=Display.Always
column_widget = ["column", col_names_selector] column_widget = ["column", col_names_selector]
fqn = Meta.get_qualified_type_name Aggregate_Column fqn = Meta.get_qualified_type_name Aggregate_Column
group_by = if include_group_by then [Option "Group By" fqn+".Group_By" [column_widget]] else []
count = Option "Count" fqn+".Count" count = Option "Count" fqn+".Count"
## Currently can't support nested vector editors so using single picker ## Currently can't support nested vector editors so using single picker
@ -56,7 +55,7 @@ make_aggregate_column_selector table display=Display.Always include_group_by=Tru
maximum = Option "Maximum" fqn+".Maximum" [column_widget] maximum = Option "Maximum" fqn+".Maximum" [column_widget]
minimum = Option "Minimum" fqn+".Minimum" [column_widget] minimum = Option "Minimum" fqn+".Minimum" [column_widget]
Single_Choice display=display values=(group_by+[count, count_distinct, first, last, count_not_nothing, count_nothing, count_not_empty, count_empty, concatenate, shortest, longest, sum, average, median, percentile, mode, standard_deviation, maximum, minimum]) Single_Choice display=display values=[count, count_distinct, first, last, count_not_nothing, count_nothing, count_not_empty, count_empty, concatenate, shortest, longest, sum, average, median, percentile, mode, standard_deviation, maximum, minimum]
## PRIVATE ## PRIVATE
Make an Aggregate_Column list editor Make an Aggregate_Column list editor
@ -64,7 +63,7 @@ make_aggregate_column_vector_selector : Table -> Display -> Widget
make_aggregate_column_vector_selector table display=Display.Always = make_aggregate_column_vector_selector table display=Display.Always =
item_editor = make_aggregate_column_selector table display=Display.Always item_editor = make_aggregate_column_selector table display=Display.Always
# TODO this is a workaround for a dropdown issue # TODO this is a workaround for a dropdown issue
Vector_Editor item_editor=item_editor item_default="(Aggregate_Column.Group_By)" display=display Vector_Editor item_editor=item_editor item_default="(Aggregate_Column.Count)" display=display
## PRIVATE ## PRIVATE
Make a column name selector. Make a column name selector.

View File

@ -10,13 +10,13 @@ Table.build_ai_prompt self =
aggs = ["Count","Average","Sum","Median","First","Last","Maximum","Minimum"] aggs = ["Count","Average","Sum","Median","First","Last","Maximum","Minimum"]
joins = ["Inner","Left_Outer","Right_Outer","Full","Left_Exclusive","Right_Exclusive"] joins = ["Inner","Left_Outer","Right_Outer","Full","Left_Exclusive","Right_Exclusive"]
examples = """ examples = """
Table["id","category","Unit Price","Stock"];goal=get product count by category==>>`aggregate [Aggregate_Column.Group_By "category", Aggregate_Column.Count Nothing]` Table["id","category","Unit Price","Stock"];goal=get product count by category==>>`aggregate ["category"] [Aggregate_Column.Count]`
Table["ID","Unit Price","Stock"];goal=order by how many items are available==>>`order_by ["Stock"]` Table["ID","Unit Price","Stock"];goal=order by how many items are available==>>`order_by ["Stock"]`
Table["Name","Enrolled Year"];goal=select people who enrolled between 2015 and 2018==>>`filter_by_expression "[Enrolled Year] >= 2015 && [Enrolled Year] <= 2018` Table["Name","Enrolled Year"];goal=select people who enrolled between 2015 and 2018==>>`filter_by_expression "[Enrolled Year] >= 2015 && [Enrolled Year] <= 2018`
Table["Number of items","client name","city","unit price"];goal=compute the total value of each order==>>`set "[Number of items] * [unit price]" "total value"` Table["Number of items","client name","city","unit price"];goal=compute the total value of each order==>>`set "[Number of items] * [unit price]" "total value"`
Table["Number of items","client name","CITY","unit price","total value"];goal=compute the average order value by city==>>`aggregate [Aggregate_Column.Group_By "CITY", Aggregate_Column.Average "total value"]` Table["Number of items","client name","CITY","unit price","total value"];goal=compute the average order value by city==>>`aggregate ["CITY"] [Aggregate_Column.Average "total value"]`
Table["Area Code", "number"];goal=get full phone numbers==>>`set "'+1 (' + [Area Code] + ') ' + [number]" "full phone number"` Table["Area Code", "number"];goal=get full phone numbers==>>`set "'+1 (' + [Area Code] + ') ' + [number]" "full phone number"`
Table["Name","Grade","Subject"];goal=rank students by their average grade==>>`aggregate [Aggregate_Column.Group_By "Name", Aggregate_Column.Average "Grade" "Average Grade"] . order_by [Sort_Column.Name "Average Grade" Sort_Direction.Descending]` Table["Name","Grade","Subject"];goal=rank students by their average grade==>>`aggregate ["Name"] [Aggregate_Column.Average "Grade" "Average Grade"] . order_by [Sort_Column.Name "Average Grade" Sort_Direction.Descending]`
Table["Country","Prime minister name","2018","2019","2020","2021"];goal=pivot yearly GDP values to rows==>>`transpose ["Country", "Prime minister name"] "Year" "GDP"` Table["Country","Prime minister name","2018","2019","2020","2021"];goal=pivot yearly GDP values to rows==>>`transpose ["Country", "Prime minister name"] "Year" "GDP"`
Table["Size","Weight","Width","stuff","thing"];goal=only select size and thing of each record==>>`select_columns ["Size", "thing"]` Table["Size","Weight","Width","stuff","thing"];goal=only select size and thing of each record==>>`select_columns ["Size", "thing"]`
Table["ID","Name","Count"];goal=join it with var_17==>>`join var_17 Join_Kind.Inner` Table["ID","Name","Count"];goal=join it with var_17==>>`join var_17 Join_Kind.Inner`

View File

@ -48,6 +48,6 @@ collect_benches = Bench.build builder->
group_builder.specify "Sort_Table" <| group_builder.specify "Sort_Table" <|
data.table.order_by "X" data.table.order_by "X"
group_builder.specify "Group_And_Sort" <| group_builder.specify "Group_And_Sort" <|
data.table.aggregate [Aggregate_Column.Group_By "X", Aggregate_Column.Last "Y" order_by="Y"] data.table.aggregate ["X"] [Aggregate_Column.Last "Y" order_by="Y"]
main = collect_benches . run_main main = collect_benches . run_main

View File

@ -1,7 +1,6 @@
from Standard.Base import all hiding First, Last from Standard.Base import all hiding First, Last
from Standard.Table import Table from Standard.Table import Table, Aggregate_Column
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all
from Standard.Test import Bench, Faker from Standard.Test import Bench, Faker
@ -34,67 +33,67 @@ collect_benches = Bench.build builder->
builder.group "Table_Aggregate" options group_builder-> builder.group "Table_Aggregate" options group_builder->
group_builder.specify "Count_table" <| group_builder.specify "Count_table" <|
data.table.aggregate [Count] data.table.aggregate [Aggregate_Column.Count]
group_builder.specify "Max_table" <| group_builder.specify "Max_table" <|
data.table.aggregate [Maximum "ValueWithNothing"] data.table.aggregate [Aggregate_Column.Maximum "ValueWithNothing"]
group_builder.specify "Sum_table" <| group_builder.specify "Sum_table" <|
data.table.aggregate [Sum "ValueWithNothing"] data.table.aggregate [Aggregate_Column.Sum "ValueWithNothing"]
group_builder.specify "Count_Distinct_table" <| group_builder.specify "Count_Distinct_table" <|
data.table.aggregate [Count_Distinct "Index"] data.table.aggregate [Aggregate_Column.Count_Distinct "Index"]
group_builder.specify "StDev_table" <| group_builder.specify "StDev_table" <|
data.table.aggregate [Standard_Deviation "Value"] data.table.aggregate [Aggregate_Column.Standard_Deviation "Value"]
group_builder.specify "Median_table" <| group_builder.specify "Median_table" <|
data.table.aggregate [Median "Value"] data.table.aggregate [Aggregate_Column.Median "Value"]
group_builder.specify "Mode_table" <| group_builder.specify "Mode_table" <|
data.table.aggregate [Mode "Index"] data.table.aggregate [Aggregate_Column.Mode "Index"]
group_builder.specify "Count_grouped" <| group_builder.specify "Count_grouped" <|
data.table.aggregate [Group_By "Index", Count] data.table.aggregate ["Index"] [Aggregate_Column.Count]
group_builder.specify "Max_grouped" <| group_builder.specify "Max_grouped" <|
data.table.aggregate [Group_By "Index", Maximum "ValueWithNothing"] data.table.aggregate ["Index"] [Aggregate_Column.Maximum "ValueWithNothing"]
group_builder.specify "Sum_grouped" <| group_builder.specify "Sum_grouped" <|
data.table.aggregate [Group_By "Index", Sum "ValueWithNothing"] data.table.aggregate ["Index"] [Aggregate_Column.Sum "ValueWithNothing"]
group_builder.specify "Count_Distinct_grouped" <| group_builder.specify "Count_Distinct_grouped" <|
data.table.aggregate [Group_By "Index", Count_Distinct "Code"] data.table.aggregate ["Index"] [Aggregate_Column.Count_Distinct "Code"]
group_builder.specify "StDev_grouped" <| group_builder.specify "StDev_grouped" <|
data.table.aggregate [Group_By "Index", Standard_Deviation "Value"] data.table.aggregate ["Index"] [Aggregate_Column.Standard_Deviation "Value"]
group_builder.specify "Median_grouped" <| group_builder.specify "Median_grouped" <|
data.table.aggregate [Group_By "Index", Median "Value"] data.table.aggregate ["Index"] [Aggregate_Column.Median "Value"]
group_builder.specify "Mode_grouped" <| group_builder.specify "Mode_grouped" <|
data.table.aggregate [Group_By "Index", Mode "Index"] data.table.aggregate ["Index"] [Aggregate_Column.Mode "Index"]
group_builder.specify "Count_2_level_groups" <| group_builder.specify "Count_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Count] data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Count]
group_builder.specify "Max_2_level_groups" <| group_builder.specify "Max_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Maximum "ValueWithNothing"] data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Maximum "ValueWithNothing"]
group_builder.specify "Sum_2_level_groups" <| group_builder.specify "Sum_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Sum "ValueWithNothing"] data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Sum "ValueWithNothing"]
group_builder.specify "Count_Distinct_2_level_groups" <| group_builder.specify "Count_Distinct_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Count_Distinct "Code"] data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Count_Distinct "Code"]
group_builder.specify "StDev_2_level_groups" <| group_builder.specify "StDev_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Standard_Deviation "Value"] data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Standard_Deviation "Value"]
group_builder.specify "Median_2_level_groups" <| group_builder.specify "Median_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Median "Value"] data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Median "Value"]
group_builder.specify "Mode_2_level_groups" <| group_builder.specify "Mode_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Mode "Index"] data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Mode "Index"]
main = collect_benches . run_main main = collect_benches . run_main

View File

@ -1,7 +1,6 @@
from Standard.Base import all from Standard.Base import all
from Standard.Table import Table from Standard.Table import Table, Aggregate_Column
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Count, Sum
from Standard.Test import Bench, Faker from Standard.Test import Bench, Faker
@ -42,7 +41,7 @@ collect_benches = Bench.build builder->
data = Data.create num_rows data = Data.create num_rows
builder.group ("CrossTab_" + num_rows.to_text) options group_builder-> builder.group ("CrossTab_" + num_rows.to_text) options group_builder->
specify group_by name_column values=[Count] = specify group_by name_column values=[Aggregate_Column.Count] =
name = (group_by.join '_') + "_" + name_column + "_" + (values.map .to_text . join "_") name = (group_by.join '_') + "_" + name_column + "_" + (values.map .to_text . join "_")
group_builder.specify (normalize_name name) <| group_builder.specify (normalize_name name) <|
data.table.cross_tab group_by name_column values data.table.cross_tab group_by name_column values
@ -53,7 +52,7 @@ collect_benches = Bench.build builder->
specify ["type"] "size" specify ["type"] "size"
specify ["store"] "size" specify ["store"] "size"
specify ["size"] "store" specify ["size"] "store"
specify ["size"] "store" values=[Count, Sum "price"] specify ["size"] "store" values=[Aggregate_Column.Count, Aggregate_Column.Sum "price"]
normalize_name : Text -> Text normalize_name : Text -> Text

View File

@ -14,7 +14,7 @@ type Boxed_Total_Aggregate
Instance text_column Instance text_column
current_aggregate_implementation self = current_aggregate_implementation self =
self.text_column.to_table.aggregate [Aggregate_Column.Longest 0] . at 0 . at 0 self.text_column.to_table.aggregate [] [Aggregate_Column.Longest 0] . at 0 . at 0
java_loop self = java_loop self =
SimpleStorageAggregateHelpers.longestText self.text_column.java_column.getStorage SimpleStorageAggregateHelpers.longestText self.text_column.java_column.getStorage
@ -46,7 +46,7 @@ type Primitive_Total_Aggregate
Instance int_column Instance int_column
current_aggregate_implementation self = current_aggregate_implementation self =
self.int_column.to_table.aggregate [Aggregate_Column.Sum 0] . at 0 . at 0 self.int_column.to_table.aggregate [] [Aggregate_Column.Sum 0] . at 0 . at 0
java_loop self = java_loop self =
long_storage = self.int_column.java_column.getStorage long_storage = self.int_column.java_column.getStorage

View File

@ -4,8 +4,7 @@ import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Database.Extensions.Upload_Database_Table import Standard.Database.Extensions.Upload_Database_Table
import Standard.Database.Extensions.Upload_In_Memory_Table import Standard.Database.Extensions.Upload_In_Memory_Table
from Standard.Table import Sort_Column from Standard.Table import Sort_Column, Aggregate_Column
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Group_By, Sum
from Standard.Table.Errors import Missing_Input_Columns, Duplicate_Output_Column_Names, Floating_Point_Equality from Standard.Table.Errors import Missing_Input_Columns, Duplicate_Output_Column_Names, Floating_Point_Equality
from Standard.Test import all from Standard.Test import all
@ -118,7 +117,7 @@ add_specs suite_builder setup =
group_builder.specify "Should work correctly after aggregation" <| group_builder.specify "Should work correctly after aggregation" <|
t0 = table_builder [["X", ["a", "b", "a", "c"]], ["Y", [1, 2, 4, 8]]] t0 = table_builder [["X", ["a", "b", "a", "c"]], ["Y", [1, 2, 4, 8]]]
t1 = t0.aggregate [Group_By "X", Sum "Y"] t1 = t0.aggregate ["X"] [Aggregate_Column.Sum "Y"]
t2 = t1.order_by "X" . add_row_number t2 = t1.order_by "X" . add_row_number
t2.at "X" . to_vector . should_equal ['a', 'b', 'c'] t2.at "X" . to_vector . should_equal ['a', 'b', 'c']

View File

@ -89,7 +89,7 @@ add_specs suite_builder setup =
group_builder.specify "case insensitive name collisions - aggregate" <| group_builder.specify "case insensitive name collisions - aggregate" <|
t1 = table_builder [["X", [2, 1, 3, 2]]] t1 = table_builder [["X", [2, 1, 3, 2]]]
t2 = t1.aggregate [Aggregate_Column.Maximum "X" "A", Aggregate_Column.Minimum "X" "a"] t2 = t1.aggregate columns=[Aggregate_Column.Maximum "X" "A", Aggregate_Column.Minimum "X" "a"]
case is_case_sensitive of case is_case_sensitive of
True -> True ->

View File

@ -1,7 +1,7 @@
from Standard.Base import all from Standard.Base import all
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Average, Count, Group_By, Sum, Concatenate from Standard.Table import Aggregate_Column
import Standard.Table.Data.Expression.Expression_Error import Standard.Table.Data.Expression.Expression_Error
from Standard.Table.Errors import all from Standard.Table.Errors import all
@ -53,7 +53,7 @@ add_specs suite_builder setup =
t1.at "z" . to_vector . should_equal [2] t1.at "z" . to_vector . should_equal [2]
group_builder.specify "should allow a different aggregate" <| group_builder.specify "should allow a different aggregate" <|
t1 = data.table.cross_tab [] "Key" values=[Sum "Value"] t1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Sum "Value"]
t1.column_names . should_equal ["x", "y", "z"] t1.column_names . should_equal ["x", "y", "z"]
t1.row_count . should_equal 1 t1.row_count . should_equal 1
t1.at "x" . to_vector . should_equal [10] t1.at "x" . to_vector . should_equal [10]
@ -61,7 +61,7 @@ add_specs suite_builder setup =
t1.at "z" . to_vector . should_equal [17] t1.at "z" . to_vector . should_equal [17]
group_builder.specify "should allow a custom expression for the aggregate" <| group_builder.specify "should allow a custom expression for the aggregate" <|
t1 = data.table.cross_tab [] "Key" values=[Sum "[Value]*[Value]"] t1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Sum "[Value]*[Value]"]
t1.column_names . should_equal ["x", "y", "z"] t1.column_names . should_equal ["x", "y", "z"]
t1.row_count . should_equal 1 t1.row_count . should_equal 1
t1.at "x" . to_vector . should_equal [30] t1.at "x" . to_vector . should_equal [30]
@ -94,7 +94,7 @@ add_specs suite_builder setup =
t1.at "z" . to_vector . should_equal [1, 1] t1.at "z" . to_vector . should_equal [1, 1]
group_builder.specify "should allow a grouping by Aggregate_Column" <| group_builder.specify "should allow a grouping by Aggregate_Column" <|
t1 = data.table2.cross_tab [Group_By "Group"] "Key" t1 = data.table2.cross_tab [Aggregate_Column.Group_By "Group"] "Key"
t1.column_names . should_equal ["Group", "x", "y", "z"] t1.column_names . should_equal ["Group", "x", "y", "z"]
t1.row_count . should_equal 2 t1.row_count . should_equal 2
t1.at "Group" . to_vector . should_equal ["A", "B"] t1.at "Group" . to_vector . should_equal ["A", "B"]
@ -102,11 +102,11 @@ add_specs suite_builder setup =
t1.at "y" . to_vector . should_equal [2, 1] t1.at "y" . to_vector . should_equal [2, 1]
t1.at "z" . to_vector . should_equal [1, 1] t1.at "z" . to_vector . should_equal [1, 1]
data.table2.cross_tab [Sum "Group"] "Key" . should_fail_with Illegal_Argument data.table2.cross_tab [Aggregate_Column.Sum "Group"] "Key" . should_fail_with Illegal_Argument
group_builder.specify "should allow a grouping by Aggregate_Colum, with some empty bins" <| group_builder.specify "should allow a grouping by Aggregate_Colum, with some empty bins" <|
table3 = table_builder [["Group", ["B","A","B","A","A"]], ["Key", ["x", "y", "y", "y", "z"]], ["Value", [4, 5, 6, 7, 8]]] table3 = table_builder [["Group", ["B","A","B","A","A"]], ["Key", ["x", "y", "y", "y", "z"]], ["Value", [4, 5, 6, 7, 8]]]
t1 = table3.cross_tab [Group_By "Group"] "Key" t1 = table3.cross_tab [Aggregate_Column.Group_By "Group"] "Key"
t1.column_names . should_equal ["Group", "x", "y", "z"] t1.column_names . should_equal ["Group", "x", "y", "z"]
t1.row_count . should_equal 2 t1.row_count . should_equal 2
t1.at "Group" . to_vector . should_equal ["A", "B"] t1.at "Group" . to_vector . should_equal ["A", "B"]
@ -127,7 +127,7 @@ add_specs suite_builder setup =
t2.column_names . should_equal ["Group", "x", "y", "z"] t2.column_names . should_equal ["Group", "x", "y", "z"]
group_builder.specify "should allow multiple values aggregates" <| group_builder.specify "should allow multiple values aggregates" <|
t1 = data.table.cross_tab [] "Key" values=[Count, Sum "Value"] t1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count, Aggregate_Column.Sum "Value"]
t1.column_names . should_equal ["x Count", "x Sum", "y Count", "y Sum", "z Count", "z Sum"] t1.column_names . should_equal ["x Count", "x Sum", "y Count", "y Sum", "z Count", "z Sum"]
t1.row_count . should_equal 1 t1.row_count . should_equal 1
t1.at "x Count" . to_vector . should_equal [4] t1.at "x Count" . to_vector . should_equal [4]
@ -156,31 +156,31 @@ add_specs suite_builder setup =
err2.catch.criteria . should_equal [42] err2.catch.criteria . should_equal [42]
group_builder.specify "should fail if aggregate values contain missing columns" <| group_builder.specify "should fail if aggregate values contain missing columns" <|
err1 = data.table.cross_tab [] "Key" values=[Count, Sum "Nonexistent Value", Sum "Value", Sum "OTHER"] err1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count, Aggregate_Column.Sum "Nonexistent Value", Aggregate_Column.Sum "Value", Aggregate_Column.Sum "OTHER"]
err1.should_fail_with Invalid_Aggregate_Column err1.should_fail_with Invalid_Aggregate_Column
err1.catch.name . should_equal "Nonexistent Value" err1.catch.name . should_equal "Nonexistent Value"
err2 = data.table.cross_tab [] "Key" values=[Count, Sum "Nonexistent Value", Sum "Value", Sum 42] err2 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count, Aggregate_Column.Sum "Nonexistent Value", Aggregate_Column.Sum "Value", Aggregate_Column.Sum 42]
err2.should_fail_with Missing_Input_Columns err2.should_fail_with Missing_Input_Columns
err2.catch.criteria . should_equal [42] err2.catch.criteria . should_equal [42]
group_builder.specify "should fail if aggregate values contain invalid expressions" <| group_builder.specify "should fail if aggregate values contain invalid expressions" <|
err1 = data.table.cross_tab [] "Key" values=[Sum "[MISSING]*10"] err1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Sum "[MISSING]*10"]
err1.should_fail_with Invalid_Aggregate_Column err1.should_fail_with Invalid_Aggregate_Column
err1.catch.name . should_equal "[MISSING]*10" err1.catch.name . should_equal "[MISSING]*10"
err1.catch.expression_error . should_equal (No_Such_Column.Error "MISSING") err1.catch.expression_error . should_equal (No_Such_Column.Error "MISSING")
err2 = data.table.cross_tab [] "Key" values=[Sum "[[["] err2 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Sum "[[["]
err2.should_fail_with Invalid_Aggregate_Column err2.should_fail_with Invalid_Aggregate_Column
err2.catch.name . should_equal "[[[" err2.catch.name . should_equal "[[["
err2.catch.expression_error . should_be_a Expression_Error.Syntax_Error err2.catch.expression_error . should_be_a Expression_Error.Syntax_Error
group_builder.specify "should not allow Group_By for values" <| group_builder.specify "should not allow Group_By for values" <|
err1 = data.table.cross_tab [] "Key" values=[Count, Group_By "Value"] on_problems=Problem_Behavior.Ignore err1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count, Aggregate_Column.Group_By "Value"] on_problems=Problem_Behavior.Ignore
err1.should_fail_with Illegal_Argument err1.should_fail_with Illegal_Argument
group_builder.specify "should gracefully handle duplicate aggregate names" <| group_builder.specify "should gracefully handle duplicate aggregate names" <|
action = data.table.cross_tab [] "Key" values=[Count new_name="Agg1", Sum "Value" new_name="Agg1"] on_problems=_ action = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count new_name="Agg1", Aggregate_Column.Sum "Value" new_name="Agg1"] on_problems=_
tester table = tester table =
table.column_names . should_equal ["x Agg1", "x Agg1 1", "y Agg1", "y Agg1 1", "z Agg1", "z Agg1 1"] table.column_names . should_equal ["x Agg1", "x Agg1 1", "y Agg1", "y Agg1 1", "z Agg1", "z Agg1 1"]
problems = [Duplicate_Output_Column_Names.Error ["x Agg1", "y Agg1", "z Agg1"]] problems = [Duplicate_Output_Column_Names.Error ["x Agg1", "y Agg1", "z Agg1"]]
@ -235,11 +235,11 @@ add_specs suite_builder setup =
t = table_builder [["Key", ["a", "a", "b", "b"]], ["ints", [1, 2, 3, 4]], ["texts", ["a", "b", "c", "d"]]] t = table_builder [["Key", ["a", "a", "b", "b"]], ["ints", [1, 2, 3, 4]], ["texts", ["a", "b", "c", "d"]]]
[Problem_Behavior.Report_Error, Problem_Behavior.Report_Warning, Problem_Behavior.Ignore].each pb-> Test.with_clue "Problem_Behavior="+pb.to_text+" " <| [Problem_Behavior.Report_Error, Problem_Behavior.Report_Warning, Problem_Behavior.Ignore].each pb-> Test.with_clue "Problem_Behavior="+pb.to_text+" " <|
t1 = t.cross_tab [] "Key" values=[Average "texts"] on_problems=pb t1 = t.cross_tab [] "Key" values=[Aggregate_Column.Average "texts"] on_problems=pb
t1.should_fail_with Invalid_Value_Type t1.should_fail_with Invalid_Value_Type
t2 = t.cross_tab [] "Key" values=[Sum "texts"] on_problems=pb t2 = t.cross_tab [] "Key" values=[Aggregate_Column.Sum "texts"] on_problems=pb
t2.should_fail_with Invalid_Value_Type t2.should_fail_with Invalid_Value_Type
t3 = t.cross_tab [] "Key" values=[Concatenate "ints"] on_problems=pb t3 = t.cross_tab [] "Key" values=[Aggregate_Column.Concatenate "ints"] on_problems=pb
t3.should_fail_with Invalid_Value_Type t3.should_fail_with Invalid_Value_Type
group_builder.specify "should return predictable types" <| group_builder.specify "should return predictable types" <|
@ -250,7 +250,7 @@ add_specs suite_builder setup =
t1.at "a" . value_type . is_integer . should_be_true t1.at "a" . value_type . is_integer . should_be_true
t1.at "b" . value_type . is_integer . should_be_true t1.at "b" . value_type . is_integer . should_be_true
t2 = table.cross_tab [] "Int" values=[Average "Float", Concatenate "Text"] . sort_columns t2 = table.cross_tab [] "Int" values=[Aggregate_Column.Average "Float", Aggregate_Column.Concatenate "Text"] . sort_columns
t2.column_names . should_equal ["1 Average Float", "1 Concatenate Text", "2 Average Float", "2 Concatenate Text"] t2.column_names . should_equal ["1 Average Float", "1 Concatenate Text", "2 Average Float", "2 Concatenate Text"]
t2.at "1 Average Float" . value_type . is_floating_point . should_be_true t2.at "1 Average Float" . value_type . is_floating_point . should_be_true
t2.at "1 Concatenate Text" . value_type . is_text . should_be_true t2.at "1 Concatenate Text" . value_type . is_text . should_be_true
@ -263,7 +263,7 @@ add_specs suite_builder setup =
r1.should_fail_with Invalid_Column_Names r1.should_fail_with Invalid_Column_Names
r1.catch.to_display_text . should_contain "cannot contain the NUL character" r1.catch.to_display_text . should_contain "cannot contain the NUL character"
r2 = data.table2.cross_tab [] "Key" values=[Average "Value" new_name='x\0'] r2 = data.table2.cross_tab [] "Key" values=[Aggregate_Column.Average "Value" new_name='x\0']
r2.print r2.print
r2.should_fail_with Invalid_Column_Names r2.should_fail_with Invalid_Column_Names
r2.catch.to_display_text . should_contain "cannot contain the NUL character" r2.catch.to_display_text . should_contain "cannot contain the NUL character"

View File

@ -2,7 +2,6 @@ from Standard.Base import all
# We hide the table constructor as instead we are supposed to use `table_builder` which is backend-agnostic. # We hide the table constructor as instead we are supposed to use `table_builder` which is backend-agnostic.
from Standard.Table import all hiding Table from Standard.Table import all hiding Table
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Group_By, Count, Sum
from Standard.Test import all from Standard.Test import all
@ -54,7 +53,7 @@ add_specs suite_builder setup =
t1 = table_builder [["Count", [1, 2, 3]], ["Class", ["X", "Y", "Z"]]] t1 = table_builder [["Count", [1, 2, 3]], ["Class", ["X", "Y", "Z"]]]
t2 = table_builder [["Letter", ["A", "B", "A", "A", "C", "A", "C", "D", "D", "B", "B"]]] t2 = table_builder [["Letter", ["A", "B", "A", "A", "C", "A", "C", "D", "D", "B", "B"]]]
t3 = t2.aggregate [Group_By "Letter", Count] t3 = t2.aggregate ["Letter"] [Aggregate_Column.Count]
t4 = t3.join t1 on="Count" join_kind=Join_Kind.Left_Outer |> materialize |> _.order_by "Letter" t4 = t3.join t1 on="Count" join_kind=Join_Kind.Left_Outer |> materialize |> _.order_by "Letter"
t4.columns.map .name . should_equal ["Letter", "Count", "Right Count", "Class"] t4.columns.map .name . should_equal ["Letter", "Count", "Right Count", "Class"]
rows = t4.rows . map .to_vector rows = t4.rows . map .to_vector
@ -66,7 +65,7 @@ add_specs suite_builder setup =
group_builder.specify "aggregates and distinct" <| group_builder.specify "aggregates and distinct" <|
t2 = table_builder [["Letter", ["A", "B", "A", "A", "C", "C"]], ["Points", [2, 5, 2, 1, 10, 3]]] t2 = table_builder [["Letter", ["A", "B", "A", "A", "C", "C"]], ["Points", [2, 5, 2, 1, 10, 3]]]
t3 = t2.aggregate [Group_By "Letter", Sum "Points"] t3 = t2.aggregate ["Letter"] [Aggregate_Column.Sum "Points"]
t4 = t3.distinct "Sum Points" |> materialize |> _.order_by "Sum Points" t4 = t3.distinct "Sum Points" |> materialize |> _.order_by "Sum Points"
t4.columns.map .name . should_equal ["Letter", "Sum Points"] t4.columns.map .name . should_equal ["Letter", "Sum Points"]
t4.row_count . should_equal 2 t4.row_count . should_equal 2
@ -81,7 +80,7 @@ add_specs suite_builder setup =
group_builder.specify "aggregates and filtering" <| group_builder.specify "aggregates and filtering" <|
t2 = table_builder [["Letter", ["A", "B", "A", "A", "C", "C", "B"]], ["Points", [2, 5, 2, 1, 10, 3, 0]]] t2 = table_builder [["Letter", ["A", "B", "A", "A", "C", "C", "B"]], ["Points", [2, 5, 2, 1, 10, 3, 0]]]
t3 = t2.aggregate [Group_By "Letter", Sum "Points"] t3 = t2.aggregate ["Letter"] [Aggregate_Column.Sum "Points"]
t4 = t3.filter "Sum Points" (Filter_Condition.Equal 5) |> materialize |> _.order_by "Letter" t4 = t3.filter "Sum Points" (Filter_Condition.Equal 5) |> materialize |> _.order_by "Letter"
t4.columns.map .name . should_equal ["Letter", "Sum Points"] t4.columns.map .name . should_equal ["Letter", "Sum Points"]
rows = t4.rows . map .to_vector rows = t4.rows . map .to_vector
@ -90,7 +89,7 @@ add_specs suite_builder setup =
group_builder.specify "aggregates and ordering" <| group_builder.specify "aggregates and ordering" <|
t1 = table_builder [["Letter", ["C", "A", "B", "A", "A", "C", "C", "B"]], ["Points", [0, -100, 5, 2, 1, 10, 3, 0]]] t1 = table_builder [["Letter", ["C", "A", "B", "A", "A", "C", "C", "B"]], ["Points", [0, -100, 5, 2, 1, 10, 3, 0]]]
t2 = t1.aggregate [Group_By "Letter", Sum "Points"] t2 = t1.aggregate ["Letter"] [Aggregate_Column.Sum "Points"]
t3 = t2.order_by "Sum Points" |> materialize t3 = t2.order_by "Sum Points" |> materialize
t3.columns.map .name . should_equal ["Letter", "Sum Points"] t3.columns.map .name . should_equal ["Letter", "Sum Points"]
t3.at "Letter" . to_vector . should_equal ["A", "B", "C"] t3.at "Letter" . to_vector . should_equal ["A", "B", "C"]
@ -194,7 +193,7 @@ add_specs suite_builder setup =
vt1.should_be_a (Value_Type.Char ...) vt1.should_be_a (Value_Type.Char ...)
vt1.variable_length.should_be_true vt1.variable_length.should_be_true
t4 = t3.aggregate [Aggregate_Column.Shortest "X", Aggregate_Column.Group_By "Y"] t4 = t3.aggregate ["Y"] [Aggregate_Column.Shortest "X"]
vt2 = t4.at "Shortest X" . value_type vt2 = t4.at "Shortest X" . value_type
Test.with_clue "t4[X].value_type="+vt2.to_display_text+": " <| Test.with_clue "t4[X].value_type="+vt2.to_display_text+": " <|
vt2.should_be_a (Value_Type.Char ...) vt2.should_be_a (Value_Type.Char ...)
@ -219,7 +218,7 @@ add_specs suite_builder setup =
c.value_type.variable_length.should_be_true c.value_type.variable_length.should_be_true
t2 = t1.set c "C" t2 = t1.set c "C"
t3 = t2.aggregate [Aggregate_Column.Shortest "C"] t3 = t2.aggregate columns=[Aggregate_Column.Shortest "C"]
t3.at "Shortest C" . to_vector . should_equal ["b"] t3.at "Shortest C" . to_vector . should_equal ["b"]
vt = t3.at "Shortest C" . value_type vt = t3.at "Shortest C" . value_type
Test.with_clue "t3[C].value_type="+vt.to_display_text+": " <| Test.with_clue "t3[C].value_type="+vt.to_display_text+": " <|

View File

@ -1,7 +1,6 @@
from Standard.Base import all from Standard.Base import all
from Standard.Table import Value_Type, Column_Ref, Previous_Value, Blank_Selector from Standard.Table import Value_Type, Column_Ref, Previous_Value, Blank_Selector
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Count_Distinct
from Standard.Table.Errors import all from Standard.Table.Errors import all
from Standard.Database.Errors import Unsupported_Database_Operation from Standard.Database.Errors import Unsupported_Database_Operation

View File

@ -4,7 +4,7 @@ import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Common.Type_Error import Standard.Base.Errors.Common.Type_Error
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Group_By, Sum from Standard.Table import Aggregate_Column
from Standard.Table.Errors import all from Standard.Table.Errors import all
from Standard.Test import all from Standard.Test import all
@ -254,7 +254,7 @@ add_specs suite_builder setup =
group_builder.specify "Should work correctly after aggregation" <| group_builder.specify "Should work correctly after aggregation" <|
t0 = table_builder [["X", ["a", "b", "a", "c"]], ["Y", [1, 2, 4, 8]]] t0 = table_builder [["X", ["a", "b", "a", "c"]], ["Y", [1, 2, 4, 8]]]
t1 = t0.aggregate [Group_By "X", Sum "Y"] t1 = t0.aggregate ["X"] [Aggregate_Column.Sum "Y"]
t2 = t1.order_by "X" . take 2 t2 = t1.order_by "X" . take 2
t2.at "X" . to_vector . should_equal ['a', 'b'] t2.at "X" . to_vector . should_equal ['a', 'b']

View File

@ -1,8 +1,7 @@
from Standard.Base import all from Standard.Base import all
import Standard.Base.Errors.Illegal_State.Illegal_State import Standard.Base.Errors.Illegal_State.Illegal_State
from Standard.Table import Sort_Column, Value_Type, Blank_Selector from Standard.Table import Sort_Column, Value_Type, Blank_Selector, Aggregate_Column
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all hiding First, Last
from Standard.Table.Errors import No_Input_Columns_Selected, Missing_Input_Columns, No_Such_Column from Standard.Table.Errors import No_Input_Columns_Selected, Missing_Input_Columns, No_Such_Column
from Standard.Database import all from Standard.Database import all
@ -158,12 +157,12 @@ add_specs suite_builder =
data.teardown data.teardown
group_builder.specify "should allow to count rows" <| group_builder.specify "should allow to count rows" <|
code = data.t1.aggregate [Group_By "A" "A grp", Count "counter"] . to_sql . prepare code = data.t1.aggregate ["A"] [Aggregate_Column.Count "counter"] . to_sql . prepare
code . should_equal ['SELECT "T1"."A grp" AS "A grp", "T1"."counter" AS "counter" FROM (SELECT "T1"."A" AS "A grp", COUNT(*) AS "counter" FROM "T1" AS "T1" GROUP BY "T1"."A") AS "T1"', []] code . should_equal ['SELECT "T1"."A" AS "A", "T1"."counter" AS "counter" FROM (SELECT "T1"."A" AS "A", COUNT(*) AS "counter" FROM "T1" AS "T1" GROUP BY "T1"."A") AS "T1"', []]
group_builder.specify "should allow to group by multiple fields" <| group_builder.specify "should allow to group by multiple fields" <|
code = data.t1.aggregate [Sum "A" "sum_a", Group_By "C", Group_By "B" "B grp"] . to_sql . prepare code = data.t1.aggregate ["C", "B"] [Aggregate_Column.Sum "A" "sum_a"] . to_sql . prepare
code . should_equal ['SELECT "T1"."sum_a" AS "sum_a", "T1"."C" AS "C", "T1"."B grp" AS "B grp" FROM (SELECT SUM("T1"."A") AS "sum_a", "T1"."C" AS "C", "T1"."B" AS "B grp" FROM "T1" AS "T1" GROUP BY "T1"."C", "T1"."B") AS "T1"', []] code . should_equal ['SELECT "T1"."C" AS "C", "T1"."B" AS "B", "T1"."sum_a" AS "sum_a" FROM (SELECT "T1"."C" AS "C", "T1"."B" AS "B", SUM("T1"."A") AS "sum_a" FROM "T1" AS "T1" GROUP BY "T1"."C", "T1"."B") AS "T1"', []]
main = main =
suite = Test.build suite_builder-> suite = Test.build suite_builder->

View File

@ -2,8 +2,7 @@ from Standard.Base import all
import Standard.Base.Errors.Common.Index_Out_Of_Bounds import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Table import Table, Sort_Column from Standard.Table import Table, Sort_Column, Aggregate_Column
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all hiding First, Last
from Standard.Table.Errors import all from Standard.Table.Errors import all
from Standard.Database import all from Standard.Database import all
@ -424,15 +423,15 @@ add_specs (suite_builder : Suite_Builder) (prefix : Text) (create_connection_fn
group_builder.specify "should allow counting group sizes and elements" <| group_builder.specify "should allow counting group sizes and elements" <|
## Names set to lower case to avoid issue with Redshift where columns are ## Names set to lower case to avoid issue with Redshift where columns are
returned in lower case. returned in lower case.
aggregates = [Count "count", Count_Not_Nothing "price" "count not nothing price", Count_Nothing "price" "count nothing price"] aggregates = [Aggregate_Column.Count "count", Aggregate_Column.Count_Not_Nothing "price" "count not nothing price", Aggregate_Column.Count_Nothing "price" "count nothing price"]
t1 = determinize_by "name" (data.t9.aggregate ([Group_By "name"] + aggregates) . read) t1 = determinize_by "name" (data.t9.aggregate ["name"] aggregates . read)
t1.at "name" . to_vector . should_equal ["bar", "baz", "foo", "quux", "zzzz"] t1.at "name" . to_vector . should_equal ["bar", "baz", "foo", "quux", "zzzz"]
t1.at "count" . to_vector . should_equal [2, 1, 5, 1, 7] t1.at "count" . to_vector . should_equal [2, 1, 5, 1, 7]
t1.at "count not nothing price" . to_vector . should_equal [2, 1, 3, 0, 5] t1.at "count not nothing price" . to_vector . should_equal [2, 1, 3, 0, 5]
t1.at "count nothing price" . to_vector . should_equal [0, 0, 2, 1, 2] t1.at "count nothing price" . to_vector . should_equal [0, 0, 2, 1, 2]
t2 = data.t9.aggregate aggregates . read t2 = data.t9.aggregate [] aggregates . read
t2.at "count" . to_vector . should_equal [16] t2.at "count" . to_vector . should_equal [16]
t2.at "count not nothing price" . to_vector . should_equal [11] t2.at "count not nothing price" . to_vector . should_equal [11]
t2.at "count nothing price" . to_vector . should_equal [5] t2.at "count nothing price" . to_vector . should_equal [5]
@ -440,16 +439,16 @@ add_specs (suite_builder : Suite_Builder) (prefix : Text) (create_connection_fn
group_builder.specify "should allow simple arithmetic aggregations" <| group_builder.specify "should allow simple arithmetic aggregations" <|
## Names set to lower case to avoid issue with Redshift where columns are ## Names set to lower case to avoid issue with Redshift where columns are
returned in lower case. returned in lower case.
aggregates = [Sum "price" "sum price", Sum "quantity" "sum quantity", Average "price" "avg price"] aggregates = [Aggregate_Column.Sum "price" "sum price", Aggregate_Column.Sum "quantity" "sum quantity", Aggregate_Column.Average "price" "avg price"]
## TODO can check the datatypes ## TODO can check the datatypes
t1 = determinize_by "name" (data.t9.aggregate ([Group_By "name"] + aggregates) . read) t1 = determinize_by "name" (data.t9.aggregate ["name"] aggregates . read)
t1.at "name" . to_vector . should_equal ["bar", "baz", "foo", "quux", "zzzz"] t1.at "name" . to_vector . should_equal ["bar", "baz", "foo", "quux", "zzzz"]
t1.at "sum price" . to_vector . should_equal [100.5, 6.7, 1, Nothing, 2] t1.at "sum price" . to_vector . should_equal [100.5, 6.7, 1, Nothing, 2]
t1.at "sum quantity" . to_vector . should_equal [80, 40, 120, 70, 2] t1.at "sum quantity" . to_vector . should_equal [80, 40, 120, 70, 2]
t1.at "avg price" . to_vector . should_equal [50.25, 6.7, (1/3), Nothing, (2/5)] t1.at "avg price" . to_vector . should_equal [50.25, 6.7, (1/3), Nothing, (2/5)]
t2 = data.t9.aggregate aggregates . read t2 = data.t9.aggregate [] aggregates . read
t2.at "sum price" . to_vector . should_equal [110.2] t2.at "sum price" . to_vector . should_equal [110.2]
t2.at "sum quantity" . to_vector . should_equal [312] t2.at "sum quantity" . to_vector . should_equal [312]
t2.at "avg price" . to_vector . should_equal [(110.2 / 11)] t2.at "avg price" . to_vector . should_equal [(110.2 / 11)]

View File

@ -68,7 +68,7 @@ add_specs (suite_builder : Suite_Builder) (prefix : Text) (create_connection_fn
group_builder.specify "will return Nothing for composite tables (join, aggregate)" group_builder.specify "will return Nothing for composite tables (join, aggregate)"
data.db_table_with_key.join data.db_table_with_key . default_ordering . should_equal Nothing data.db_table_with_key.join data.db_table_with_key . default_ordering . should_equal Nothing
data.db_table_with_key.aggregate [Aggregate_Column.Group_By "X"] . default_ordering . should_equal Nothing data.db_table_with_key.aggregate ["X"] . default_ordering . should_equal Nothing
group_builder.specify "will return the ordering determined by order_by" <| group_builder.specify "will return the ordering determined by order_by" <|
v1 = data.db_table_with_key.order_by ["Y", Sort_Column.Name "X" Sort_Direction.Descending] . default_ordering v1 = data.db_table_with_key.order_by ["Y", Sort_Column.Name "X" Sort_Direction.Descending] . default_ordering

View File

@ -188,7 +188,7 @@ add_specs suite_builder prefix create_connection_func =
src = Table.new [[name_1, [1, 2, 3]], [name_2, [4, 5, 6]]] src = Table.new [[name_1, [1, 2, 3]], [name_2, [4, 5, 6]]]
t1 = src.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True t1 = src.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True
# We create 2 Maximum columns that if wrongly truncated will have the same name, introducing possible ambiguity to further queries. # We create 2 Maximum columns that if wrongly truncated will have the same name, introducing possible ambiguity to further queries.
t2 = t1.aggregate [Aggregate_Column.Group_By name_1, Aggregate_Column.Maximum name_2, Aggregate_Column.Maximum name_1] t2 = t1.aggregate [name_1] [Aggregate_Column.Maximum name_2, Aggregate_Column.Maximum name_1]
# The newly added column would by default have a name exceeding the limit, if there's one - and its 'dumbly' truncated name will clash with the already existing column. # The newly added column would by default have a name exceeding the limit, if there's one - and its 'dumbly' truncated name will clash with the already existing column.
t3 = t1.set (t1.at name_1 * t1.at name_2) t3 = t1.set (t1.at name_1 * t1.at name_2)
@ -372,7 +372,7 @@ add_specs suite_builder prefix create_connection_func =
src = Table.new [["X", [1, 2, 3]]] src = Table.new [["X", [1, 2, 3]]]
db_table = src.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True db_table = src.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True
long_name = "a" * (max_column_name_length + 1) long_name = "a" * (max_column_name_length + 1)
r = db_table.aggregate [Aggregate_Column.Maximum "X" new_name=long_name] r = db_table.aggregate columns=[Aggregate_Column.Maximum "X" new_name=long_name]
r.should_fail_with Name_Too_Long r.should_fail_with Name_Too_Long
r.catch.entity_kind . should_equal "column" r.catch.entity_kind . should_equal "column"
r.catch.name . should_equal long_name r.catch.name . should_equal long_name
@ -382,7 +382,7 @@ add_specs suite_builder prefix create_connection_func =
name_b = "x" * (max_column_name_length - 1) + "B" name_b = "x" * (max_column_name_length - 1) + "B"
src = Table.new [[name_a, [1, 2, 3]], [name_b, [4, 5, 6]]] src = Table.new [[name_a, [1, 2, 3]], [name_b, [4, 5, 6]]]
db_table = src.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True db_table = src.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True
t2 = db_table.aggregate [Aggregate_Column.Maximum name_a, Aggregate_Column.Maximum name_b] t2 = db_table.aggregate columns=[Aggregate_Column.Maximum name_a, Aggregate_Column.Maximum name_b]
w1 = Problems.expect_warning Truncated_Column_Names t2 w1 = Problems.expect_warning Truncated_Column_Names t2
w1.original_names . should_contain_the_same_elements_as ["Maximum "+name_a, "Maximum "+name_b] w1.original_names . should_contain_the_same_elements_as ["Maximum "+name_a, "Maximum "+name_b]
@ -397,7 +397,7 @@ add_specs suite_builder prefix create_connection_func =
src2 = Table.new (names.map_with_index i-> name-> [name, [100 + i, 200 + i]]) src2 = Table.new (names.map_with_index i-> name-> [name, [100 + i, 200 + i]])
db_table2 = src2.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True db_table2 = src2.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True
Problems.assume_no_problems db_table2 Problems.assume_no_problems db_table2
t3 = db_table2.aggregate (names.map name-> Aggregate_Column.Maximum name) t3 = db_table2.aggregate columns=(names.map name-> Aggregate_Column.Maximum name)
w2 = Problems.expect_warning Truncated_Column_Names t3 w2 = Problems.expect_warning Truncated_Column_Names t3
w2.original_names . should_contain_the_same_elements_as (names.map name-> "Maximum " + name) w2.original_names . should_contain_the_same_elements_as (names.map name-> "Maximum " + name)
t3.column_names . should_contain_the_same_elements_as w2.truncated_names t3.column_names . should_contain_the_same_elements_as w2.truncated_names

View File

@ -4,8 +4,7 @@ import Standard.Base.Errors.Illegal_State.Illegal_State
import Standard.Base.Runtime.Ref.Ref import Standard.Base.Runtime.Ref.Ref
import Standard.Table.Data.Type.Value_Type.Bits import Standard.Table.Data.Type.Value_Type.Bits
from Standard.Table import Table, Value_Type from Standard.Table import Table, Value_Type, Aggregate_Column
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all hiding First, Last
from Standard.Table.Errors import Invalid_Column_Names, Inexact_Type_Coercion, Duplicate_Output_Column_Names from Standard.Table.Errors import Invalid_Column_Names, Inexact_Type_Coercion, Duplicate_Output_Column_Names
import Standard.Database.Data.Column.Column import Standard.Database.Data.Column.Column
@ -231,7 +230,7 @@ postgres_specific_spec suite_builder create_connection_fn db_name setup =
i.at "Value Type" . to_vector . should_equal [default_text, Value_Type.Integer, Value_Type.Boolean, Value_Type.Float] i.at "Value Type" . to_vector . should_equal [default_text, Value_Type.Integer, Value_Type.Boolean, Value_Type.Float]
group_builder.specify "should return Table information, also for aggregated results" <| group_builder.specify "should return Table information, also for aggregated results" <|
i = data.t.aggregate [Concatenate "strs", Sum "ints", Count_Distinct "bools"] . info i = data.t.aggregate columns=[Aggregate_Column.Concatenate "strs", Aggregate_Column.Sum "ints", Aggregate_Column.Count_Distinct "bools"] . info
i.at "Column" . to_vector . should_equal ["Concatenate strs", "Sum ints", "Count Distinct bools"] i.at "Column" . to_vector . should_equal ["Concatenate strs", "Sum ints", "Count Distinct bools"]
i.at "Items Count" . to_vector . should_equal [1, 1, 1] i.at "Items Count" . to_vector . should_equal [1, 1, 1]
i.at "Value Type" . to_vector . should_equal [default_text, Value_Type.Decimal, Value_Type.Integer] i.at "Value Type" . to_vector . should_equal [default_text, Value_Type.Decimal, Value_Type.Integer]
@ -277,19 +276,19 @@ postgres_specific_spec suite_builder create_connection_fn db_name setup =
data.teardown data.teardown
group_builder.specify "Concatenate, Shortest and Longest" <| group_builder.specify "Concatenate, Shortest and Longest" <|
r = data.t.aggregate [Concatenate "txt", Shortest "txt", Longest "txt"] r = data.t.aggregate columns=[Aggregate_Column.Concatenate "txt", Aggregate_Column.Shortest "txt", Aggregate_Column.Longest "txt"]
r.columns.at 0 . value_type . should_equal default_text r.columns.at 0 . value_type . should_equal default_text
r.columns.at 1 . value_type . should_equal default_text r.columns.at 1 . value_type . should_equal default_text
r.columns.at 2 . value_type . should_equal default_text r.columns.at 2 . value_type . should_equal default_text
group_builder.specify "Counts" <| group_builder.specify "Counts" <|
r = data.t.aggregate [Count, Count_Empty "txt", Count_Not_Empty "txt", Count_Distinct "i1", Count_Not_Nothing "i2", Count_Nothing "i3"] r = data.t.aggregate columns=[Aggregate_Column.Count, Aggregate_Column.Count_Empty "txt", Aggregate_Column.Count_Not_Empty "txt", Aggregate_Column.Count_Distinct "i1", Aggregate_Column.Count_Not_Nothing "i2", Aggregate_Column.Count_Nothing "i3"]
r.column_count . should_equal 6 r.column_count . should_equal 6
r.columns.each column-> r.columns.each column->
column.value_type . should_equal Value_Type.Integer column.value_type . should_equal Value_Type.Integer
group_builder.specify "Sum" <| group_builder.specify "Sum" <|
r = data.t.aggregate [Sum "i1", Sum "i2", Sum "i3", Sum "i4", Sum "r1", Sum "r2"] r = data.t.aggregate columns=[Aggregate_Column.Sum "i1", Aggregate_Column.Sum "i2", Aggregate_Column.Sum "i3", Aggregate_Column.Sum "i4", Aggregate_Column.Sum "r1", Aggregate_Column.Sum "r2"]
r.columns.at 0 . value_type . should_equal Value_Type.Integer r.columns.at 0 . value_type . should_equal Value_Type.Integer
r.columns.at 1 . value_type . should_equal Value_Type.Integer r.columns.at 1 . value_type . should_equal Value_Type.Integer
r.columns.at 2 . value_type . should_equal Value_Type.Decimal r.columns.at 2 . value_type . should_equal Value_Type.Decimal
@ -298,7 +297,7 @@ postgres_specific_spec suite_builder create_connection_fn db_name setup =
r.columns.at 5 . value_type . should_equal (Value_Type.Float Bits.Bits_64) r.columns.at 5 . value_type . should_equal (Value_Type.Float Bits.Bits_64)
group_builder.specify "Average" <| group_builder.specify "Average" <|
r = data.t.aggregate [Average "i1", Average "i2", Average "i3", Average "i4", Average "r1", Average "r2"] r = data.t.aggregate columns=[Aggregate_Column.Average "i1", Aggregate_Column.Average "i2", Aggregate_Column.Average "i3", Aggregate_Column.Average "i4", Aggregate_Column.Average "r1", Aggregate_Column.Average "r2"]
r.columns.at 0 . value_type . should_equal Value_Type.Decimal r.columns.at 0 . value_type . should_equal Value_Type.Decimal
r.columns.at 1 . value_type . should_equal Value_Type.Decimal r.columns.at 1 . value_type . should_equal Value_Type.Decimal
r.columns.at 2 . value_type . should_equal Value_Type.Decimal r.columns.at 2 . value_type . should_equal Value_Type.Decimal

View File

@ -111,7 +111,7 @@ add_specs suite_builder create_connection_fn =
t.evaluate_expression 'is_nan([d])' . value_type . should_equal Value_Type.Boolean t.evaluate_expression 'is_nan([d])' . value_type . should_equal Value_Type.Boolean
t.evaluate_expression 'is_nothing([a])' . value_type . should_equal Value_Type.Boolean t.evaluate_expression 'is_nothing([a])' . value_type . should_equal Value_Type.Boolean
t2 = t.aggregate [Aggregate_Column.Group_By "b", Aggregate_Column.Sum "a", Aggregate_Column.Maximum "a", Aggregate_Column.Count_Not_Nothing "c", Aggregate_Column.Concatenate "b", Aggregate_Column.Count, (Aggregate_Column.First "c" order_by="a")] t2 = t.aggregate ["b"] [Aggregate_Column.Sum "a", Aggregate_Column.Maximum "a", Aggregate_Column.Count_Not_Nothing "c", Aggregate_Column.Concatenate "b", Aggregate_Column.Count, (Aggregate_Column.First "c" order_by="a")]
t2.at "b" . value_type . should_equal default_text t2.at "b" . value_type . should_equal default_text
t2.at "Sum a" . value_type . should_equal (Value_Type.Integer Bits.Bits_64) t2.at "Sum a" . value_type . should_equal (Value_Type.Integer Bits.Bits_64)
t2.at "Maximum a" . value_type . should_equal (Value_Type.Integer Bits.Bits_16) t2.at "Maximum a" . value_type . should_equal (Value_Type.Integer Bits.Bits_16)

View File

@ -88,7 +88,7 @@ add_specs suite_builder =
t.evaluate_expression 'is_empty([b])' . value_type . should_equal Value_Type.Boolean t.evaluate_expression 'is_empty([b])' . value_type . should_equal Value_Type.Boolean
t.evaluate_expression 'is_nothing([a])' . value_type . should_equal Value_Type.Boolean t.evaluate_expression 'is_nothing([a])' . value_type . should_equal Value_Type.Boolean
t2 = t.aggregate [Aggregate_Column.Group_By "b", Aggregate_Column.Sum "a", Aggregate_Column.Maximum "a", Aggregate_Column.Minimum "d", Aggregate_Column.Count_Not_Nothing "c", Aggregate_Column.Concatenate "b", Aggregate_Column.Count] t2 = t.aggregate ["b"] [Aggregate_Column.Sum "a", Aggregate_Column.Maximum "a", Aggregate_Column.Minimum "d", Aggregate_Column.Count_Not_Nothing "c", Aggregate_Column.Concatenate "b", Aggregate_Column.Count]
t2.at "b" . value_type . should_equal Value_Type.Char t2.at "b" . value_type . should_equal Value_Type.Char
t2.at "Sum a" . value_type . should_equal Value_Type.Integer t2.at "Sum a" . value_type . should_equal Value_Type.Integer
t2.at "Maximum a" . value_type . should_equal Value_Type.Integer t2.at "Maximum a" . value_type . should_equal Value_Type.Integer

View File

@ -489,7 +489,7 @@ add_specs suite_builder make_new_connection prefix persistent_connector=True =
db_table_1 = t1.select_into_database_table data.connection (Name_Generator.random_name "source-table-1") temporary=True primary_key=Nothing db_table_1 = t1.select_into_database_table data.connection (Name_Generator.random_name "source-table-1") temporary=True primary_key=Nothing
db_table_2 = db_table_1.set "[Y] + 100 * [X]" "C1" . set '"constant_text"' "C2" db_table_2 = db_table_1.set "[Y] + 100 * [X]" "C1" . set '"constant_text"' "C2"
db_table_3 = db_table_1.aggregate [Aggregate_Column.Group_By "X", Aggregate_Column.Sum "[Y]*[Y]" "C3"] . set "[X] + 1" "X" db_table_3 = db_table_1.aggregate ["X"] [Aggregate_Column.Sum "[Y]*[Y]" "C3"] . set "[X] + 1" "X"
db_table_4 = db_table_2.join db_table_3 join_kind=Join_Kind.Left_Outer db_table_4 = db_table_2.join db_table_3 join_kind=Join_Kind.Left_Outer
db_table_4.is_trivial_query . should_fail_with Table_Not_Found db_table_4.is_trivial_query . should_fail_with Table_Not_Found

View File

@ -6,7 +6,6 @@ import Standard.Base.Errors.Common.Type_Error
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Table import Table, Column, Sort_Column, Aggregate_Column, Blank_Selector from Standard.Table import Table, Column, Sort_Column, Aggregate_Column, Blank_Selector
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all hiding First, Last
import Standard.Table.Data.Type.Value_Type.Value_Type import Standard.Table.Data.Type.Value_Type.Value_Type
from Standard.Table.Errors import Invalid_Column_Names, Duplicate_Output_Column_Names, No_Input_Columns_Selected, Missing_Input_Columns, No_Such_Column, Floating_Point_Equality, Invalid_Value_Type, Row_Count_Mismatch from Standard.Table.Errors import Invalid_Column_Names, Duplicate_Output_Column_Names, No_Input_Columns_Selected, Missing_Input_Columns, No_Such_Column, Floating_Point_Equality, Invalid_Value_Type, Row_Count_Mismatch
@ -656,19 +655,19 @@ add_specs suite_builder =
objects = ["objects", [My.Data 0 1, My.Data 0 1, My.Data 2 2, My.Data 2 2]] objects = ["objects", [My.Data 0 1, My.Data 0 1, My.Data 2 2, My.Data 2 2]]
table = Table.new [dates, texts, mixed, ints, floats, objects] table = Table.new [dates, texts, mixed, ints, floats, objects]
t1 = table.aggregate [Group_By "dates", Shortest "texts", Aggregate_Column.First "texts", Aggregate_Column.First "objects", Aggregate_Column.First "ints", Aggregate_Column.Last "mixed"] t1 = table.aggregate ["dates"] [Aggregate_Column.Shortest "texts", Aggregate_Column.First "texts", Aggregate_Column.First "objects", Aggregate_Column.First "ints", Aggregate_Column.Last "mixed"]
t1.info.at "Column" . to_vector . should_equal ["dates", "Shortest texts", "First texts", "First objects", "First ints", "Last mixed"] t1.info.at "Column" . to_vector . should_equal ["dates", "Shortest texts", "First texts", "First objects", "First ints", "Last mixed"]
t1.info.at "Value Type" . to_vector . should_equal [Value_Type.Date, Value_Type.Char, Value_Type.Char, Value_Type.Mixed, Value_Type.Integer, Value_Type.Mixed] t1.info.at "Value Type" . to_vector . should_equal [Value_Type.Date, Value_Type.Char, Value_Type.Char, Value_Type.Mixed, Value_Type.Integer, Value_Type.Mixed]
t2 = table.aggregate [Mode "dates", Count_Not_Nothing "objects", Count_Distinct "texts", Minimum "ints", Maximum "floats"] t2 = table.aggregate [] [Aggregate_Column.Mode "dates", Aggregate_Column.Count_Not_Nothing "objects", Aggregate_Column.Count_Distinct "texts", Aggregate_Column.Minimum "ints", Aggregate_Column.Maximum "floats"]
t2.info.at "Column" . to_vector . should_equal ["Mode dates", "Count Not Nothing objects", "Count Distinct texts", "Minimum ints", "Maximum floats"] t2.info.at "Column" . to_vector . should_equal ["Mode dates", "Count Not Nothing objects", "Count Distinct texts", "Minimum ints", "Maximum floats"]
t2.info.at "Value Type" . to_vector . should_equal [Value_Type.Date, Value_Type.Integer, Value_Type.Integer, Value_Type.Integer, Value_Type.Float] t2.info.at "Value Type" . to_vector . should_equal [Value_Type.Date, Value_Type.Integer, Value_Type.Integer, Value_Type.Integer, Value_Type.Float]
t3 = table.aggregate [Group_By "texts", Group_By "ints", Aggregate_Column.Last "floats"] t3 = table.aggregate ["texts", "ints"] [Aggregate_Column.Last "floats"]
t3.info.at "Column" . to_vector . should_equal ["texts", "ints", "Last floats"] t3.info.at "Column" . to_vector . should_equal ["texts", "ints", "Last floats"]
t3.info.at "Value Type" . to_vector . should_equal [Value_Type.Char, Value_Type.Integer, Value_Type.Float] t3.info.at "Value Type" . to_vector . should_equal [Value_Type.Char, Value_Type.Integer, Value_Type.Float]
t4 = table.aggregate [Group_By "mixed", Sum "ints", Sum "floats"] t4 = table.aggregate ["mixed"] [Aggregate_Column.Sum "ints", Aggregate_Column.Sum "floats"]
t4.info.at "Column" . to_vector . should_equal ["mixed", "Sum ints", "Sum floats"] t4.info.at "Column" . to_vector . should_equal ["mixed", "Sum ints", "Sum floats"]
t4.info.at "Value Type" . to_vector . should_equal [Value_Type.Mixed, Value_Type.Float, Value_Type.Float] t4.info.at "Value Type" . to_vector . should_equal [Value_Type.Mixed, Value_Type.Float, Value_Type.Float]
@ -676,20 +675,20 @@ add_specs suite_builder =
texts = ["texts", ['ściana', 'ściana', 'łąka', 's\u0301ciana', 'ła\u0328ka', 'sciana']] texts = ["texts", ['ściana', 'ściana', 'łąka', 's\u0301ciana', 'ła\u0328ka', 'sciana']]
ints = ["ints", [1, 2, 4, 8, 16, 32]] ints = ["ints", [1, 2, 4, 8, 16, 32]]
table = Table.new [texts, ints] table = Table.new [texts, ints]
r1 = table.aggregate [Group_By "texts", Sum "ints"] . order_by ([Sort_Column.Name "texts"]) r1 = table.aggregate ["texts"] [Aggregate_Column.Sum "ints"] . order_by ([Sort_Column.Name "texts"])
r1.at "texts" . to_vector . should_equal ['sciana', 'ściana', 'łąka'] r1.at "texts" . to_vector . should_equal ['sciana', 'ściana', 'łąka']
r1.at "Sum ints" . to_vector . should_equal [32, 11, 20] r1.at "Sum ints" . to_vector . should_equal [32, 11, 20]
r2 = table.aggregate [Count_Distinct "texts"] r2 = table.aggregate columns=[Aggregate_Column.Count_Distinct "texts"]
r2.at "Count Distinct texts" . to_vector . should_equal [3] r2.at "Count Distinct texts" . to_vector . should_equal [3]
group_builder.specify "should be able to aggregate over enso Types" <| group_builder.specify "should be able to aggregate over enso Types" <|
weekday_table = Table.new [["days", [Day_Of_Week.Monday, Day_Of_Week.Monday, Day_Of_Week.Monday, Day_Of_Week.Tuesday, Day_Of_Week.Sunday]], ["group", [1,1,2,1,2]]] weekday_table = Table.new [["days", [Day_Of_Week.Monday, Day_Of_Week.Monday, Day_Of_Week.Monday, Day_Of_Week.Tuesday, Day_Of_Week.Sunday]], ["group", [1,1,2,1,2]]]
r1 = weekday_table.aggregate [Group_By "days"] . order_by "days" r1 = weekday_table.aggregate ["days"] . order_by "days"
r1.at "days" . to_vector . should_equal [Day_Of_Week.Sunday, Day_Of_Week.Monday, Day_Of_Week.Tuesday] r1.at "days" . to_vector . should_equal [Day_Of_Week.Sunday, Day_Of_Week.Monday, Day_Of_Week.Tuesday]
r2 = weekday_table.aggregate [Group_By "group", Minimum "days" "min", Maximum "days" "max"] . order_by "group" r2 = weekday_table.aggregate ["group"] [Aggregate_Column.Minimum "days" "min", Aggregate_Column.Maximum "days" "max"] . order_by "group"
r2.at "group" . to_vector . should_equal [1, 2] r2.at "group" . to_vector . should_equal [1, 2]
r2.at "min" . to_vector . should_equal [Day_Of_Week.Monday, Day_Of_Week.Sunday] r2.at "min" . to_vector . should_equal [Day_Of_Week.Monday, Day_Of_Week.Sunday]
r2.at "max" . to_vector . should_equal [Day_Of_Week.Tuesday, Day_Of_Week.Monday] r2.at "max" . to_vector . should_equal [Day_Of_Week.Tuesday, Day_Of_Week.Monday]

View File

@ -69,7 +69,7 @@ add_specs suite_builder =
json = make_json header=["A"] data=[['a', 'a']] all_rows=3 ixes_header=[] ixes=[] json = make_json header=["A"] data=[['a', 'a']] all_rows=3 ixes_header=[] ixes=[]
vis . should_equal json vis . should_equal json
g = data.t.aggregate [Aggregate_Column.Group_By "A", Aggregate_Column.Group_By "B", Aggregate_Column.Average "C"] . at "Average C" g = data.t.aggregate ["A", "B"] [Aggregate_Column.Average "C"] . at "Average C"
vis2 = Visualization.prepare_visualization g 1 vis2 = Visualization.prepare_visualization g 1
json2 = make_json header=["Average C"] data=[[4.0]] all_rows=2 ixes_header=[] ixes=[] json2 = make_json header=["Average C"] data=[[4.0]] all_rows=2 ixes_header=[] ixes=[]
vis2 . should_equal json2 vis2 . should_equal json2

View File

@ -36,7 +36,7 @@ main =
operator25 = Table.from_rows operator27 [operator27] operator25 = Table.from_rows operator27 [operator27]
operator28 = operator25.first_row operator28 = operator25.first_row
operator29 = operator28.at 'current_implementation' operator29 = operator28.at 'current_implementation'
operator30 = operator22.aggregate [(Aggregate_Column.Group_By 'Approach'), (Aggregate_Column.Average 'Time')] operator30 = operator22.aggregate ['Approach'] [Aggregate_Column.Average 'Time']
operator31 = operator30.filter 'Approach' (Filter_Condition.Equal operator29) operator31 = operator30.filter 'Approach' (Filter_Condition.Equal operator29)
operator32 = operator31.at 'Average Time' . at 0 operator32 = operator31.at 'Average Time' . at 0
operator33 = operator30.set "[Average Time] / "+operator32.to_text "Percentage" operator33 = operator30.set "[Average Time] / "+operator32.to_text "Percentage"

View File

@ -1,16 +1,12 @@
from Standard.Base import all from Standard.Base import all
from Standard.Base.Data.Filter_Condition import Filter_Condition
from Standard.Base.Data.Map import Map
from Standard.Base.Data.Time.Date import Date
from Standard.Table import all from Standard.Table import all
from Standard.Database import all from Standard.Database import all
from Standard.Table.Data.Aggregate_Column import Aggregate_Column
import Standard.Visualization import Standard.Visualization
main = main =
operator4 = [Aggregate_Column.Maximum "commit_timestamp"] operator4 = [Aggregate_Column.Maximum "commit_timestamp"]
operator11 = [Aggregate_Column.Minimum "commit_timestamp"] operator11 = [Aggregate_Column.Minimum "commit_timestamp"]
operator13 = [Aggregate_Column.Group_By "label"] operator13 = ["label"]
number1 = 26 number1 = 26
text1 = "benchs.csv" text1 = "benchs.csv"
operator5 = enso_project.data / text1 operator5 = enso_project.data / text1
@ -18,8 +14,8 @@ main =
operator7 = operator6.row_count operator7 = operator6.row_count
operator1 = operator6.column_names operator1 = operator6.column_names
operator9 = operator6.at 'commit_timestamp' operator9 = operator6.at 'commit_timestamp'
operator3 = operator6.aggregate operator4 operator3 = operator6.aggregate [] operator4
operator8 = operator6.aggregate operator11 operator8 = operator6.aggregate [] operator11
operator12 = operator6.aggregate operator13 operator12 = operator6.aggregate operator13
operator17 = operator12.at 'label' operator17 = operator12.at 'label'
operator18 = operator17.to_vector operator18 = operator17.to_vector