Update the aggregate API to take a separate group_by (#9027)

Separate out the `Group_By` from the column definition in `aggregate`.
![image](https://github.com/enso-org/enso/assets/4699705/6b4f03bc-1c4a-4582-b38a-ba528ae94167)

Supports the old API with a warning attached about deprecation:
![image](https://github.com/enso-org/enso/assets/4699705/0cc42ff7-6047-41a5-bb99-c717d06d0d93)

Widgets have been updated with `Group_By` removed from the dropdown.
This commit is contained in:
James Dunkerley 2024-02-13 10:23:59 +00:00 committed by GitHub
parent f7a84d06e4
commit 8c197f325b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
34 changed files with 507 additions and 469 deletions

View File

@ -615,6 +615,7 @@
- [Allow removing rows using a Filter_Condition.][8861]
- [Added `Table.to_xml`.][8979]
- [Implemented Write support for `S3_File`.][8921]
- [Separate `Group_By` from `columns` into new argument on `aggregate`.][9027]
[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
@ -886,6 +887,7 @@
[8861]: https://github.com/enso-org/enso/pull/8861
[8979]: https://github.com/enso-org/enso/pull/8979
[8921]: https://github.com/enso-org/enso/pull/8921
[9027]: https://github.com/enso-org/enso/pull/9027
#### Enso Compiler

View File

@ -0,0 +1,13 @@
import project.Data.Text.Text
## A warning that an API is deprecated.
type Deprecated
## PRIVATE
Warning type_name:Text method_name:Text message:Text=""
## PRIVATE
Pretty prints the Deprecated warning.
to_display_text : Text
to_display_text self =
if self.message.is_empty then ("Deprecated: " + self.type_name + "." + self.method_name + " is deprecated.") else self.message

View File

@ -6,6 +6,7 @@ import Standard.Base.Errors.Common.Additional_Warnings
import Standard.Base.Errors.Common.Incomparable_Values
import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Common.Type_Error
import Standard.Base.Errors.Deprecated.Deprecated
import Standard.Base.Errors.File_Error.File_Error
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Base.Errors.Illegal_State.Illegal_State
@ -1720,10 +1721,14 @@ type Table
GROUP Standard.Base.Calculations
ICON sigma
Aggregates the rows in a table using any `Group_By` entries in columns.
The columns argument specifies which additional aggregations to perform and to return.
Aggregates the rows in a table using `group_by` columns.
The columns argument specifies which additional aggregations to perform
and to return.
Arguments:
- group_by: Vector of column identifiers to group by. These will be
included at the start of the resulting table. If no columns are
specified a single row will be returned with the aggregate columns.
- columns: Vector of `Aggregate_Column` specifying the aggregated table.
Expressions can be used within the aggregate column to perform more
complicated calculations.
@ -1745,7 +1750,7 @@ type Table
- If a column index is out of range, a `Missing_Input_Columns` is
reported according to the `on_problems` setting, unless
`error_on_missing_columns` is set to `True`, in which case it is
raised as an error. Problems resolving `Group_By` columns are
raised as an error. Problems resolving `group_by` columns are
reported as dataflow errors regardless of these settings, as a
missing grouping will completely change semantics of the query.
- If a column selector is given as a `Text` and it does not match any
@ -1753,7 +1758,7 @@ type Table
`Invalid_Aggregate_Column` error is raised according to the
`on_problems` settings (unless `error_on_missing_columns` is set to
`True` in which case it will always be an error). Problems resolving
`Group_By` columns are reported as dataflow errors regardless of
`group_by` columns are reported as dataflow errors regardless of
these settings, as a missing grouping will completely change
semantics of the query.
- If an aggregation fails, an `Invalid_Aggregation` dataflow error is
@ -1771,63 +1776,73 @@ type Table
- If there are more than 10 issues with a single column,
an `Additional_Warnings`.
> Example
Count all the rows
table.aggregate columns=[Aggregate_Column.Count]
> Example
Group by the Key column, count the rows
table.aggregate [Aggregate_Column.Group_By "Key", Aggregate_Column.Count]
table.aggregate ["Key"] [Aggregate_Column.Count]
@group_by Widget_Helpers.make_column_name_vector_selector
@columns Widget_Helpers.make_aggregate_column_vector_selector
aggregate : Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings
aggregate self columns (error_on_missing_columns=False) (on_problems=Report_Warning) =
validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper columns self error_on_missing_columns=error_on_missing_columns
key_columns = validated.key_columns
key_problems = key_columns.flat_map internal_column->
column = self.make_column internal_column
case column.value_type.is_floating_point of
True -> [Floating_Point_Equality.Error column.name]
False -> []
on_problems.attach_problems_before validated.problems+key_problems <|
resolved_aggregates = validated.valid_columns
key_expressions = key_columns.map .expression
new_ctx = self.context.set_groups key_expressions
problem_builder = Problem_Builder.new
## TODO [RW] here we will perform as many fetches as there are
aggregate columns, but technically we could perform just one
fetch fetching all column types - TODO we should do that. We can
do it here by creating a builder that will gather all requests
from the executed callbacks and create Lazy references that all
point to a single query.
See #6118.
infer_from_database_callback expression =
SQL_Type_Reference.new self.connection self.context expression
dialect = self.connection.dialect
type_mapping = dialect.get_type_mapping
infer_return_type op_kind columns expression =
type_mapping.infer_return_type infer_from_database_callback op_kind columns expression
results = resolved_aggregates.map p->
agg = p.second
new_name = p.first
result = Aggregate_Helper.make_aggregate_column self agg new_name dialect infer_return_type problem_builder
## If the `result` did contain an error, we catch it to be
able to store it in a vector and then we will partition the
created columns and failures.
result.catch Any error->
DB_Wrapped_Error.Value error
aggregate : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings
aggregate self group_by=[] columns=[] (error_on_missing_columns=False) (on_problems=Report_Warning) =
normalized_group_by = Vector.unify_vector_or_element group_by
if normalized_group_by.is_empty && columns.is_empty then Error.throw (No_Output_Columns.Error "At least one column must be specified.") else
validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper normalized_group_by columns self error_on_missing_columns=error_on_missing_columns
partitioned = results.partition (_.is_a DB_Wrapped_Error)
key_columns = validated.key_columns
key_problems = key_columns.flat_map internal_column->
column = self.make_column internal_column
case column.value_type.is_floating_point of
True -> [Floating_Point_Equality.Error column.name]
False -> []
on_problems.attach_problems_before validated.problems+key_problems <|
resolved_aggregates = validated.valid_columns
key_expressions = key_columns.map .expression
new_ctx = self.context.set_groups key_expressions
problem_builder = Problem_Builder.new
## TODO [RW] here we will perform as many fetches as there are
aggregate columns, but technically we could perform just one
fetch fetching all column types - TODO we should do that. We can
do it here by creating a builder that will gather all requests
from the executed callbacks and create Lazy references that all
point to a single query.
See #6118.
infer_from_database_callback expression =
SQL_Type_Reference.new self.connection self.context expression
dialect = self.connection.dialect
type_mapping = dialect.get_type_mapping
infer_return_type op_kind columns expression =
type_mapping.infer_return_type infer_from_database_callback op_kind columns expression
results = resolved_aggregates.map p->
agg = p.second
new_name = p.first
result = Aggregate_Helper.make_aggregate_column self agg new_name dialect infer_return_type problem_builder
## If the `result` did contain an error, we catch it to be
able to store it in a vector and then we will partition the
created columns and failures.
result.catch Any error->(DB_Wrapped_Error.Value error)
## When working on join we may encounter further issues with having
aggregate columns exposed directly, it may be useful to re-use
the `lift_aggregate` method to push the aggregates into a
subquery.
new_columns = partitioned.second
problem_builder.attach_problems_before on_problems <|
problems = partitioned.first.map .value
on_problems.attach_problems_before problems <|
handle_no_output_columns =
first_problem = if problems.is_empty then Nothing else problems.first
Error.throw (No_Output_Columns.Error first_problem)
if new_columns.is_empty then handle_no_output_columns else
self.updated_context_and_columns new_ctx new_columns subquery=True
partitioned = results.partition (_.is_a DB_Wrapped_Error)
## When working on join we may encounter further issues with having
aggregate columns exposed directly, it may be useful to re-use
the `lift_aggregate` method to push the aggregates into a
subquery.
new_columns = partitioned.second
problem_builder.attach_problems_before on_problems <|
problems = partitioned.first.map .value
on_problems.attach_problems_before problems <|
handle_no_output_columns =
first_problem = if problems.is_empty then Nothing else problems.first
Error.throw (No_Output_Columns.Error first_problem)
if new_columns.is_empty then handle_no_output_columns else
result = self.updated_context_and_columns new_ctx new_columns subquery=True
if validated.old_style.not then result else
Warning.attach (Deprecated.Warning "Standard.Database.Data.Aggregate_Column.Aggregate_Column" "Group_By" "Deprecated: `Group_By` constructor has been deprecated, use the `group_by` argument instead.") result
## GROUP Standard.Base.Calculations
ICON dataframe_map_row
@ -1939,7 +1954,7 @@ type Table
A | Example | France
@group_by Widget_Helpers.make_column_name_vector_selector
@name_column Widget_Helpers.make_column_name_selector
@values (Widget_Helpers.make_aggregate_column_selector include_group_by=False)
@values Widget_Helpers.make_aggregate_column_selector
cross_tab : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> (Text | Integer) -> Aggregate_Column | Vector Aggregate_Column -> Problem_Behavior -> Table ! Missing_Input_Columns | Invalid_Aggregate_Column | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings | Invalid_Column_Names
cross_tab self group_by name_column values=Aggregate_Column.Count (on_problems=Report_Warning) =
## Avoid unused arguments warning. We cannot rename arguments to `_`,

View File

@ -64,7 +64,7 @@ type Context
- groups: a list of grouping expressions, for each entry a GROUP BY is
added, the resulting query can then directly include only the
grouped-by columns or aggregate expressions.
- limit: an optional maximum number of elements that the equery should
- limit: an optional maximum number of elements that the query should
return.
Value (from_spec : From_Spec) (where_filters : Vector SQL_Expression) (orders : Vector Order_Descriptor) (groups : Vector SQL_Expression) (limit : Nothing | Integer) (distinct_on : Nothing | Vector SQL_Expression)

View File

@ -228,8 +228,8 @@ type Non_Unique_Key_Recipe
Creates a `Non_Unique_Key` error containing information about an
example group violating the uniqueness constraint.
raise_duplicated_primary_key_error source_table primary_key original_panic =
agg = source_table.aggregate [Aggregate_Column.Count]+(primary_key.map Aggregate_Column.Group_By)
filtered = agg.filter column=0 (Filter_Condition.Greater than=1)
agg = source_table.aggregate primary_key [Aggregate_Column.Count]
filtered = agg.filter column=-1 (Filter_Condition.Greater than=1)
materialized = filtered.read max_rows=1 warn_if_more_rows=False
case materialized.row_count == 0 of
## If we couldn't find a duplicated key, we give up the translation and
@ -239,8 +239,8 @@ raise_duplicated_primary_key_error source_table primary_key original_panic =
True -> Panic.throw original_panic
False ->
row = materialized.first_row.to_vector
example_count = row.first
example_entry = row.drop 1
example_count = row.last
example_entry = row.drop (Last 1)
Error.throw (Non_Unique_Key.Error primary_key example_entry example_count)
## PRIVATE
@ -619,15 +619,15 @@ check_duplicate_key_matches_for_delete target_table tmp_table key_columns allow_
Checks if any rows identified by `key_columns` have more than one match between two tables.
check_multiple_rows_match left_table right_table key_columns ~continuation =
joined = left_table.join right_table on=key_columns join_kind=Join_Kind.Inner
counted = joined.aggregate [Aggregate_Column.Count]+(key_columns.map (Aggregate_Column.Group_By _))
duplicates = counted.filter 0 (Filter_Condition.Greater than=1)
counted = joined.aggregate key_columns [Aggregate_Column.Count]
duplicates = counted.filter -1 (Filter_Condition.Greater than=1)
example = duplicates.read max_rows=1 warn_if_more_rows=False
case example.row_count == 0 of
True -> continuation
False ->
row = example.first_row . to_vector
offending_key = row.drop 1
count = row.first
offending_key = row.drop (Last 1)
count = row.last
Error.throw (Multiple_Target_Rows_Matched_For_Update.Error offending_key count)
## PRIVATE

View File

@ -4,11 +4,11 @@ import project.Data.Sort_Column.Sort_Column
## Defines an Aggregate Column
type Aggregate_Column
## Specifies a column to group the rows by.
## PRIVATE
Specifies a column to group the rows by. Deprecated but used internally.
Arguments:
- column: the column (specified by name, expression or index) to group
by.
- column: the column (either name, expression or index) to group by.
- new_name: name of new column.
Group_By (column:Text|Integer|Any) (new_name:Text="") # Any needed because of 6866

View File

@ -10,6 +10,7 @@ import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Common.No_Such_Method
import Standard.Base.Errors.Common.Out_Of_Memory
import Standard.Base.Errors.Common.Type_Error
import Standard.Base.Errors.Deprecated.Deprecated
import Standard.Base.Errors.File_Error.File_Error
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Base.Errors.Unimplemented.Unimplemented
@ -658,10 +659,14 @@ type Table
GROUP Standard.Base.Calculations
ICON sigma
Aggregates the rows in a table using any `Group_By` entries in columns.
The columns argument specifies which additional aggregations to perform and to return.
Aggregates the rows in a table using `group_by` columns.
The columns argument specifies which additional aggregations to perform
and to return.
Arguments:
- group_by: Vector of column identifiers to group by. These will be
included at the start of the resulting table. If no columns are
specified a single row will be returned with the aggregate columns.
- columns: Vector of `Aggregate_Column` specifying the aggregated table.
Expressions can be used within the aggregate column to perform more
complicated calculations.
@ -681,7 +686,7 @@ type Table
- If a column index is out of range, a `Missing_Input_Columns` is
reported according to the `on_problems` setting, unless
`error_on_missing_columns` is set to `True`, in which case it is
raised as an error. Problems resolving `Group_By` columns are
raised as an error. Problems resolving `group_by` columns are
reported as dataflow errors regardless of these settings, as a
missing grouping will completely change semantics of the query.
- If a column selector is given as a `Text` and it does not match any
@ -689,7 +694,7 @@ type Table
`Invalid_Aggregate_Column` problem is raised according to the
`on_problems` settings (unless `error_on_missing_columns` is set to
`True` in which case it will always be an error). Problems resolving
`Group_By` columns are reported as dataflow errors regardless of
`group_by` columns are reported as dataflow errors regardless of
these settings, as a missing grouping will completely change
semantics of the query.
- If an aggregation fails, an `Invalid_Aggregation` dataflow error is
@ -707,22 +712,31 @@ type Table
- If there are more than 10 issues with a single column,
an `Additional_Warnings`.
> Example
Count all the rows
table.aggregate columns=[Aggregate_Column.Count]
> Example
Group by the Key column, count the rows
table.aggregate [Aggregate_Column.Group_By "Key", Aggregate_Column.Count]
table.aggregate ["Key"] [Aggregate_Column.Count]
@group_by Widget_Helpers.make_column_name_vector_selector
@columns Widget_Helpers.make_aggregate_column_vector_selector
aggregate : Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings
aggregate self columns (error_on_missing_columns=False) (on_problems=Report_Warning) =
validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper columns self error_on_missing_columns=error_on_missing_columns
aggregate : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings
aggregate self group_by=[] columns=[] (error_on_missing_columns=False) (on_problems=Report_Warning) =
normalized_group_by = Vector.unify_vector_or_element group_by
if normalized_group_by.is_empty && columns.is_empty then Error.throw (No_Output_Columns.Error "At least one column must be specified.") else
validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper normalized_group_by columns self error_on_missing_columns=error_on_missing_columns
on_problems.attach_problems_before validated.problems <| Illegal_Argument.handle_java_exception <|
java_key_columns = validated.key_columns.map .java_column
Java_Problems.with_problem_aggregator on_problems java_problem_aggregator->
index = self.java_table.indexFromColumns java_key_columns java_problem_aggregator
new_columns = validated.valid_columns.map c->(Aggregate_Column_Helper.java_aggregator c.first c.second)
java_table = index.makeTable new_columns
Table.Value java_table
on_problems.attach_problems_before validated.problems <| Illegal_Argument.handle_java_exception <|
java_key_columns = validated.key_columns.map .java_column
Java_Problems.with_problem_aggregator on_problems java_problem_aggregator->
index = self.java_table.indexFromColumns java_key_columns java_problem_aggregator
new_columns = validated.valid_columns.map c->(Aggregate_Column_Helper.java_aggregator c.first c.second)
java_table = index.makeTable new_columns
if validated.old_style.not then Table.Value java_table else
Warning.attach (Deprecated.Warning "Standard.Database.Data.Aggregate_Column.Aggregate_Column" "Group_By" "Deprecated: `Group_By` constructor has been deprecated, use the `group_by` argument instead.") (Table.Value java_table)
## ALIAS sort
GROUP Standard.Base.Selections
@ -2448,7 +2462,7 @@ type Table
A | Example | France
@group_by Widget_Helpers.make_column_name_vector_selector
@name_column Widget_Helpers.make_column_name_selector
@values (Widget_Helpers.make_aggregate_column_selector include_group_by=False)
@values Widget_Helpers.make_aggregate_column_selector
cross_tab : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> (Text | Integer) -> Aggregate_Column | Vector Aggregate_Column -> Problem_Behavior -> Table ! Missing_Input_Columns | Invalid_Aggregate_Column | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings | Invalid_Column_Names
cross_tab self group_by name_column values=Aggregate_Column.Count (on_problems=Report_Warning) = Out_Of_Memory.handle_java_exception "cross_tab" <|
columns_helper = self.columns_helper

View File

@ -1,5 +1,6 @@
from Standard.Base import all hiding First, Last
import Standard.Base.Data.Vector.No_Wrap
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Base.Runtime import assert
import project.Data.Aggregate_Column.Aggregate_Column
@ -31,69 +32,76 @@ polyglot java import org.enso.table.aggregations.StandardDeviation as StandardDe
polyglot java import org.enso.table.aggregations.Sum as SumAggregator
## Result type for aggregate_columns validation
- key_columns: Vector of Columns from the table to group by
- valid_columns: Table structure to build as pairs of unique column name and Aggregate_Column
- problems: Set of any problems when validating the input
- key_columns: Vector of Columns from the table to group by.
- valid_columns: Table structure to build as pairs of unique column name and Aggregate_Column.
- problems: Set of any problems when validating the input.
- old_style: Boolean indicating if the input was in the old style.
type Validated_Aggregate_Columns
## PRIVATE
Value (key_columns:(Vector Column)) (valid_columns:(Vector (Pair Text Aggregate_Column))) (problems:(Vector Any))
Value (key_columns:(Vector Column)) (valid_columns:(Vector (Pair Text Aggregate_Column))) (problems:(Vector Any)) (old_style:Boolean)
## PRIVATE
Prepares an aggregation input for further processing:
- resolves the column descriptors, reporting any issues,
- ensures that the output names are unique,
- finds the key columns.
prepare_aggregate_columns : Column_Naming_Helper -> Vector Aggregate_Column -> Table -> Boolean -> Validated_Aggregate_Columns
prepare_aggregate_columns naming_helper aggregates table error_on_missing_columns =
prepare_aggregate_columns : Column_Naming_Helper -> Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> Vector Aggregate_Column -> Table -> Boolean -> Validated_Aggregate_Columns
prepare_aggregate_columns naming_helper group_by aggregates table error_on_missing_columns =
is_a_key c = case c of
Aggregate_Column.Group_By _ _ -> True
_ -> False
keys = aggregates.filter is_a_key
# Key resolution always errors on missing, regardless of any settings.
keys_problem_builder = Problem_Builder.new error_on_missing_columns=True
resolved_keys = keys.map (resolve_aggregate table keys_problem_builder)
## Resolve old style aggregate into new style
old_style = aggregates.is_empty && group_by.any (g-> g.is_a Aggregate_Column)
if old_style.not && group_by.any is_a_key then Error.throw (Invalid_Aggregation.Error "`columns` should not contain a `Group_By`.") else
keys = if old_style then group_by.filter is_a_key else group_by.map (Aggregate_Column.Group_By _ "")
## Since `keys_problem_builder` has `error_on_missing_columns` set to `True`,
any missing columns will be reported as errors. Therefore, we can assume
that all the columns were present.
keys_problem_builder.attach_problems_before Problem_Behavior.Report_Error <|
assert (resolved_keys.contains Nothing . not)
problem_builder = Problem_Builder.new error_on_missing_columns=error_on_missing_columns
valid_resolved_aggregate_columns = aggregates.map on_problems=No_Wrap (resolve_aggregate table problem_builder) . filter x-> x.is_nothing.not
# Key resolution always errors on missing, regardless of any settings.
keys_problem_builder = Problem_Builder.new error_on_missing_columns=True
resolved_keys = keys.map (resolve_aggregate table keys_problem_builder)
# Grouping Key
key_columns = resolved_keys.map .column
unique_key_columns = key_columns.distinct (on = .name)
## Since `keys_problem_builder` has `error_on_missing_columns` set to `True`,
any missing columns will be reported as errors. Therefore, we can assume
that all the columns were present.
keys_problem_builder.attach_problems_before Problem_Behavior.Report_Error <|
assert (resolved_keys.contains Nothing . not)
problem_builder = Problem_Builder.new error_on_missing_columns=error_on_missing_columns
columns = if old_style then group_by else keys+aggregates
valid_resolved_aggregate_columns = columns.map on_problems=No_Wrap (resolve_aggregate table problem_builder) . filter x-> x.is_nothing.not
# Resolve Names
unique = naming_helper.create_unique_name_strategy
## First pass ensures the custom names specified by the user are unique.
The second pass resolves the default names, ensuring that they do not
clash with the user-specified names (ensuring that user-specified names
take precedence).
pass_1 = valid_resolved_aggregate_columns.map on_problems=No_Wrap c-> if c.new_name == "" then "" else
# Verify if the user-provided name is valid and if not, throw an error.
naming_helper.ensure_name_is_valid c.new_name <|
unique.make_unique c.new_name
renamed_columns = pass_1.map_with_index i->name->
agg = valid_resolved_aggregate_columns.at i
new_name = if name != "" then name else unique.make_unique (default_aggregate_column_name agg)
Pair.new new_name agg
# Grouping Key
key_columns = resolved_keys.map .column
unique_key_columns = key_columns.distinct (on = .name)
# Build Problems Output
case renamed_columns.is_empty of
True ->
## First, we try to raise any warnings that may have caused the
lack of columns, promoted to errors.
problem_builder.attach_problems_before Problem_Behavior.Report_Error <|
## If none were found, we raise a generic error (this may
happen primarily when an empty list is provided to the
aggregate method).
Error.throw No_Output_Columns.Error
False ->
problem_builder.report_unique_name_strategy unique
Validated_Aggregate_Columns.Value unique_key_columns renamed_columns problem_builder.get_problemset_throwing_distinguished_errors
# Resolve Names
unique = naming_helper.create_unique_name_strategy
## First pass ensures the custom names specified by the user are unique.
The second pass resolves the default names, ensuring that they do not
clash with the user-specified names (ensuring that user-specified names
take precedence).
pass_1 = valid_resolved_aggregate_columns.map on_problems=No_Wrap c-> if c.new_name == "" then "" else
# Verify if the user-provided name is valid and if not, throw an error.
naming_helper.ensure_name_is_valid c.new_name <|
unique.make_unique c.new_name
renamed_columns = pass_1.map_with_index i->name->
agg = valid_resolved_aggregate_columns.at i
new_name = if name != "" then name else unique.make_unique (default_aggregate_column_name agg)
Pair.new new_name agg
# Build Problems Output
case renamed_columns.is_empty of
True ->
## First, we try to raise any warnings that may have caused the
lack of columns, promoted to errors.
problem_builder.attach_problems_before Problem_Behavior.Report_Error <|
## If none were found, we raise a generic error (this may
happen primarily when an empty list is provided to the
aggregate method).
Error.throw No_Output_Columns.Error
False ->
problem_builder.report_unique_name_strategy unique
Validated_Aggregate_Columns.Value unique_key_columns renamed_columns problem_builder.get_problemset_throwing_distinguished_errors old_style
## PRIVATE
Defines the default name of an `Aggregate_Column`.

View File

@ -230,7 +230,7 @@ type Table_Column_Helper
Blank_Selector.All_Cells -> Aggregate_Column.Minimum _
aggregates = blanks.map blanks_col-> col_aggregate blanks_col.name
aggregate_result = just_indicators.aggregate aggregates on_problems=Problem_Behavior.Report_Error
aggregate_result = just_indicators.aggregate columns=aggregates on_problems=Problem_Behavior.Report_Error
materialized_result = self.materialize <| aggregate_result.catch Any error->
msg = "Unexpected dataflow error has been thrown in an `select_blank_columns_helper`. This is a bug in the Table library. The unexpected error was: "+error.to_display_text
Panic.throw (Illegal_State.Error message=msg cause=error)

View File

@ -16,13 +16,12 @@ from project.Extensions.Table_Conversions import all
## PRIVATE
Make an aggregate column selector.
make_aggregate_column_selector : Table -> Display -> Boolean -> Widget
make_aggregate_column_selector table display=Display.Always include_group_by=True =
make_aggregate_column_selector : Table -> Display -> Widget
make_aggregate_column_selector table display=Display.Always =
col_names_selector = make_column_name_selector table display=Display.Always
column_widget = ["column", col_names_selector]
fqn = Meta.get_qualified_type_name Aggregate_Column
group_by = if include_group_by then [Option "Group By" fqn+".Group_By" [column_widget]] else []
count = Option "Count" fqn+".Count"
## Currently can't support nested vector editors so using single picker
@ -56,7 +55,7 @@ make_aggregate_column_selector table display=Display.Always include_group_by=Tru
maximum = Option "Maximum" fqn+".Maximum" [column_widget]
minimum = Option "Minimum" fqn+".Minimum" [column_widget]
Single_Choice display=display values=(group_by+[count, count_distinct, first, last, count_not_nothing, count_nothing, count_not_empty, count_empty, concatenate, shortest, longest, sum, average, median, percentile, mode, standard_deviation, maximum, minimum])
Single_Choice display=display values=[count, count_distinct, first, last, count_not_nothing, count_nothing, count_not_empty, count_empty, concatenate, shortest, longest, sum, average, median, percentile, mode, standard_deviation, maximum, minimum]
## PRIVATE
Make an Aggregate_Column list editor
@ -64,7 +63,7 @@ make_aggregate_column_vector_selector : Table -> Display -> Widget
make_aggregate_column_vector_selector table display=Display.Always =
item_editor = make_aggregate_column_selector table display=Display.Always
# TODO this is a workaround for a dropdown issue
Vector_Editor item_editor=item_editor item_default="(Aggregate_Column.Group_By)" display=display
Vector_Editor item_editor=item_editor item_default="(Aggregate_Column.Count)" display=display
## PRIVATE
Make a column name selector.

View File

@ -10,13 +10,13 @@ Table.build_ai_prompt self =
aggs = ["Count","Average","Sum","Median","First","Last","Maximum","Minimum"]
joins = ["Inner","Left_Outer","Right_Outer","Full","Left_Exclusive","Right_Exclusive"]
examples = """
Table["id","category","Unit Price","Stock"];goal=get product count by category==>>`aggregate [Aggregate_Column.Group_By "category", Aggregate_Column.Count Nothing]`
Table["id","category","Unit Price","Stock"];goal=get product count by category==>>`aggregate ["category"] [Aggregate_Column.Count]`
Table["ID","Unit Price","Stock"];goal=order by how many items are available==>>`order_by ["Stock"]`
Table["Name","Enrolled Year"];goal=select people who enrolled between 2015 and 2018==>>`filter_by_expression "[Enrolled Year] >= 2015 && [Enrolled Year] <= 2018`
Table["Number of items","client name","city","unit price"];goal=compute the total value of each order==>>`set "[Number of items] * [unit price]" "total value"`
Table["Number of items","client name","CITY","unit price","total value"];goal=compute the average order value by city==>>`aggregate [Aggregate_Column.Group_By "CITY", Aggregate_Column.Average "total value"]`
Table["Number of items","client name","CITY","unit price","total value"];goal=compute the average order value by city==>>`aggregate ["CITY"] [Aggregate_Column.Average "total value"]`
Table["Area Code", "number"];goal=get full phone numbers==>>`set "'+1 (' + [Area Code] + ') ' + [number]" "full phone number"`
Table["Name","Grade","Subject"];goal=rank students by their average grade==>>`aggregate [Aggregate_Column.Group_By "Name", Aggregate_Column.Average "Grade" "Average Grade"] . order_by [Sort_Column.Name "Average Grade" Sort_Direction.Descending]`
Table["Name","Grade","Subject"];goal=rank students by their average grade==>>`aggregate ["Name"] [Aggregate_Column.Average "Grade" "Average Grade"] . order_by [Sort_Column.Name "Average Grade" Sort_Direction.Descending]`
Table["Country","Prime minister name","2018","2019","2020","2021"];goal=pivot yearly GDP values to rows==>>`transpose ["Country", "Prime minister name"] "Year" "GDP"`
Table["Size","Weight","Width","stuff","thing"];goal=only select size and thing of each record==>>`select_columns ["Size", "thing"]`
Table["ID","Name","Count"];goal=join it with var_17==>>`join var_17 Join_Kind.Inner`

View File

@ -48,6 +48,6 @@ collect_benches = Bench.build builder->
group_builder.specify "Sort_Table" <|
data.table.order_by "X"
group_builder.specify "Group_And_Sort" <|
data.table.aggregate [Aggregate_Column.Group_By "X", Aggregate_Column.Last "Y" order_by="Y"]
data.table.aggregate ["X"] [Aggregate_Column.Last "Y" order_by="Y"]
main = collect_benches . run_main

View File

@ -1,7 +1,6 @@
from Standard.Base import all hiding First, Last
from Standard.Table import Table
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all
from Standard.Table import Table, Aggregate_Column
from Standard.Test import Bench, Faker
@ -34,67 +33,67 @@ collect_benches = Bench.build builder->
builder.group "Table_Aggregate" options group_builder->
group_builder.specify "Count_table" <|
data.table.aggregate [Count]
data.table.aggregate [Aggregate_Column.Count]
group_builder.specify "Max_table" <|
data.table.aggregate [Maximum "ValueWithNothing"]
data.table.aggregate [Aggregate_Column.Maximum "ValueWithNothing"]
group_builder.specify "Sum_table" <|
data.table.aggregate [Sum "ValueWithNothing"]
data.table.aggregate [Aggregate_Column.Sum "ValueWithNothing"]
group_builder.specify "Count_Distinct_table" <|
data.table.aggregate [Count_Distinct "Index"]
data.table.aggregate [Aggregate_Column.Count_Distinct "Index"]
group_builder.specify "StDev_table" <|
data.table.aggregate [Standard_Deviation "Value"]
data.table.aggregate [Aggregate_Column.Standard_Deviation "Value"]
group_builder.specify "Median_table" <|
data.table.aggregate [Median "Value"]
data.table.aggregate [Aggregate_Column.Median "Value"]
group_builder.specify "Mode_table" <|
data.table.aggregate [Mode "Index"]
data.table.aggregate [Aggregate_Column.Mode "Index"]
group_builder.specify "Count_grouped" <|
data.table.aggregate [Group_By "Index", Count]
data.table.aggregate ["Index"] [Aggregate_Column.Count]
group_builder.specify "Max_grouped" <|
data.table.aggregate [Group_By "Index", Maximum "ValueWithNothing"]
data.table.aggregate ["Index"] [Aggregate_Column.Maximum "ValueWithNothing"]
group_builder.specify "Sum_grouped" <|
data.table.aggregate [Group_By "Index", Sum "ValueWithNothing"]
data.table.aggregate ["Index"] [Aggregate_Column.Sum "ValueWithNothing"]
group_builder.specify "Count_Distinct_grouped" <|
data.table.aggregate [Group_By "Index", Count_Distinct "Code"]
data.table.aggregate ["Index"] [Aggregate_Column.Count_Distinct "Code"]
group_builder.specify "StDev_grouped" <|
data.table.aggregate [Group_By "Index", Standard_Deviation "Value"]
data.table.aggregate ["Index"] [Aggregate_Column.Standard_Deviation "Value"]
group_builder.specify "Median_grouped" <|
data.table.aggregate [Group_By "Index", Median "Value"]
data.table.aggregate ["Index"] [Aggregate_Column.Median "Value"]
group_builder.specify "Mode_grouped" <|
data.table.aggregate [Group_By "Index", Mode "Index"]
data.table.aggregate ["Index"] [Aggregate_Column.Mode "Index"]
group_builder.specify "Count_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Count]
data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Count]
group_builder.specify "Max_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Maximum "ValueWithNothing"]
data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Maximum "ValueWithNothing"]
group_builder.specify "Sum_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Sum "ValueWithNothing"]
data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Sum "ValueWithNothing"]
group_builder.specify "Count_Distinct_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Count_Distinct "Code"]
data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Count_Distinct "Code"]
group_builder.specify "StDev_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Standard_Deviation "Value"]
data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Standard_Deviation "Value"]
group_builder.specify "Median_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Median "Value"]
data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Median "Value"]
group_builder.specify "Mode_2_level_groups" <|
data.table.aggregate [Group_By "Index", Group_By "Flag", Mode "Index"]
data.table.aggregate ["Index", "Flag"] [Aggregate_Column.Mode "Index"]
main = collect_benches . run_main

View File

@ -1,7 +1,6 @@
from Standard.Base import all
from Standard.Table import Table
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Count, Sum
from Standard.Table import Table, Aggregate_Column
from Standard.Test import Bench, Faker
@ -42,7 +41,7 @@ collect_benches = Bench.build builder->
data = Data.create num_rows
builder.group ("CrossTab_" + num_rows.to_text) options group_builder->
specify group_by name_column values=[Count] =
specify group_by name_column values=[Aggregate_Column.Count] =
name = (group_by.join '_') + "_" + name_column + "_" + (values.map .to_text . join "_")
group_builder.specify (normalize_name name) <|
data.table.cross_tab group_by name_column values
@ -53,7 +52,7 @@ collect_benches = Bench.build builder->
specify ["type"] "size"
specify ["store"] "size"
specify ["size"] "store"
specify ["size"] "store" values=[Count, Sum "price"]
specify ["size"] "store" values=[Aggregate_Column.Count, Aggregate_Column.Sum "price"]
normalize_name : Text -> Text

View File

@ -14,7 +14,7 @@ type Boxed_Total_Aggregate
Instance text_column
current_aggregate_implementation self =
self.text_column.to_table.aggregate [Aggregate_Column.Longest 0] . at 0 . at 0
self.text_column.to_table.aggregate [] [Aggregate_Column.Longest 0] . at 0 . at 0
java_loop self =
SimpleStorageAggregateHelpers.longestText self.text_column.java_column.getStorage
@ -46,7 +46,7 @@ type Primitive_Total_Aggregate
Instance int_column
current_aggregate_implementation self =
self.int_column.to_table.aggregate [Aggregate_Column.Sum 0] . at 0 . at 0
self.int_column.to_table.aggregate [] [Aggregate_Column.Sum 0] . at 0 . at 0
java_loop self =
long_storage = self.int_column.java_column.getStorage

View File

@ -4,8 +4,7 @@ import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Database.Extensions.Upload_Database_Table
import Standard.Database.Extensions.Upload_In_Memory_Table
from Standard.Table import Sort_Column
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Group_By, Sum
from Standard.Table import Sort_Column, Aggregate_Column
from Standard.Table.Errors import Missing_Input_Columns, Duplicate_Output_Column_Names, Floating_Point_Equality
from Standard.Test import all
@ -118,7 +117,7 @@ add_specs suite_builder setup =
group_builder.specify "Should work correctly after aggregation" <|
t0 = table_builder [["X", ["a", "b", "a", "c"]], ["Y", [1, 2, 4, 8]]]
t1 = t0.aggregate [Group_By "X", Sum "Y"]
t1 = t0.aggregate ["X"] [Aggregate_Column.Sum "Y"]
t2 = t1.order_by "X" . add_row_number
t2.at "X" . to_vector . should_equal ['a', 'b', 'c']

View File

@ -89,7 +89,7 @@ add_specs suite_builder setup =
group_builder.specify "case insensitive name collisions - aggregate" <|
t1 = table_builder [["X", [2, 1, 3, 2]]]
t2 = t1.aggregate [Aggregate_Column.Maximum "X" "A", Aggregate_Column.Minimum "X" "a"]
t2 = t1.aggregate columns=[Aggregate_Column.Maximum "X" "A", Aggregate_Column.Minimum "X" "a"]
case is_case_sensitive of
True ->

View File

@ -1,7 +1,7 @@
from Standard.Base import all
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Average, Count, Group_By, Sum, Concatenate
from Standard.Table import Aggregate_Column
import Standard.Table.Data.Expression.Expression_Error
from Standard.Table.Errors import all
@ -53,7 +53,7 @@ add_specs suite_builder setup =
t1.at "z" . to_vector . should_equal [2]
group_builder.specify "should allow a different aggregate" <|
t1 = data.table.cross_tab [] "Key" values=[Sum "Value"]
t1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Sum "Value"]
t1.column_names . should_equal ["x", "y", "z"]
t1.row_count . should_equal 1
t1.at "x" . to_vector . should_equal [10]
@ -61,7 +61,7 @@ add_specs suite_builder setup =
t1.at "z" . to_vector . should_equal [17]
group_builder.specify "should allow a custom expression for the aggregate" <|
t1 = data.table.cross_tab [] "Key" values=[Sum "[Value]*[Value]"]
t1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Sum "[Value]*[Value]"]
t1.column_names . should_equal ["x", "y", "z"]
t1.row_count . should_equal 1
t1.at "x" . to_vector . should_equal [30]
@ -94,7 +94,7 @@ add_specs suite_builder setup =
t1.at "z" . to_vector . should_equal [1, 1]
group_builder.specify "should allow a grouping by Aggregate_Column" <|
t1 = data.table2.cross_tab [Group_By "Group"] "Key"
t1 = data.table2.cross_tab [Aggregate_Column.Group_By "Group"] "Key"
t1.column_names . should_equal ["Group", "x", "y", "z"]
t1.row_count . should_equal 2
t1.at "Group" . to_vector . should_equal ["A", "B"]
@ -102,11 +102,11 @@ add_specs suite_builder setup =
t1.at "y" . to_vector . should_equal [2, 1]
t1.at "z" . to_vector . should_equal [1, 1]
data.table2.cross_tab [Sum "Group"] "Key" . should_fail_with Illegal_Argument
data.table2.cross_tab [Aggregate_Column.Sum "Group"] "Key" . should_fail_with Illegal_Argument
group_builder.specify "should allow a grouping by Aggregate_Colum, with some empty bins" <|
table3 = table_builder [["Group", ["B","A","B","A","A"]], ["Key", ["x", "y", "y", "y", "z"]], ["Value", [4, 5, 6, 7, 8]]]
t1 = table3.cross_tab [Group_By "Group"] "Key"
t1 = table3.cross_tab [Aggregate_Column.Group_By "Group"] "Key"
t1.column_names . should_equal ["Group", "x", "y", "z"]
t1.row_count . should_equal 2
t1.at "Group" . to_vector . should_equal ["A", "B"]
@ -127,7 +127,7 @@ add_specs suite_builder setup =
t2.column_names . should_equal ["Group", "x", "y", "z"]
group_builder.specify "should allow multiple values aggregates" <|
t1 = data.table.cross_tab [] "Key" values=[Count, Sum "Value"]
t1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count, Aggregate_Column.Sum "Value"]
t1.column_names . should_equal ["x Count", "x Sum", "y Count", "y Sum", "z Count", "z Sum"]
t1.row_count . should_equal 1
t1.at "x Count" . to_vector . should_equal [4]
@ -156,31 +156,31 @@ add_specs suite_builder setup =
err2.catch.criteria . should_equal [42]
group_builder.specify "should fail if aggregate values contain missing columns" <|
err1 = data.table.cross_tab [] "Key" values=[Count, Sum "Nonexistent Value", Sum "Value", Sum "OTHER"]
err1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count, Aggregate_Column.Sum "Nonexistent Value", Aggregate_Column.Sum "Value", Aggregate_Column.Sum "OTHER"]
err1.should_fail_with Invalid_Aggregate_Column
err1.catch.name . should_equal "Nonexistent Value"
err2 = data.table.cross_tab [] "Key" values=[Count, Sum "Nonexistent Value", Sum "Value", Sum 42]
err2 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count, Aggregate_Column.Sum "Nonexistent Value", Aggregate_Column.Sum "Value", Aggregate_Column.Sum 42]
err2.should_fail_with Missing_Input_Columns
err2.catch.criteria . should_equal [42]
group_builder.specify "should fail if aggregate values contain invalid expressions" <|
err1 = data.table.cross_tab [] "Key" values=[Sum "[MISSING]*10"]
err1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Sum "[MISSING]*10"]
err1.should_fail_with Invalid_Aggregate_Column
err1.catch.name . should_equal "[MISSING]*10"
err1.catch.expression_error . should_equal (No_Such_Column.Error "MISSING")
err2 = data.table.cross_tab [] "Key" values=[Sum "[[["]
err2 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Sum "[[["]
err2.should_fail_with Invalid_Aggregate_Column
err2.catch.name . should_equal "[[["
err2.catch.expression_error . should_be_a Expression_Error.Syntax_Error
group_builder.specify "should not allow Group_By for values" <|
err1 = data.table.cross_tab [] "Key" values=[Count, Group_By "Value"] on_problems=Problem_Behavior.Ignore
err1 = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count, Aggregate_Column.Group_By "Value"] on_problems=Problem_Behavior.Ignore
err1.should_fail_with Illegal_Argument
group_builder.specify "should gracefully handle duplicate aggregate names" <|
action = data.table.cross_tab [] "Key" values=[Count new_name="Agg1", Sum "Value" new_name="Agg1"] on_problems=_
action = data.table.cross_tab [] "Key" values=[Aggregate_Column.Count new_name="Agg1", Aggregate_Column.Sum "Value" new_name="Agg1"] on_problems=_
tester table =
table.column_names . should_equal ["x Agg1", "x Agg1 1", "y Agg1", "y Agg1 1", "z Agg1", "z Agg1 1"]
problems = [Duplicate_Output_Column_Names.Error ["x Agg1", "y Agg1", "z Agg1"]]
@ -235,11 +235,11 @@ add_specs suite_builder setup =
t = table_builder [["Key", ["a", "a", "b", "b"]], ["ints", [1, 2, 3, 4]], ["texts", ["a", "b", "c", "d"]]]
[Problem_Behavior.Report_Error, Problem_Behavior.Report_Warning, Problem_Behavior.Ignore].each pb-> Test.with_clue "Problem_Behavior="+pb.to_text+" " <|
t1 = t.cross_tab [] "Key" values=[Average "texts"] on_problems=pb
t1 = t.cross_tab [] "Key" values=[Aggregate_Column.Average "texts"] on_problems=pb
t1.should_fail_with Invalid_Value_Type
t2 = t.cross_tab [] "Key" values=[Sum "texts"] on_problems=pb
t2 = t.cross_tab [] "Key" values=[Aggregate_Column.Sum "texts"] on_problems=pb
t2.should_fail_with Invalid_Value_Type
t3 = t.cross_tab [] "Key" values=[Concatenate "ints"] on_problems=pb
t3 = t.cross_tab [] "Key" values=[Aggregate_Column.Concatenate "ints"] on_problems=pb
t3.should_fail_with Invalid_Value_Type
group_builder.specify "should return predictable types" <|
@ -250,7 +250,7 @@ add_specs suite_builder setup =
t1.at "a" . value_type . is_integer . should_be_true
t1.at "b" . value_type . is_integer . should_be_true
t2 = table.cross_tab [] "Int" values=[Average "Float", Concatenate "Text"] . sort_columns
t2 = table.cross_tab [] "Int" values=[Aggregate_Column.Average "Float", Aggregate_Column.Concatenate "Text"] . sort_columns
t2.column_names . should_equal ["1 Average Float", "1 Concatenate Text", "2 Average Float", "2 Concatenate Text"]
t2.at "1 Average Float" . value_type . is_floating_point . should_be_true
t2.at "1 Concatenate Text" . value_type . is_text . should_be_true
@ -263,7 +263,7 @@ add_specs suite_builder setup =
r1.should_fail_with Invalid_Column_Names
r1.catch.to_display_text . should_contain "cannot contain the NUL character"
r2 = data.table2.cross_tab [] "Key" values=[Average "Value" new_name='x\0']
r2 = data.table2.cross_tab [] "Key" values=[Aggregate_Column.Average "Value" new_name='x\0']
r2.print
r2.should_fail_with Invalid_Column_Names
r2.catch.to_display_text . should_contain "cannot contain the NUL character"

View File

@ -2,7 +2,6 @@ from Standard.Base import all
# We hide the table constructor as instead we are supposed to use `table_builder` which is backend-agnostic.
from Standard.Table import all hiding Table
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Group_By, Count, Sum
from Standard.Test import all
@ -54,7 +53,7 @@ add_specs suite_builder setup =
t1 = table_builder [["Count", [1, 2, 3]], ["Class", ["X", "Y", "Z"]]]
t2 = table_builder [["Letter", ["A", "B", "A", "A", "C", "A", "C", "D", "D", "B", "B"]]]
t3 = t2.aggregate [Group_By "Letter", Count]
t3 = t2.aggregate ["Letter"] [Aggregate_Column.Count]
t4 = t3.join t1 on="Count" join_kind=Join_Kind.Left_Outer |> materialize |> _.order_by "Letter"
t4.columns.map .name . should_equal ["Letter", "Count", "Right Count", "Class"]
rows = t4.rows . map .to_vector
@ -66,7 +65,7 @@ add_specs suite_builder setup =
group_builder.specify "aggregates and distinct" <|
t2 = table_builder [["Letter", ["A", "B", "A", "A", "C", "C"]], ["Points", [2, 5, 2, 1, 10, 3]]]
t3 = t2.aggregate [Group_By "Letter", Sum "Points"]
t3 = t2.aggregate ["Letter"] [Aggregate_Column.Sum "Points"]
t4 = t3.distinct "Sum Points" |> materialize |> _.order_by "Sum Points"
t4.columns.map .name . should_equal ["Letter", "Sum Points"]
t4.row_count . should_equal 2
@ -81,7 +80,7 @@ add_specs suite_builder setup =
group_builder.specify "aggregates and filtering" <|
t2 = table_builder [["Letter", ["A", "B", "A", "A", "C", "C", "B"]], ["Points", [2, 5, 2, 1, 10, 3, 0]]]
t3 = t2.aggregate [Group_By "Letter", Sum "Points"]
t3 = t2.aggregate ["Letter"] [Aggregate_Column.Sum "Points"]
t4 = t3.filter "Sum Points" (Filter_Condition.Equal 5) |> materialize |> _.order_by "Letter"
t4.columns.map .name . should_equal ["Letter", "Sum Points"]
rows = t4.rows . map .to_vector
@ -90,7 +89,7 @@ add_specs suite_builder setup =
group_builder.specify "aggregates and ordering" <|
t1 = table_builder [["Letter", ["C", "A", "B", "A", "A", "C", "C", "B"]], ["Points", [0, -100, 5, 2, 1, 10, 3, 0]]]
t2 = t1.aggregate [Group_By "Letter", Sum "Points"]
t2 = t1.aggregate ["Letter"] [Aggregate_Column.Sum "Points"]
t3 = t2.order_by "Sum Points" |> materialize
t3.columns.map .name . should_equal ["Letter", "Sum Points"]
t3.at "Letter" . to_vector . should_equal ["A", "B", "C"]
@ -194,7 +193,7 @@ add_specs suite_builder setup =
vt1.should_be_a (Value_Type.Char ...)
vt1.variable_length.should_be_true
t4 = t3.aggregate [Aggregate_Column.Shortest "X", Aggregate_Column.Group_By "Y"]
t4 = t3.aggregate ["Y"] [Aggregate_Column.Shortest "X"]
vt2 = t4.at "Shortest X" . value_type
Test.with_clue "t4[X].value_type="+vt2.to_display_text+": " <|
vt2.should_be_a (Value_Type.Char ...)
@ -219,7 +218,7 @@ add_specs suite_builder setup =
c.value_type.variable_length.should_be_true
t2 = t1.set c "C"
t3 = t2.aggregate [Aggregate_Column.Shortest "C"]
t3 = t2.aggregate columns=[Aggregate_Column.Shortest "C"]
t3.at "Shortest C" . to_vector . should_equal ["b"]
vt = t3.at "Shortest C" . value_type
Test.with_clue "t3[C].value_type="+vt.to_display_text+": " <|

View File

@ -1,7 +1,6 @@
from Standard.Base import all
from Standard.Table import Value_Type, Column_Ref, Previous_Value, Blank_Selector
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Count_Distinct
from Standard.Table.Errors import all
from Standard.Database.Errors import Unsupported_Database_Operation

View File

@ -4,7 +4,7 @@ import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Common.Type_Error
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import Group_By, Sum
from Standard.Table import Aggregate_Column
from Standard.Table.Errors import all
from Standard.Test import all
@ -254,7 +254,7 @@ add_specs suite_builder setup =
group_builder.specify "Should work correctly after aggregation" <|
t0 = table_builder [["X", ["a", "b", "a", "c"]], ["Y", [1, 2, 4, 8]]]
t1 = t0.aggregate [Group_By "X", Sum "Y"]
t1 = t0.aggregate ["X"] [Aggregate_Column.Sum "Y"]
t2 = t1.order_by "X" . take 2
t2.at "X" . to_vector . should_equal ['a', 'b']

View File

@ -1,8 +1,7 @@
from Standard.Base import all
import Standard.Base.Errors.Illegal_State.Illegal_State
from Standard.Table import Sort_Column, Value_Type, Blank_Selector
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all hiding First, Last
from Standard.Table import Sort_Column, Value_Type, Blank_Selector, Aggregate_Column
from Standard.Table.Errors import No_Input_Columns_Selected, Missing_Input_Columns, No_Such_Column
from Standard.Database import all
@ -158,12 +157,12 @@ add_specs suite_builder =
data.teardown
group_builder.specify "should allow to count rows" <|
code = data.t1.aggregate [Group_By "A" "A grp", Count "counter"] . to_sql . prepare
code . should_equal ['SELECT "T1"."A grp" AS "A grp", "T1"."counter" AS "counter" FROM (SELECT "T1"."A" AS "A grp", COUNT(*) AS "counter" FROM "T1" AS "T1" GROUP BY "T1"."A") AS "T1"', []]
code = data.t1.aggregate ["A"] [Aggregate_Column.Count "counter"] . to_sql . prepare
code . should_equal ['SELECT "T1"."A" AS "A", "T1"."counter" AS "counter" FROM (SELECT "T1"."A" AS "A", COUNT(*) AS "counter" FROM "T1" AS "T1" GROUP BY "T1"."A") AS "T1"', []]
group_builder.specify "should allow to group by multiple fields" <|
code = data.t1.aggregate [Sum "A" "sum_a", Group_By "C", Group_By "B" "B grp"] . to_sql . prepare
code . should_equal ['SELECT "T1"."sum_a" AS "sum_a", "T1"."C" AS "C", "T1"."B grp" AS "B grp" FROM (SELECT SUM("T1"."A") AS "sum_a", "T1"."C" AS "C", "T1"."B" AS "B grp" FROM "T1" AS "T1" GROUP BY "T1"."C", "T1"."B") AS "T1"', []]
code = data.t1.aggregate ["C", "B"] [Aggregate_Column.Sum "A" "sum_a"] . to_sql . prepare
code . should_equal ['SELECT "T1"."C" AS "C", "T1"."B" AS "B", "T1"."sum_a" AS "sum_a" FROM (SELECT "T1"."C" AS "C", "T1"."B" AS "B", SUM("T1"."A") AS "sum_a" FROM "T1" AS "T1" GROUP BY "T1"."C", "T1"."B") AS "T1"', []]
main =
suite = Test.build suite_builder->

View File

@ -2,8 +2,7 @@ from Standard.Base import all
import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Table import Table, Sort_Column
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all hiding First, Last
from Standard.Table import Table, Sort_Column, Aggregate_Column
from Standard.Table.Errors import all
from Standard.Database import all
@ -424,15 +423,15 @@ add_specs (suite_builder : Suite_Builder) (prefix : Text) (create_connection_fn
group_builder.specify "should allow counting group sizes and elements" <|
## Names set to lower case to avoid issue with Redshift where columns are
returned in lower case.
aggregates = [Count "count", Count_Not_Nothing "price" "count not nothing price", Count_Nothing "price" "count nothing price"]
aggregates = [Aggregate_Column.Count "count", Aggregate_Column.Count_Not_Nothing "price" "count not nothing price", Aggregate_Column.Count_Nothing "price" "count nothing price"]
t1 = determinize_by "name" (data.t9.aggregate ([Group_By "name"] + aggregates) . read)
t1 = determinize_by "name" (data.t9.aggregate ["name"] aggregates . read)
t1.at "name" . to_vector . should_equal ["bar", "baz", "foo", "quux", "zzzz"]
t1.at "count" . to_vector . should_equal [2, 1, 5, 1, 7]
t1.at "count not nothing price" . to_vector . should_equal [2, 1, 3, 0, 5]
t1.at "count nothing price" . to_vector . should_equal [0, 0, 2, 1, 2]
t2 = data.t9.aggregate aggregates . read
t2 = data.t9.aggregate [] aggregates . read
t2.at "count" . to_vector . should_equal [16]
t2.at "count not nothing price" . to_vector . should_equal [11]
t2.at "count nothing price" . to_vector . should_equal [5]
@ -440,16 +439,16 @@ add_specs (suite_builder : Suite_Builder) (prefix : Text) (create_connection_fn
group_builder.specify "should allow simple arithmetic aggregations" <|
## Names set to lower case to avoid issue with Redshift where columns are
returned in lower case.
aggregates = [Sum "price" "sum price", Sum "quantity" "sum quantity", Average "price" "avg price"]
aggregates = [Aggregate_Column.Sum "price" "sum price", Aggregate_Column.Sum "quantity" "sum quantity", Aggregate_Column.Average "price" "avg price"]
## TODO can check the datatypes
t1 = determinize_by "name" (data.t9.aggregate ([Group_By "name"] + aggregates) . read)
t1 = determinize_by "name" (data.t9.aggregate ["name"] aggregates . read)
t1.at "name" . to_vector . should_equal ["bar", "baz", "foo", "quux", "zzzz"]
t1.at "sum price" . to_vector . should_equal [100.5, 6.7, 1, Nothing, 2]
t1.at "sum quantity" . to_vector . should_equal [80, 40, 120, 70, 2]
t1.at "avg price" . to_vector . should_equal [50.25, 6.7, (1/3), Nothing, (2/5)]
t2 = data.t9.aggregate aggregates . read
t2 = data.t9.aggregate [] aggregates . read
t2.at "sum price" . to_vector . should_equal [110.2]
t2.at "sum quantity" . to_vector . should_equal [312]
t2.at "avg price" . to_vector . should_equal [(110.2 / 11)]

View File

@ -68,7 +68,7 @@ add_specs (suite_builder : Suite_Builder) (prefix : Text) (create_connection_fn
group_builder.specify "will return Nothing for composite tables (join, aggregate)"
data.db_table_with_key.join data.db_table_with_key . default_ordering . should_equal Nothing
data.db_table_with_key.aggregate [Aggregate_Column.Group_By "X"] . default_ordering . should_equal Nothing
data.db_table_with_key.aggregate ["X"] . default_ordering . should_equal Nothing
group_builder.specify "will return the ordering determined by order_by" <|
v1 = data.db_table_with_key.order_by ["Y", Sort_Column.Name "X" Sort_Direction.Descending] . default_ordering

View File

@ -188,7 +188,7 @@ add_specs suite_builder prefix create_connection_func =
src = Table.new [[name_1, [1, 2, 3]], [name_2, [4, 5, 6]]]
t1 = src.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True
# We create 2 Maximum columns that if wrongly truncated will have the same name, introducing possible ambiguity to further queries.
t2 = t1.aggregate [Aggregate_Column.Group_By name_1, Aggregate_Column.Maximum name_2, Aggregate_Column.Maximum name_1]
t2 = t1.aggregate [name_1] [Aggregate_Column.Maximum name_2, Aggregate_Column.Maximum name_1]
# The newly added column would by default have a name exceeding the limit, if there's one - and its 'dumbly' truncated name will clash with the already existing column.
t3 = t1.set (t1.at name_1 * t1.at name_2)
@ -372,7 +372,7 @@ add_specs suite_builder prefix create_connection_func =
src = Table.new [["X", [1, 2, 3]]]
db_table = src.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True
long_name = "a" * (max_column_name_length + 1)
r = db_table.aggregate [Aggregate_Column.Maximum "X" new_name=long_name]
r = db_table.aggregate columns=[Aggregate_Column.Maximum "X" new_name=long_name]
r.should_fail_with Name_Too_Long
r.catch.entity_kind . should_equal "column"
r.catch.name . should_equal long_name
@ -382,7 +382,7 @@ add_specs suite_builder prefix create_connection_func =
name_b = "x" * (max_column_name_length - 1) + "B"
src = Table.new [[name_a, [1, 2, 3]], [name_b, [4, 5, 6]]]
db_table = src.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True
t2 = db_table.aggregate [Aggregate_Column.Maximum name_a, Aggregate_Column.Maximum name_b]
t2 = db_table.aggregate columns=[Aggregate_Column.Maximum name_a, Aggregate_Column.Maximum name_b]
w1 = Problems.expect_warning Truncated_Column_Names t2
w1.original_names . should_contain_the_same_elements_as ["Maximum "+name_a, "Maximum "+name_b]
@ -397,7 +397,7 @@ add_specs suite_builder prefix create_connection_func =
src2 = Table.new (names.map_with_index i-> name-> [name, [100 + i, 200 + i]])
db_table2 = src2.select_into_database_table data.connection (Name_Generator.random_name "long-column-names") temporary=True
Problems.assume_no_problems db_table2
t3 = db_table2.aggregate (names.map name-> Aggregate_Column.Maximum name)
t3 = db_table2.aggregate columns=(names.map name-> Aggregate_Column.Maximum name)
w2 = Problems.expect_warning Truncated_Column_Names t3
w2.original_names . should_contain_the_same_elements_as (names.map name-> "Maximum " + name)
t3.column_names . should_contain_the_same_elements_as w2.truncated_names

View File

@ -4,8 +4,7 @@ import Standard.Base.Errors.Illegal_State.Illegal_State
import Standard.Base.Runtime.Ref.Ref
import Standard.Table.Data.Type.Value_Type.Bits
from Standard.Table import Table, Value_Type
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all hiding First, Last
from Standard.Table import Table, Value_Type, Aggregate_Column
from Standard.Table.Errors import Invalid_Column_Names, Inexact_Type_Coercion, Duplicate_Output_Column_Names
import Standard.Database.Data.Column.Column
@ -231,7 +230,7 @@ postgres_specific_spec suite_builder create_connection_fn db_name setup =
i.at "Value Type" . to_vector . should_equal [default_text, Value_Type.Integer, Value_Type.Boolean, Value_Type.Float]
group_builder.specify "should return Table information, also for aggregated results" <|
i = data.t.aggregate [Concatenate "strs", Sum "ints", Count_Distinct "bools"] . info
i = data.t.aggregate columns=[Aggregate_Column.Concatenate "strs", Aggregate_Column.Sum "ints", Aggregate_Column.Count_Distinct "bools"] . info
i.at "Column" . to_vector . should_equal ["Concatenate strs", "Sum ints", "Count Distinct bools"]
i.at "Items Count" . to_vector . should_equal [1, 1, 1]
i.at "Value Type" . to_vector . should_equal [default_text, Value_Type.Decimal, Value_Type.Integer]
@ -277,19 +276,19 @@ postgres_specific_spec suite_builder create_connection_fn db_name setup =
data.teardown
group_builder.specify "Concatenate, Shortest and Longest" <|
r = data.t.aggregate [Concatenate "txt", Shortest "txt", Longest "txt"]
r = data.t.aggregate columns=[Aggregate_Column.Concatenate "txt", Aggregate_Column.Shortest "txt", Aggregate_Column.Longest "txt"]
r.columns.at 0 . value_type . should_equal default_text
r.columns.at 1 . value_type . should_equal default_text
r.columns.at 2 . value_type . should_equal default_text
group_builder.specify "Counts" <|
r = data.t.aggregate [Count, Count_Empty "txt", Count_Not_Empty "txt", Count_Distinct "i1", Count_Not_Nothing "i2", Count_Nothing "i3"]
r = data.t.aggregate columns=[Aggregate_Column.Count, Aggregate_Column.Count_Empty "txt", Aggregate_Column.Count_Not_Empty "txt", Aggregate_Column.Count_Distinct "i1", Aggregate_Column.Count_Not_Nothing "i2", Aggregate_Column.Count_Nothing "i3"]
r.column_count . should_equal 6
r.columns.each column->
column.value_type . should_equal Value_Type.Integer
group_builder.specify "Sum" <|
r = data.t.aggregate [Sum "i1", Sum "i2", Sum "i3", Sum "i4", Sum "r1", Sum "r2"]
r = data.t.aggregate columns=[Aggregate_Column.Sum "i1", Aggregate_Column.Sum "i2", Aggregate_Column.Sum "i3", Aggregate_Column.Sum "i4", Aggregate_Column.Sum "r1", Aggregate_Column.Sum "r2"]
r.columns.at 0 . value_type . should_equal Value_Type.Integer
r.columns.at 1 . value_type . should_equal Value_Type.Integer
r.columns.at 2 . value_type . should_equal Value_Type.Decimal
@ -298,7 +297,7 @@ postgres_specific_spec suite_builder create_connection_fn db_name setup =
r.columns.at 5 . value_type . should_equal (Value_Type.Float Bits.Bits_64)
group_builder.specify "Average" <|
r = data.t.aggregate [Average "i1", Average "i2", Average "i3", Average "i4", Average "r1", Average "r2"]
r = data.t.aggregate columns=[Aggregate_Column.Average "i1", Aggregate_Column.Average "i2", Aggregate_Column.Average "i3", Aggregate_Column.Average "i4", Aggregate_Column.Average "r1", Aggregate_Column.Average "r2"]
r.columns.at 0 . value_type . should_equal Value_Type.Decimal
r.columns.at 1 . value_type . should_equal Value_Type.Decimal
r.columns.at 2 . value_type . should_equal Value_Type.Decimal

View File

@ -111,7 +111,7 @@ add_specs suite_builder create_connection_fn =
t.evaluate_expression 'is_nan([d])' . value_type . should_equal Value_Type.Boolean
t.evaluate_expression 'is_nothing([a])' . value_type . should_equal Value_Type.Boolean
t2 = t.aggregate [Aggregate_Column.Group_By "b", Aggregate_Column.Sum "a", Aggregate_Column.Maximum "a", Aggregate_Column.Count_Not_Nothing "c", Aggregate_Column.Concatenate "b", Aggregate_Column.Count, (Aggregate_Column.First "c" order_by="a")]
t2 = t.aggregate ["b"] [Aggregate_Column.Sum "a", Aggregate_Column.Maximum "a", Aggregate_Column.Count_Not_Nothing "c", Aggregate_Column.Concatenate "b", Aggregate_Column.Count, (Aggregate_Column.First "c" order_by="a")]
t2.at "b" . value_type . should_equal default_text
t2.at "Sum a" . value_type . should_equal (Value_Type.Integer Bits.Bits_64)
t2.at "Maximum a" . value_type . should_equal (Value_Type.Integer Bits.Bits_16)

View File

@ -88,7 +88,7 @@ add_specs suite_builder =
t.evaluate_expression 'is_empty([b])' . value_type . should_equal Value_Type.Boolean
t.evaluate_expression 'is_nothing([a])' . value_type . should_equal Value_Type.Boolean
t2 = t.aggregate [Aggregate_Column.Group_By "b", Aggregate_Column.Sum "a", Aggregate_Column.Maximum "a", Aggregate_Column.Minimum "d", Aggregate_Column.Count_Not_Nothing "c", Aggregate_Column.Concatenate "b", Aggregate_Column.Count]
t2 = t.aggregate ["b"] [Aggregate_Column.Sum "a", Aggregate_Column.Maximum "a", Aggregate_Column.Minimum "d", Aggregate_Column.Count_Not_Nothing "c", Aggregate_Column.Concatenate "b", Aggregate_Column.Count]
t2.at "b" . value_type . should_equal Value_Type.Char
t2.at "Sum a" . value_type . should_equal Value_Type.Integer
t2.at "Maximum a" . value_type . should_equal Value_Type.Integer

View File

@ -489,7 +489,7 @@ add_specs suite_builder make_new_connection prefix persistent_connector=True =
db_table_1 = t1.select_into_database_table data.connection (Name_Generator.random_name "source-table-1") temporary=True primary_key=Nothing
db_table_2 = db_table_1.set "[Y] + 100 * [X]" "C1" . set '"constant_text"' "C2"
db_table_3 = db_table_1.aggregate [Aggregate_Column.Group_By "X", Aggregate_Column.Sum "[Y]*[Y]" "C3"] . set "[X] + 1" "X"
db_table_3 = db_table_1.aggregate ["X"] [Aggregate_Column.Sum "[Y]*[Y]" "C3"] . set "[X] + 1" "X"
db_table_4 = db_table_2.join db_table_3 join_kind=Join_Kind.Left_Outer
db_table_4.is_trivial_query . should_fail_with Table_Not_Found

View File

@ -6,7 +6,6 @@ import Standard.Base.Errors.Common.Type_Error
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Table import Table, Column, Sort_Column, Aggregate_Column, Blank_Selector
from Standard.Table.Data.Aggregate_Column.Aggregate_Column import all hiding First, Last
import Standard.Table.Data.Type.Value_Type.Value_Type
from Standard.Table.Errors import Invalid_Column_Names, Duplicate_Output_Column_Names, No_Input_Columns_Selected, Missing_Input_Columns, No_Such_Column, Floating_Point_Equality, Invalid_Value_Type, Row_Count_Mismatch
@ -656,19 +655,19 @@ add_specs suite_builder =
objects = ["objects", [My.Data 0 1, My.Data 0 1, My.Data 2 2, My.Data 2 2]]
table = Table.new [dates, texts, mixed, ints, floats, objects]
t1 = table.aggregate [Group_By "dates", Shortest "texts", Aggregate_Column.First "texts", Aggregate_Column.First "objects", Aggregate_Column.First "ints", Aggregate_Column.Last "mixed"]
t1 = table.aggregate ["dates"] [Aggregate_Column.Shortest "texts", Aggregate_Column.First "texts", Aggregate_Column.First "objects", Aggregate_Column.First "ints", Aggregate_Column.Last "mixed"]
t1.info.at "Column" . to_vector . should_equal ["dates", "Shortest texts", "First texts", "First objects", "First ints", "Last mixed"]
t1.info.at "Value Type" . to_vector . should_equal [Value_Type.Date, Value_Type.Char, Value_Type.Char, Value_Type.Mixed, Value_Type.Integer, Value_Type.Mixed]
t2 = table.aggregate [Mode "dates", Count_Not_Nothing "objects", Count_Distinct "texts", Minimum "ints", Maximum "floats"]
t2 = table.aggregate [] [Aggregate_Column.Mode "dates", Aggregate_Column.Count_Not_Nothing "objects", Aggregate_Column.Count_Distinct "texts", Aggregate_Column.Minimum "ints", Aggregate_Column.Maximum "floats"]
t2.info.at "Column" . to_vector . should_equal ["Mode dates", "Count Not Nothing objects", "Count Distinct texts", "Minimum ints", "Maximum floats"]
t2.info.at "Value Type" . to_vector . should_equal [Value_Type.Date, Value_Type.Integer, Value_Type.Integer, Value_Type.Integer, Value_Type.Float]
t3 = table.aggregate [Group_By "texts", Group_By "ints", Aggregate_Column.Last "floats"]
t3 = table.aggregate ["texts", "ints"] [Aggregate_Column.Last "floats"]
t3.info.at "Column" . to_vector . should_equal ["texts", "ints", "Last floats"]
t3.info.at "Value Type" . to_vector . should_equal [Value_Type.Char, Value_Type.Integer, Value_Type.Float]
t4 = table.aggregate [Group_By "mixed", Sum "ints", Sum "floats"]
t4 = table.aggregate ["mixed"] [Aggregate_Column.Sum "ints", Aggregate_Column.Sum "floats"]
t4.info.at "Column" . to_vector . should_equal ["mixed", "Sum ints", "Sum floats"]
t4.info.at "Value Type" . to_vector . should_equal [Value_Type.Mixed, Value_Type.Float, Value_Type.Float]
@ -676,20 +675,20 @@ add_specs suite_builder =
texts = ["texts", ['ściana', 'ściana', 'łąka', 's\u0301ciana', 'ła\u0328ka', 'sciana']]
ints = ["ints", [1, 2, 4, 8, 16, 32]]
table = Table.new [texts, ints]
r1 = table.aggregate [Group_By "texts", Sum "ints"] . order_by ([Sort_Column.Name "texts"])
r1 = table.aggregate ["texts"] [Aggregate_Column.Sum "ints"] . order_by ([Sort_Column.Name "texts"])
r1.at "texts" . to_vector . should_equal ['sciana', 'ściana', 'łąka']
r1.at "Sum ints" . to_vector . should_equal [32, 11, 20]
r2 = table.aggregate [Count_Distinct "texts"]
r2 = table.aggregate columns=[Aggregate_Column.Count_Distinct "texts"]
r2.at "Count Distinct texts" . to_vector . should_equal [3]
group_builder.specify "should be able to aggregate over enso Types" <|
weekday_table = Table.new [["days", [Day_Of_Week.Monday, Day_Of_Week.Monday, Day_Of_Week.Monday, Day_Of_Week.Tuesday, Day_Of_Week.Sunday]], ["group", [1,1,2,1,2]]]
r1 = weekday_table.aggregate [Group_By "days"] . order_by "days"
r1 = weekday_table.aggregate ["days"] . order_by "days"
r1.at "days" . to_vector . should_equal [Day_Of_Week.Sunday, Day_Of_Week.Monday, Day_Of_Week.Tuesday]
r2 = weekday_table.aggregate [Group_By "group", Minimum "days" "min", Maximum "days" "max"] . order_by "group"
r2 = weekday_table.aggregate ["group"] [Aggregate_Column.Minimum "days" "min", Aggregate_Column.Maximum "days" "max"] . order_by "group"
r2.at "group" . to_vector . should_equal [1, 2]
r2.at "min" . to_vector . should_equal [Day_Of_Week.Monday, Day_Of_Week.Sunday]
r2.at "max" . to_vector . should_equal [Day_Of_Week.Tuesday, Day_Of_Week.Monday]

View File

@ -69,7 +69,7 @@ add_specs suite_builder =
json = make_json header=["A"] data=[['a', 'a']] all_rows=3 ixes_header=[] ixes=[]
vis . should_equal json
g = data.t.aggregate [Aggregate_Column.Group_By "A", Aggregate_Column.Group_By "B", Aggregate_Column.Average "C"] . at "Average C"
g = data.t.aggregate ["A", "B"] [Aggregate_Column.Average "C"] . at "Average C"
vis2 = Visualization.prepare_visualization g 1
json2 = make_json header=["Average C"] data=[[4.0]] all_rows=2 ixes_header=[] ixes=[]
vis2 . should_equal json2

View File

@ -36,7 +36,7 @@ main =
operator25 = Table.from_rows operator27 [operator27]
operator28 = operator25.first_row
operator29 = operator28.at 'current_implementation'
operator30 = operator22.aggregate [(Aggregate_Column.Group_By 'Approach'), (Aggregate_Column.Average 'Time')]
operator30 = operator22.aggregate ['Approach'] [Aggregate_Column.Average 'Time']
operator31 = operator30.filter 'Approach' (Filter_Condition.Equal operator29)
operator32 = operator31.at 'Average Time' . at 0
operator33 = operator30.set "[Average Time] / "+operator32.to_text "Percentage"

View File

@ -1,16 +1,12 @@
from Standard.Base import all
from Standard.Base.Data.Filter_Condition import Filter_Condition
from Standard.Base.Data.Map import Map
from Standard.Base.Data.Time.Date import Date
from Standard.Table import all
from Standard.Database import all
from Standard.Table.Data.Aggregate_Column import Aggregate_Column
import Standard.Visualization
main =
operator4 = [Aggregate_Column.Maximum "commit_timestamp"]
operator11 = [Aggregate_Column.Minimum "commit_timestamp"]
operator13 = [Aggregate_Column.Group_By "label"]
operator13 = ["label"]
number1 = 26
text1 = "benchs.csv"
operator5 = enso_project.data / text1
@ -18,8 +14,8 @@ main =
operator7 = operator6.row_count
operator1 = operator6.column_names
operator9 = operator6.at 'commit_timestamp'
operator3 = operator6.aggregate operator4
operator8 = operator6.aggregate operator11
operator3 = operator6.aggregate [] operator4
operator8 = operator6.aggregate [] operator11
operator12 = operator6.aggregate operator13
operator17 = operator12.at 'label'
operator18 = operator17.to_vector