enso/test/Exploratory_Benchmarks
James Dunkerley c0fd6eed2d
Small tweaks from Ned, Mark, Cass and Steve's feedback. (#9859)
- Rename `.info` to `.column_info`.
- Rename `.select_by_type` to `.select_columns_by_type`. (to be merged into `select_columns` functionality)
- Rename `.remove_by_type` to `.remove_columns_by_type`. (to be merged into `remove_columns` functionality)
- Adding "field" `ALIAS`es.
- Making `Table.Value` private (and hence hiding `java_table`).
- Making `Column.Value` private (and hence hiding `java_column`, added method to `Storage` to allow tests to work).
- Making `DB_Table.Value` private (and hence hiding `connection`, `context` and `internal_columns`).
- Making `DB_Column.Value` private (and hence hiding `connection`, `sql_type_reference`, `expression` and `context`).
- Add widgets for everywhere `Filter_Condition` used (fix missing entry for lower, can't use auto-scoping).
![image](https://github.com/enso-org/enso/assets/4699705/bad7d5c8-cd01-4384-a1c8-dbcd7a5ad92b)
- Fix for widgets in `Simple_Calculation`: `date_add` and `format`.
![image](https://github.com/enso-org/enso/assets/4699705/14eeef2e-5069-4a88-8b68-eb675497addf)
![image](https://github.com/enso-org/enso/assets/4699705/75d4e097-f171-4bf6-a27e-f9d396c44a92)
- Fix for nested `Row`, `Column` and `DB_Column` rendering in Table Viz.
![image](https://github.com/enso-org/enso/assets/4699705/8c50c519-7c34-4228-bab7-be938a6ce1bf)
![image](https://github.com/enso-org/enso/assets/4699705/c83ad301-f31e-451d-a8be-b022e1456b4e)

# Important Notes
- Can't use `.` to access a private constructor field when mapping, have to use a lambda.
- We should agree convention for private backing fields which are wrapped by a public getter (went with `internal_` for this pass).
2024-05-06 13:49:12 +00:00
..
polyglot-sources/exploratory-benchmark-java-helpers/src/main/java/org/enso/exploratory_benchmark_helpers IsNa to IsNothing, missing to Nothing in Table code. (#9154) 2024-02-26 10:52:07 +00:00
src Small tweaks from Ned, Mark, Cass and Steve's feedback. (#9859) 2024-05-06 13:49:12 +00:00
package.yaml Add benchmarks comparing performance of Table operations 'vectorized' in Java vs performed in Enso (#7270) 2023-07-21 17:25:02 +00:00
README.md Add benchmarks comparing performance of Table operations 'vectorized' in Java vs performed in Enso (#7270) 2023-07-21 17:25:02 +00:00

Exploring Table operation performance

These benchmarks are used to compare various approaches to computing operations on Table columns, to find out what best practices should we use for these and find venues for optimization of the language and Table implementation.

These benchmarks are not meant to be used for tracking performance of the current implementation itself. That is supposed to be done by another project - Table_Benchmarks.

Structure

Currently, the benchmarks are split into a few files, each exploring some separate topic, like mapping a single column, combining two columns with some operation, or computing an aggregate operation over a column. In each file, there may be a few Enso types, each representing a separate benchmark. Usually, we have two benchmarks for each operation type - one dealing with a primitive value type like integers (long in the Java side) and another dealing with a reference type like String or Date. We expect the performance characteristics between these may differ, e.g. because Java allows to use long without boxing, so we compare them separately.

Each Enso type for a given benchmark contains multiple methods which represent various 'approaches' to computing the same operation.

Each benchmark run has a name that consists of the type it defines it, a dot and the method representing the particular approach, e.g. Boxed_Map_Test.enso_map_as_vector.

Running

The runner is very simple. If any options are to be customized, the Enso file itself needs to be modified. One can run the whole project to run all the benchmarks, or run only a specific file.

Analysis

The output of the benchmarks should be saved to a file. Then that file can be loaded using the Enso workflow in tools/performance/benchmark-analysis.

The workflow is tuned to analysing these comparative benchmarks.

At the top, one can select which file is to be analyzed. Below there is a dropdown allowing to select one particular benchmark (represented by the type, e.g. Boxed_Map_Test). With that selected, one can display a scatter plot visualization comparing various approaches of that one given benchmark. On the plot we can see runtimes of subsequent iterations. Later, we drop the first 40 iterations (the number can easily be customized in the workflow) to ensure sufficient warm-up for each benchmark. Then a table is displayed computing the average runtime of each approach and how they compare relative to each other - a dropdown allows to select one benchmark that will be used as a reference point (100%) for the average runtime comparison.