Fixes performance problems observed when creating/resolving errors (#6674):
|before|after|
|---|---|
|![vokoscreenNG-2023-06-09_08-49-46.webm](https://github.com/enso-org/enso/assets/1047859/a0048b32-4906-41cd-8899-6e2543ef6942)|![vokoscreenNG-2023-06-09_08-50-54.webm](https://github.com/enso-org/enso/assets/1047859/fef81512-ad89-4418-ae10-d54de94d96ea)|
This also helps with #6637, although I haven't been able to reproduce the degree of slowness shown there so I can't confirm that this resolves that issue.
# Important Notes
- Disable visualizations until shown. [Faster startup, and all graph changes.]
- 6x faster message deserialization. [Saves 400ms when making a change with many visualizations open.]
- Fast edge recoloring. [Saves 100-150ms when disconnecting an edge in Orders.]
- Add a checked implementation of a `profiler` data structure, used instead of the fast `unsafe` version when `debug-assertions` are enabled.
# Important Notes
The mouse handling changes involve an unfortunate huge hack, where we enable mouse events on the mouse shape during box selection. That way we know for sure that no other shape will be able to receive mouse enter event. Then the list editor widget is modified to only actually respond to events when its background is hovered. We will definitely want a more proper way to handle mouse event contention, but it's definitely out of scope for current bugfixing.
This PR fixes#6560.
The fix has a few elements:
1) Bumps the Engine requirement to the latest release, namely `2023.1.1`.
2) Changed the logic of checking whether a given version matches the requirement. Previously, we relied on `VersionReq` from `semver` crate which did not behave intuitively when the required version had a prerelease suffix. Now we rely directly on Semantic Versioning rules of precedence.
3) Code cleanups, including deduplicating 3 copies of the version-checking code, and moving some tests to more sensible places.
Implement new Enso documentation parser; remove old Scala Enso parser.
Performance: Total time parsing documentation is now ~2ms.
# Important Notes
- Doc parsing is now done only in the frontend.
- Some engine tests had never been switched to the new parser. We should investigate tests that don't pass after the switch: #5894.
- The option to run the old searcher has been removed, as it is obsolete and was already broken before this (see #5909).
- Some interfaces used only by the old searcher have been removed.
@hubertp has reported in #5620 that sometimes enabling visualization does not send "attachVisualization" message to the engine.
The actual cause was simply because it was already attached. Updating the default visualizations (when receiving information about type) updated the preprocessors, what caused in turn attaching visualization.
That was a bug, of course. This PR fixes it: now we don't update any visualization if it's hidden.
This PR changes build script's `ide watch` and `ide start` commands, so they don't use `electron-builder` to package. Instead, they invoke `electron` directly, significantly reducing time overhead.
`ide watch` will now start Electron process, while continuously rebuilding gui and the client in the background. Changes can be puilled by reloading within the electron, or closing the electron and letting it start once again. To stop, the script should be interrupted with `Ctrl+C`.
This PR updates the build script:
* fixed issue where program version check was not properly triggering;
* improved `git-clean` command to correctly clear Scala artifacts;
* added `run.ps1` wrapper to the build script that works better with PowerShell than `run.cmd`;
* increased timeouts to work around failures on macOS nightly builds;
* replaced depracated GitHub Actions APIs (set-output) with their new equivalents;
* workaround for issue with electron builder (python2 lookup) on newer macOS runner images;
* GUI and backend dispatches to cloud were completed;
* release workflow allows creating RC releases.
When running the profiling run-graph and flamegraph demo scenes, if a profile file is not found in the directory served over http, fall back to generating demo data.
This PR reenables code signing and notarization on macOS.
[ci no changelog needed]
# Important Notes
* electron-builder has been bumped, mostly to avoid missing Python issue. A workaround for a regression with Windows installer is provided as a patch.
Provide a JNI dynamic-library interface to `enso_parser`.
# Important Notes
- The library can be built with: `cargo build -p enso-parser-jni`.
- A new `org.enso.syntax2.Parser` API is implemented on top of the JNI interface provided by `enso-parser-jni`.
- We are using the `jni` crate, since apparently Java cannot just call C-ABI functions. The crate is not well-maintained. I came across an obviously-unsound `safe` function, and found it was reported over a year ago, with a PR to fix: jni-rs/jni-rs#303. However our needs are simple. We can't trust any safety guarantees they imply, but I think we are unlikely to encounter any logic bugs using the basic bindings.
Implement generation of Java AST types from the Rust AST type definitions, with support for deserializing in Java syntax trees created in Rust.
### New Libraries
#### `enso-reflect`
Implements a `#[derive(Reflect)]` macro to enable runtime analysis of datatypes. Macro interface includes helper attributes; **the Rust types and the `reflect` attributes applied to them fully determine the Java types** ultimately produced (by `enso-metamodel`). This is the most important API, as it is used in the subject crates (`enso-parser`, and dependencies with types used in the AST). [Module docs](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/reflect/macros/src/lib.rs).
#### `enso-metamodel`
Provides data models for data models in Rust/Java/Meta (a highly-abstracted language-independent model--I have referred to it before as the "generic representation", but that was an overloaded term).
The high-level interface consists of operations on data models, and between them. For example, the only operations needed by [the binary that drives datatype transpilation](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/parser/generate-java/src/main.rs) are: `rust::to_meta`, `java::from_meta`, `java::transform::optional_to_null`, `java::to_syntax`.
The low-level interface consists of direct usage of the datatypes; this is used by [the module that implements some serialization overrides](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/parser/generate-java/src/serialization.rs) (so that the Java interface to `Code` references can produce `String`s on demand based on serialized offset/length pairs). The serialization override mechanism is based on customizing, not replacing, the generated deserialization methods, so as to be as robust as possible to changes in the Rust source or in the transpilation process.
### Important Notes
- Rust/Java serialization is exhaustively tested for structural compatibility. A function [`metamodel::meta::serialization::testcases`](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/metamodel/src/meta/serialization.rs) uses `reflect`-derived data to generate serialized representations of ASTs to use as test cases. Its should-accept cases cover every type a tree can contain; it also produces a representative set of should-reject cases. A Rust `#[test]` confirms that these cases are accepted/rejected as expected, and generated Java tests (see Binaries below) check the generated Java deserialization code against the same test cases.
- Deserializing `Code` is untested. The mechanism is in place (in Rust, we serialize only the offset/length of the `Cow`; in Java, during deserialization we obtain a context object holding a buffer for all string data; the accessor generated in Java uses the buffer and the offset/length to return `String`s), but it will be easier to test once we have implemented actually parsing something and instantiating the `Cow`s with source code.
- `#[tagged_enum]` [now supports](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/shapely/macros/src/tagged_enum.rs#L36-L51) control over what is done with container-level attributes; they can be applied to the container and variants (default), only to the container, or only to variants.
- Generation of `sealed` classes is supported, but currently disabled by `TARGET_VERSION` in `metamodel::java::syntax` so that tests don't require Java 15 to run. (The same logic is run either way; there is a shallow difference in output.)
### Binaries
The `enso-parser-generate-java` crate defines several binaries:
- `enso-parser-generate-java`: Performs the transpilation; after integration, this will be invoked by the build script.
- `java-tests`: Generates the Java code that tests format deserialization; after integration this command will be invoked by the build script, and its Java output compiled and run during testing.
- `graph-rust`/`graph-meta`/`graph-java`: Produce GraphViz representations of data models in different typesystems; these are for developing and understanding model transformations.
Until integration, a **script regenerates the Java and runs the format tests: `./tools/parser_generate_java.sh`**. The generated code can be browsed in `target/generated_java`.
* fixes a regression for watching ide in dev profile;
* add support for passing cargo-watch options;
* general improvements and cleanups around watch commands.
### Pull Request Description
Using the new tooling (#3491), I investigated the **performance / compile-time tradeoff** of different codegen options for release mode builds. By scripting the testing procedure, I was able to explore many possible combinations of options, which is important because their interactions (on both application performance and build time) are complex. I found **two candidate profiles** that offer specific advantages over the current `release` settings (`baseline`):
- `thin16`: Supports incremental compiles in 1/3 the time of `baseline` in common cases. Application runs about 2% slower than `baseline`.
- `fat1-O4`: Application performs 13% better than `baseline`. Compile time is almost 3x `baseline`, and non-incremental.
(See key in first chart for the settings defining these profiles.)
We can build faster or run faster, though not in the same build. Because the effect sizes are large enough to be impactful to developer and user experience, respectively, I think we should consider having it both ways. We could **split the `release` profile** into two profiles to serve different purposes:
- `release`: A profile that supports fast developer iteration, while offering realistic performance.
- `production`: A maximally-optimized profile, for nightly builds and actual releases.
Since `wasm-pack` doesn't currently support custom profiles (rustwasm/wasm-pack#1111), we can't use a Cargo profile for `production`; however, we can implement our own profile by overriding rustc flags.
### Performance details
![perf](https://user-images.githubusercontent.com/1047859/170788530-ab6d7910-5253-4a2b-b432-8bfa0b4735ba.png)
As you can see, `thin16` is slightly slower than `baseline`; `fat1-O4` is dramatically faster.
<details>
<summary>Methodology (click to show)</summary>
I developed a procedure for benchmarking "whole application" performance, using the new "open project" workflow (which opens the IDE and loads a complex project), and some statistical analysis to account for variance. To gather this data:
Build the application with profiling:
`./run.sh ide build --profiling-level=debug`
Run the `open_project` workflow repeatedly:
`for i in $(seq 0 9); do dist/ide/linux-unpacked/enso --entry-point profile --workflow open_project --save-profile open_project_thin16_${i}.json; done`
For each profile recorded, take the new `total_self_time` output of the `intervals` tool; gather into CSV:
`echo $(for i in $(seq 0 9); do target/rust/debug/intervals < open_project_thin16_${i}.json | tail -n1 | awk '{print $2}'; do`
(Note that the output of intervals should not be considered stable; this command may need modification in the future. Eventually it would be nice to support formatted outputs...)
The data is ready to graph. I used the `boxplot` method of the [seaborn](https://seaborn.pydata.org/index.html) package, in order to show the distribution of data.
</details>
#### Build times
![thin16](https://user-images.githubusercontent.com/1047859/170788539-1578e41b-bc30-4f30-9b71-0b0181322fa5.png)
In the case of changing a file in `enso-prelude`, with the current `baseline` settings rebuilding takes over 3 minutes. With the `thin16` settings, the same rebuild completes in 40 seconds.
(To gather this data on different hardware or in the future, just run the new `bench-build.sh` script for each case to be measured.)
- Change dev profile settings. Improves build performance; will not affect anything else. Details below.
- Introduce script for benchmarking various incremental builds. Usage is explained in the script comments.
- Add a line to `intervals` showing total main-thread CPU work logged in a profile; this can be used to compare the results of optimizations (I'll be starting a discussion informed by that data separately; this change just enables the tooling to report it).
Implement a command that launches the application, runs a series of steps (a "workflow"), writes a profile to a file, and exits.
See: [#181775808](https://www.pivotaltracker.com/story/show/181775808)
# Important Notes
- The command to capture run and profile is used like: `./run profile --workflow=new_project --save-profile=out.json`. Defining some more workflows (collapse nodes, create node and edit value) comes next; they are implemented with the same infrastructure as the integration-tests.
- The `--save-profile` option can also be used when profiling interactively; when the option is provided, capturing a profile with the hotkey will write a file instead of dumping the data to the devtools console.
- If the IDE panics, the error message is now printed to the console that invoked the process, as well as the devtools console. (If a batch workflow fails, this allows us to see why.)
- New functionality (writing profile files, quitting on command, logging to console) relies on Electron APIs. These APIs are implemented in `index.js`, bridged to the render process in `preload.js`, and wrapped for use in Rust in a `debug_api` crate.
See: [#181837344](https://www.pivotaltracker.com/story/show/181837344).
I've separated this PR from some deeper changes I'm making to the profile format, because the changeset was getting too complex. The new APIs and tools in this PR are fully-implemented, except the profile format is too simplistic--it doesn't currently support headers that are needed to determine the relative timings of events from different processes.
- Adds basic support for profile files containing data collected by multiple processes.
- Implements `api_events_to_profile`, a tool for converting backend message logs (#3392) to the `profiler` format so they can be merged with frontend profiles (currently they can be merged with `cat`, but the next PR will introduce a merge tool).
- Introduces `message_beanpoles`, a simple tool that diagrams timing relationships between frontend and backend messages.
### Important Notes
- All TODOs introduced here will be addressed in the next PR that defines the new format.
- Introduced a new crate, `enso_profiler_enso_data`, to be used by profile consumers that need to refer to Enso application datatypes to interpret metadata.
- Introduced a `ProfileBuilder` abstraction for writing the JSON profile format; partially decouples the runtime event log structures from the format definition.
- Introducing the conversion performed for `ProfilerBuilder` uncovered that the `.._with_same_start!` low-level `profiler` APIs don't currently work; they return `Started<_>` profilers, but that is inconsistent with the stricter data model that I introduced when I implemented `profiler_data`; they need to return profilers in a created, unstarted state. Low-level async profilers have not been a priority, but once #3382 merges we'll have a way to render their data, which will be really useful because async profilers capture *why* we're doing things. I'll bring up scheduling this in the next performance meeting.
In this branch:
* The workaround for cursor-not-being-updated-after-closing-searcher bug (discovered while testing #3278) is reverted.
* The proper fix was introduced: created an abstraction for EnsoGL component, which, when dropping, will not immediately drop the FRP network and model, but instead put it into the Garbage Collector. The Collector ensures, that all "component hiding" effects and events will be handled, and drops FRP network and model only after that.
* I run clippy for wasm32 target out of curiosity. There was one warning, and I fixed it on this branch.
Use a new algorithm for placement of new nodes in cases when:
- a) there is no selected node, and the `TAB` key is pressed while the mouse pointer is near an existing node (especially in an area below an existing node);
- b) a connection is dragged out from an existing node and dropped near the node (especially in an area below the node).
In both cases mentioned above, the new node will now be placed in a location suggested by an internal algorithm, aligned to existing nodes. Specifically, the placement algorithm used is similar to when pressing `TAB` with a node selected.
For more details, see: https://www.pivotaltracker.com/story/show/181076066
# Important Notes
- Visible visualizations enabled with the "eye icon" button are treated as part of a node. (In case of nodes with errors, visualizations are not visible, and are not treated as part of a node.)
API for storing metadata.
See: https://www.pivotaltracker.com/story/show/181149277
# Important Notes
**New APIs**:
- Storing metadata is implemented with `profiler::MetadataLogger`.
- A full metadata storage/retrieval example is in [the top-level doctests](https://github.com/enso-org/enso/blob/wip/kw/profiling-metadata-api/lib/rust/profiler/data/src/lib.rs) for profiler::data, a crate which implements an API for profiling data consumers (it abstracts away the low-level details of the event log, and checks its invariants in the process) [after review of this new API here I'll open a PR to add it to the design doc].
**Implementation**:
- `profiler::Event` is parameterized by a metadata type, so that different types of metadata can be dependency-injected into it.
- A data consumer defines its metadata type as an enum of all the kinds of metadata it is interested in.
- Producing the metadata enum is accomplished without defining its type (which would require dependencies from around the app): A `MetadataLogger` internally use a serialization helper `Variant` to serialize its variant of the metadata enum without knowledge of the other possible variants.
**Performance impact**: still in the low ns/measurement range, comparable to pushing to a vec.
*Note*: `LocalVecBuilder` is currently present under the name `Log`, which is accurate but probably too overloaded. I'd like to find the right name for it, document it with examples, and move it to its own crate under data-structures, but I don't want doing that to hold up this PR.
* profiling instrumentation
* Support native testing with mock impl of `mod js`
* Add benchmarks
* Wrapper: support methods.
* `#[profile]`: work in any context
* feature-gate lineno info that breaks IDE
* Support async; more docs; add perf analysis
* docs & formatting
This change adds utility code for calculating summaries from multiple samples (snapshots) of EnsoGL runtime stats values.
This internal feature is expected to be used by Enso IDE performance profiling tools, which are planned to be added in the near future.
https://www.pivotaltracker.com/story/show/181093920
A demo scene named `stats` was added, showcasing how to perform calculations using the new tools. Currently, the summary calculations in the scene work only when the EnsoGL stats Monitor Panel is visible; this is planned to be improved in a future task (https://www.pivotaltracker.com/story/show/181093601).
- Note: the stats aggregation code is intended to be later used in Enso IDE's main rendering loop, so it needs to have very good performance characteristics.
- Due to that, `Accumulator` was designed to only use simple addition arithmetic, and be constant-memory once created.