Removes a bulk of rust crates that we no longer need, but that added significant install, build and testing time to the Rust parser.
Most significantly, removed `enso-web` and `enso-shapely`, and got rid of many no longer necessary `#![feature]`s. Moved two still used proc-macros from shapely to prelude. The last remaining usage of `web-sys` is within the logger (`console.log`), but we may actually want to keep that one.
This PR updates the Rust toolchain to recent nightly.
Most of the changes are related to fixing newly added warnings and adjusting the feature flags. Also the formatter changed its behavior slightly, causing some whitespace changes.
Other points:
* Changed debug level of the `buildscript` profile to `lint-tables-only` — this should improve the build times and space usage somewhat.
* Moved lint configuration to the worksppace `Cargo.toml` definition. Adjusted the formatter appropriately.
* Removed auto-generated IntelliJ run configurations, as they are not useful anymore.
* Added a few trivial stdlib nightly functions that were removed to our codebase.
* Bumped many dependencies but still not all:
* `clap` bump encountered https://github.com/clap-rs/clap/issues/5407 — for now the warnings were silenced by the lint config.
* `octocrab` — our forked diverged to far with the original, needs more refactoring.
* `derivative` — is unmaintained and has no updated version, despite introducing warnings in the generated code. There is no direct replacement.
Generate TS bindings and lazy deserialization for the parser types.
# Important Notes
- The new API is imported into `ffi.ts`, but not yet used.
- I have tested the generated code in isolation, but cannot commit tests as we are not currently able to load WASM modules when running in `vitest`.
Libraries: Revert changes that were necessitated by a new rule we have decided not to introduce.
Parser:
- Support mixed constructors/bindings in types.
- Disallow zero-length hex sequences in character escapes: `\x`, `\u`, `\u{}`, `\U`, `\U{}` are no longer legal synonyms for `\0` (matches old parser behavior).
Fix bugs in `TreeToIr` (rewrite) and parser. Implement more undocumented features in parser. Emulate some old parser bugs and quirks for compatibility.
Changes in libs:
- Fix some bugs.
- Clean up some odd syntaxes that the old parser translates idiosyncratically.
- Constructors are now required to precede methods.
# Important Notes
Out of 221 files:
- 215 match the old parser
- 6 contain complex types the old parser is known not to handle correctly
So, compared to the old parser, the new parser parses 103% of files correctly.
Parse text literals. See: https://www.pivotaltracker.com/story/show/182496940
# Important Notes
- The left-trimming algorithm (https://github.com/enso-org/design/blob/wip/wd/enso-spec/epics/enso-spec-1.0/04.%20Expressions.md#inline-and-block-text-literals) requires two passes over the sequence of text segments. This implementation performs one pass while parsing (identifying the correct amount of trim). The other pass (applying the trim) can be done when building the value of the quoted string: Trim the amount of whitespace identified by the `trim` field off of the whitespace of each `TextSection` (the value will not exceed the amount of whitespace found in the tokens' offsets, except for tokens with 0 offset, in which case no trimming is necessary/possible).
Implements:
- UUIDs: https://www.pivotaltracker.com/story/show/182931137
- Comments: https://www.pivotaltracker.com/story/show/182981779
- Type annotations and signatures: https://www.pivotaltracker.com/story/show/182497454
- Fix getter names (https://github.com/enso-org/enso/pull/3627#discussion_r940887460).
# Important Notes
- I can't fully test UUIDs; I have tested that the data obtained in Rust matches my understanding of how the format is supposed to work. What remains to be tested is that the data in Java matches the way the old parser handles the format. So @JaroslavTulach, let me know if you see any cases where I'm not returning the same values.
- This implementation of type annotations and signatures accepts any expression in type context. It would probably be nice to narrow this down at some point, but for now I have no design info on what specifically should be allowed in type expressions; this implementation should be at least an incremental improvement.
Based on usage; I believe this handles every case in current `.enso` files.
# Important Notes
- `import` is a built-in macro, so an import statement parses as a `MultiSegmentApp`.
- Every `import` syntax will have a segment whose leading keyword is `import`; however `import` macros can be identified more efficiently by looking at only the first keyword. A `MultiSegmentApp` is an import if and only if its first keyword is in the set { "polyglot", "from", "import" }.
This PR contains all work for finishing integration of first Component List Panel in the IDE:
* It adds a stub for the whole Component Browser View. The documentation panel is re-used from the old searcher.
* It has the presenter implementation, integrating the view with Hierarchical Component List from the controller.
* It extends the View API, so the integration is possible, making use of Component Group Set wrapper.
* The selection integration was also merged into this PR, because it depended on the API extension mentioned above. However, we should avoid such practice in the future.
https://user-images.githubusercontent.com/3919101/177816427-8c4285b4-8941-4048-a400-52f4acf77a9f.mp4
# Important Notes
There are some known issues, to-be-fixed in the future.
* The performance is bad. It should be improved with new text::Area, and the decent one shall come with [GridView inside component browser](https://www.pivotaltracker.com/story/show/182561072)
* There is no keyboard navigation. It should also be delivered with [GridView](https://www.pivotaltracker.com/story/show/182561072).
* The Favorites section is not [filtered out by node source type](https://www.pivotaltracker.com/story/show/182661634).
implement simple variable assignments and function definitions.
This implements:
- https://www.pivotaltracker.com/story/show/182497122
- https://www.pivotaltracker.com/story/show/182497144 (the code blocks are not created yet, but the function declaration is recognized.)
# Important Notes
- Introduced S-expression-based tests, and pretty-printing-roundtrip testing.
- Started writing tests for TypeDef based on the examples in the issue. None of them parse successfully.
- Fixed Number tokenizing.
- Moved most contents of parser's `main.rs` to `lib.rs` (fixes a warning).
Implement generation of Java AST types from the Rust AST type definitions, with support for deserializing in Java syntax trees created in Rust.
### New Libraries
#### `enso-reflect`
Implements a `#[derive(Reflect)]` macro to enable runtime analysis of datatypes. Macro interface includes helper attributes; **the Rust types and the `reflect` attributes applied to them fully determine the Java types** ultimately produced (by `enso-metamodel`). This is the most important API, as it is used in the subject crates (`enso-parser`, and dependencies with types used in the AST). [Module docs](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/reflect/macros/src/lib.rs).
#### `enso-metamodel`
Provides data models for data models in Rust/Java/Meta (a highly-abstracted language-independent model--I have referred to it before as the "generic representation", but that was an overloaded term).
The high-level interface consists of operations on data models, and between them. For example, the only operations needed by [the binary that drives datatype transpilation](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/parser/generate-java/src/main.rs) are: `rust::to_meta`, `java::from_meta`, `java::transform::optional_to_null`, `java::to_syntax`.
The low-level interface consists of direct usage of the datatypes; this is used by [the module that implements some serialization overrides](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/parser/generate-java/src/serialization.rs) (so that the Java interface to `Code` references can produce `String`s on demand based on serialized offset/length pairs). The serialization override mechanism is based on customizing, not replacing, the generated deserialization methods, so as to be as robust as possible to changes in the Rust source or in the transpilation process.
### Important Notes
- Rust/Java serialization is exhaustively tested for structural compatibility. A function [`metamodel::meta::serialization::testcases`](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/metamodel/src/meta/serialization.rs) uses `reflect`-derived data to generate serialized representations of ASTs to use as test cases. Its should-accept cases cover every type a tree can contain; it also produces a representative set of should-reject cases. A Rust `#[test]` confirms that these cases are accepted/rejected as expected, and generated Java tests (see Binaries below) check the generated Java deserialization code against the same test cases.
- Deserializing `Code` is untested. The mechanism is in place (in Rust, we serialize only the offset/length of the `Cow`; in Java, during deserialization we obtain a context object holding a buffer for all string data; the accessor generated in Java uses the buffer and the offset/length to return `String`s), but it will be easier to test once we have implemented actually parsing something and instantiating the `Cow`s with source code.
- `#[tagged_enum]` [now supports](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/shapely/macros/src/tagged_enum.rs#L36-L51) control over what is done with container-level attributes; they can be applied to the container and variants (default), only to the container, or only to variants.
- Generation of `sealed` classes is supported, but currently disabled by `TARGET_VERSION` in `metamodel::java::syntax` so that tests don't require Java 15 to run. (The same logic is run either way; there is a shallow difference in output.)
### Binaries
The `enso-parser-generate-java` crate defines several binaries:
- `enso-parser-generate-java`: Performs the transpilation; after integration, this will be invoked by the build script.
- `java-tests`: Generates the Java code that tests format deserialization; after integration this command will be invoked by the build script, and its Java output compiled and run during testing.
- `graph-rust`/`graph-meta`/`graph-java`: Produce GraphViz representations of data models in different typesystems; these are for developing and understanding model transformations.
Until integration, a **script regenerates the Java and runs the format tests: `./tools/parser_generate_java.sh`**. The generated code can be browsed in `target/generated_java`.