Commit Graph

40 Commits

Author SHA1 Message Date
Kaz Wesley
4d4a2990a0
Distinguish assignment/thunk by statement context (#11324)
Align `Assignment`/`Function` distinction in AST with compiler's implemented semantics:
- The ambiguous case `funcOrVar = expression` is now parsed as a `Function` when in a `Type` definition or in the top level of a module. I.e. it is an `Assignment` in contexts where the RHS is evaluated immediately when the binding is evaluated, and a `Function` in contexts where the RHS is evaluated each time the bound name is evaluated.
- `Assignment` statements now may only occur in function bodies.

Correcting this distinction lays the groundwork for #11302.

Other changes:
- Fixed incorrect source code locations for negative literals and negated expressions.

# Important Notes
New APIs:
- The parser now exposes a `parse_block` entry point, which allows parsing input lines as if in the body of a function. The previous entry point has been renamed to `parse_module`.
2024-10-18 17:54:55 +00:00
Kaz Wesley
289198127f
Stateless parser API (#11147)
Stateless (static) parser interface. Buffer-reuse optimization is now hidden within `Parser` implementation. Fixes #11121 and prevents similar bugs.

# Important Notes
- Also simplify `EnsoParser` API, exposing only a higher-level interface.
2024-09-27 15:58:02 +00:00
Hubert Plociniczak
7c413298fb
More info when critical failure occurs (#11092)
* More info when critical failure occurs

Log problematic module to help with debugging critical failure.

* One more exception

* s/System.err/Logger.error/

* maybe append slf4j deps

* fix what looks like a long standing typo

`GeneratedFormatTests.java` not `GeneratedFormatTests..java`

* one more typo

* Fix directory where to look for classpath

`./run java-gen test --skip-version-check` now works. At least locally.

* Local is fine, CI is not. More temporary debugging...

* Ensure project's managedClasspath is exported

Running java tests requires us knowing all additional dependencies as
they have to be added to the classpath manually. That can only be
ensured by invoking the right sbt target.

* Move sbt call after graalvm setup

* removing CI debugging

* Apply suggestions from code review

Co-authored-by: Kaz Wesley <kaz@lambdaverse.org>

---------

Co-authored-by: Kaz Wesley <kaz@lambdaverse.org>
2024-09-23 10:59:56 +02:00
Kaz Wesley
e587d564f8
Improve backend error handling (#11136)
- Fix debug logging for #11088 case--attempt to create an exception that is its own cause fails.
- In case the parser is used after closing, throw an `IllegalStateException` instead of UB. (This case is not known to occur and doesn't seem to be behind the #11121, but we should handle it more safely if it does.)
2024-09-20 13:23:52 +00:00
Paweł Grabarz
f14b79f8cf
Rust bump, reduce dependencices (#10803)
Updated rust version, removed some unnecessary or problematic dependencies. Ported some changes from bazel branch.
2024-08-13 23:16:55 +00:00
Kaz Wesley
aafdef1aeb
Improve parser contextualization (#10734) 2024-08-05 15:46:58 +00:00
Kaz Wesley
8b48637691
Provide syntax warnings to Java (#10645)
Translate syntax warnings and attach to IR when translating operator applications.

We should ensure that all Trees are checked for warnings and every warning is attached to some IR. That would require a bit of refactoring: In TreeToIr, we could define helpers wrapping every IR constructor and accepting a `Tree` parameter. The `Tree` could be used to populate the `IdentifiedLocation` when constructing the IR type, and then to attach all warnings after constructing the IR object.

# Important Notes
- Update JNI dependency.
- Introduces a `cargo bench` runner for parser.
2024-07-24 17:54:23 +00:00
Jaroslav Tulach
515d8238bb
Support for --jvm option in Enso runner (#10374)
Addresses one of two concerns of #5298 - adds support for `--jvm` argument to allow us to switch from _native image_ built Enso binary (as developed by #10126) to regular JVM based Enso execution. This change _doesn't affect production builds_. The _native executable_ continues to be only built by `engine-runner/buildNativeImage` which is tested on CI, but not in the production jobs.
2024-07-06 07:02:20 +00:00
Jaroslav Tulach
fe2cf49568
Run whole test/Base_Tests in native image runner (#10296) 2024-06-21 06:03:53 +02:00
Jaroslav Tulach
dee9e079d4
Enso language support with parser in VSCode, IGV, etc. (#7054)
Outline view and completions for Enso code in VSCode.

# Important Notes
This PR provides the necessary infrastructure for building VSCode extension that includes `enso_parser` library compiled for all supported platforms.

VSCode extension can now use libraries from `sbt` that are `publishM2`-ready. To make that possible a documentation must have been provided and fixed for those modules - hence so many changes in `.scala` classes.

<img width="862" alt="image" src="https://github.com/enso-org/enso/assets/26887752/7374bf41-bdc6-4322-b562-85a2e761de2a">

Last, but not least. The outline view and completions display something.
2024-06-14 14:01:37 +00:00
Hubert Plociniczak
ca8d715d5a
Hotfix for finding parser library (#10123)
* Hotfix for finding parser library

Since ydoc is now started by language server, parser is initialized
differently and attempts to find `libenso_parser.so` in
`component/runner` rather than `component` directory.

* Add fallbacks

* fix native image build
2024-05-29 23:43:53 +02:00
Jaroslav Tulach
16c1b74218
Enso Library Feature to execute (a bit of) Base_Tests (#9997) 2024-05-23 08:20:19 +02:00
Dmitry Bushev
5995a00958
Run ydoc-server with GraalVM (#9528)
part of #7954

# Important Notes
The workflow is:
- `$ npm install` -- just in case
- `$ npm --workspace=enso-gui2 run build-ydoc-server-polyglot` -- build the `ydocServer.js` bundle
- `$ sbt ydoc-server/assembly` -- build the ydoc server jar
- `env POLYGLOT_YDOC_SERVER=true npm --workspace=enso-gui2 run dev` -- run the dev server with the polyglot ydoc server. Providing `POLYGLOT_YDOC_SERVER_DEBUG=true` env variable enables the chrome debugger
2024-05-02 06:28:57 +00:00
Michał Wawrzyniec Urbańczyk
90bbee352e
Bump Rust Toolchain (#9517)
This PR updates the Rust toolchain to recent nightly.

Most of the changes are related to fixing newly added warnings and adjusting the feature flags. Also the formatter changed its behavior slightly, causing some whitespace changes.

Other points:
* Changed debug level of the `buildscript` profile to `lint-tables-only` — this should improve the build times and space usage somewhat.
* Moved lint configuration to the worksppace `Cargo.toml` definition. Adjusted the formatter appropriately.
* Removed auto-generated IntelliJ run configurations, as they are not useful anymore.
* Added a few trivial stdlib nightly functions that were removed to our codebase.
* Bumped many dependencies but still not all:
* `clap` bump encountered https://github.com/clap-rs/clap/issues/5407 — for now the warnings were silenced by the lint config.
* `octocrab` — our forked diverged to far with the original, needs more refactoring.
* `derivative` — is unmaintained and has no updated version, despite introducing warnings in the generated code. There is no direct replacement.
2024-03-24 23:45:55 +00:00
Kaz Wesley
ce042569b0
line:col positions in parser (#8203)
Add `line:column` information to source code references produced by the parser. This information will be used by GUI2 as part of the solution to #8134.

# Important Notes
- `parse_all_enso_files.sh` has been used to ensure this doesn't affect tree structures.
- `parse_all_enso_files.sh` now checks emitted locations for consistency, and has been used to verify that all line:col references match the values found by an independent scan of the source up to the given UTF8 position.
2023-11-08 16:53:39 +00:00
Dmitry Bushev
6249c79ffd
Update sbt-java-formatter plugin (#7011)
Update java formatter plugin. The new version can remove unused imports.
2023-06-12 14:18:48 +00:00
Michael Mauderer
349cc210e0
Bump rustc to nightly-2023-01-12 (#4053)
Bump rustc nightly-2022-08-30 and fix new errors and lints.
https://www.pivotaltracker.com/story/show/184229094
2023-02-02 23:05:25 +00:00
Wojciech Daniło
ce5b078130
Dependency cleaning (#4092) 2023-01-27 23:39:37 +01:00
Michael Mauderer
c7c555314a
Bump rustc nightly-2022-11-22 (#3911) 2022-11-30 03:16:25 +01:00
Wojciech Daniło
717325f472
removing optional dependencies from prelude (#3922) 2022-11-28 12:42:31 +01:00
Michał Wawrzyniec Urbańczyk
6b25cfa91f
Build hotfix (#3916) 2022-11-25 14:05:57 +01:00
Jaroslav Tulach
5ce173316b
More improvements that work with both parsers (#3868) 2022-11-12 02:34:14 +01:00
Marcin Kostrzewa
23e04f905f
Another attempt at M1 compilation (#3859) 2022-11-09 15:26:25 +00:00
Jaroslav Tulach
c2633bc137
Metadata, in context and imports (#3856)
Another set of improvements extracted from #3611. This time it includes a fix to the Rust part of the parser.

# Important Notes
After digging into metadata parsing I realized the positions used to query the BTree data structure are wrong. This PR tries to address that by re-arranging the order of serialized fields and passing `startCode` and `endCode` locations in.

Originally I though I need changes on the Rust side to support `in` operator. Turned out I can do that just with changes on the Java side.

Qualified names in imports were missing UUIDs. Fixed now.
2022-11-07 19:05:19 +00:00
Jaroslav Tulach
85f71cbfc8
Including enso_parser library in the engine distribution (#3842)
Make sure `libenso_parser.so`, `.dll` or `.dylib` are packaged and included when `sbt buildEngineDistribution`.

# Important Notes
There was [a discussion](https://discord.com/channels/401396655599124480/1036562819644141598) about proper location of the library. It was concluded that _"there's no functional difference between a dylib and a jar."_ and as such the library is placed in `component` folder.

Currently the old parser is still used for parsing. This PR just integrates the build system changes and makes us ready for smooth flipping of the parser in the future as part of #3611.
2022-11-02 17:13:53 +00:00
Jaroslav Tulach
21fe0b8865
Process documentation in type definitions (#3806)
Another part of #3611 ready for integration into `develop` branch.

# Important Notes
Test `org.enso.compiler.EnsoCompilerTest.testTestGroup` is ignored as it has problems with source offsets - identifiers don't have the appropriate names due to `Tree.codeRepr()` being _off_.
2022-10-19 09:19:42 +00:00
Michał Wawrzyniec Urbańczyk
ad69eeb4ad
Build script merge (#3743)
Merged the build script into main repository. Some related cleanups.
2022-10-10 23:38:48 +02:00
Kaz Wesley
44a031f9f0
Parser: Full constructor syntax for type definitions; Field syntax; Complex operator sections; Template functions; Text improvements; Operator methods; eliminate Unsupported; better ArgumentDefinitions (#3716)
I believe all parse failures remaining after these changes are because the new parser is intentionally stricter about some things. I'll be reviewing those failures and opening a bug to change the library/tests code.

Implements:
- https://www.pivotaltracker.com/story/show/182941610: full type def syntax
- https://www.pivotaltracker.com/story/show/182497490: field syntax
- https://www.pivotaltracker.com/story/show/182497395: complex operator sections
- https://www.pivotaltracker.com/story/show/182497236: template functions
- `codeRepr` without leading whitespace
- text literals: interpret escape sequences in lexer
- the multiline text-literal left-trim algorithm
- type operator-methods
- the `<=` operator is no longer treated as a modifier
- https://www.pivotaltracker.com/story/show/183315038: eliminate Unsupported
- use ArgumentDefinition for type constructor arguments
- more detailed ArgumentDefinition type
2022-10-05 04:45:31 +00:00
Wojciech Daniło
61546a7ade
Wip/wdanilo/widgets 182746060 (#3678) 2022-10-04 04:51:27 +02:00
Jaroslav Tulach
9134f9b2d7
EnsoCompilerTest to verify compatibility of parsers (#3723)
Adding new _compatibility test_ `EnsoCompilerTest` to verify the new Rust based parser can produce the same `IR` as the original `AST` based one. The simplest way to execute the test from an empty repository is:
```bash
enso$ sbt bootstrap
enso$ sbt "testOnly *EnsoCompilerTest"
```

There are [GitHub Actions run](https://github.com/enso-org/enso/actions/runs/3087664644/jobs/4993266212#step:9:5187) on Linux as well as [run on Windows](https://github.com/enso-org/enso/actions/runs/3087664644/jobs/4993266370#step:9:5254) that show `EnsoCompilerTest` is being executed by the CI (good, as that means `.so` was properly built and linked to the JVM running the test). The [linux](https://github.com/enso-org/enso/actions/runs/3087664644/jobs/4993266212#step:9:5187) as well as [windows](https://github.com/enso-org/enso/actions/runs/3087664644/jobs/4993266370#step:9:5254) runs also demonstrate that failures in the `EnsoCompilerTest` suite fail the CI.

# Important Notes
Right now [there are five test failures](https://github.com/enso-org/enso/actions/runs/3087664644/jobs/4993266212#step:9:5187) - waiting for @kazcw to make sure `codeRepr()` doesn't contain spaces. However, as this PR is more about the infrastructure, I am disabling the currently failing tests in [031169b](031169bd05)
2022-09-20 15:50:27 +00:00
Kaz Wesley
d8f274158a
Parser: Named and default arguments; Text interpolation; Escape sequences (#3709)
* named and default arguments

* text interpolation and escapes

* work around a limitation of Java
2022-09-14 22:32:28 -07:00
Kaz Wesley
605bd08e8d
Parser: Utf16, recursive spans, toString, lambdas, case expressions, operator precedence, array and tuple literals, numeric literals (#3706)
Implements:
- https://www.pivotaltracker.com/story/show/182807114 - Utf16
- https://www.pivotaltracker.com/story/show/182931097 - recursive span info
- https://www.pivotaltracker.com/story/show/182940917 - readable `toString`
- https://www.pivotaltracker.com/story/show/182497196 - lambdas
- https://www.pivotaltracker.com/story/show/182497518 - case expressions
- https://www.pivotaltracker.com/story/show/182497344 - operator precedence and associativity
- https://www.pivotaltracker.com/story/show/182497111 - array and tuple literals
- https://www.pivotaltracker.com/story/show/182496909 - numeric literals
2022-09-14 18:09:58 +00:00
Kaz Wesley
1e3b9a3624
Parse text literals (#3681)
Parse text literals. See: https://www.pivotaltracker.com/story/show/182496940

# Important Notes
- The left-trimming algorithm (https://github.com/enso-org/design/blob/wip/wd/enso-spec/epics/enso-spec-1.0/04.%20Expressions.md#inline-and-block-text-literals) requires two passes over the sequence of text segments. This implementation performs one pass while parsing (identifying the correct amount of trim). The other pass (applying the trim) can be done when building the value of the quoted string: Trim the amount of whitespace identified by the `trim` field off of the whitespace of each `TextSection` (the value will not exceed the amount of whitespace found in the tokens' offsets, except for tokens with 0 offset, in which case no trimming is necessary/possible).
2022-09-03 06:38:06 +00:00
Kaz Wesley
c3f758e0dc
Parser: Parse UUIDs; implement comments in AST; implement type annotations and signatures; fix field names (#3653)
Implements:
- UUIDs: https://www.pivotaltracker.com/story/show/182931137
- Comments: https://www.pivotaltracker.com/story/show/182981779
- Type annotations and signatures: https://www.pivotaltracker.com/story/show/182497454
- Fix getter names (https://github.com/enso-org/enso/pull/3627#discussion_r940887460).

# Important Notes
- I can't fully test UUIDs; I have tested that the data obtained in Rust matches my understanding of how the format is supposed to work. What remains to be tested is that the data in Java matches the way the old parser handles the format. So @JaroslavTulach, let me know if you see any cases where I'm not returning the same values.
- This implementation of type annotations and signatures accepts any expression in type context. It would probably be nice to narrow this down at some point, but for now I have no design info on what specifically should be allowed in type expressions; this implementation should be at least an incremental improvement.
2022-09-03 03:15:27 +00:00
Kaz Wesley
60b1dce79e
Parser: hide internal APIs in generated Java (#3605)
Now that there's a public `org.enso.syntax2.Parser` interface (after #3599), make APIs that don't need to be exposed package-private.
2022-08-09 23:32:49 +02:00
Kaz Wesley
796b1b5b82
Parser: implement import (#3627)
Based on usage; I believe this handles every case in current `.enso` files.

# Important Notes
- `import` is a built-in macro, so an import statement parses as a `MultiSegmentApp`.
- Every `import` syntax will have a segment whose leading keyword is `import`; however `import` macros can be identified more efficiently by looking at only the first keyword. A `MultiSegmentApp` is an import if and only if its first keyword is in the set { "polyglot", "from", "import" }.
2022-08-02 15:09:20 +00:00
Kaz Wesley
c525b201b9
Parser: don't panic for any standard library files (#3609) 2022-07-28 19:17:33 +02:00
Kaz Wesley
c670718e3c
JNI bindings for enso-parser (#3599)
Provide a JNI dynamic-library interface to `enso_parser`.

# Important Notes
- The library can be built with: `cargo build -p enso-parser-jni`.
- A new `org.enso.syntax2.Parser` API is implemented on top of the JNI interface provided by `enso-parser-jni`.
- We are using the `jni` crate, since apparently Java cannot just call C-ABI functions. The crate is not well-maintained. I came across an obviously-unsound `safe` function, and found it was reported over a year ago, with a PR to fix: jni-rs/jni-rs#303. However our needs are simple. We can't trust any safety guarantees they imply, but I think we are unlikely to encounter any logic bugs using the basic bindings.
2022-07-25 14:24:21 +00:00
Kaz Wesley
3b99e18f94
Code blocks (#3585) 2022-07-20 16:53:20 +02:00
Kaz Wesley
bc66078251
Parser: Transpile Rust AST types to Java types (#3555)
Implement generation of Java AST types from the Rust AST type definitions, with support for deserializing in Java syntax trees created in Rust.

### New Libraries

#### `enso-reflect`

Implements a `#[derive(Reflect)]` macro to enable runtime analysis of datatypes. Macro interface includes helper attributes; **the Rust types and the `reflect` attributes applied to them fully determine the Java types** ultimately produced (by `enso-metamodel`). This is the most important API, as it is used in the subject crates (`enso-parser`, and dependencies with types used in the AST). [Module docs](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/reflect/macros/src/lib.rs).

#### `enso-metamodel`

Provides data models for data models in Rust/Java/Meta (a highly-abstracted language-independent model--I have referred to it before as the "generic representation", but that was an overloaded term).

The high-level interface consists of operations on data models, and between them. For example, the only operations needed by [the binary that drives datatype transpilation](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/parser/generate-java/src/main.rs) are: `rust::to_meta`, `java::from_meta`, `java::transform::optional_to_null`, `java::to_syntax`.

The low-level interface consists of direct usage of the datatypes; this is used by [the module that implements some serialization overrides](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/parser/generate-java/src/serialization.rs) (so that the Java interface to `Code` references can produce `String`s on demand based on serialized offset/length pairs). The serialization override mechanism is based on customizing, not replacing, the generated deserialization methods, so as to be as robust as possible to changes in the Rust source or in the transpilation process.

### Important Notes

- Rust/Java serialization is exhaustively tested for structural compatibility. A function [`metamodel::meta::serialization::testcases`](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/metamodel/src/meta/serialization.rs) uses `reflect`-derived data to generate serialized representations of ASTs to use as test cases. Its should-accept cases cover every type a tree can contain; it also produces a representative set of should-reject cases. A Rust `#[test]` confirms that these cases are accepted/rejected as expected, and generated Java tests (see Binaries below) check the generated Java deserialization code against the same test cases.
- Deserializing `Code` is untested. The mechanism is in place (in Rust, we serialize only the offset/length of the `Cow`; in Java, during deserialization we obtain a context object holding a buffer for all string data; the accessor generated in Java uses the buffer and the offset/length to return `String`s), but it will be easier to test once we have implemented actually parsing something and instantiating the `Cow`s with source code.
- `#[tagged_enum]` [now supports](https://github.com/enso-org/enso/blob/wip/kw/parser/ast-transpiler/lib/rust/shapely/macros/src/tagged_enum.rs#L36-L51) control over what is done with container-level attributes; they can be applied to the container and variants (default), only to the container, or only to variants.
- Generation of `sealed` classes is supported, but currently disabled by `TARGET_VERSION` in `metamodel::java::syntax` so that tests don't require Java 15 to run. (The same logic is run either way; there is a shallow difference in output.)

### Binaries

The `enso-parser-generate-java` crate defines several binaries:
- `enso-parser-generate-java`: Performs the transpilation; after integration, this will be invoked by the build script.
- `java-tests`: Generates the Java code that tests format deserialization; after integration this command will be invoked by the build script, and its Java output compiled and run during testing.
- `graph-rust`/`graph-meta`/`graph-java`: Produce GraphViz representations of data models in different typesystems; these are for developing and understanding model transformations. 

Until integration, a **script regenerates the Java and runs the format tests: `./tools/parser_generate_java.sh`**. The generated code can be browsed in `target/generated_java`.
2022-07-07 04:46:42 +02:00