mirror of
https://github.com/enso-org/enso.git
synced 2024-11-22 22:10:15 +03:00
f0de43a970
Working on compiler IR is a daunting task. I have therefore added a new system property `enso.compiler.dumpIr` that will help with that. It dumps the encountered IRs to `ir-dumps` directory in the [GraphViz](www.graphviz.org) format. More info in updated docs. Note that all the functionality to dump IRs to `dot` files was already implemented. This PR just adds the command line option and updates docs. # Important Notes - `--dump-graphs` cmd line option is removed as per [Jaroslav's request](https://github.com/enso-org/enso/pull/10740#pullrequestreview-2216676140). - To dump graphs, use `-Dgraal.Dump=Truffle:2` system property passed via `JAVA_OPTS` env var. If you run `env JAVA_OPTS='-Denso.compiler.dumpIr=true' enso --run tmp.enso` where `tmp.enso` is, e.g.: ``` from Standard.Base import all main = 42 ``` You will then have something like: ``` $ ls ir-dumps Standard.Base.Data.Filter_Condition.dot Standard.Base.Data.Time.dot Standard.Base.System.Advanced.dot Standard.Base.Warning.dot Standard.Base.Data.Locale.dot Standard.Base.Enso_Cloud.Enso_File.dot Standard.Base.System.File.Advanced.dot tmp.dot Standard.Base.Data.Numeric.dot Standard.Base.Errors.dot Standard.Base.System.File.dot Standard.Base.Data.Numeric.Internal.dot Standard.Base.Network.HTTP.Internal.dot Standard.Base.System.File.Generic.dot Standard.Base.Data.Text.Regex.Internal.dot Standard.Base.Runtime.dot Standard.Base.System.Internal.dot ``` You can then visualize any of these with `dot -Tsvg -O ir-dumps/tmp.dot`. An example how that could look like is ![image.svg](https://github.com/user-attachments/assets/26ab8415-72cf-46da-bc63-f475e9fa628e)
217 lines
13 KiB
Markdown
217 lines
13 KiB
Markdown
---
|
||
layout: developer-doc
|
||
title: Runtime Guide
|
||
category: summary
|
||
tags: [contributing, guide, graal, truffle]
|
||
order: 7
|
||
---
|
||
|
||
# Runtime Guide
|
||
|
||
## GraalVM and Truffle
|
||
|
||
### Papers
|
||
|
||
1. [One VM To Rule Them All](http://lafo.ssw.uni-linz.ac.at/papers/2013_Onward_OneVMToRuleThemAll.pdf)
|
||
a high-level overview of what GraalVM is and how it works.
|
||
2. [Practical Partial Evaluation for High-Performance Dynamic Language Runtimes](https://chrisseaton.com/rubytruffle/pldi17-truffle/pldi17-truffle.pdf)
|
||
an introduction to basic Truffle concepts, including Polymorphic Inline
|
||
Caches and other frequently used techniques.
|
||
3. [Fast, Flexible, Polyglot Instrumentation Support for Debuggers and other Tools](https://arxiv.org/pdf/1803.10201.pdf)
|
||
an introduction to Truffle instrumentation framework – this is what Enso's
|
||
runtime server for the IDE uses.
|
||
4. [Cross-Language Interoperability in a Multi-Language Runtime](https://chrisseaton.com/truffleruby/cross-language-interop.pdf)
|
||
an introduction to how Truffle cross-language interop works.
|
||
5. [The whole list of publications](https://www.graalvm.org/community/publications/)
|
||
because something may be useful at some point.
|
||
|
||
### Tutorials
|
||
|
||
1. [The list of Truffle docs on specific topics](https://github.com/oracle/graal/tree/master/truffle/docs)
|
||
Certain more advanced topics are covered in these, use as needed.
|
||
1. [Optimizing Tutorial](https://github.com/oracle/graal/blob/master/truffle/docs/Optimizing.md)
|
||
you'll want to read this one for sure.
|
||
2. [TruffleLibrary Tutorial](https://github.com/oracle/graal/blob/master/truffle/docs/TruffleLibraries.md)
|
||
this is an important architectural concept for building Truffle
|
||
interpreters. We wish we knew about this sooner and we recommend you
|
||
structure the interpreter around this in the future.
|
||
2. [A tutorial on building a LISP in Truffle](https://cesquivias.github.io/blog/2015/01/15/writing-a-language-in-truffle-part-4-adding-features-the-truffle-way/)
|
||
It's a 4-part tutorial, linked is part 4, start with part 2 (part 1 is not
|
||
about Truffle). This one is important, even though it is old and uses stale
|
||
APIs – it will still highlight the most important concepts, in particular the
|
||
way Enso deals with lexical scopes and Tail Call Optimization.
|
||
3. [Simple Language](https://github.com/graalvm/simplelanguage) this is an
|
||
implementation of a very simple toy language. Read it for basic understanding
|
||
of simple Truffle concepts.
|
||
|
||
### Tips and Tricks
|
||
|
||
1. Familiarize yourself with
|
||
[IGV](https://www.graalvm.org/graalvm-as-a-platform/language-implementation-framework/Profiling/).
|
||
It's a horrible tool. It's clunky, ugly, and painful to use. It has also
|
||
saved us more times than we can count, definitely worth investing the time to
|
||
understand it. Download
|
||
[Enso Language Support for IGV](../tools/enso4igv/README.md). Use
|
||
[this tutorial](https://shopify.engineering/understanding-programs-using-graphs)
|
||
(and
|
||
[the follow up post](https://chrisseaton.com/truffleruby/basic-truffle-graphs/))
|
||
to familiarize yourself with the representation.
|
||
2. Use our sbt
|
||
[`withDebug`](https://github.com/enso-org/enso/blob/develop/project/WithDebugCommand.scala)
|
||
utility. Familiarize yourself with the different otpions. It is a useful
|
||
helper for running your programs and microbenchmarks with different Truffle
|
||
debugging options.
|
||
3. Use [hsdis](https://github.com/liuzhengyang/hsdis/) for printing the
|
||
generated assembly – you can often spot obvious problems with compilations.
|
||
That being said, IGV (with
|
||
[Enso Language Support](../tools/enso4igv/README.md)) is usually the better
|
||
tool, if you take a look at the later compilation stages.
|
||
4. Pay attention to making things `final` and `@CompilationFinal`. This is the
|
||
most important way Graal does constant-folding. Whenever a loop bound can be
|
||
compilation final, take advantage (and use `@ExplodeLoop`).
|
||
5. Read the generated code for the nodes generated by the DSL. Learning the DSL
|
||
is quite difficult and the documentation is sorely lacking. It is best to
|
||
experiment with different kinds of `@Specialization` and read the generated
|
||
code. Without this understanding, it's way too easy to introduce very subtle
|
||
bugs to the language semantics.
|
||
6. Join the [GraalVM Slack server](https://www.graalvm.org/slack-invitation/).
|
||
All the authors are there and they will happily help and answer any
|
||
questions.
|
||
7. Be aware that Truffle Instrumentation is more constrained than it could be,
|
||
because it wants to be language agnostic. The Enso runtime server is
|
||
Enso-specific and therefore you may be better served in the future by rolling
|
||
your own instrumentation. Read the instrumentation sources, it will help you
|
||
understand how non-magical it actually is.
|
||
8. Clone the sources of Truffle and TruffleRuby. Set them up as projects in your
|
||
IDE. Read the code when in doubt. Truffle documentation is really lacking
|
||
sometimes, even though it is improving.
|
||
9. Understand the boundary between the language-side APIs (see e.g.
|
||
`InteropLibrary`) and embedder side (see `Value`). You want to make sure you
|
||
use the proper APIs in the proper places in the codebase. As a rule of thumb:
|
||
all code in the `runtime` project is language/instrumentation-side. All code
|
||
elsewhere is embedder-side. In particular, the only Graal dependency in
|
||
embedder code should be `graal-sdk`. If you find yourself pulling things like
|
||
`truffle-api`, you've done something wrong. Similarly, if you ever import
|
||
anything from `org.graalvm.polyglot` in the language code, you're doing
|
||
something wrong.
|
||
10. Avoid
|
||
[deoptimizations](https://www.graalvm.org/22.2/graalvm-as-a-platform/language-implementation-framework/Optimizing/#debugging-deoptimizations).
|
||
Understanding IGV graphs can be a very time-consuming and complex process
|
||
(even with the help of [Enso tooling for IGV](../tools/enso4igv/README.md)).
|
||
Sometimes it is sufficient to only look at the compilation traces to
|
||
discover repeated or unnecessary deoptimizations which can significantly
|
||
affect overall performance of your program. You can tell runner to generate
|
||
compilation traces via additional options:
|
||
```
|
||
JAVA_OPTS="-Dpolygot.engine.TracePerformanceWarnings=all -Dpolyglot.engine.TraceTransferToInterpreter=true -Dpolyglot.engine.TraceDeoptimizeFrame=true -Dpolyglot.engine.TraceCompilation=true -Dpolyglot.engine.TraceCompilationDetails=true"
|
||
```
|
||
Make sure you print trace logs by using `--log-level TRACE`.
|
||
11. Occasionally a piece of code runs slower than we anticipated. Analyzing
|
||
Truffle inlining traces may reveal locations that one thought would be
|
||
inlined but Truffle decided otherwise. Rewriting such locations to builtin
|
||
methods or more inliner-friendly representation can significantly improve
|
||
the performance. You can tell runner to generate inlining traces via
|
||
additional options:
|
||
```
|
||
JAVA_OPTS="-Dpolyglot.engine.TraceInlining=true -Dpolyglot.engine.TraceInliningDetails=true"
|
||
```
|
||
Make sure you print trace logs by using `--log-level TRACE`. See
|
||
[documentation](https://www.graalvm.org/22.2/graalvm-as-a-platform/language-implementation-framework/Inlining/#call-tree-states)
|
||
for the explanation of inlining decisions.
|
||
|
||
## Code & Internal Documentation Map
|
||
|
||
Other than the subsections here, go through the
|
||
[existing documentation](https://github.com/enso-org/enso/tree/develop/docs).
|
||
|
||
### Entry Points
|
||
|
||
1. See `Main` in `engine-runner` and `Language` in `runtime`. The former is the
|
||
embedder-side entry point, the latter the language-side one. They do a bit of
|
||
ping-pong through the polyglot APIs. That is unfortunate, as this API is
|
||
stringly typed. Therefore, chase the usages of method-name constants to jump
|
||
between the language-side implementations and the embedder-side calls.
|
||
Alternatively, step through the flow in a debugger.
|
||
2. Look at the `MainModule` in `language-server` and `RuntimeServerInstrument`
|
||
in `runtime`. This is the entry point for IDE, with language/embedder
|
||
boundary as usual, but with a server-like message exchange instead of
|
||
polyglot API use.
|
||
|
||
### Compiler
|
||
|
||
Look at `Compiler` in `runtime`. It is the main compiler class and the flow
|
||
should be straightforward. A high level overview is: the compiler alternates
|
||
between running module-local passes (currently in 3 groups) and global join
|
||
points, where information flows between modules.
|
||
|
||
### Interpreter
|
||
|
||
There are a few very convoluted spots in the interpreter, with non-trivial
|
||
design choices. Here's a list with some explanations:
|
||
|
||
1. **Function Call Flow**: It is quite difficult to efficiently call an Enso
|
||
function. Enso allows passing arguments by name, supports currying and
|
||
eta-expansion, and defaulted argument values. It also has to deal with
|
||
polyglot method calls. And it has to be instrumentable, to enable the "enter
|
||
a method via call site" functionality of the IDE. Start reading from
|
||
`ApplicationNode` and follow the execute methods (or `@Specialization`s).
|
||
There's a lot of them, but don't get too scared. It is also outlined
|
||
[here](https://github.com/enso-org/enso/blob/develop/docs/runtime/function-call-flow.md).
|
||
2. **Polyglot Code**: While for some languages (Java, Ruby and Python) it is
|
||
straightforward and very Truffle-like, for others (JS and R) it becomes
|
||
tricky. The reason is that Truffle places strong limitations on threading in
|
||
these languages and it is impossible to call JS and R from a multithreaded
|
||
language context (like Enso's). For this reason, we have a special, internal
|
||
sub-language, running on 2 separate Truffle contexts, exposing the single
|
||
threaded languages in a safe way (through a GIL). The language is called EPB
|
||
(Enso Polyglot Bridge) and lives in
|
||
[this subtree](https://github.com/enso-org/enso/tree/develop/engine/runtime/src/main/java/org/enso/interpreter/epb).
|
||
To really understand it, you'll need to familiarize yourself with what a
|
||
[TruffleContext](https://www.graalvm.org/truffle/javadoc/com/oracle/truffle/api/TruffleContext.html)
|
||
is and how it relates to polyglot and language contexts (oh, and also get
|
||
ready to work with about 7 different meanings of the word `Context`...).
|
||
3. **Threading & Safepoints**: Enso has its own safepointing system and a thread
|
||
manager. The job of the thread manager is to halt all the executing threads
|
||
when needed. Safepoints are polled during normal code execution (usually at
|
||
the start of every non-inlined method call and at each iteration of a TCO
|
||
loop). See
|
||
[the source](https://github.com/enso-org/enso/blob/develop/engine/runtime/src/main/java/org/enso/interpreter/runtime/ThreadManager.java).
|
||
4. **Resource Finalization**: Enso exposes a system for automatic resource
|
||
finalization. This is non-trivial on the JVM and is handled in the
|
||
[ResourceManager](https://github.com/enso-org/enso/blob/develop/engine/runtime/src/main/java/org/enso/interpreter/runtime/ResourceManager.java).
|
||
5. **Builtin Definitions**: Certain basic functions and types are exposed
|
||
directly from the interpreter. They currently are all bundled in a virtual
|
||
module called `Standard.Builtins`. See
|
||
[the Builtins class](https://github.com/enso-org/enso/blob/develop/engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java)
|
||
to see how that module is constructed. There's also a java-side
|
||
annotation-driven DSL for automatic generation of builtin method boilerplate.
|
||
See nodes in
|
||
[this tree](https://github.com/enso-org/enso/tree/develop/engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin)
|
||
to get an idea of how it works. Also
|
||
[read the doc](https://github.com/enso-org/enso/blob/develop/docs/runtime/builtin-base-methods.md)
|
||
6. **Standard Library Sources**: These are very non-magical – just plain old
|
||
Enso projects that get shipped with every compiler release. They live
|
||
[in this tree](https://github.com/enso-org/enso/tree/develop/distribution/lib/Standard).
|
||
And are tested through
|
||
[these projects](https://github.com/enso-org/enso/tree/develop/test). It also
|
||
makes heavy use of host interop. The Java methods used by the standard
|
||
library are located in
|
||
[this directory](https://github.com/enso-org/enso/tree/develop/std-bits).
|
||
7. **Microbenchmarks**: There are some microbenchmarks for tiny Enso programs
|
||
for basic language constructs. They are located in
|
||
[this directory](https://github.com/enso-org/enso/tree/develop/engine/runtime/src/bench).
|
||
They can be run through `sbt runtime-benchmarks/bench`. Each run will
|
||
generate (or append to) the `bench-report.xml` file. See
|
||
[Benchmarks](infrastructure/benchmarks.md) for more information about the
|
||
benchmarking infrastructure.
|
||
[Enso Language Support](../tools/enso4igv/README.md)).
|
||
8. **Tests**: There are scalatests that comprehensively test all of the language
|
||
semantics and compiler passes. These are run with
|
||
`sbt runtime-integration-tests/test`. For newer functionalities, we prefer
|
||
adding tests to the `Tests` project in the standard library test. At this
|
||
point, Enso is mature enough to self-test.
|
||
|
||
### Language Server
|
||
|
||
Talk to Dmitry! He's the main maintainer of this part.
|