Commit Graph

12 Commits

Author SHA1 Message Date
Stephen Compall
a51d0db8ff
set scalac -Xsource:2.13 -Ypartial-unification globally (#6469)
* add -Xsource:2.13, -Ypartial-unification to common_scalacopts

* add now-referenced scalaz-core where needed

* work around bad type signatures in scalatest Aggregating, Containing

* unused Any suppression

* work around bad partial-unification wrought by type alias

* remove unused Conversions import

- not required in 4f68cfc480 either, so unsure how it's survived this long

* work around Future.traverse; remove unused show import

* no changelog

CHANGELOG_BEGIN
CHANGELOG_END

* remove unused bounds

* remove -Ypartial-unification and -Xsource:2.13 where they were explicitly passed

* longer comment on what the options do

- suggested by @stefanobaghino-da; thanks

* forget Future.traverse, just use scalaz, it knows how to do this
2020-06-24 16:51:24 -04:00
Remy
1b1b4eab2c
Speedy: clean machine builder name (#6427)
* Address comment martin made in #6368

* changelog

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-19 14:58:13 +02:00
Remy
5a9e7ebc7c
Speedy: refactor Machine builders (#6368)
In this PR we cleanup the constructor for the speedy Machine.

* We remove the `case`  keyword since `Machine` is a stateful class,
* We replace the pre-existing builders with
  + one generic builder `Machine.apply`,
  + scenario specific builder,

CHANGELOG_BEGIN
CHANGELOG_END
2020-06-18 15:39:55 +02:00
nickchapman-da
61aa7039ef
Disable stack-traces in the speedy performance benchmarks. (#6358)
changelog_begin
changelog_end
2020-06-16 07:01:49 +00:00
nickchapman-da
cfa66ecd07
Speedy computational benchmarks. (#6239)
This PR adds some DAML speed benchmarks which focus on the computational aspects of DAML.

The first benchmark is `nfib`. The speed is reported in nfibs/micro-second.

The second benchmark is `json-parser`.  We have a short pipeline: JSON AST is constructed
to represent an arithmetic expression. The AST is converted to its string representation.
The JSON string is then parsed back to AST using the JSON parser (which is defined using
parser combinators defined in the benchmark code). Finally the arithmetic expression
embedded in the JSON AST is evaluated. We report the speed in k-chars/second.

The speed tests are designed to be quick and easy to run. Both tests scale exponentially
in their integer argument, and so are easy to tune so each iteration takes about half a
second. The are run like this:

```
bazel run daml-lf/interpreter/perf:nfib
bazel run daml-lf/interpreter/perf:speed-json-parser
```

For interest, the speeds I see on my dev machine are:

- nfib: 1.35 nfibs/us
- json-parser: 27 k/s

changelog_begin
changelog_end
2020-06-05 15:30:19 +01:00
nickchapman-da
44d0128f29
Add control to speedy compiler to choose if stack-trace support occurs at runtime. (#6070)
Choices for `stacktracing` are `NoStackTrace` / `FullStackTrace`.

Adapt code so the selection is made by the original caller:
- `engine`
- `scenario-service`
- `repl-service`
- `daml-script` runner
etc

Currently, all callers pass `FullStackTrace` (the existing behaviour), except for the
exploration dev-code: `daml-lf/interpreter/perf/src/main/scala/com/daml/lf/explore`.

The idea is that once this control is in place, we can discuss if we can change how we
might expose it to the user, and/or perhaps change the default behaviour to have
`stacktracing` disabled.

changelog_begin
changelog_end
2020-05-22 16:36:03 +01:00
nickchapman-da
70201afc55
Make the Lightweight Pretty Printing more helpful. (#6069)
- reduce verbosity of PAP
- see inside applications & lambdas.
- see rhss/body of lets
- see scrut of case (maybe later we add alts!)
- see the name of the definition ref
- see bools
- see LOC(and-nested-expression) -- interesting where the LOCs are places
- see head of kont stack

changelog_begin
changelog_end
2020-05-21 15:41:22 +00:00
nickchapman-da
3e4c67613a
Speedy explore dar (#6066)
Explore the execution of the speedy machine on small examples taken from a dar file
In particular interested in how the execution proceeds for standard patterns of recursion.

changelog_begin
changelog_end
2020-05-21 11:06:44 +00:00
nickchapman-da
ee4f89378d
Speedy Tail Call Optimization (#6003)
* Speedy Tail call optimization

The goal of this PR is to achieve Tail call optimization #5767

Tail call optimization means that tail-calls execute without consuming resources. In particular, they must not consume stack space.

Speedy has two stacks: The `env`-stack and the `kontStack`. For an optimized tail call in Speedy, we must not extend either. In Speedy, all function calls are executed via the code `enterFullyAppliedFunction`. The behaviour of this code (prior to this PR) is as follows:

(1) Push the values of all args and free-variables on the env-stack (because that's where the code expects to find them), and
(2) Push a KPop continuation on the kontStack, which will restore the env-stack to its original size before returning to the code which made the function call.

We must stop doing both these things. We achieve this as follows:

(1) We address function args and free-vars via a new machine component: the current `frame`.
(2) We make continuations responsible for restoring their own environment.

As well as achieving proper tail calls, we also gain a performance improvement by (a) removing the many pushes to the env-stack, and (b) never having to push (and then later re-enter) a KPop continuation. The args array and the free-vars array already existed, so there is no additional cost associated with constructing these array. The only extra costs (which are smaller than the gains) are that we must manage the new `frame` component of the machine, and we must record frame/env-size information in continuations so they can restore their environment.

To make use of the frame, we need to identify (at compile time) the run-time location for every variable in a speedy expression. This is done during the `closureConvert` phase. At run-time, an environment is now composed of both the existing env-stack and the frame. The only values which now live on the env-stack are those introduced by let-bindings and pattern-match-destructuring. All other are found in the frame.

Changes to SEExpr:
- Introduce a new expression form `SELoc`, with 3 sub classes: SELocS/SELocA/SELocF to represent the run-time location of a variable.
- SELocS/A/F execute by calling corresponding lookup function in Speedy: getEnv(Stack,Arg,Free).
- SEMakeClo takes a list of SELoc instead of list of int.
- During closure conversion all SEVar are replaced by an SELocS/A/F.
- SEVar are not allowed to exist at run-time (just as SEAbs may not exist).
- We adapt the synthesised code for SEBuiltinRecursiveDefinition: FoldL, FoldR, EqualList

It is worth noting the prior code also had the notion of before/after closureConvert, but SEVar was used for both meanings: Prior to closureConvert it meant the relative-index of the enclosing binder (lambda,let,..). After closureConvert it meant the relative-offset from the top of the env-stack where the value would be found at run-time. These are not quite the same! Now we have different sub-types (SEVar vs SELoc), this change of mode is made more explicit.

Run-time changes:
- Use the existing `KFun` continuation as the new `Frame` component.
- `KFun` allows access to both the args of the current application, and the free-vars of the current closure.
- A variable is looked up by it's run-time location (SELocS/A/F)
- A function application is executed (`enterFullyAppliedFunction`), by setting the machine's `frame` component to the new current `KFun`.
- When a continuation (KArg, KMatch, KPushTo, KCatch) is pushed, we record the current Frame and current stack depth within the continuation, so when it is entered, it can call `restoreEnv` to restore the environment to the state when the continuation was pushed.

Changes to Compiler:

- The required changes are to the `closureConvert` and `validate`.
- `closureConvert` `remaps` is now a `Map` from the `SEVar`s relative-index to `SELoc`
- `validate` now tracks 3-ints (maxS,masA,maxF)

changelog_begin
changelog_end

* changes for Remy

* Changes for Martin

* test designed explicitly to blow if the free variables are captured incorrectly

* address more comments

* improve comment about shift in Compiler
2020-05-20 12:37:52 +01:00
Martin Huschenbett
caedc72551
Implement a simple profiler for DAML scenarios (#5957)
* Implement a simple profiler for DAML scenarios

The profiler runs a single scenario and records timing information when
each function (and some other closures) are entered and left. The
resulting information can be visualized as a flamegraph using
[speedscope](https://www.speedscope.app/).

The profiler works by instrumenting the CEK machine at the heart of
DAML Engine. Unfortunetaly, this causes a very small overhead on
non-profiling runs too. However, in my benchmarks I could not measure
any significant impact on the overall runtime at all. More precisely,
the overhead is as follows:

Every closure now has an additional field called `label`. In
non-profiling runs this field is always set to `null`. This field needs
to be allocated, copied whenever we copy a closure and scanned during
garbage collection. Additionally, whenever we enter a closure, we check
this field and whenever it is _not_ `null`, i.e. never during
non-profiling runs, we record an "open event" and set up a hook for the
corresponding "close event". Thus, the additional cost during
non-profiling runs are a single pointer comparison and a jump beyond
the "then branch".

Since this is still very much in active development, there are no
documentation, other than an entry in a README, and no tests yet. They
will come before we promote this. However, the UX will look very
different then since we already have plans to significantly change it.

CHANGELOG_BEGIN
CHANGELOG_END

* Run scalafmt

* Make profiling argument to PureCompiledPackges optional

* Fix a bunch of tests

CHANGELOG_BEGIN
CHANGELOG_END

* scalafmt is so annoying

CHANGELOG_BEGIN
CHANGELOG_END

* Apply simple suggestions

CHANGELOG_BEGIN
CHANGELOG_END
2020-05-18 17:20:44 +00:00
Remy
deedec9b9f
Engine: Remove optionality of contract ID Seeding. (#5966)
CHANGELOG_BEGIN
CHANGELOG_END
2020-05-13 20:29:14 +02:00
nickchapman-da
4823053441
Add instrumentation and lightweight tracing to Speedy CEK machine (#5819)
- Disabled by default.
- Enabled with flags: `enableInstrumentation` and `enableLightweightStepTracing'

The instrumentation:
- tracks #pushes and maximum-depth for the `kont`-stack and the `env`ironment
- counts #steps
- classifies  each step:
    - for `CtrlExpr`, what kind of expression is it?
    - for `CtrlValue`, what kind of continuation is at the head of the `kont`-stack?
- the information is printed when the machine finalizes

The lightweight tracing shows an abbreviated output line per step:
- the control expression / value
- the size of the environment
- the depth of the `kont`-stack

changelog_begin
changelog_end
2020-05-11 11:10:02 +01:00