daml/unreleased/multi-package.md
Samuel Williams 2e17f19786
Add multi-package brief doc (#17849)
* Add multi-package brief doc

* Add introduction to CLI usage

* Because --enable-multi-package=yes by default, remove references to flag

* Address Curtis comments by moving multi-package.yaml in trees

* Apply review comments

---------

Co-authored-by: Dylan Thinnes <dylan.thinnes@digitalasset.com>
2023-12-15 13:37:53 +00:00

19 KiB

Multi-Package Builds

The multi-package builds support is a new optional feature for building and managing multiple interdependent packages at once. It is implemented as a configuration file and a set of flags, and also serves to enable future improvements to the IDE for multi-package projects.

Using multi-package builds, developers can:

  • Configure daml build to automatically rebuild DARs used in data dependencies when the source code corresponding to those DARs changes.
  • Build all of the packages in a project simultaneously using the new daml build --all command.
  • Clean all build artifacts in a project using daml clean --all.

Overview

The developer begins by configuring which packages are to be built automatically by placing a new configuration file multi-package.yaml at the root of their project:

multi-package.yaml

packages:
- <path to first package>
- <path to second package>

The developer then runs daml build --all from the root of their project to build all of the packages listed in the multi-package.yaml file, or they run daml build within a package to build it and its dependencies.

The rest of this document expands on use cases and usage of the CLI and the multi-package.yaml file.

Migrating from Previous Workflows

Because multi-package builds are a purely opt-in additive behaviour over existing daml build, if you don't plan to use multi-package builds you do not need to change anything about your package configurations.

If you do plan to use multi-package files, your daml.yaml files will not need updating in general because the multi-package.yaml needs only a path to a folder with a daml.yaml.

There is one caveat: multi-package.yaml assumes that all dependent packages refer to DAR files by the locations where daml build would normally put them. This means that if you have shell scripts that move a package's DAR file after build, you will need use the --output flag instead, so that multi-package.yaml can statically find the DAR's expected output location.

Problem Statement

Frequently when developing Daml, we have the notion of our "project", which contains all of the code related our Daml application. Within this project, we may have multiple Daml packages, each with its own daml.yaml.

For example, say a developer is developing an application with some data model, and some logic over that model in a separate package. To make changes, the developer simultaneously works on two packages Logic and Model, where Logic depends on the build artifact of Model, Model.dar.

> tree
.
├── package-Logic
│   ├── daml/...
│   └── daml.yaml
└── package-Model
    ├── .daml/dist/package-Model-1.0.0.dar
    ├── daml/...
    └── daml.yaml

Without multi-package, if the developer makes changes to the source code of both Logic and Model, the developer must invoke daml build from within ./package-Model to produce package-Model-1.0.0.dar before invoking daml build within ./package-Logic. If the developer forgets to compile package Model first, package Logic will build against a stale version package-Model-1.0.0.dar that does not reflect the latest changes to Model's source code.

This has a number of drawbacks:

  • The developer must navigate directories and invoke daml build twice.
  • The developer must always remember to build in order to propagate changes to package-Logic-1.0.0.dar.
  • If the developer forgets, compilation may succeed without warnings due to the stale package-Model-1.0.0.dar, where it would fail if package-Model-1.0.0.dar were up to date.

Correct Usage Without Multi-Package

> # ... make some changes to Model ...
> cd ./package-Model/  # Navigate to package-Model
> daml build           # Build Model.dar
> cd ../package-Logic/ # Navigate to package-Logic
> daml build           # Build Logic.dar

Using Multi-Package as a Solution

Using multi-package builds, the developer can register multiple packages as interdependent parts of our "project". Whenever a package A in our project refers to the DAR of another package B in our project, any daml build on package A will automatically rebuild B's DAR from source if it has changed via a second invocation of daml build (See Caching for details on what "changed" means here).

The directories containing the daml.yaml for the various packages that should be part of our project are listed with relative paths under the packages header in the multi-package.yaml.

Example

For our example project, we would register package-Model and package-Logic as members of our project. We do this by placing a multi-package.yaml file at the root of our project, with two relative paths to packages Model and Logic.

multi-package.yaml

packages:
- ./package-Logic
- ./package-Model

Our new tree will look something like:

> tree
.
├── multi-package.yaml
├── package-Logic
│   ├── daml/...
│   └── daml.yaml
└── package-Model
    ├── .daml/dist/package-Model-1.0.0.dar
    ├── daml/...
    └── daml.yaml

Now that both package-Logic and package-Model are in the project, any attempt to build package-Logic will detect that the Model.dar data dependency belongs to package-Model, and automatically re-builds it.

Correct Usage With Multi-Package:

> # ... make some changes to Model ...
> cd ./package-Logic/ # Navigate to package-Logic
> daml build # Build package Logic
...
Dependency "package-Model" is stale, rebuilding...
...
Building "package-Logic"...
Done.

Using multi-package build, the developer runs only one call to daml build, does not need to remember to build each package in order, and can guarantee that changes to dependencies are always propagated to packages.

Building All Packages

To build all packages in the project, provide the --all flag. This will build every package listed in the specified multi-package.yaml. Note that when using this flag, daml build does not need to be invoked from a package directory.

Finding the multi-package.yaml

Both daml build and daml clean will need to know where to find your multi-package.yaml, as it is common to invoke them from further into your project structure than the root.

Put briefly, the discovery logic closely mimics that of the daml.yaml, and resolves as follows:

  1. If a path is passed in using --multi-package-path PATH, we use the multi-package.yaml available at that path. (This is the multi-package equivalent to --project-root)
  2. Otherwise, the CLI (either daml build or daml clean) will search up the directory tree starting from the directory where it was invoked. We immediately return the first multi-package.yaml we find.
  3. If no file is found, multi-build is not used.

More on managing multiple multi-package.yaml files in Project Discovery in Nested Projects.

Adding a New Package to your Project

We continue the example above to demonstrate the process of adding a new package to a multi-package project. The current project structure is as follows:

> tree
.
├── multi-package.yaml
├── package-Logic
│   ├── .daml/dist/package-Logic-1.0.0.dar
│   ├── daml/...
│   └── daml.yaml
└── package-Model
    ├── daml/...
    ├── daml/.dist/package-Model-1.0.0.dar
    └── daml.yaml

We add a new package to our tree by running daml new --template=empty-skeleton package-Tests. This creates an empty package named package-Tests alongside our existing package-Logic and package-Model packages.

> tree
.
├── multi-package.yaml
├── package-Logic
├── package-Model
└── package-Tests

In the newly created package-Tests/daml.yaml, we remove the sandbox-options and add a dependency on the DAR for Logic.

data-dependencies:
- ../package-Logic/.daml/dist/package-Logic-1.0.0.dar

Finally, we'll want to tie the new package into our project to take advantage of the multi-package feature - we do this by adding package-Testing to the multi-package.yaml in the same way Logic and Model have been added.

multi-package.yaml

  packages:
  - ./package-Logic
  - ./package-Model
+ - ./package-Tests

Running Multi-Package Tests

Assuming we then write our tests, we can build our test package in one of two ways:

  • Run daml build from the package-Testing directory to build the Tests package and its dependency Logic package.
  • Run daml build --all from the root of the project to build all three packages in our project, including the Tests package.

Once we have built the package, we can run daml test from the package-Testing directory to run the up-to-date tests.

Behaviour for Packages Not Listed In multi-package.yaml

Packages that are not listed in the multi-package.yaml file will not be automatically recompiled, regardless of whether their source is available in the project's source tree.

For example, say we add a vendor library to our source tree. We build the vendor library from source in isolation, and refer to its vendor-library.dar as a data dependency from within package-Logic/daml.yaml.

> tree
.
├── multi-package.yaml
├── package-Logic
│   ├── daml/...
│   └── daml.yaml
├── package-Model
│   ├── daml/...
│   ├── daml/.dist/package-Model-1.0.0.dar
│   └── daml.yaml
└── vendor-library
    ├── daml/...
    ├── daml/.dist/vendor-library-1.0.0.dar
    └── daml.yaml

If we do not update multi-package.yaml, builds of package Logic will not try to rebuild vendor-library.dar, even though its source and vendor-library/daml.yaml are clearly available.

> # ... make some changes to vendor-library ...
> cd ./package-Logic/ # Navigate to package-Logic
> daml build # Build package Logic
...
Building "package-Logic"... # We DO NOT rebuild vendor-library
Done.

This allows developers to distinguish packages that are part of a project from those that are not, without risking recompilation of a vendored package whose upstream they don't control.

Working with Multiple Projects

In some cases, we have nested projects. For example, say we have two separate GitHub repositories, application and library, in which we work and change the source regularly. The application repository depends on the library-repository by pointing at DAR files within the library repository.

.
├── application-repository
│   ├── .git
│   ├── multi-package.yaml
│   ├── application-logic
│   │   └── daml.yaml
│   └── application-tests
│       └── daml.yaml
└── library-repository
    ├── .git
    ├── multi-package.yaml
    ├── library-logic
    │   └── daml.yaml
    └── library-tests
        └── daml.yaml

application-repository/application-logic/daml.yaml

version: 1.0.0
...
data-dependencies:
- ../../library-respository/library-logic/.daml/dist/library-logic-1.0.0.daml

Both repositories have their own multi-package.yaml which points to their respective logic and tests packages, so that we can work effectively in each repository on its own.

> cd library-repository
> daml build --all
...
Building "library-logic"...
Building "library-tests"...
Done.
> cd application-repository
> daml build --all
...
Building "application-logic"...
Building "application-tests"...
Done.

However, occasionally we may want to make changes to both repositories simultaneously. In the example case above, neither repository is aware of the other, so builds run from within the application repository will not rebuild dependencies within the library repository.

> cd library-repository
> editor library-logic/... # make some changes
> cd ../application-repository
> daml build --all
Nothing to do. # changes from library-logic are NOT picked up and NOT rebuilt

In this case, we'd like invocations of build from inside the application repository to be aware of changes to packages inside the library respository, and rebuild those packages if necessary.

For this case, we provide the ability to include other multi-package.yaml files into yours using the projects field.

application-repository/multi-package.yaml

packages:
- ./application-logic
- ./application-tests
# Add the path to the library-repository folder, which contains a multi-package.yaml file we'd like to include
projects:
- ../library-repository

Once a local multi-package.yaml file's projects field is populated with a folder containing an external multi-package.yaml, all dependencies mentioned in the external YAML will also be included in the local multi-package.

> cd library-repository
> editor library-logic/... # make some changes
> cd ../application-repository
> daml build --all
Building "library-logic"... # changes from library-logic ARE picked up and rebuilt
Building "application-logic" # the application-logic package is rebuilt because its library-logic dependency has changed.

Project Discovery with Multiple Files

Because of the projects: field, a project might be composed of many multi-package.yaml files. Depending on which multi-package.yaml file your call to build refers to, a different set of packages is added to the project.

Project Discovery in Nested Projects

This following example explores a nested project structure, where we have a top-level main package with a multi-package.yaml file, and a libs subdirectory. The libs subdirectory has its own libs/multi-package.yaml file and contains two packages, libs/my-lib and libs/my-lib-helper.

The main package depends on my-lib, which itself depends on my-lib-helper.

> tree
.
├── multi-package.yaml
├── libs
│   ├── multi-package.yaml
│   ├── my-lib
│   │   ├── daml
│   │   │   └── MyLib.daml
│   │   └── daml.yaml
│   └── my-lib-helper
│       ├── daml
│       │   └── MyLibHelper.daml
│       └── daml.yaml
└── main
    ├── daml
    │   └── Main.daml
    └── daml.yaml

Looking inside the most relevant files:

  • libs/multi-package.yaml
    packages:
    - ./my-lib
    - ./my-lib-helper
    
  • multi-package.yaml
    packages:
    - ./main
    projects:
    - ./libs
    
  • main/daml.yaml
    version: 1.0.0
    ...
    data-dependencies:
    - ../libs/my-lib/.daml/dist/my-lib-1.0.0.dar # main depends on my-lib
    

As before, we can run daml build --all from the root of the project to build all libraries and main.

> # From the root of the project:
> daml build --all
Building "my-lib-helper"...
Building "my-lib"...
Building "main"...

However, if we run daml build --all from the libs directory, it'll traverse the directory tree and first find the libs/multi-package.yaml. Because libs/multi-package.yaml only refers to my-lib and my-lib-helper, only those will be built.

> cd libs/
> daml build --all
Building "my-lib-helper"...
Building "my-lib"...
# Main is *not* built, because libs/multi-package.yaml was used

You can explicitly specify the use of the outer multi-package.yaml from within libs using --multi-package-path ../multi-package.yaml.

> cd libs/
> daml build --all --multi-package-path ../multi-package.yaml
Building "my-lib-helper"...
Building "my-lib"...
Building "main"... # Main *is* built, because the root multi-package.yaml was used

The --output flag

The --output flag can be used to specify the location of the DAR generated by daml build. This is not new, however it is important to note that the multi-package feature extends the use cases of this flag.

  1. Multi-package build behaviour correctly handles the use of the --output flag, but only if it is specified in the relevant package's daml.yaml build-options, for example:

    build-options:
    - --output=./my-dar.dar
    
  2. The --output flag can be used to abstract away the package structure of sub-projects. This is useful when the maintainers of the sub-project are a different team, allowing them to make structural changes to without breaking dependent projects.

    For example, take the scenario in Nested Projects with the following structure.

    > tree
    .
    ├── multi-package.yaml
    ├── libs
    │   ├── multi-package.yaml
    │   ├── my-lib
    │   │   ├── daml
    │   │   │   └── MyLib.daml
    │   │   └── daml.yaml
    │   └── my-lib-helper
    │       ├── daml
    │       │   └── MyLibHelper.daml
    │       └── daml.yaml
    └── main
        ├── daml
        │   └── Main.daml
        └── daml.yaml
    

    If the my-lib package had specified --output ../dars in its daml.yaml file, that would put the dar in a libs/dars directory at the root of the libs sub-project. Now, the main package could use the DAR as a data dependency without needing to know the source location of the my-lib package within the libs sub-folder.

    > tree
    .
    ├── multi-package.yaml
    ├── libs
    │   ├── multi-package.yaml
    │   ├── dars
    │   │   └── my-lib.dar <--
    │   ├── my-lib
    │   │   └── ...
    │   └── my-lib-helper
    │       └── ...
    └── main
        └── ...
    

Caching

Depending on the use case, the multi-package feature may automatically rebuild dependencies or even the whole project.

Because of this, daml build will try to cache package results and avoid rebuilds wherever possible. daml build checks 2 main properties of generated artifacts to ensure they are up-to-date:

  • Compare the contents of all Daml source files against those compiled into the DAR.
  • Compare the package IDs of all dependencies against the the package IDs of dependencies compiled into the DAR.

Disabling Caching

  • --no-cache disables the caching behaviour for a build, forcing the rebuild of every package needed

Cleaning the Cache with daml clean

To clean out any build artifacts in the project, use daml clean.

  • daml clean will clear any artifacts for the current package and its dependencies.
  • daml-clean --all will clear any artifacts for the entire project.

When using daml clean with multi-package enabled and a multi-package.yaml provided, you can now also provide a --all flag, which will clean the build artifacts of all packages in your project.