Merge pull request #9 from DavHau/dev

Contributors Guide + More Docs
This commit is contained in:
DavHau 2021-09-30 03:33:33 +01:00 committed by GitHub
commit e1565969b5
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 369 additions and 199 deletions

215
README.md
View File

@ -1,203 +1,32 @@
## [WIP] dream2nix - A generic framework for 2nix tools
dream2nix is a generic framework for 2nix tools (tools translating from other build systems to nix).
dream2nix is a generic framework for 2nix converters (converting from other build systems to nix).
It focuses on the following aspects:
- Modularity
- Customizability
- Maintainability
- Nixpkgs Compatibility (not enforcing IFD)
- Code de-duplication across 2nix tools
- Code de-duplication in nixpkgs
- Risk free opt-in FOD fetching (no reproducibility issues)
- Common UI across 2nix tools
- Reduce effort to develop new 2nix solutions
- Exploration and adoption of new nix features
- Simplified updating of packages
- Modularity
- Customizability
- Maintainability
- Nixpkgs Compatibility (not enforcing IFD)
- Code de-duplication across 2nix converters
- Code de-duplication in nixpkgs
- Risk free opt-in FOD fetching (no reproducibility issues)
- Common UI across 2nix converters
- Reduce effort to develop new 2nix solutions
- Exploration and adoption of new nix features
- Simplified updating of packages
### Motivation
2nix tools, or in other words, tools converting instructions of other build systems to nix build instructions, are an important part of the nix/nixos ecosystem. These tools make packaging workflows easier and often allow to manage complexity that would be hard or impossible to manage without.
Yet the current landscape of 2nix tools has certain weaknesses. Existing 2nix tools are very monolithic. Authors of these tools are often motivated by some specific use case and therefore the individual approaches are strongly biased and not flexible. All existing tools have quite different user interfaces, use different strategies of parsing, resolving, fetching, building with significantly different options for customizability. As a user of these tools it often feels like there is some part of the tool that suits the needs well, but at the same time it has undesirable hard coded behaviour. Often one would like to use some aspect of one tool combined with some aspect of another tool. One tool, for example, might do a good job in reading a specific lock file format, but lacks customizability for building. Another tool might come with a good customization interface, but is unable to parse the lock file format. Some tools are restricted to use IFD or FOD, while others enforce code generation.
2nix converters, or in other words, tools converting instructions of other build systems to nix build instructions, are an important part of the nix/nixos ecosystem. These converters make packaging workflows easier and often allow to manage complexity that would be hard or impossible to manage without.
The idea of this project is therefore to create a standardized, generic, modular framework for 2nix solutions, aiming for better flexibility, maintainability and usability.
Yet the current landscape of 2nix converters has certain weaknesses. Existing 2nix converters are very monolithic. Authors of these converters are often motivated by some specific use case and therefore the individual approaches are strongly biased and not flexible. All existing converters have quite different user interfaces, use different strategies of parsing, resolving, fetching and building while providing significantly different options for customizability. As a user of such converter it often feels like there is some part of it that suits the needs well, but at the same time it has undesirable hard coded behaviour. Often one would like to use some aspect of one converter combined with some aspect of another converter. One converter might do a good job in parsing a specific lock file format, but lacks customizability for building, while another one provides a good customization interface, but is unable to parse the lock file format. Some converters are restricted to use IFD or FOD, while others enforce code generation.
Ideally this repository will become a hub for re-usable nix code delivered with a nice UI to allow for simpler, more automated packaging.
### Modularity:
The following phases which are generic to basically all existing 2nix solutions:
- parsing project metadata
- resolving/locking dependencies (not always required)
- fetching sources
- building/installing packages
... should be separated from each other with well defined interfaces.
This will allow for free compsition of different approaches for these phases.
The user should be able to freely choose between:
- input metadata formats (eg. lock file formats)
- metadata fetching/translation strategies: IFD vs. in-tree
- source fetching strategies: granular fetching vs fetching via single large FOD to minimize expression file size
- installation strategies: build dependencies individually vs inside a single derivation.
### Customizability
Every Phase mentioned in the previous section should be customizable at a high degree via override functions. Practical examples:
- Inject extra requirements/dependencies
- fetch sources from alternative locations
- replace or modify sources
- customize the build/installation procedure
### Maintainability
Due to the modular architecture with strict interfaces, contributers can add support for new lock-file formats or new strategies for fetching, building, installing more easily.
### Compatibility
Depending on where the nix code is used, different approaches are desired or discouraged. While IFD might be desired for some out of tree projects to achieve simplified UX, it is strictly prohibited in nixpkgs due to nix/hydra limitations.
All solutions which follow the dream2nix specification will be compatible with both approaches without having to re-invent the tool.
### Code de-duplication
Common problems that apply to many 2nix solutions can be solved once by the framework. Examples:
- handling cyclic dependencies
- handling sources from various origins (http, git, local, ...)
- generate nixpkgs/hydra friendly output (no IFD)
- good user interface
### Code de-duplication in nixpkgs
Essential components like package update scripts or fetching and override logic are provided by the dream2nix framework and are stored only once in the source tree instead of several times.
### Risk free opt-in FOD fetching
Optionally, to save more storag space, individual hashes for source can be ommited and a single large FOD used instead.
Due to a unified minimalistic fetching layer the risk of FOD hash breakages should be very low.
### Common UI across many 2nix solutions
2nix solutions which follow the dream2nix framework will have a unified UI for workflows like project initialization or code generation. This will allow quicker onboarding of new users by providing familiar workflows across different build systems.
### Reduced effort to develop new 2nix solutions
Since the framework already solves common problems and provides an interface for integrating new build systems, developers will have an easier time creating their next 2nix solution.
### Architecture
The general architecture should consist of these components:
`Input -> Translation -> Generic Lock -> Fetching -> Building`
```
┌───────┐
│ Input │◄── Arbitrary
└────┬──┘ URLs + Metadata containing Build instructions
│ ┌──────────┐ in standardized minimalistic form (json)
└──►│Translator│ │
└───────┬──┘ ▼
▲ │ ┌────────────┐
│ └──►│Generic Lock│
└─────────┬──┘
impure/pure │ ┌────────┐
online/offline ├──►│Fetcher │◄── Same across all
pure-nix/IFD/external │ └────────┘ languages/frameworks
│ ▼
│ ┌────────┐
└──►│Builder │◄── Reads extra metadata
└────────┘ from generic lock
```
Input:
- can consist of:
- requirement contstraints
- requirement files
- lock-files
- project's source tree
Translator:
- read input and generate generic lock format containing:
- URLs + hashes of sources
- metadata for building
- different strategies can be used:
- `pure-nix`: translate input by using the nix language only
- `IFD/recursive`: translate using a nix build
- `external`: translate using an external tool which resolves against an online package index
- for more information about translators and how nixpkgs compatibility is guaranteed, check [docs/translators.md](/docs/translators.md)
Generic Lock (standardized format):
- Produced by `Translator`. Contains URLs + hashes for sources and metadata relevant for building.
- The contained format for sources and dependency relations is independent of the build system. Fetching works always the same.
- The metadata also contains build system specific attributes as individual approaches are required here. A specific builder for the individual build system will later read this metadata and transform it into nix derivations.
- It is not relevant which steps/strategies have been taken to create this lock. From this point on, there are no impurities. This format will contain everything necessary for a fully reproducible build.
- This format can always be put into nixpkgs, not requiring any IFD (given the nix code for the builder exists within nixpkgs).
- In case of a pure-nix translator, the generic lock data can be generated on the fly and passed directly to the builder, preventing unnecessary usage of IFD.
Fetcher:
- Since a generic lock was produced in the previous step, the fetching layer can be the same across all build systems.
Builder:
- Receives sources from fetcher and metadata produced by the translator.
- The builder transforms the metadata into nix derivation(s).
- Strictly separating the builder from previous phases allows:
- switching between different build strategies or upgrading the builder without having to re-run the translator each time.
- reducing code duplication if a project contains multiple packages built via dream2nix.
### Example (walk through the phases)
#### python project with poetry.lock
As an example we package a python project that uses poetry for dependency management.
Potery uses `pyproject.toml` and `poetry.lock` to lock dependencies
- Input: pyproject.toml, poetry.lock (toml)
- Translator: written in pure nix, reading the toml input and generating the generic lock format
- Generic Lock (for explanatory purposes dumped to json and commented):
```json5
{
// generic lock format version
"version": 1,
// format for sources is always the same (not specific to python)
"sources": {
"requests": {
"type": "tarball",
"url": "https://download.pypi.org/requests/2.28.0",
"hash": "deadbeefdeadbeefdeadbeefdeadbeefdeadbeef",
},
"certifi": {
"type": "github",
"owner": "certifi",
"repo": "python-certifi",
"hash": "deadbeefdeadbeefdeadbeefdeadbeefdeadbeef"
}
},
// generic metadata (not specific to python)
"generic": {
// this indicates which builder must be used
"buildSystem": "python",
// translator which generated this file
// (not relevant for building)
"producedBy": "translator-poetry-1",
// dependency graph of the packages
"dependencyGraph": {
"requests": [
"certifi"
]
}
},
// all fields inside 'buildSystem' are specific to
// the selected buildSystem (python)
"buildSystem": {
// tell the python builder how the inputs must be handled
"sourceFormats": {
"requests": "sdist", // triggers build instructions for sdist
"certifi": "wheel" // triggers build instructions for wheel
}
}
}
```
- This lock data can now either:
- be dumped to a .json file and committed to a repo
- passed directly to the fetching/building layer
- the fetcher will only read the sources section and translate it to standard fetcher calls.
- the building layer will read the "buildSystem" attribute and select the python builder for building.
- the python builder will read all information from "buildSystem" and translate the data to a final derivation.
Notes on IFD, FOD and code generation:
- No matter which type of tanslator is used, it is always possible to export the generic lock to a file, which can later be evaluated without using IFD or FOD, similar to current nix code generators, just with a standardized format.
- If the translator supports IFD or is written in pure nix, it is optional to the user to skip exporting the generic lock and instead evaluate everything on the fly.
The idea of this project is to create a standardized, generic, modular framework for 2nix solutions, aiming for better flexibility, maintainability and usability.
The intention is to integrate many existing 2nix converters into this framework, thereby improving many of the previously named aspects and providing a unified UI for all 2nix solutions.
### Further Reading
- [Summary of the core concepts and benefits](/docs/concepts-and-benefits.md)
- [How would this improve the packaging situation in nixpkgs](/docs/nixpkgs-improvements.md)
- [Contributors Guide](/docs/contributors-guide.md)

View File

@ -0,0 +1,176 @@
### Modularity:
The following phases which are generic to basically all existing 2nix solutions:
- parsing project metadata
- resolving/locking dependencies (not always required)
- fetching sources
- building/installing packages
... should be separated from each other with well defined interfaces.
This will allow for free compsition of different approaches for these phases.
The user should be able to freely choose between:
- input metadata formats (eg. lock file formats)
- metadata fetching/translation strategies: IFD vs. in-tree
- source fetching strategies: granular fetching vs fetching via single large FOD to minimize expression file size
- installation strategies: build dependencies individually vs inside a single derivation.
### Customizability
Every Phase mentioned in the previous section should be customizable at a high degree via override functions. Practical examples:
- Inject extra requirements/dependencies
- fetch sources from alternative locations
- replace or modify sources
- customize the build/installation procedure
### Maintainability
Due to the modular architecture with strict interfaces, contributers can add support for new lock-file formats or new strategies for fetching, building, installing more easily.
### Compatibility
Depending on where the nix code is used, different approaches are desired or discouraged. While IFD might be desired for some out of tree projects to achieve simplified UX, it is strictly prohibited in nixpkgs due to nix/hydra limitations.
All solutions which follow the dream2nix specification will be compatible with both approaches without having to re-invent the tool.
### Code de-duplication
Common problems that apply to many 2nix solutions can be solved once by the framework. Examples:
- handling cyclic dependencies
- handling sources from various origins (http, git, local, ...)
- generate nixpkgs/hydra friendly output (no IFD)
- good user interface
### Code de-duplication in nixpkgs
Essential components like package update scripts or fetching and override logic are provided by the dream2nix framework and are stored only once in the source tree instead of several times.
### Risk free opt-in FOD fetching
Optionally, to save more storag space, individual hashes for source can be ommited and a single large FOD used instead.
Due to a unified minimalistic fetching layer the risk of FOD hash breakages should be very low.
### Common UI across many 2nix solutions
2nix solutions which follow the dream2nix framework will have a unified UI for workflows like project initialization or code generation. This will allow quicker onboarding of new users by providing familiar workflows across different build systems.
### Reduced effort to develop new 2nix solutions
Since the framework already solves common problems and provides an interface for integrating new build systems, developers will have an easier time creating their next 2nix solution.
### Architecture
The general architecture should consist of these components:
`Input -> Translation -> Generic Lock -> Fetching -> Building`
```
┌───────┐
│ Input │◄── Arbitrary
└────┬──┘ URLs + Metadata containing Build instructions
│ ┌──────────┐ in standardized minimalistic form (json)
└──►│Translator│ │
└───────┬──┘ ▼
▲ │ ┌────────────┐
│ └──►│Generic Lock│
└─────────┬──┘
impure/pure │ ┌────────┐
online/offline ├──►│Fetcher │◄── Same across all
pure-nix/IFD/external │ └────────┘ languages/frameworks
│ ▼
│ ┌────────┐
└──►│Builder │◄── Reads extra metadata
└────────┘ from generic lock
```
Input:
- can consist of:
- requirement contstraints
- requirement files
- lock-files
- project's source tree
Translator:
- read input and generate generic lock format containing:
- URLs + hashes of sources
- metadata for building
- different strategies can be used:
- `pure-nix`: translate input by using the nix language only
- `IFD/recursive`: translate using a nix build
- `external`: translate using an external tool which resolves against an online package index
- for more information about translators and how nixpkgs compatibility is guaranteed, check [./translators.md](/docs/translators.md)
Generic Lock (standardized format):
- Produced by `Translator`. Contains URLs + hashes for sources and metadata relevant for building.
- The contained format for sources and dependency relations is independent of the build system. Fetching works always the same.
- The metadata also contains build system specific attributes as individual approaches are required here. A specific builder for the individual build system will later read this metadata and transform it into nix derivations.
- It is not relevant which steps/strategies have been taken to create this lock. From this point on, there are no impurities. This format will contain everything necessary for a fully reproducible build.
- This format can always be put into nixpkgs, not requiring any IFD (given the nix code for the builder exists within nixpkgs).
- In case of a pure-nix translator, the generic lock data can be generated on the fly and passed directly to the builder, preventing unnecessary usage of IFD.
Fetcher:
- Since a generic lock was produced in the previous step, the fetching layer can be the same across all build systems.
Builder:
- Receives sources from fetcher and metadata produced by the translator.
- The builder transforms the metadata into nix derivation(s).
- Strictly separating the builder from previous phases allows:
- switching between different build strategies or upgrading the builder without having to re-run the translator each time.
- reducing code duplication if a project contains multiple packages built via dream2nix.
### Example (walk through the phases)
#### python project with poetry.lock
As an example we package a python project that uses poetry for dependency management.
Potery uses `pyproject.toml` and `poetry.lock` to lock dependencies
- Input: pyproject.toml, poetry.lock (toml)
- Translator: written in pure nix, reading the toml input and generating the generic lock format
- Generic Lock (for explanatory purposes dumped to json and commented):
```json5
{
// generic lock format version
"version": 1,
// format for sources is always the same (not specific to python)
"sources": {
"requests": {
"type": "tarball",
"url": "https://download.pypi.org/requests/2.28.0",
"hash": "deadbeefdeadbeefdeadbeefdeadbeefdeadbeef",
},
"certifi": {
"type": "github",
"owner": "certifi",
"repo": "python-certifi",
"hash": "deadbeefdeadbeefdeadbeefdeadbeefdeadbeef"
}
},
// generic metadata (not specific to python)
"generic": {
// this indicates which builder must be used
"buildSystem": "python",
// translator which generated this file
// (not relevant for building)
"producedBy": "translator-poetry-1",
// dependency graph of the packages
"dependencyGraph": {
"requests": [
"certifi"
]
}
},
// all fields inside 'buildSystem' are specific to
// the selected buildSystem (python)
"buildSystem": {
// tell the python builder how the inputs must be handled
"sourceFormats": {
"requests": "sdist", // triggers build instructions for sdist
"certifi": "wheel" // triggers build instructions for wheel
}
}
}
```
- This lock data can now either:
- be dumped to a .json file and committed to a repo
- passed directly to the fetching/building layer
- the fetcher will only read the sources section and translate it to standard fetcher calls.
- the building layer will read the "buildSystem" attribute and select the python builder for building.
- the python builder will read all information from "buildSystem" and translate the data to a final derivation.
Notes on IFD, FOD and code generation:
- No matter which type of tanslator is used, it is always possible to export the generic lock to a file, which can later be evaluated without using IFD or FOD, similar to current nix code generators, just with a standardized format.
- If the translator supports IFD or is written in pure nix, it is optional to the user to skip exporting the generic lock and instead evaluate everything on the fly.

View File

@ -0,0 +1,76 @@
# dream2nix contributers guide
## Contribute Translator
In general there are 3 different types of translators
1. pure translator
- translation logic is implemented in nix lang only
- does not invoke build or read from any build output
2. IFD translator
- part of the logic is integrated as a nix build
- nix code is used to invoke a nix build and parse its results
- same interface as pure translator
3. impure
- translator can be any executable program running outside of a nix build
- not constrained in any way (can do arbitrary network access etc.)
### Add a new translator
To add a new translator, execute the flakes app `contribute` which will generate a template for you. Then open the new `default.nix` file in an edtior
The nix file must declare the following attributes:
In case of a `pure` or `IFD` translator:
```nix
{
# function which receives source files and returns an attribute set
# which follows the dream lock format
translate = ...;
# function which receives source files and returns either true or false
# indicating if the current translator is capable of translating these files
compatiblePaths = ;
# optionally specify additional arguments that the user can provide to the
# translator to customize its behavior
specialArgs = ...;
}
```
In case of an `impure` translator:
```nix
{
# A derivation which outputs an executable at `/bin/translate`.
# The executable will be called by dream2nix for translation
#
# The first arg `$1` will be a json file containing the input parameters
# like defined in /specifications/translator-call-example.json and the
# additional arguments required according to specialArgs
#
# The program is expected to create a file at the location specified
# by the input parameter `outFile`.
# The output file must contain the dream lock data encoded as json.
translateBin = ...;
# A function which receives source files and returns either true or false
# indicating if the current translator is capable of translating these files
compatiblePaths = ;
# optionally specify additional arguments that the user can provide to the
# translator to customize its behavior
specialArgs = ...;
}
```
Ways of debugging your translator:
- run the dream2nix flake app and use the new translator
- temporarily expose internal functions of your translator, then use nix repl `nix repl ./.` and invoke a function via `translators.translators.{subsystem}.{type}.{translator-name}.some_function`

View File

@ -0,0 +1,92 @@
## List of problems which currently exist in nixpkgs
### Generated Code Size/Duplication
#### Problem
- large .nix files containing auto generated code for fetching sources (example: nodejs)
- many duplicated .nix files containing build logic
#### Solution
- dream2nix minimizes the amount of generated nix code, as most of the logic required to build a package resides in the framework and therefore is not duplicated across individual packages.
- If the upstream lock file format can be interpreted with pure nix and is present at evaluation time, then generating any intermediary code can be omitted.
- Once any kind of recursive nix (IFD, recursive-nix, RFC-92) is enabled in nixpkgs, dream2nix will utilize it and eliminate the requirement of generating nix code or storing upsteam lock files
### Update Scripts Duplication/Complexity
#### Problem
- update scripts are largely duplicated
- update scripts are complex
#### Solution
- storing `update.sh` scripts alongside packages will not be necessary anymore. dream2nix can generate update procedures on the fly by reading the package declaration.
- The UI for updating packages is the same across all languages/frameworks
### Fetching / Caching issues (large FODs)
#### Problem
- non-reproducible large FOD fetchers (example: rust)
- updating FODs is not risk free (forget to update hash)
- bad caching properties due to large FODs
#### Solution
- the translators of dream2nix always produce a clear list of URLs to fetch
- large-FOD fetching is not necessary and never enforced
- large-FOD fetching can be used optionally to reduce amount of hashes to be stored
- even if large-FOD fetching is used, it won't have any of the known reproducibility issues, since dream2nix does never make use of upstream toolchain for fetching and potentially impure operations like dependency resolution are never done inside an FOD.
- updating hashes of FODs is done via dream2nix CLI, which ensures that the correct hashes are in place
- As the use of large-FOD fetching is not necessary and therefore minimized, dependencies are cached on an individual basis and shared between packages.
### Update Workflows
#### Problem
- package update workflows can be complicated
- package update workflows vary significantly depending on the language/fragmework
#### Solution
- the workflow for updating packages will be unified and largely independenct of the underlying language/framework.
### Merge Conflicts for shared dependencies
#### Problem
- Due to how shared dependencies are managed, merge conflicts are likely (example: global node-packages.nix)
#### Solution
- Having a central set of shared dependencies can make sense to reduce the code size of nixpkgs, load on hydra+cache.
- To eliminate merge conflicts, the global package set can be maintained via a two stage process. Individual package maintainers can manage their dependencies independently. Once every staging cycle, common dependencies can be found via graph analysis and moved into a global package set.
- The total amount of dependency versions used can also be minimized by re-running the resolver, prioritizing dependencies from the global set of common packages.
### Customizability / Overriding
#### Problem
- Capabilities vary depending on the underlying generator/translator.
- UI is different depending on the underlying generator/translator.
#### Solution
- dream2nix provides good interfaces for customizability which are unified as much as possible independently from the underlying buildsystems.
### Inefficient/Slow Innovation
#### Problem
- Design issues (FOD-impurity, Maintainability, etc.) cannot be fixed easily and lead to long term suffering of maintainers.
- Innovation often happens on individual tools and are not adapted ecosystem wide
- New nix features will not be easily adapted as this will require updating many individual tools.
#### Solution
- Since dream2nix centrally handles many core elements of packaging like different strategies for fetching and building, it is much easier to fix problems at large scale and apply new innovations to all underlysing buildsystems at once.
- Experimenting with and adding support for new nix features will be easier as the framework offers better abstractions than existing 2nix converters and allows adding/modifying strategies more easily.

View File

@ -262,7 +262,7 @@ def parse_args():
list_parser.set_defaults(func=list_translators)
# PARSER FOR TRNASLATOR
# PARSER FOR TRANSLATOR
translate_parser = sub.add_parser(
"translate",
@ -280,7 +280,7 @@ def parse_args():
translate_parser.add_argument(
"-o", "--output",
help="output file/directory for the generic lock",
help="output file/directory for the dream.lock",
default="./dream.lock"
)

View File

@ -10,7 +10,7 @@ dream2nix_src = "./src"
class ContributeCommand(Command):
description = (
"Creates a basic <comment>pyproject.toml</> file in the current directory."
"Add a new module to dream2nix by initializing a template"
)
name = "contribute"
@ -32,9 +32,6 @@ class ContributeCommand(Command):
def handle(self):
module = self.option('module')
print(f"module: {module}")
if self.io.is_interactive():
self.line("")
self.line(