Make sure the all derivations referenced by the test script are
available on the nodes. Accessing these derivations works just fine
without this change when using 9p to mount the host's store, but when
an image is built (virtualisation.buildRootImage), the dependencies
need to be copied to the image. We don't want to copy the script
itself, though, since that would trigger unnecessary image rebuilds.
Add a copyChannel argument which controls whether the current source
tree will be made available as a nix channel in the image or
not. Previously, it always was. Making it available is useful for
interactive use of nix utils, but changes the hash of the image when
the sources are updated.
The current implementation just forks off a thread to read
QEMU's stdout and lets it exist forever. This, however,
makes the interpreter shutdown racy, as the thread could
still be running and writing out buffered stdout when the
main thread exits (and since it's using the low level API,
the worker thread does not get cleaned up by the atexit hooks
installed by `threading`, either). So, instead of doing that,
let's create a real `threading.Thread` object, and also
explicitly `join` it along with the other stuff when cleaning up.
we need the file itself as a dependency for the docbook build, but we don't need
it to be properly sorted at the nix level. push the sort out to a python script
instead to save eval time. on the machine used to write this `nix-instantiate
<nixos/nixos> -A system` went down from 7.1s to 5.4s and GC heap size decreased
by 50MB (or 70MB max RSS).
We have no usecase for manually/selectively starting or stopping VLANs
in integration tests.
By starting and stopping the VLANs with the constructor and destructor
of VLAN objects, we remove the obligation and complexity to maintain
network lifetime separately.
This commit encapsulates the involved domain into classes and
defines explicit and typed arguments where untyped dicts where used.
It preserves backwards compatibility through legacy wrappers.
The current name is misleading: it doesn't contain cli arguments,
but several constants and utility functions related to qemu.
This commit also removes the use of `with import ...` for clarity.
This is a private interface for internal NixOS use. It is similar
to `make-disk-image` except it is much more opinionated about what
kind of disk image it'll make.
Specifically, it will always create *two* disks:
1. a `boot` disk formatted with FAT in a hybrid GPT mode.
2. a `root` disk which is completely owned by a single zpool.
The partitioning and FAT decisions should make the resulting images
bootable under EFI or BIOS, with systemd-boot or grub.
The root disk's zpools options are highly customizable, including
fully customizable datasets and their options.
Because the boot disk and partition are highly opinionated, it is
expected that the `boot` disk will be mounted at `/boot`. It is
always labeled ESP even on BIOS boot systems.
In order for the datasets to be mounted properly, the `datasets`
passed in to `make-zfs-image` are turned in to NixOS configuration
stored at /etc/nixos/configuration.nix inside the VM.
NOTE: The function accepts a system configuration in the `config`
argument. The *caller* must manually configure the system
in `config` to have each specified `dataset` be represented
by a corresponding `fileSystems` entry.
One way to test the resulting images is with qemu:
```sh
boot=$(find ./result/ -name '*.boot.*');
root=$(find ./result/ -name '*.root.*');
echo '`Ctrl-a h` to get help on the monitor';
echo '`Ctrl-a x` to exit';
qemu-kvm \
-nographic \
-cpu max \
-m 16G \
-drive file=$boot,snapshot=on,index=0,media=disk \
-drive file=$root,snapshot=on,index=1,media=disk \
-boot c \
-net user \
-net nic \
-msg timestamp=on
```
Previous to this commit, the entire test driver environment was shared
with the actual python test environment.
This is a hefty api surface. This commit selectively exposes only those
symbols to the test environment that are actually meant to be used by
tests.
This is the case when the test-script is empty. `nixos-build-vms(8)` is
primarily supposed to be used as tool to test changes or to reproduce
bugs (IMHO) where "just spinning up a few VMs" is the primary use-case.
In the ongoing discussion about these changes[1] it was suggested to
only expose it when needed (i.e. in the case I described above) to keep
the API surface as slim as possible.
[1] https://github.com/NixOS/nixpkgs/pull/133675#discussion_r688112485
This is relevant for `nixos-build-vms(8)` which doesn't have a
test-script. In that case it's more intuitive to directly go into the
interactive mode which is IMHO more intuitive.
Previously the driver was configured exclusively through convoluted
environment variables.
Now the driver's defaults are configured through env variables.
Some additional concerns are in the github comments of this PR.
the use of python further restricts possible RFC1035 host labels since
dash is not allowed for use in python identifiers.
The previous implementation of this check was flawed, since it did not
check the `hostName` value that is actually used to construe the
identifier, but the node name, which can be anything, e.g. just `machine`.
The previous implementation, by further restricting RFC1035 labels, only
for the sake of testing seems to be an unacceptable restriction and should
be addressed separately.
The disk image calculator was using find + exec forking du for every
file in the disk image, making it very slow. Change du to accept files,
nul delimeted on stdin to speed it back up.
Before change:
nix-build nixos/tests/image-contents.nix 9.71s user 1.06s system 8% cpu 2:13.11 total
After change:
nix-build nixos/tests/image-contents.nix 9.93s user 1.23s system 21% cpu 51.601 total
nixos tests are blended with other system configurations, hence
their settings must be either enforced or defaulted.
This particular setting is set via lib.nixosSystem as
`system.nixos.revision = final.mkIf (self ? rev) self.rev;` which would
mean that without this change no flake generated nixos could be blended
with nixos testing.
This setting was made previously constant in
169c6b4b14 in order to avoid pointless
rebuilds of the testing VMs, but was set without enforcing it.
Apparently this looks like it was forgotten when doing commit
3884ff70ba, which refactored the test
runner and driver a bit.
The passthru argument actually was correctly reintroduced in
setupDriverForTest, but the actual makeTest function didn't use it.
This fixes the nixpkgs tarball job, which previously failed with:
attribute 'elkPackages' missing, at /build/source/pkgs/tools/misc/logstash/6.x.nix:58:30
Signed-off-by: aszlig <aszlig@nix.build>
Acked-by: David Arnold <dar@xoe.solutions>
Fixes: https://github.com/NixOS/nixpkgs/issues/127274
Merges: https://github.com/NixOS/nixpkgs/pull/127346
Less nesting, where that improves readability. More nesteing, where
that improves readability, but most importantly:
Expose individual functions separately so that they can be more easily
built directly, eg.:
`nix build --impure --expr '(import ./testing-python.nix {system = builtins.currentSystem;}).mkTestDriver'`
At nixpkgs root:
`rg redirectSerial ./` does not result in any other match
nor does
`rg USE_SERIAL ./` except for an unrelated match in:
pkgs/tools/graphics/argyllcms/default.nix
Bash's standard behavior of not propagating non-zero exit codes
through a pipeline is unexpected and almost universally
unwanted. Default to setting `pipefail` for the command being run;
it can still be turned off by prefixing the pipeline with
`set +o pipefail` if needed.
Also, set `errexit` and `nonunset` options to make the first command
of consecutive commands separated by `;` fail, and disallow
dereferencing unset variables respectively.
When importing Nixpkgs within Nixpkgs, we should not consider aliases
to ensure we don't rely on them internally.
There are probably more places that need to be converted.
The root filesystem resizing step, `resize2fs -M', does not provide any
control over the amount of slack left in the result. It can produce an
arbitrarily tight fit, depending on how well the payload aligns with
ext4 data structures.
This is problematic, as NixOS must create a few files and directories
during its first boot, before the root is enlarged to match the size of
the containing SD card.
An overly tight fit can cause failures in the first stage:
mkdir: can't create directory '/mnt-root/proc': No space left on device
or in the second stage:
install: cannot create directory '/var': No space left on device
A previous version of `make-ext4-fs' (before PR #79368) was explicitly
"reserving" 16 MiB of free space in the final filesystem. Manually
calculating the size of an ext4 filesystem is a perilous endeavor,
however, and the method it employed was apparently unreliable.
Reverting is consequently not a good option.
A solution would be to create some sort of "balloon" occupying inodes
and blocks in the image prior to invoking `resize2fs -M', and to remove
these temporary files/directories before the compression step.
This changeset takes the simpler approach of simply dropping the
resizing step.
Note that this does *not* result in a larger image in general, as the
current procedure does not truncate the `.img' file anyway. In fact, it
has been observed to yield *smaller* compressed images---probably
because of some "noise" left after resizing. E.g., before-vs-after:
-r--r--r-- 2 root root 607M 1. Jan 1970 nixos-sd-image-21.11pre-git-x86_64-linux.img.zst
-r--r--r-- 2 root root 606M 1. Jan 1970 nixos-sd-image-21.11pre-git-x86_64-linux.img.zst
For now you had to know that the actions are retried for 900s when
seeing an error like
> Traceback (most recent call last):
> File "/nix/store/dbvmxk60sv87xsxm7kwzzjm7a4fhgy6y-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 927, in run_tests
> exec(tests, globals())
> File "<string>", line 1, in <module>
> File "<string>", line 31, in <module>
> File "/nix/store/dbvmxk60sv87xsxm7kwzzjm7a4fhgy6y-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 565, in wait_for_file
> retry(check_file)
> File "/nix/store/dbvmxk60sv87xsxm7kwzzjm7a4fhgy6y-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 142, in retry
> raise Exception("action timed out")
> Exception: action timed out
in your (hydra) build failure. Due to the absence of timestamps you were
left guessing if the machine was just slow, someone passed a low timeout
value (which they couldn't until now) or whatever might have happened.
By making this error a bit more descriptive (by including the elapsed
time) these hopefully become more useful.
Our test driver exposes a bunch of variables and functions, which
pyflakes doesn't recognise by default because it assumes that the test
script is executed standalone. In reality however the test driver script
is using exec() on the testScript.
Fortunately pyflakes has $PYFLAKES_BUILTINS, which are the attributes
that are globally available on all modules to be checked. Since we only
have one module, using this environment variable is fine as opposed to
my first approach to this, which tried to use the unstable internal API
of pyflakes.
The attributes are gathered by the main derivation of the test driver,
because we don't want to end up defining a new attribute in the test
driver module just to being confused why using it in a test will result
in an error.
Another way we could have gathered these attributes would be in
mkDriver, which is where the linting takes place. However, we do have a
different set of Python dependencies in scope and duplicating these will
again just cause confusion over having it at one location only.
Signed-off-by: aszlig <aszlig@nix.build>
Co-Authored-By: aszlig <aszlig@nix.build>
So far, we have used "black" for formatting the test code, which is
rather strict and opinionated and when used inline in Nix expressions it
creates all sorts of trouble.
One of the main annoyances is that when using strings coming from Nix
expressions (eg. store paths or option definitions from NixOS modules),
completely unrelated changes could cause tests to fail, since eg. black
wants lines to be broken.
Another downside of enforcing a certain kind of formatting is that it
makes the Nix expression code inconsistent because we're mixing two
spaces of indentation (common in nixpkgs) with four spaces of
indentation as defined in PEP-8. While this is perfectly fine for
standalone Python files, it really looks ugly and inconsistent IMO when
used within Nix strings.
What we actually want though is a linter that catches problems early on
before actually running the test, because this is *actually* helping in
development because running the actual VM test takes much longer.
This is the reason why I switched from black to pyflakes, because the
latter actually has useful checks, eg. usage of undefined variables,
invalid format arguments, duplicate arguments, shadowed loop vars and
more.
Signed-off-by: aszlig <aszlig@nix.build>
Closes: https://github.com/NixOS/nixpkgs/issues/72964
On my system I have XWayland disabled and therefore only WAYLAND_DISPLAY
is set. This ensures that the graphical output will still be enabled on
such setups (both Wayland and X11 are supported by the viewer).
Work around missing /dev files inside runInLinuxVM by creating a
symlink before calling nixos-enter.
This fixes https://github.com/NixOS/nixpkgs/issues/93381.
I ran into this issue when trying to create a VMware image that boots from EFI.
Thanks @colemickens for reporting this and @danielfullmer for fixing the same thing in in qemu-vm.nix (37676e77cb) and explaining what the issue was.
This ensures the following gptfdisk warning won't happen:
```
Warning: File size is not a multiple of 512 bytes! Misbehavior is likely!
```
Additionally, helps towards aligning the partition to be more optimal
for the underlying storage.
It is actually impossible to align for the actual underlying storage
optimally because we don't know what the block device will be!
But aligning on 1MiB should help.
This is a bit of a thorny issue. See, the actual `diskSize` variable is
for the *total* disk size, not for the filesystem!
The automatic numbers are meant to compute the *filesystem* required
space. So we have to add any other reserved space!
We have different requirements for reserved space. E.g. there could be
none (when it's actually a filesystem image). There could also be 1MiB
for alignment for an MBR image, legacy+gpt needs 2MiB, then GPT with an
ESP ("bootSize") needs to take the boot partition and GPT size into
account too!
Though luckily(?) for this latter situation we can cheat! As noted in the
change, `bootSize` is NOT the boot partition size. It is actually the
offset where the target filesystem starts.
Reserved space includes:
- inodes space in use (2 blocks per)
- about 5.2% of the space
The 5.2% reserved space was computed empirically when working on a
previous EXT4 image builder. It seems to stabilize around 5% even for
much larger filesystems.
On some filesystems, `du` without `--apparent-size` will not give the
actual size for a file. Using `--apparent-size` will give us the actual
file size.
Though, this is not actually correct still. 1000 × 1 bytes is not 1000
bytes. It is 1000 × ceil(filesize/blockSize)*blockSize.
So instead of adding up the actual file sizes. We are adding up the
block sizes.
Note that this also changes the builder to work with *bytes*, rather
than with any other units. Doing maths on bytes is less likely to go
awry than doing it on other units.
When performing OCR, some of the Tesseract settings perform better than
others on a variety of different workloads, but they mostly take
~negligible incremental time to run compared to the overhead of running
the ImageMagick filters.
After this commit, we try using all three of the current Tesseract
models (classic, LSTM, and classic+LSTM) to generate output text. This
fixes chromium-90's tests at release-20.09, and should make cases where
you're looking for *specific* text better, with the tradeoff of running
Tesseract multiple times.
To make it sensible to cherrypick this into release-20.09, this doesn't
change the existing API surface for the test driver. In particular,
get_screen_text continues to have the existing behaviour.
Make the revision metadata constant, in order to avoid needless retesting.
The human version (e.g. 21.05-pre) is left as is, because it is useful
for external modules that test with e.g. nixosTest and rely on that
version number.
the nix store may contain hardlinks: derivations may output them
directly, or users may be using store optimization which automatically
hardlinks identical files in the nix store.
The presence of these links are intended to be a 'transparent'
optimization. However, when creating a squashfs image, the image
will be different depending on whether hard links were present
on the filesystem, leading to reproducibility problems.
By passing '-no-hardlinks' to mksquashfs the files are stored
as duplicates in the squashfs image. Since squashfs has support
for duplicate files this does not lead to a larger image.
For more details see
https://github.com/NixOS/nixpkgs/issues/114331
/var/lib/nixos is used by update-users-groups.pl in the activation
script for storing uid/gid mappings. If this has its own mountpoint
(as is the case in some setups with fine-grained bind mounts pointing
into persistent storage), the mappings are written to /var/lib, /var,
or /. These may be backed by a tmpfs or (otherwise ephemeral storage),
resulting in the mappings not persisting between reboots.
The hydra tarball step would fail due to the nodes attribute not being
properly inherited. Since we can't execute all the tests and release
steps locally anymore (thanks to the JSONification and faster hydra
eval) these errors will probably keep in appearing.
This is hopefully the last of those introduced by me test runner
refactoring.
Error was seen on hydra (https://hydra.nixos.org/build/129282411):
> unpacking sources
> unpacking source archive /nix/store/bp95x52h6nv3j8apxrryyj2rviw682k1-source
> source root is source
> patching sources
> autoconfPhase
> No bootstrap, bootstrap.sh, configure.in or configure.ac. Assuming this is not an GNU Autotools package.
> configuring
> release name is nixpkgs-21.03pre249116.1088f059401
> git-revision is 1088f05940
> building
> no Makefile, doing nothing
> running tests
> warning: you did not specify '--add-root'; the result might be removed by the garbage collector
> warning: you did not specify '--add-root'; the result might be removed by the garbage collector
> checking Nixpkgs on i686-linux
> checking Nixpkgs on x86_64-linux
> checking Nixpkgs on x86_64-darwin
> checking eval-release.nix
> trace: `mkStrict' is obsolete; use `mkOverride 0' instead.
> trace: `lib.nixpkgsVersion` is deprecated, use `lib.version` instead!
> trace: warning: lib.readPathsFromFile is deprecated, use a list instead
> trace: Warning: `showVal` is deprecated and will be removed in the next release, please use `traceSeqN`
> trace: lib.zip is deprecated, use lib.zipAttrsWith instead
> checking find-tarballs.nix
> trace: `mkStrict' is obsolete; use `mkOverride 0' instead.
> trace: `lib.nixpkgsVersion` is deprecated, use `lib.version` instead!
> trace: warning: lib.readPathsFromFile is deprecated, use a list instead
> trace: Warning: `showVal` is deprecated and will be removed in the next release, please use `traceSeqN`
> trace: lib.zip is deprecated, use lib.zipAttrsWith instead
> error: while evaluating anonymous function at /build/source/maintainers/scripts/find-tarballs.nix:6:1, called from undefined position:
> while evaluating 'operator' at /build/source/maintainers/scripts/find-tarballs.nix:27:16, called from undefined position:
> while evaluating 'immediateDependenciesOf' at /build/source/maintainers/scripts/find-tarballs.nix:39:29, called from /build/source/maintainers/scripts/find-tarballs.nix:27:44:
> while evaluating anonymous function at /build/source/lib/attrsets.nix:234:10, called from undefined position:
> while evaluating anonymous function at /build/source/maintainers/scripts/find-tarballs.nix:40:37, called from /build/source/lib/attrsets.nix:234:16:
> while evaluating 'derivationsIn' at /build/source/maintainers/scripts/find-tarballs.nix:42:19, called from /build/source/maintainers/scripts/find-tarballs.nix:40:40:
> while evaluating 'canEval' at /build/source/maintainers/scripts/find-tarballs.nix:48:13, called from /build/source/maintainers/scripts/find-tarballs.nix:43:9:
> while evaluating the attribute 'nodes' at /build/source/nixos/lib/testing-python.nix:195:23:
> attribute 'nodes' missing, at /build/source/nixos/lib/testing-python.nix:193:16
> build time elapsed: 0m0.122s 0m0.043s 17m51.526s 0m56.668s
> builder for '/nix/store/96rk3c74vrk6m3snm7n6jhis3j640pn4-nixpkgs-tarball-21.03pre249116.1088f059401.drv' failed with exit code 1
In 5500dc8 we introduced the --keep-vm-state flag and defaulted to that
flag not being set. This lead to the `runInMachine` tests not longer
working and that going unnoticed for quite some time now.
Previously you would be able to override only the QEMU package to be
used in the test runner. Frankly that doesn't help a lot if you are
trying to get a graphical session. The graphical session requires the
option in the NixOS module system to bet set to the correct QEMU
package.
In this commit I moved most of the test node configuration and
transformations into the `mkDriver` function (previously called
`driver`). The motivation was to be able to create a `driver` instance
with a given QEMU package that will be used consistently througout the
test expression.
According to Python documentation [0], `bufsize=1` is only meaningful in
text mode. As we don't pass in an argument called `universal_newlines`,
`encoding`, `errors` or `text` the file objects aren't opened in text
mode, which means the argument is ignored with a warning in Python 3.8.
line buffering (buffering=1) isn't supported in binary mode,
the default buffer size will be used
This commit removes this warning that appared when using
interactive test driver built with `-A driver`. This is done by
removing `bufsize=1` from Popen calls.
The default parameter when unspecified for `bufsize` is `-1` which
according to the documentation will be interpreted as
`io.DEFAULT_BUFFER_SIZE`. As mentioned by a warning, Python already
uses default buffer size when providing `buffering=1` parameter for
file objects not opened in text mode.
[0]: https://docs.python.org/3/library/subprocess.html#subprocess.Popen
For a lot of the work the non-interactive drivers are enough and it is
probably a good idea to keep it accessible for debugging without
touching the Nix expression.
Since we previously stripped down the features of `qemu_test` some of
the features users are used to while running tests through the (impure)
driver didn't work anymore. Most notably we lost support for graphical
output and audio. With this change the `driver` attribute uses are more
feature complete version of QEmu compared to the one used in the pure
Nix builds.
This gives us the best of both worlds. Users are able to see the
graphical windows of VMs while CI and regular nix builds do not have to
download all the (unnecessary) dependencies.
Since using flakes disallows the usage of <unstable> (which I use in
some tests), this adds an alternative. By setting specialArgs, all VMs
can get the `unstable` flake input as an arg. This is not possible with
extraConfigurations, as that would lead to infinite recursions.
ecb73fd555 introduced a new keepVmState
CLI flag for test-driver.py. This CLI flags gets forwarded to the
Machine class through create_machine.
It created a regression for the boot tests where __main__ end up not
being evaluated. See
https://github.com/NixOS/nixpkgs/pull/97346#issuecomment-690951837 for
bug report.
Defaulting keepVmState to false when __main__ ends up not being
evaluated.
The previous version of the code would only kick in if the state
directory path pointed at a *file*, which never occurs. Making that
codepath actually work reveals an ordering bug, which this patch fixes
as well.
It also replaces the confusing, imperative case log message "delete VM
state directory" with "deleting VM state directory".
Finally, we hint the user about how to prevent this deletion. IE. by
passing the --keep-vm-state flag.
Bug report:
https://github.com/NixOS/nixpkgs/pull/91046#issuecomment-685568750
Credit goes to Edef for the rebase on top of a recent nixpkgs commit
and for writing most of this commit message.
Co-authored-by: edef <edef@edef.eu>
This reverts commit 1bff6fe17c, reversing
changes made to 2995fa48cb.
There’s presumably nothing wrong with this PR, except that it
conflicts with reverting #96254 which broke several tests (#96699).
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
With the Perl driver, machine.sleep(N) was doing a sleep on the guest
machine instead of the host machine. The new Python test driver however
uses time.sleep(), which instead sleeps on the host.
While this shouldn't make a difference most of the time, it *does*
however make a huge difference if the test machine is loaded and you're
sleeping for a minimum duration of eg. an animation.
I stumbled on this while porting most of all my tests to the new Python
test driver and particularily my video game tests failed on a fairly
loaded machine, whereas they don't with the Perl test driver.
Switching the sleep() method to sleep on the guest instead of the host
fixes this.
Signed-off-by: aszlig <aszlig@nix.build>
This appears to avoid requiring KVM when it’s not available. This is
what I originally though -cpu host did. Unfortunately not much
documentation available from the QEMU side on this, but this appears
to square with help:
$ qemu-system-x86 -cpu help
...
x86 host KVM processor with all supported host features
x86 max Enables all features supported by the accelerator in the current host
...
Whether we actually want to support this not clear, since this only
happens when your CPU doesn’t have full KVM support. Some Nix builders
are lying about kvm support though. Things aren’t too slow without it
though.
Fixes https://github.com/NixOS/nixpkgs/issues/85394
Alternative to https://github.com/NixOS/nixpkgs/pull/83920
- Give a more accurate description of how fileSystems.<name/>.neededForBoot
works
- Give a more detailed description of how fileSystems.<name/>.encrypted.keyFile
works
In 9ac1ab10c9 this library function was
refactored to use mkfs.ext4 instead of cptofs. There are two problems:
If populateImageCommands would create no files (which is fine), a cp
invocation would fail due to missing source arguments.
Another problem is that mkfs.ext4 relies on fakeroot to have sane
uid/gids in the generated filesystem image. This currently doesn't
work for cross compiling.
This modifies the `router` to not give out a range of IP addresses but
only give out a fixed address based on the MAC address using the
`services.dhcpd4.machines` option.
To get access to the MAC address the `qemuNicMac` function is defined
and exported from `qemu-flags.nix`.
By generating a version-5 GUID based on $out (which contains
the derivation hash) and preventing isohybrid from overwriting
the GPT table (which already is populated correctly by xorriso).
Tested by:
* booting from USB disk on a UEFI system
* booting from USB disk on a non-UEFI system
* booting from CD on a UEFI system
* booting from CD on a non-UEFI system
* booting from CD on an OSX system
Also tested that "nix-build ./nixos/release-combined.nix -A
nixos.iso_minimal.x86_64-linux -I nixpkgs=~/nixpkgs-r13y --check"
now succeeds.
Fixes#74047
This was broken in 460c0d6 (PR #90431); now the nixos-unstable channel
should get unblocked.
vcunat modified this commit to use env-var instead of hardcoding /build
Keeping the VM state test across several run sometimes lead to subtle
and hard to spot errors in practice. We delete the VM state which
contains (among other things) the qcow volume.
We also introduce a -K (--keep-vm-state) flag making VM state to
persist after the test run. This flag makes test-driver.py to match
its previous behaviour.
Turns out, on smaller images (~800MiB uncompressed sdcard image size),
the current fudge factor is way too small to even get the system to the
phase where it can resize itself.
I first tried with 1.05, but it wasn't enough.
xchg is advertised as a bidirectional exchange dir, but file content
transfer from host to VM fails due to caching:
If a file is read in the VM and then modified on the host, subsequent
re-reads in the VM can yield old, cached data.
This is caused by the use of 9p's cache=loose mode that is explicitly
meant for read-only mounts.
9p doesn't provide any suitable cache modes, so fix this by disabling
caching.
Also, remove a now unnecessary sync in the test driver.
These syncs have the goal to transfer host filesystem changes to the VM,
but they have no effect because 1) syncing in the VM can't possibly pull
in host data and 2) 9p is accessing the host filesystem on the cached
layer anyways, so even syncing on the host would have no effect in the
VM.
The test harness provides the commands it wishes to run in Bourne
syntax. This fails if the user uses a different shell. For example,
with fish:
machine.wait_for_unit("graphical-session.target", "alice")
machine # fish: Unsupported use of '='. To run '-u`' with a modified environment, please use 'env XDG_RUNTIME_DIR=/run/user/`id -u`…'
machine # XDG_RUNTIME_DIR=/run/user/`id -u` systemctl --user --no-pager show "graphical-session.target"
machine # ^
machine # [ 16.329957] su[1077]: pam_unix(su:session): session closed for user alice
error: retrieving systemctl info for unit "graphical-session.target" under user "alice" failed with exit code 127
This completes the removal of the nested log feature, which previously
got removed from Nix, Hydra, stdenv and GNU Make. In particular, this
means that the output of VM builds no longer contains a copy of
jQuery.
If a program (e.g. nixos-install) writes more than 1000 lines to
stderr during execute(), then process_serial_output() deadlocks
waiting for the queue to be processed. So use an unbounded queue
instead.
We should probably get rid of the structured log output (log.xml),
since then we don't need the log queue anymore.
This avoids a possible surprise if the user is using `nixpkgs.system`
and `nesting.children`. `nesting.children` is expected to ignore all
parent configuration so we shouldn't propagate the user-facing option
`nixpkgs.system`. To avoid doing so, we introduce a new internal
option for holding the value passed to eval-config.nix, and use that
when recursing for nesting.
Most VM tests have been migrated to use the python test driver
(introduced in #71684), the migration is tracked in #72828 (which also
thankfully uncovered and fixed many currently broken tests)
While increasing the acceptance and adoption of NixOS integration tests
by using a more popular language, there was also nobody willing to do
larger refactors in the currently very convoluted test infrastructure.
We plan to remove the perl infrastructure between the 20.03 and 20.09
release, to be able to do these refactorings.
Some people might be using Perl tests in their internal CI, so print a
warning for 20.03, and give users time to move to the python testing
infrastructure.
According to https://repology.org/repository/nix_unstable/problems, we have a
lot of packages that have http links that redirect to https as their homepage.
This commit updates all these packages to use the https links as their
homepage.
The following script was used to make these updates:
```
curl https://repology.org/api/v1/repository/nix_unstable/problems \
| jq '.[] | .problem' -r \
| rg 'Homepage link "(.+)" is a permanent redirect to "(.+)" and should be updated' --replace 's@$1@$2@' \
| sort | uniq > script.sed
find -name '*.nix' | xargs -P4 -- sed -f script.sed -i
```
The docstring says it uses a directory shared among all vms, although
that doesn't seem necessary for the functionality. However, it does need
to be consistent between the guest and host.
The codec format 'unicode_escape' was introduced in 52ee102 to handle
undecodable bytes in boot menus.
This made the problem worse as unicode chars outside of iso-8859-1
produce garbled output and valid utf-8 strings (such as "\x" ) trigger
decoding errors.
Fix this by using the default 'utf-8' codec and by explicitly ignoring
decoding errors.
This changes the python test driver to match the behavior of the perl
test driver. I.e. the directory mounted into /tmp/shared should be the
same for all machines.
This probably fixes many tests, but I found this while investigating
failures in nixos/tests/ceph-multi-node.nix.
While it's a good idea to automate the linting of the python code used
for our tests, I think that it can be quite distracting when hacking on
a NixOS test.
I figured that it might be more convenient to add an option as a
shortcut for this to avoid that everyone needs to dig into the test
driver again.
The upstream session files display managers use have no concept of sessions being composed from
desktop manager and window manager. To be able to set upstream session files as default
session, we need a single option. Having two different ways to set default session would be confusing,
though, so we decided to deprecate the old method.
We also created separate script for each session, just like we already had a separate desktop
file for each one, and started using displayManager.sessionPackages mechanism to make the
session handling more uniform.
When using `documentation.nixos.includeAllModules = true;` with external
modules, the string context might contain dependencies to derivations
and so `toFile` refuses to evaluate;
```
error: in 'toFile': the file 'options.xml' cannot refer to derivation outputs, at
[...]/nixpkgs/nixos/lib/make-options-doc/default.nix:89:16
```
This is not an issue when using `writeText` (instead of manually
stripping the context).
The SLIM project is abandoned and their last release was in 2013.
Because of this it poses a security risk to systems, no one is working
on it or picked up maintenance. It also lacks compatibility with systemd
and logind sessions. For users, there liikely isn't anything like slim
that's as lightweight in terms of dependencies.
we previously immediately returned the first commands output, and didn't
execute any of the other commands.
Now, return the last commands output.
This should be documented in the method docstring.
Condition seems to be inverted. Crash and shutdown only make sense, when
the machine is booted; i.e. we return immediately otherwise.
In the Perl test driver this is:
return unless $self->{booted};
This reverts commit e9bf955fd6. We use
nixos-install to ensure that make-disk-image produces the same result
as a regular installation (9802da517f)
and to reduce code duplication. If there is something broken in
nixos-install, it should be fixed there.
Introduce new functions which allows modules to define options where,
if the input is an attrset and the output is JSON, the user can define
arbitrary secrets.
Because the copy process inside the VM does not reliably
give "No space" error message leaving the user wondering what
went wrong:
unable to create directory /mnt/0000fe01///nix/store/yknzxx7w2ck9p30k81gpi5yfjlrq41lr-libsecret-0.18.7/share/locale/ro: Success
[ 5.462365] reboot: Restarting system
error processing entry /build/root/nix/store/yknzxx7w2ck9p30k81gpi5yfjlrq41lr-libsecret-0.18.7/share/locale/ro, aborting
error processing entry /build/root/nix/store/yknzxx7w2ck9p30k81gpi5yfjlrq41lr-libsecret-0.18.7/share/locale, aborting
error processing entry /build/root/nix/store/yknzxx7w2ck9p30k81gpi5yfjlrq41lr-libsecret-0.18.7/share, aborting
error processing entry /build/root/nix/store/yknzxx7w2ck9p30k81gpi5yfjlrq41lr-libsecret-0.18.7, aborting
error processing entry /build/root/nix/store, aborting
error processing entry /build/root/nix, aborting
builder for '/nix/store/fsdvqxq92iai7f3w8wcsncgfwag7cj2l-libvirtd-ssh-image.drv' failed with exit code 228
Motivation is to support other repositories containing nixos
modules that would like to generate options documentation:
- nix-darwin
- private repos
- arion
- ??
When IPXE tests were added, an option was added for configuring only
the frontend, and the backend configuration was dropped entirely. This
caused most installer tests to fail.
We differentiate between modules and baseModules in the
VM builder for NixOS tests. This way, nesting.children, eventhough
it doesn't inherit from parent, still has enough config to
actually complete the test. Otherwise, the qemu modules
would not be loaded, for example, and a nesting.children
statement would not evaluate.
Before this change `man 5 configuration.nix` would only show options of modules in
the `baseModules` set, which consists only of the list of modules in
`nixos/modules/module-list.nix`
With this change applied and `documentation.nixos.includeAllModules` option enabled
all modules included in `configuration.nix` file will be used instead.
This makes configurations with custom modules self-documenting. It also means
that importing non-`baseModules` modules like `gce.nix` or `azure.nix`
will make their documentation available in `man 5 configuration.nix`.
`documentation.nixos.includeAllModules` is currently set to `false` by
default as enabling it usually uncovers bugs and prevents evaluation.
It should be set to `true` in a release or two.
This was originally implemented in #47177, edited for more configurability,
documented and rebased onto master by @oxij.
Fixes#5185856e12aae54 ends up passing config to pkgs. Unfortunately this might be null and pkgs/top-level/default.nix assumes it is an attrset. To fix this, we just make the default for config = {}. Thanks to @kristoff3r for tracking this down.
/cc @domenkozar
See #49441 for an earlier attempt, which was subsequently reverted. I am
assuming that doubling the time will be sufficient if the machine is
overloaded since so many of the tests already pass at 5 minutes, while
still not holding back failures for needlessly long.
cleanSource does not appear to work correctly in this case. The path
does not get coerced to a string, resulting in a dangling symlink
produced in channel.nix. Not sure why, but this
seems to fix it.
Fixes#51025.
/cc @elvishjericco
Since 113a6b9325 the test driver
explicitly ensures if the node names won't break the resulting Perl
script at runtime. This slightly improves the correctness of the error
message.
These names are referenced by Perl variables inside the testing
frameworks which don't allow chars like `-` as character inside. An exemplary
expression may look like this:
```
{
x11-vm = {
services.xserver.enable = true;
};
}
```
This expression evaluates, e.g. when running `nixos-build-vms`, but when
trying to run `./result/bin/nixos-run-vms`, an error like this occurs:
```
starting VDE switch for network 1
running the VM test script
error: Can't modify subtraction (-) in scalar assignment at (eval 17) line 1, at EOF
Bareword "test" not allowed while "strict subs" in use at (eval 17) line 1.
Can't modify subtraction (-) in scalar assignment at (eval 17) line 1, at EOF
Bareword "test" not allowed while "strict subs" in use at (eval 17) line 1.
vde_switch: EOF on stdin, cleaning up and exiting
cleaning up
```
This can be very confusing for beginners, this change breaks evaluation
if such names are used for machines.
This is to try and squeeze more lost space from the image, so that hydra
starts building it again.
The fsck previous to the resize2fs is required so resize2fs works.
The one afterwards is a sanity check.
Using `-M` from resize2fs will not give much saved space due to a known
(in the manual) issue.
```
[samueldr@aarch64:~/nixpkgs]$ ls -lh result-*/*/*.img
-r--r--r-- 1 root root 2.2G Jan 1 1970 result-original/sd-image/nixos-sd-image-18.09.git.a7fd431-aarch64-linux.img
-r--r--r-- 1 root root 2.1G Jan 1 1970 result-M/sd-image/nixos-sd-image-18.09.git.a7fd431-aarch64-linux.img
-r--r--r-- 1 root root 1.9G Jan 1 1970 result-slimmed/sd-image/nixos-sd-image-18.09.git.a7fd431-aarch64-linux.img
```
```
[samueldr@aarch64:~/nixpkgs]$ nix path-info -S ./result-original
/nix/store/c8k9n78gylx293rjh762fr05a069kxp2-nixos-sd-image-18.09.git.a7fd431-aarch64-linux.img 3844125000
[samueldr@aarch64:~/nixpkgs]$ nix path-info -S ./result-slimmed
/nix/store/962238skj5mnzhrsmjy23dyzmxk77sp4-nixos-sd-image-18.09.git.a7fd431-aarch64-linux.img 3447473208
```
Rationale
---------
Currently, tests are hard to discover. For instance, someone updating
`dovecot` might not notice that the interaction of `dovecot` with
`opensmtpd` is handled in the `opensmtpd.nix` test.
And even for someone updating `opensmtpd`, it requires manual work to go
check in `nixos/tests` whether there is actually a test, especially
given not so many packages in `nixpkgs` have tests and this is thus most
of the time useless.
Finally, for the reviewer, it is much easier to check that the “Tested
via one or more NixOS test(s)” has been checked if the file modified
already includes the list of relevant tests.
Implementation
--------------
Currently, this commit only adds the metadata in the package. Each
element of the `meta.tests` attribute is a derivation that, when it
builds successfully, means the test has passed (ie. following the same
convention as NixOS tests).
Future Work
-----------
In the future, the tools could be made aware of this `meta.tests`
attribute, and for instance a `--with-tests` could be added to
`nix-build` so that it also builds all the tests. Or a `--without-tests`
to build without all the tests. @Profpatsch described in his NixCon talk
such systems.
Another thing that would help in the future would be the possibility to
reasonably easily have cross-derivation nix tests without the whole
NixOS VM stack. @7c6f434c already proposed such a system.
This RFC currently handles none of these concerns. Only the addition of
`meta.tests` as metadata to be used by maintainers to remember to run
relevant tests.