Commit Graph

10 Commits

Author SHA1 Message Date
aszlig
7286be7e81 nixos/systemd-confinement: Allow shipped unit file
In issue #157787 @martined wrote:

  Trying to use confinement on packages providing their systemd units
  with systemd.packages, for example mpd, fails with the following
  error:

  system-units> ln: failed to create symbolic link
  '/nix/store/...-system-units/mpd.service': File exists

  This is because systemd-confinement and mpd both provide a mpd.service
  file through systemd.packages. (mpd got updated that way recently to
  use upstream's service file)

To address this, we now place the unit file containing the bind-mounted
paths of the Nix closure into a drop-in directory instead of using the
name of a unit file directly.

This does come with the implication that the options set in the drop-in
directory won't apply if the main unit file is missing. In practice
however this should not happen for two reasons:

  * The systemd-confinement module already sets additional options via
    systemd.services and thus we should get a main unit file
  * In the unlikely event that we don't get a main unit file regardless
    of the previous point, the unit would be a no-op even if the options
    of the drop-in directory would apply

Another thing to consider is the order in which those options are
merged, since systemd loads the files from the drop-in directory in
alphabetical order. So given that we have confinement.conf and
overrides.conf, the confinement options are loaded before the NixOS
overrides.

Since we're only setting the BindReadOnlyPaths option, the order isn't
that important since all those paths are merged anyway and we still
don't lose the ability to reset the option since overrides.conf comes
afterwards.

Fixes: https://github.com/NixOS/nixpkgs/issues/157787
Signed-off-by: aszlig <aszlig@nix.build>
2022-03-02 11:42:44 -08:00
Andreas Rammhold
64556974b6
systemd: 247.6 -> 249.4
This updates systemd to version v249.4 from version v247.6.

Besides the many new features that can be found in the upstream
repository they also introduced a bunch of cleanup which ended up
requiring a few more patches on our side.

a) 0022-core-Handle-lookup-paths-being-symlinks.patch:
  The way symlinked units were handled was changed in such that the last
  name of a unit file within one of the unit directories
  (/run/systemd/system, /etc/systemd/system, ...) is used as the name
  for the unit. Unfortunately that code didn't take into account that
  the unit directories themselves could already be symlinks and thus
  caused all our units to be recognized slightly different.

  There is an upstream PR for this new patch:
    https://github.com/systemd/systemd/pull/20479

b) The way the APIVFS is setup has been changed in such a way that we
   now always have /run. This required a few changes to the
   confinement tests which did assert that they didn't exist. Instead of
   adding another patch we can just adopt the upstream behavior. An
   empty /run doesn't seem harmful.

   As part of this work I refactored the confinement test just a little
   bit to allow better debugging of test failures. Previously it would
   just fail at some point and it wasn't obvious which of the many
   commands failed or what the unexpected string was. This should now be
   more obvious.

c) Again related to the confinement tests the way a file was tested for
   being accessible was optimized. Previously systemd would in some
   situations open a file twice during that check. This was reduced to
   one operation but required the procfs to be mounted in a units
   namespace.

   An upstream bug was filed and fixed. We are now carrying the
   essential patch to fix that issue until it is backported to a new
   release (likely only version 250). The good part about this story is
   that upstream systemd now has a test case that looks very similar to
   one of our confinement tests. Hopefully that will lead to less
   friction in the long run.

   https://github.com/systemd/systemd/issues/20514
   https://github.com/systemd/systemd/pull/20515

d) Previously we could grep for dlopen( somewhat reliably but now
   upstream started using a wrapper around dlopen that is most of the
   time used with linebreaks. This makes using grep not ergonomic
   anymore.

   With this bump we are grepping for anything that looks like a
   dynamic library name (in contrast to a dlopen(3) call) and replace
   those instead. That seems more robust. Time will tell if this holds.

   I tried using coccinelle to patch all those call sites using its
   tooling but unfornately it does stumble upon the _cleanup_
   annotations that are very common in the systemd code.

e) We now have some machinery for libbpf support in our systemd build.
   That being said it doesn't actually work as generating some skeletons
   doesn't work just yet. It fails with the below error message and is
   disabled by default (in both minimal and the regular build).

   > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h
   > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
   > libbpf: elf: socket_bind_bpf is not a valid eBPF object file
   > Error: failed to open BPF object file: BPF object format invalid
   > Traceback (most recent call last):
   >   File "/build/source/tools/build-bpf-skel.py", line 128, in <module>
   >     bpf_build(args)
   >   File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build
   >     gen_bpf_skeleton(bpftool_exec=args.bpftool_exec,
   >   File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton
   >     skel = subprocess.check_output(bpftool_args, universal_newlines=True)
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output
   >     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run
   >     raise CalledProcessError(retcode, process.args,
   > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255.
   > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut)
   > ninja: build stopped: subcommand failed.

  f) We do now have support for TPM2 based disk encryption in our
     systemd build. The actual bits and pieces to make use of that are
     missing but there are various ongoing efforts in that direction.
     There is also the story about systemd in our initrd to enable this
     being used for root volumes. None of this will yet work out of the
     box but we can start improving on that front.

  g) FIDO2 support was added systemd and consequently we can now use
     that. Just with TPM2 there hasn't been any integration work with
     NixOS and instead this just adds that capability to work on that.

Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
2021-09-12 23:45:49 +02:00
divanorama
b7dea9e494 nixosTests.systemd-confinement: fix script format
https://hydra.nixos.org/build/142591177/nixlog/30

ZHF: #122042
2021-05-08 10:05:15 -07:00
Symphorien Gibol
7a87973b4c nixos/users: require one of users.users.name.{isSystemUser,isNormalUser}
As the only consequence of isSystemUser is that if the uid is null then
it's allocated below 500, if a user has uid = something below 500 then
we don't require isSystemUser to be set.

Motivation: https://github.com/NixOS/nixpkgs/issues/112647
2021-04-14 20:40:00 +02:00
Florian Klink
32516e4fee
Merge pull request #80103 from tfc/port-systemd-confinement-test
nixosTests.systemd-confinement: Port to Python
2020-04-23 01:00:51 +02:00
Jörg Thalheim
cf3328e7e3
treewide: use runtimeShell in nixos/
This is needed for cross-compilation.
2020-04-07 07:26:47 +01:00
Jacek Galowicz
1320f23a6b nixosTests.systemd-confinement: Port to Python 2020-02-27 16:58:59 +01:00
aszlig
9e9af4f9c0
nixos/confinement: Allow to include the full unit
From @edolstra at [1]:

  BTW we probably should take the closure of the whole unit rather than
  just the exec commands, to handle things like Environment variables.

With this commit, there is now a "fullUnit" option, which can be enabled
to include the full closure of the service unit into the chroot.

However, I did not enable this by default, because I do disagree here
and *especially* things like environment variables or environment files
shouldn't be in the closure of the chroot.

For example if you have something like:

  { pkgs, ... }:

  {
    systemd.services.foobar = {
      serviceConfig.EnvironmentFile = ${pkgs.writeText "secrets" ''
        user=admin
        password=abcdefg
      '';
    };
  }

We really do not want the *file* to end up in the chroot, but rather
just the environment variables to be exported.

Another thing is that this makes it less predictable what actually will
end up in the chroot, because we have a "globalEnvironment" option that
will get merged in as well, so users adding stuff to that option will
also make it available in confined units.

I also added a big fat warning about that in the description of the
fullUnit option.

[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704

Signed-off-by: aszlig <aszlig@nix.build>
2019-03-14 20:04:33 +01:00
aszlig
46f7dd436f
nixos/confinement: Allow to configure /bin/sh
Another thing requested by @edolstra in [1]:

  We should not provide a different /bin/sh in the chroot, that's just
  asking for confusion and random shell script breakage. It should be
  the same shell (i.e. bash) as in a regular environment.

While I personally would even go as far to even have a very restricted
shell that is not even a shell and basically *only* allows "/bin/sh -c"
with only *very* minimal parsing of shell syntax, I do agree that people
expect /bin/sh to be bash (or the one configured by environment.binsh)
on NixOS.

So this should make both others and me happy in that I could just use
confinement.binSh = "${pkgs.dash}/bin/dash" for the services I confine.

[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704

Signed-off-by: aszlig <aszlig@nix.build>
2019-03-14 19:14:05 +01:00
aszlig
0ba48f46da
nixos/systemd-chroot: Rename chroot to confinement
Quoting @edolstra from [1]:

  I don't really like the name "chroot", something like "confine[ment]"
  or "restrict" seems better. Conceptually we're not providing a
  completely different filesystem tree but a restricted view of the same
  tree.

I already used "confinement" as a sub-option and I do agree that
"chroot" sounds a bit too specific (especially because not *only* chroot
is involved).

So this changes the module name and its option to use "confinement"
instead of "chroot" and also renames the "chroot.confinement" to
"confinement.mode".

[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704

Signed-off-by: aszlig <aszlig@nix.build>
2019-03-14 19:14:03 +01:00