Currently if you want to properly chroot a systemd service, you could do
it using BindReadOnlyPaths=/nix/store or use a separate derivation which
gathers the runtime closure of the service you want to chroot. The
former is the easier method and there is also a method directly offered
by systemd, called ProtectSystem, which still leaves the whole store
accessible. The latter however is a bit more involved, because you need
to bind-mount each store path of the runtime closure of the service you
want to chroot.
This can be achieved using pkgs.closureInfo and a small derivation that
packs everything into a systemd unit, which later can be added to
systemd.packages.
However, this process is a bit tedious, so the changes here implement
this in a more generic way.
Now if you want to chroot a systemd service, all you need to do is:
{
systemd.services.myservice = {
description = "My Shiny Service";
wantedBy = [ "multi-user.target" ];
confinement.enable = true;
serviceConfig.ExecStart = "${pkgs.myservice}/bin/myservice";
};
}
If more than the dependencies for the ExecStart* and ExecStop* (which
btw. also includes script and {pre,post}Start) need to be in the chroot,
it can be specified using the confinement.packages option. By default
(which uses the full-apivfs confinement mode), a user namespace is set
up as well and /proc, /sys and /dev are mounted appropriately.
In addition - and by default - a /bin/sh executable is provided, which
is useful for most programs that use the system() C library call to
execute commands via shell.
Unfortunately, there are a few limitations at the moment. The first
being that DynamicUser doesn't work in conjunction with tmpfs, because
systemd seems to ignore the TemporaryFileSystem option if DynamicUser is
enabled. I started implementing a workaround to do this, but I decided
to not include it as part of this pull request, because it needs a lot
more testing to ensure it's consistent with the behaviour without
DynamicUser.
The second limitation/issue is that RootDirectoryStartOnly doesn't work
right now, because it only affects the RootDirectory option and doesn't
include/exclude the individual bind mounts or the tmpfs.
A quirk we do have right now is that systemd tries to create a /usr
directory within the chroot, which subsequently fails. Fortunately, this
is just an ugly error and not a hard failure.
The changes also come with a changelog entry for NixOS 19.03, which is
why I asked for a vote of the NixOS 19.03 stable maintainers whether to
include it (I admit it's a bit late a few days before official release,
sorry for that):
@samueldr:
Via pull request comment[1]:
+1 for backporting as this only enhances the feature set of nixos,
and does not (at a glance) change existing behaviours.
Via IRC:
new feature: -1, tests +1, we're at zero, self-contained, with no
global effects without actively using it, +1, I think it's good
@lheckemann:
Via pull request comment[2]:
I'm neutral on backporting. On the one hand, as @samueldr says,
this doesn't change any existing functionality. On the other hand,
it's a new feature and we're well past the feature freeze, which
AFAIU is intended so that new, potentially buggy features aren't
introduced in the "stabilisation period". It is a cool feature
though? :)
A few other people on IRC didn't have opposition either against late
inclusion into NixOS 19.03:
@edolstra: "I'm not against it"
@Infinisil: "+1 from me as well"
@grahamc: "IMO its up to the RMs"
So that makes +1 from @samueldr, 0 from @lheckemann, 0 from @edolstra
and +1 from @Infinisil (even though he's not a release manager) and no
opposition from anyone, which is the reason why I'm merging this right
now.
I also would like to thank @Infinisil, @edolstra and @danbst for their
reviews.
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-477322127
[2]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-477548395
eb90d97009 broke nslcd, as /run/nslcd was
created/chowned as root user, while nslcd wants to do parts as nslcd
user.
This commit changes the nslcd to run with the proper uid/gid from the
start (through User= and Group=), so the RuntimeDirectory has proper
permissions, too.
In some cases, secrets are baked into nslcd's config file during startup
(so we don't want to provide it from the store).
This config file is normally hard-wired to /etc/nslcd.conf, but we don't
want to use PermissionsStartOnly anymore (#56265), and activation
scripts are ugly, so redirect /etc/nslcd.conf to /run/nslcd/nslcd.conf,
which now gets provisioned inside ExecStartPre=.
This change requires the files referenced to in
users.ldap.bind.passwordFile and users.ldap.daemon.rootpwmodpwFile to be
readable by the nslcd user (in the non-nslcd case, this was already the
case for users.ldap.bind.passwordFile)
fixes#57783
First of all, the reason I added this to the "highlights" section is
that we want users to be aware of these options, because in the end we
really want to decrease the attack surface of NixOS services and this is
a step towards improving that situation.
The reason why I'm adding this to the changelog of the NixOS 19.03
release instead of 19.09 is that it makes backporting services that use
these options easier. Doing the backport of the confinement module after
the official release would mean that it's not part of the release
announcement and potentially could fall under the radar of most users.
These options and the whole module also do not change anything in
existing services or affect other modules, so they're purely optional.
Adding this "last minute" to the 19.03 release doesn't hurt and is
probably a good preparation for the next months where we hopefully
confine as much services as we can :-)
I also have asked @samueldr and @lheckemann, whether they're okay with
the inclusion in 19.03. While so far only @samueldr has accepted the
change, we can still move the changelog entry to the NixOS 19.09 release
notes in case @lheckemann rejects it.
Signed-off-by: aszlig <aszlig@nix.build>
So far we had MountFlags = "private", but as @Infinisil has correctly
noticed, there is a dedicated PrivateMounts option, which does exactly
that and is better integrated than providing raw mount flags.
When checking for the reason why I used MountFlags instead of
PrivateMounts, I found that at the time I wrote the initial version of
this module (Mar 12 06:15:58 2018 +0100) the PrivateMounts option didn't
exist yet and has been added to systemd in Jun 13 08:20:18 2018 +0200.
Signed-off-by: aszlig <aszlig@nix.build>
Noted by @Infinisil on IRC:
infinisil: Question regarding the confinement PR
infinisil: On line 136 you do different things depending on
RootDirectoryStartOnly
infinisil: But on line 157 you have an assertion that disallows that
option being true
infinisil: Is there a reason behind this or am I missing something
I originally left this in so that once systemd supports that, we can
just flip a switch and remove the assertion and thus support
RootDirectoryStartOnly for our confinement module.
However, this doesn't seem to be on the roadmap for systemd in the
foreseeable future, so I'll just remove this, especially because it's
very easy to add it again, once it is supported.
Signed-off-by: aszlig <aszlig@nix.build>
These are all `mkRenamedOptionModule` ones from 2015 (there are none
from 2014). `mkAliasOptionModule` from 2015 were left in because those
don't give any warning at all.
users.ldap.daemon.rootpwmodpw -> users.ldap.daemon.rootpwmodpwFile
users.ldap.bind.password -> users.ldap.bind.passwordFile
as users.ldap.daemon.rootpwmodpw never was part of a release, no
mkRenamedOptionModule is introduced.
Regression introduced by c94005358c.
The commit introduced declarative docker containers and subsequently
enables docker whenever any declarative docker containers are defined.
This is done via an option with type "attrsOf somesubmodule" and a check
on whether the attribute set is empty.
Unfortunately, the check was whether a *list* is empty rather than
wether an attribute set is empty, so "mkIf (cfg != [])" *always*
evaluates to true and thus subsequently enables docker by default:
$ nix-instantiate --eval nixos --arg configuration {} \
-A config.virtualisation.docker.enable
true
Fixing this is simply done by changing the check to "mkIf (cfg != {})".
Tested this by running the "docker-containers" NixOS test and it still
passes.
Signed-off-by: aszlig <aszlig@nix.build>
Cc: @benley, @danbst, @Infinisil, @nlewo
This otherwise does not eval `:tested` any more, which means no nixos
channel updates.
Regression comes from 0eb6d0735f (#57751)
which added an assertion stopping the use of `autoResize` when the
filesystem cannot be resized automatically.
* WIP: Run Docker containers as declarative systemd services
* PR feedback round 1
* docker-containers: add environment, ports, user, workdir options
* docker-containers: log-driver, string->str, line wrapping
* ExecStart instead of script wrapper, %n for container name
* PR feedback: better description and example formatting
* Fix docbook formatting (oops)
* Use a list of strings for ports, expand documentation
* docker-continers: add a simple nixos test
* waitUntilSucceeds to avoid potential weird async issues
* Don't enable docker daemon unless we actually need it
* PR feedback: leave ExecReload undefined
IPv6 container support broke a while ago and we didn't notice it. Making
them part of the (small) release test set should fix that. At this point
in time they should be granted the same amount of importance as the
legacy IP tests.
Prior to this commit an installation over serial via syslinux would
involve:
1. setting bitrate to BIOS's bitrate (typically 115200)
2. setting bitrate to syslinux's bitrate (38400)
3. setting bitrate to stty's bitrate (115200)
By changing syslinux's bitrate to 115200, an installation over serial
is a smoother experience, and consistent with the GRUB2 installation
which is also 115200 bps.
[root@nixos:~]# stty
speed 115200 baud; line = 0;
-brkint ixoff iutf8
-iexten
In a future commit I will add default serial terminals to the syslinux
kernel lines.
Previously this module precluded use of storage backends other than
`filesystem`. It is now possible to configure another storage backend
manually by setting `services.dockerRegistry.storagePath` to `null` and
configuring the other backend via `extraConfig`.