This patch implements basic support for OpenBSD-style pledge().
pledge() allows programs to incrementally reduce their set of allowed
syscalls, which are divided into categories that each make up a subset
of POSIX functionality.
If a process violates one of its pledged promises by attempting to call
a syscall that it previously said it wouldn't call, the process is
immediately terminated with an uncatchable SIGABRT.
This is by no means complete, and we'll need to add more checks in
various places to ensure that promises are being kept.
But it is pretty cool! :^)
We now support these mount flags:
* MS_NODEV: disallow opening any devices from this file system
* MS_NOEXEC: disallow executing any executables from this file system
* MS_NOSUID: ignore set-user-id bits on executables from this file system
The fourth flag, MS_BIND, is defined, but currently ignored.
O_EXEC is mentioned by POSIX, so let's have it. Currently, it is only used
inside the kernel to ensure the process has the right permissions when opening
an executable.
At the moment, the actual flags are ignored, but we correctly propagate them all
the way from the original mount() syscall to each custody that resides on the
mounted FS.
While I was updating syscalls to stop passing null-terminated strings,
I added some helpful struct types:
- StringArgument { const char*; size_t; }
- ImmutableBuffer<Data, Size> { const Data*; Size; }
- MutableBuffer<Data, Size> { Data*; Size; }
The Process class has some convenience functions for validating and
optionally extracting the contents from these structs:
- get_syscall_path_argument(StringArgument)
- validate_and_copy_string_from_user(StringArgument)
- validate(ImmutableBuffer)
- validate(MutableBuffer)
There's still so much code around this and I'm wondering if we should
generate most of it instead. Possible nice little project.
The chroot() syscall now allows the superuser to isolate a process into
a specific subtree of the filesystem. This is not strictly permanent,
as it is also possible for a superuser to break *out* of a chroot, but
it is a useful mechanism for isolating unprivileged processes.
The VFS now uses the current process's root_directory() as the root for
path resolution purposes. The root directory is stored as an uncached
Custody in the Process object.
Note that I'm developing some helper types in the Syscall namespace as
I go here. Once I settle on some nice types, I will convert all the
other syscalls to use them as well.
The userspace execve() wrapper now measures all the strings and puts
them in a neat and tidy structure on the stack.
This way we know exactly how much to copy in the kernel, and we don't
have to use the SMAP-violating validate_read_str(). :^)
If we pass a null path to these syscall wrappers, just return EFAULT
directly from the wrapper instead of segfaulting by calling strlen().
This is a compromise, since we now have to pass the path length to the
kernel, so we can't rely on the kernel to tell us that the path is at
a bad memory address.
Separate some responsibilities:
ELFDynamicLoader is responsible for loading elf binaries from disk and
performing relocations, calling init functions, and eventually calling
finalizer functions.
ELFDynamicObject is a helper class to parse the .dynamic section of an
elf binary, or the table of Elf32_Dyn entries at the _DYNAMIC symbol.
ELFDynamicObject now owns the helper classes for Relocations, Symbols,
Sections and the like that ELFDynamicLoader will use to perform
relocations and symbol lookup.
Because these new helpers are constructed from offsets into the .dynamic
section within the loaded .data section of the binary, we don't need the
ELFImage for nearly as much of the loading processes as we did before.
Therefore we can remove most of the extra DynamicXXX classes and just
keep the one that lets us find the location of _DYNAMIC in the new ELF.
And finally, since we changed the name of the class that dlopen/dlsym
care about, we need to compile/link and use the new ELFDynamicLoader
class in LibC.
This code never worked, as was never used for anything. We can build
a much better SHM implementation on top of TmpFS or similar when we
get to the point when we need one.
ELFDynamicObject::load looks a lot better with all the steps
re-organized into helpers.
Add plt_trampoline.S to handle PLT fixups for lazy loading.
Add the needed trampoline-trampolines in ELFDynamicObject to get to
the proper relocations and to return the symbol back to the assembly
method to call into from the PLT once we return back to user code.
This patch introduces a syscall:
int set_thread_boost(int tid, int amount)
You can use this to add a permanent boost value to the effective thread
priority of any thread with your UID (or any thread in the system if
you are the superuser.)
This is quite crude, but opens up some interesting opportunities. :^)
Threads now have numeric priorities with a base priority in the 1-99
range.
Whenever a runnable thread is *not* scheduled, its effective priority
is incremented by 1. This is tracked in Thread::m_extra_priority.
The effective priority of a thread is m_priority + m_extra_priority.
When a runnable thread *is* scheduled, its m_extra_priority is reset to
zero and the effective priority returns to base.
This means that lower-priority threads will always eventually get
scheduled to run, once its effective priority becomes high enough to
exceed the base priority of threads "above" it.
The previous values for ThreadPriority (Low, Normal and High) are now
replaced as follows:
Low -> 10
Normal -> 30
High -> 50
In other words, it will take 20 ticks for a "Low" priority thread to
get to "Normal" effective priority, and another 20 to reach "High".
This is not perfect, and I've used some quite naive data structures,
but I think the mechanism will allow us to build various new and
interesting optimizations, and we can figure out better data structures
later on. :^)
Lock each directory before entering it so when using -j, the same
dependency isn't built more than once at a time.
This doesn't get full -j parallelism though, since one make child
will be sitting idle waiting for flock to receive its lock and
continue making (which should then do nothing since it will have
been built already). Unfortunately there's not much that can be
done to fix that since it can't proceed until its dependency is
built by another make process.