ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-11-10 13:00:29 +03:00

Author	SHA1	Message	Date
Sergey Bugaev	d395b93b15	Kernel: Misc tweaks	2020-05-29 07:53:30 +02:00
Sergey Bugaev	fdb71cdf8f	Kernel: Support read-only filesystem mounts This adds support for MS_RDONLY, a mount flag that tells the kernel to disallow any attempts to write to the newly mounted filesystem. As this flag is per-mount, and different mounts of the same filesystems (such as in case of bind mounts) can have different mutability settings, you have to go though a custody to find out if the filesystem is mounted read-only, instead of just asking the filesystem itself whether it's inherently read-only. This also adds a lot of checks we were previously missing; and moves some of them to happen after more specific checks (such as regular permission checks). One outstanding hole in this system is sys$mprotect(PROT_WRITE), as there's no way we can know if the original file description this region has been mounted from had been opened through a readonly mount point. Currently, we always allow such sys$mprotect() calls to succeed, which effectively allows anyone to circumvent the effect of MS_RDONLY. We should solve this one way or another.	2020-05-29 07:53:30 +02:00
Sergey Bugaev	b6845de3f6	Kernel: Fix error case in Process::create_user_process() If we fail to exec() the target executable, don't leak the thread (this actually triggers an assertion when destructing the process), and print an error message.	2020-05-29 07:53:30 +02:00
Sergey Bugaev	6627c3ea3a	Kernel: Fix some failing assertions When mounting Ext2FS, we don't care if the file has a custody (it doesn't if it's a device, which is a common case). When doing a bind-mount, we do need a custody; if none is provided, let's return an error instead of crashing.	2020-05-29 07:53:30 +02:00
Sergey Bugaev	f945d7c358	Kernel: Always require read access when mmaping a file POSIX says, "The file descriptor fildes shall have been opened with read permission, regardless of the protection options specified."	2020-05-29 07:53:30 +02:00
Sergey Bugaev	602c3fdb3a	AK: Rename FileSystemPath -> LexicalPath And move canonicalized_path() to a static method on LexicalPath. This is to make it clear that FileSystemPath/canonicalized_path() only perform lexical canonicalization.	2020-05-26 14:35:10 +02:00
Sergey Bugaev	cddaeb43d3	Kernel: Introduce "sigaction" pledge You now have to pledge "sigaction" to change signal handlers/dispositions. This is to prevent malicious code from messing with assertions (and segmentation faults), which are normally expected to instantly terminate the process but can do other things if you change signal disposition for them.	2020-05-26 14:35:10 +02:00
Angel	6137475c39	Kernel: fix assertion on readlink() syscall The is_error() check on the KResultOr returned when reading the link target had a stray ! operator which causes link resolution to crash the kernel with an assertion error.	2020-05-26 12:45:01 +02:00
Brian Gianforcaro	6a74af8063	Kernel: Plumb KResult through FileDescription::read_entire_file() implementation. Allow file system implementation to return meaningful error codes to callers of the FileDescription::read_entire_file(). This allows both Process::sys$readlink() and Process::sys$module_load() to return more detailed errors to the user.	2020-05-26 10:15:40 +02:00
Andreas Kling	dd924b730a	Kernel+LibC: Fix various build issues introduced by ssize_t Now that ssize_t is derived from size_t, we have to	2020-05-23 15:27:33 +02:00
Andreas Kling	b3736c1b1e	Kernel: Use a FlatPtr for the "argument" to ioctl() Since it's often used to pass pointers, it should really be a FlatPtr.	2020-05-23 15:25:43 +02:00
Sergey Bugaev	7541122206	Kernel+LibC: Switch isatty() to use a fcntl() We would want it to work with only stdio pledged.	2020-05-20 08:31:31 +02:00
AnotherTest	8582a06899	Kernel + LibC: Handle running processes in do_waitid()	2020-05-17 11:58:08 +02:00
AnotherTest	9d54f21859	Kernel: wait() should not block if WNOHANG is specified	2020-05-17 11:58:08 +02:00
Andreas Kling	f7a75598bb	Kernel: Remove Process::any_thread() This was a holdover from the old times when each Process had a special main thread with TID 0. Using it was a total crapshoot since it would just return whichever thread was first on the process's thread list. Now that I've removed all uses of it, we don't need it anymore. :^)	2020-05-16 12:40:15 +02:00
Andreas Kling	0e7f85c24a	Kernel: Sending a signal to a process now goes to the main thread Instead of falling back to the suspicious "any_thread()" mechanism, just fail with ESRCH if you try to kill() a PID that doesn't have a corresponding TID.	2020-05-16 12:33:48 +02:00
Andreas Kling	21d5f4ada1	Kernel: Absorb LibBareMetal back into the kernel This was supposed to be the foundation for some kind of pre-kernel environment, but nobody is working on it right now, so let's move everything back into the kernel and remove all the confusion.	2020-05-16 12:00:04 +02:00
Andreas Kling	204fb27333	Kernel: Remove now-unused KernelInfoPage.h	2020-05-16 11:34:54 +02:00
Andreas Kling	2dc051c866	Kernel: Remove sys$getdtablesize() I'm not sure why this was a syscall. If we need this we can add it in LibC as a wrapper around sysconf(_SC_OPEN_MAX).	2020-05-16 11:34:01 +02:00
Andreas Kling	426c4e387d	Kernel: Use copy_to_user() in sys$gettimeofday()	2020-05-16 11:34:01 +02:00
Andreas Kling	3a92d0828d	Kernel: Remove the "kernel info page" used for fast gettimeofday() We stopped using gettimeofday() in Core::EventLoop a while back, in favor of clock_gettime() for monotonic time. Maintaining an optimization for a syscall we're not using doesn't make a lot of sense, so let's go back to the old-style sys$gettimeofday().	2020-05-16 11:33:59 +02:00
Sergey Bugaev	752617cbb2	Kernel: Disallow opening socket files You can still open files that have sockets attached to them from inside the kernel via VFS::open() (and in fact, that is what LocalSocket itslef uses), but trying to do that from userspace using open() will now fail with ENXIO.	2020-05-15 11:43:58 +02:00
Andreas Kling	5bfd893292	Kernel+Userland: Add "settime" pledge promise for setting system time We now require the "settime" promise from pledged processes who want to change the system time.	2020-05-08 22:54:17 +02:00
Andreas Kling	1cddb1055f	Kernel: Only allow superuser to call sys$clock_settime()	2020-05-08 22:47:21 +02:00
Andreas Kling	652b22ee9c	Kernel: Remove SmapDisabler in sys$clock_settime()	2020-05-08 22:47:03 +02:00
Andreas Kling	55f61c0004	Kernel: Add for_each_vmobject_of_type<T> This makes iterating over a specific type of VMObjects a bit nicer.	2020-05-08 22:10:47 +02:00
Andreas Kling	042b1f6814	Kernel: Propagate failure to commit VM regions in more places Ultimately we should not panic just because we can't fully commit a VM region (by populating it with physical pages.) This patch handles some of the situations where commit() can fail.	2020-05-08 21:47:08 +02:00
Andreas Kling	6fe83b0ac4	Kernel: Crash the current process on OOM (instead of panicking kernel) This patch adds PageFaultResponse::OutOfMemory which informs the fault handler that we were unable to allocate a necessary physical page and cannot continue. In response to this, the kernel will crash the current process. Because we are OOM, we can't symbolicate the crash like we normally would (since the ELF symbolication code needs to allocate), so we also communicate to Process::crash() that we're out of memory. Now we can survive "allocate 300 MB" (only the allocate process dies.) This is definitely not perfect and can easily end up killing a random innocent other process who happened to allocate one page at the wrong time, but it's a lot better than panicking on OOM. :^)	2020-05-06 22:28:23 +02:00
Ben Wiederhake	dce3faff08	Kernel: Don't crash on invalid fcntl	2020-05-03 22:46:28 +02:00
Michael Lelli	58a34fbe09	Kernel: Fix pledge syscall applying new pledges when it fails (#2076 ) If the exec promises fail to apply, then the normal promises should not apply either. Add a test for this fixed functionality.	2020-05-03 00:41:18 +02:00
Brian Gianforcaro	25a620a573	Kernel: Enable timeout support for sys$futex(FUTEX_WAIT) Utilize the new Thread::wait_on timeout parameter to implement timeout support for FUTEX_WAIT. As we compute the relative time from the user specified absolute time, we try to delay that computation as long as possible before we call into Thread::wait_on(..). To enable this a small bit of refactoring was done pull futex_queue fetching out and timeout fetch and calculation separation.	2020-04-26 21:31:52 +02:00
Andreas Kling	fb826aa59a	Kernel: Make sys$sethostname() superuser-only Also take the hostname string lock exclusively.	2020-04-26 15:51:57 +02:00
Luke Payne	f191b84b50	Kernel: Added the ability to set the hostname via new syscall Userland/hostname: Now takes parameter to set the hostname LibC/unistd: Added sethostname function	2020-04-26 12:59:09 +02:00
Brian Gianforcaro	0f3990cfa3	Kernel: Support signaling all processes with pid == -1 This is a special case that was previously not implemented. The idea is that you can dispatch a signal to all other processes the calling process has access to. There was some minor refactoring to make the self signal logic into a function so it could easily be easily re-used from do_killall.	2020-04-26 12:54:10 +02:00
Brian Gianforcaro	1f64e3eb16	Kernel: Implement FUTEX_WAKE of arbitrary count. Previously we just woke all waiters no matter how many were requested. Fix this by implementing WaitQueue::wake_n(..).	2020-04-26 12:35:35 +02:00
Drew Stratford	4a37362249	LibPthread: implicitly call pthread_exit on return from start routine. Previously, when returning from a pthread's start_routine, we would segfault. Now we instead implicitly call pthread_exit as specified in the standard. pthread_create now creates a thread running the new pthread_create_helper, which properly manages the calling and exiting of the start_routine supplied to pthread_create. To accomplish this, the thread's stack initialization has been moved out of sys$create_thread and into the userspace function create_thread.	2020-04-25 16:51:35 +02:00
Itamar	edaa9c06d9	LibELF: Make ELF::Loader RefCounted	2020-04-20 17:25:50 +02:00
Sergey Bugaev	54550365eb	Kernel: Use shared locking mode in some places The notable piece of code that remains to be converted is Ext2FS.	2020-04-18 13:58:29 +02:00
Sergey Bugaev	f18d6610d3	Kernel: Don't include null terminator in sys$readlink() result POSIX says, "Conforming applications should not assume that the returned contents of the symbolic link are null-terminated." If we do include the null terminator into the returning string, Python believes it to actually be a part of the returned name, and gets unhappy about that later. This suggests other systems Python runs in don't include it, so let's do that too. Also, make our userspace support non-null-terminated realpath().	2020-04-14 18:40:24 +02:00
Andreas Kling	815b73bdcc	Kernel: Simplify sys$setgroups(0, ...) If we're dropping all groups, just clear the extra_gids and return.	2020-04-14 15:30:25 +02:00
Andreas Kling	9962db5bf8	Kernel: Remove SmapDisablers in {peek,poke}_user_data()	2020-04-14 09:52:49 +02:00
Itamar	3e9a7175d1	Debugger: Add DebugSession The DebugSession class wraps the usage of Ptrace. It is intended to be used by cli & gui debugger programs. Also, call objdump for disassemly	2020-04-13 00:53:22 +02:00
Itamar	aae3f7b914	Process: Fix siginfo for code CLD_STOPPED si_code, si_status where swapped	2020-04-13 00:53:22 +02:00
Itamar	9e51e295cf	ptrace: Add PT_SETREGS PT_SETTREGS sets the regsiters of the traced thread. It can only be used when the tracee is stopped. Also, refactor ptrace. The implementation was getting long and cluttered the alraedy large Process.cpp file. This commit moves the bulk of the implementation to Kernel/Ptrace.cpp, and factors out peek & poke to separate methods of the Process class.	2020-04-13 00:53:22 +02:00
Itamar	0431712660	ptrace: Stop a traced thread when it exists from execve This was a missing feature in the PT_TRACEME command. This feature allows the tracer to interact with the tracee before the tracee has started executing its program. It will be useful for automatically inserting a breakpoint at a debugged program's entry point.	2020-04-13 00:53:22 +02:00
Itamar	b306ac9b2b	ptrace: Add PT_POKE PT_POKE writes a single word to the tracee's address space. Some caveats: - If the user requests to write to an address in a read-only region, we temporarily change the page's protections to allow it. - If the user requests to write to a region that's backed by a SharedInodeVMObject, we replace the vmobject with a PrivateIndoeVMObject.	2020-04-13 00:53:22 +02:00
Itamar	984ff93406	ptrace: Add PT_PEEK PT_PEEK reads a single word from the tracee's address space and returns it to the tracer.	2020-04-13 00:53:22 +02:00
Andreas Kling	c19b56dc99	Kernel+LibC: Add minherit() and MAP_INHERIT_ZERO This patch adds the minherit() syscall originally invented by OpenBSD. Only the MAP_INHERIT_ZERO mode is supported for now. If set on an mmap region, that region will be zeroed out on fork().	2020-04-12 20:22:26 +02:00
Andrew Kaster	61acca223f	LibELF: Move validation methods to their own file These validate_elf_* methods really had no business being static methods of ELF::Image. Now that the ELF namespace exists, it makes sense to just move them to be free functions in the namespace.	2020-04-11 22:41:05 +02:00
Andrew Kaster	21b5909dc6	LibELF: Move ELF classes into namespace ELF This is for consistency with other namespace changes that were made a while back to the other libraries :)	2020-04-11 22:41:05 +02:00
Andreas Kling	dec352dacd	Kernel: Ignore zero-length PROGBITS sections in sys$module_load()	2020-04-10 16:36:01 +02:00
Andreas Kling	c06d5ef114	Kernel+LibC: Remove ESUCCESS There's no official ESUCCESS==0 errno code, and it keeps breaking the Lagom build when we use it, so let's just say 0 instead.	2020-04-10 13:09:35 +02:00
Andreas Kling	871d450b93	Kernel: Remove redundant "ACPI" from filenames in ACPI/	2020-04-09 18:17:27 +02:00
Andreas Kling	4644217094	Kernel: Remove "non-operational" ACPI parser state If we don't support ACPI, just don't instantiate an ACPI parser. This is way less confusing than having a special parser class whose only purpose is to do nothing. We now search for the RSDP in ACPI::initialize() instead of letting the parser constructor do it. This allows us to defer the decision to create a parser until we're sure we can make a useful one.	2020-04-09 17:19:11 +02:00
Andreas Kling	dc7340332d	Kernel: Update cryptically-named functions related to symbolication	2020-04-08 17:19:46 +02:00
Liav A	23fb985f02	Kernel & Userland: Allow to mount image files formatted with Ext2FS	2020-04-06 15:36:36 +02:00
Andreas Kling	9ae3cced76	Revert "Kernel & Userland: Allow to mount image files formatted with Ext2FS" This reverts commit `a60ea79a41`. Reverting these changes since they broke things. Fixes #1608.	2020-04-03 21:28:57 +02:00
Liav A	a60ea79a41	Kernel & Userland: Allow to mount image files formatted with Ext2FS	2020-04-02 12:03:08 +02:00
Itamar	6b74d38aab	Kernel: Add 'ptrace' syscall This commit adds a basic implementation of the ptrace syscall, which allows one process (the tracer) to control another process (the tracee). While a process is being traced, it is stopped whenever a signal is received (other than SIGCONT). The tracer can start tracing another thread with PT_ATTACH, which causes the tracee to stop. From there, the tracer can use PT_CONTINUE to continue the execution of the tracee, or use other request codes (which haven't been implemented yet) to modify the state of the tracee. Additional request codes are PT_SYSCALL, which causes the tracee to continue exection but stop at the next entry or exit from a syscall, and PT_GETREGS which fethces the last saved register set of the tracee (can be used to inspect syscall arguments and return value). A special request code is PT_TRACE_ME, which is issued by the tracee and causes it to stop when it calls execve and wait for the tracer to attach.	2020-03-28 18:27:18 +01:00
Shannon Booth	757c14650f	Kernel: Simplify process assertion checking if region is in range Let's use the helper function for this :)	2020-03-22 08:51:40 +01:00
Liav A	b536547c52	Process: Use monotonic time for timeouts	2020-03-19 15:48:00 +01:00
Liav A	4484513b45	Kernel: Add new syscall to allow changing the system date	2020-03-19 15:48:00 +01:00
Liav A	9db291d885	Kernel: Introduce the new Time management subsystem This new subsystem includes better abstractions of how time will be handled in the OS. We take advantage of the existing RTC timer to aid in keeping time synchronized. This is standing in contrast to how we handled time-keeping in the kernel, where the PIT was responsible for that function in addition to update the scheduler about ticks. With that new advantage, we can easily change the ticking dynamically and still keep the time synchronized. In the process context, we no longer use a fixed declaration of TICKS_PER_SECOND, but we call the TimeManagement singleton class to provide us the right value. This allows us to use dynamic ticking in the future, a feature known as tickless kernel. The scheduler no longer does by himself the calculation of real time (Unix time), and just calls the TimeManagment singleton class to provide the value. Also, we can use 2 new boot arguments: - the "time" boot argument accpets either the value "modern", or "legacy". If "modern" is specified, the time management subsystem will try to setup HPET. Otherwise, for "legacy" value, the time subsystem will revert to use the PIT & RTC, leaving HPET disabled. If this boot argument is not specified, the default pattern is to try to setup HPET. - the "hpet" boot argumet accepts either the value "periodic" or "nonperiodic". If "periodic" is specified, the HPET will scan for periodic timers, and will assert if none are found. If only one is found, that timer will be assigned for the time-keeping task. If more than one is found, both time-keeping task & scheduler-ticking task will be assigned to periodic timers. If this boot argument is not specified, the default pattern is to try to scan for HPET periodic timers. This boot argument has no effect if HPET is disabled. In hardware context, PIT & RealTimeClock classes are merely inheriting from the HardwareTimer class, and they allow to use the old i8254 (PIT) and RTC devices, managing them via IO ports. By default, the RTC will be programmed to a frequency of 1024Hz. The PIT will be programmed to a frequency close to 1000Hz. About HPET, depending if we need to scan for periodic timers or not, we try to set a frequency close to 1000Hz for the time-keeping timer and scheduler-ticking timer. Also, if possible, we try to enable the Legacy replacement feature of the HPET. This feature if exists, instructs the chipset to disconnect both i8254 (PIT) and RTC. This behavior is observable on QEMU, and was verified against the source code: `ce967e2f33` The HPETComparator class is inheriting from HardwareTimer class, and is responsible for an individual HPET comparator, which is essentially a timer. Therefore, it needs to call the singleton HPET class to perform HPET-related operations. The new abstraction of Hardware timers brings an opportunity of more new features in the foreseeable future. For example, we can change the callback function of each hardware timer, thus it makes it possible to swap missions between hardware timers, or to allow to use a hardware timer for other temporary missions (e.g. calibrating the LAPIC timer, measuring the CPU frequency, etc).	2020-03-19 15:48:00 +01:00
Alex Muscar	d013753f83	Kernel: Resolve relative paths when there is a veil (#1474 )	2020-03-19 09:57:34 +01:00
Andreas Kling	ad92a1e4bc	Kernel: Add sys$get_stack_bounds() for finding the stack base & size This will be useful when implementing conservative garbage collection.	2020-03-16 19:06:33 +01:00
Andreas Kling	3803196edb	Kernel: Get rid of SmapDisabler in sys$fstat()	2020-03-10 13:34:24 +01:00
Liav A	0f45a1b5e7	Kernel: Allow to reboot in ACPI via PCI or MMIO access Also, we determine if ACPI reboot is supported by checking the FADT flags' field.	2020-03-09 10:53:13 +01:00
Ben Wiederhake	b066586355	Kernel: Fix race in waitid This is similar to `28e1da344d` and `4dd4dd2f3c`. The crux is that wait verifies that the outvalue (siginfo* infop) is writable before waiting, and writes to it after waiting. In the meantime, a concurrent thread can make the output region unwritable, e.g. by deallocating it.	2020-03-08 14:12:12 +01:00
Ben Wiederhake	d8cd4e4902	Kernel: Fix race in select This is similar to `28e1da344d` and `4dd4dd2f3c`. The crux is that select verifies that the filedescriptor sets are writable before blocking, and writes to them after blocking. In the meantime, a concurrent thread can make the output buffer unwritable, e.g. by deallocating it.	2020-03-08 14:12:12 +01:00
Andreas Kling	b1058b33fb	AK: Add global FlatPtr typedef. It's u32 or u64, based on sizeof(void*) Use this instead of uintptr_t throughout the codebase. This makes it possible to pass a FlatPtr to something that has u32 and u64 overloads.	2020-03-08 13:06:51 +01:00
Andreas Kling	c6693f9b3a	Kernel: Simplify a bunch of dbg() and klog() calls LogStream can handle VirtualAddress and PhysicalAddress directly.	2020-03-06 15:00:44 +01:00
Liav A	85eb1d26d5	Kernel: Run clang-format on Process.cpp & ACPIDynamicParser.h	2020-03-05 19:04:04 +01:00
Liav A	1b8cd6db7b	Kernel: Call ACPI reboot method first if possible Now we call ACPI reboot method first if possible, and if ACPI reboot is not available, we attempt to reboot via the keyboard controller.	2020-03-05 19:04:04 +01:00
Ben Wiederhake	4dd4dd2f3c	Kernel: Fix race in clock_nanosleep This is a complete fix of clock_nanosleep, because the thread holds the process lock again when returning from sleep()/sleep_until(). Therefore, no further concurrent invalidation can occur.	2020-03-03 20:13:32 +01:00
Liav A	0fc60e41dd	Kernel: Use klog() instead of kprintf() Also, duplicate data in dbg() and klog() calls were removed. In addition, leakage of virtual address to kernel log is prevented. This is done by replacing kprintf() calls to dbg() calls with the leaked data instead. Also, other kprintf() calls were replaced with klog().	2020-03-02 22:23:39 +01:00
Andreas Kling	47beab926d	Kernel: Remove ability to create kernel-only regions at user addresses This was only used by the mechanism for mapping executables into each process's own address space. Now that we remap executables on demand when needed for symbolication, this can go away.	2020-03-02 11:20:34 +01:00
Andreas Kling	e56f8706ce	Kernel: Map executables at a kernel address during ELF load This is both simpler and more robust than mapping them in the process address space.	2020-03-02 11:20:34 +01:00
Andreas Kling	678c87087d	Kernel: Load executables on demand when symbolicating Previously we would map the entire executable of a program in its own address space (but make it unavailable to userspace code.) This patch removes that and changes the symbolication code to remap the executable on demand (and into the kernel's own address space instead of the process address space.) This opens up a couple of further simplifications that will follow.	2020-03-02 11:20:34 +01:00
Andreas Kling	0acac186fb	Kernel: Make the "entire executable" region shared This makes Region::clone() do the right thing with it on fork().	2020-03-02 06:13:29 +01:00
Andreas Kling	5c2a296a49	Kernel: Mark read-only PT_LOAD mappings as shared regions This makes Region::clone() do the right thing for these now that we differentiate based on Region::is_shared().	2020-03-01 21:26:36 +01:00
Andreas Kling	ecfde5997b	Kernel: Use SharedInodeVMObject for executables after all I had the wrong idea about this. Thanks to Sergey for pointing it out! Here's what he says (reproduced for posterity): > Private mappings protect the underlying file from the changes made by > you, not the other way around. To quote POSIX, "If MAP_PRIVATE is > specified, modifications to the mapped data by the calling process > shall be visible only to the calling process and shall not change the > underlying object. It is unspecified whether modifications to the > underlying object done after the MAP_PRIVATE mapping is established > are visible through the MAP_PRIVATE mapping." In practice that means > that the pages that were already paged in don't get updated when the > underlying file changes, and the pages that weren't paged in yet will > load the latest data at that moment. > The only thing MAP_FILE \| MAP_PRIVATE is really useful for is mapping > a library and performing relocations; it's definitely useless (and > actively harmful for the system memory usage) if you only read from > the file. This effectively reverts `e2697c2ddd`.	2020-03-01 21:16:27 +01:00
Andreas Kling	bb7dd63f74	Kernel: Run clang-format on Process.cpp	2020-03-01 21:16:27 +01:00
Andreas Kling	687b52ceb5	Kernel: Name perfcore files "perfcore.PID" This way we can trace many things and we get one perfcore file per process instead of everyone trying to write to "perfcore"	2020-03-01 20:59:02 +01:00
Andreas Kling	fee20bd8de	Kernel: Remove some more harmless InodeVMObject miscasts	2020-03-01 12:27:03 +01:00
Andreas Kling	95e3aec719	Kernel: Fix harmless type miscast in Process::amount_clean_inode()	2020-03-01 11:23:23 +01:00
Andreas Kling	e2697c2ddd	Kernel: Use PrivateInodeVMObject for loading program executables This will be a memory usage pessimization until we actually implement CoW sharing of the memory pages with SharedInodeVMObject. However, it's a huge architectural improvement, so let's take it and improve on this incrementally. fork() should still be neutral, since all private mappings are CoW'ed.	2020-03-01 11:23:10 +01:00
Andreas Kling	88b334135b	Kernel: Remove some Region construction helpers It's now up to the caller to provide a VMObject when constructing a new Region object. This will make it easier to handle things going wrong, like allocation failures, etc.	2020-03-01 11:23:10 +01:00
Andreas Kling	4badef8137	Kernel: Return bytes written if sys$write() fails after writing some If we wrote anything we should just inform userspace that we did, and not worry about the error code. Userspace can call us again if it wants, and we'll give them the error then.	2020-02-29 18:42:35 +01:00
Andreas Kling	7cd1bdfd81	Kernel: Simplify some dbg() logging We don't have to log the process name/PID/TID, dbg() automatically adds that as a prefix to every line. Also we don't have to do .characters() on Strings passed to dbg() :^)	2020-02-29 13:39:06 +01:00
Andreas Kling	8fbdda5a2d	Kernel: Implement basic support for sys$mmap() with MAP_PRIVATE You can now mmap a file as private and writable, and the changes you make will only be visible to you. This works because internally a MAP_PRIVATE region is backed by a unique PrivateInodeVMObject instead of using the globally shared SharedInodeVMObject like we always did before. :^) Fixes #1045.	2020-02-28 23:25:00 +01:00
Andreas Kling	aa1e209845	Kernel: Remove some unnecessary indirection in InodeFile::mmap() InodeFile now directly calls Process::allocate_region_with_vmobject() instead of taking an awkward detour via a special Region constructor.	2020-02-28 20:29:14 +01:00
Andreas Kling	651417a085	Kernel: Split InodeVMObject into two subclasses We now have PrivateInodeVMObject and SharedInodeVMObject, corresponding to MAP_PRIVATE and MAP_SHARED respectively. Note that PrivateInodeVMObject is not used yet.	2020-02-28 20:20:35 +01:00
Andreas Kling	07a26aece3	Kernel: Rename InodeVMObject => SharedInodeVMObject	2020-02-28 20:07:51 +01:00
Andreas Kling	5af95139fa	Kernel: Make Process::m_master_tls_region a WeakPtr Let's not keep raw Region* variables around like that when it's so easy to avoid it.	2020-02-28 14:05:30 +01:00
Andreas Kling	b0623a0c58	Kernel: Remove SmapDisabler in sys$connect()	2020-02-28 13:20:26 +01:00
Andreas Kling	dcd619bd46	Kernel: Merge the shbuf_get_size() syscall into shbuf_get() Add an extra out-parameter to shbuf_get() that receives the size of the shared buffer. That way we don't need to make a separate syscall to get the size, which we always did immediately after.	2020-02-28 12:55:58 +01:00
Andreas Kling	f72e5bbb17	Kernel+LibC: Rename shared buffer syscalls to use a prefix This feels a lot more consistent and Unixy: create_shared_buffer() => shbuf_create() share_buffer_with() => shbuf_allow_pid() share_buffer_globally() => shbuf_allow_all() get_shared_buffer() => shbuf_get() release_shared_buffer() => shbuf_release() seal_shared_buffer() => shbuf_seal() get_shared_buffer_size() => shbuf_get_size() Also, "shared_buffer_id" is shortened to "shbuf_id" all around.	2020-02-28 12:55:58 +01:00
Liav A	db23703570	Process: Use dbg() instead of dbgprintf() Also, fix a bad derefernce in sys$create_shared_buffer() method.	2020-02-27 13:05:12 +01:00
Andreas Kling	4997dcde06	Kernel: Always disable interrupts in do_killpg() Will caught an assertion when running "kill 9999999999999" :^)	2020-02-27 11:05:16 +01:00
Andreas Kling	4a293e8a21	Kernel: Ignore signals sent to threadless (zombie) processes If a process doesn't have any threads left, it's in a zombie state and we can't meaningfully send signals to it. So just ignore them. Fixes #1313.	2020-02-27 11:04:15 +01:00
Andreas Kling	0c1497846e	Kernel: Don't allow profiling a dead process Work towards #1313.	2020-02-27 10:42:31 +01:00
Cristian-Bogdan SIRB	05ce8586ea	Kernel: Fix ASSERTION failed in join_thread syscall set_interrupted_by_death was never called whenever a thread that had a joiner died, so the joiner remained with the joinee pointer there, resulting in an assertion fail in JoinBlocker: m_joinee pointed to a freed task, filled with garbage. Thread::current->m_joinee may not be valid after the unblock Properly return the joinee exit value to the joiner thread.	2020-02-27 10:09:44 +01:00
Andreas Kling	d28fa89346	Kernel: Don't assert on sys$kill() with pid=INT32_MIN On 32-bit platforms, INT32_MIN == -INT32_MIN, so we can't expect this to always work: if (pid < 0) positive_pid = -pid; // may still be negative! This happens because the -INT32_MIN expression becomes a long and is then truncated back to an int. Fixes #1312.	2020-02-27 10:02:04 +01:00
Cristian-Bogdan SIRB	717cd5015e	Kernel: Allow process with multiple threads to call exec and exit This allows a process wich has more than 1 thread to call exec, even from a thread. This kills all the other threads, but it won't wait for them to finish, just makes sure that they are not in a running/runable state. In the case where a thread does exec, the new program PID will be the thread TID, to keep the PID == TID in the new process. This introduces a new function inside the Process class, kill_threads_except_self which is called on exit() too (exit with multiple threads wasn't properly working either). Inside the Lock class, there is the need for a new function, clear_waiters, which removes all the waiters from the Process::big_lock. This is needed since after a exit/exec, there should be no other threads waiting for this lock, the threads should be simply killed. Only queued threads should wait for this lock at this point, since blocked threads are handled in set_should_die.	2020-02-26 13:06:40 +01:00
Andreas Kling	ceec1a7d38	AK: Make Vector use size_t for its size and capacity	2020-02-25 14:52:35 +01:00
Andreas Kling	d0f5b43c2e	Kernel: Use Vector::unstable_remove() when deallocating a region Process::m_regions is not sorted, so we can use unstable_remove() to avoid shifting the vector contents. :^)	2020-02-24 18:34:49 +01:00
Andreas Kling	30a8991dbf	Kernel: Make Region weakable and use WeakPtr<Region> instead of Region* This turns use-after-free bugs into null pointer dereferences instead.	2020-02-24 13:32:45 +01:00
Andreas Kling	79576f9280	Kernel: Clear the region lookup cache on exec() Each process has a 1-level lookup cache for fast repeated lookups of the same VM region (which tends to be the majority of lookups.) The cache is used by the following syscalls: munmap, madvise, mprotect and set_mmap_name. After a succesful exec(), there could be a stale Region* in the lookup cache, and the new executable was able to manipulate it using a number of use-after-free code paths.	2020-02-24 12:37:27 +01:00
Liav A	895e874eb4	Kernel: Include the new PIT class in system components	2020-02-24 11:27:03 +01:00
Andreas Kling	fc5ebe2a50	Kernel: Disown shared buffers on sys$execve() When committing to a new executable, disown any shared buffers that the process was previously co-owning. Otherwise accessing the same shared buffer ID from the new program would cause the kernel to find a cached (and stale!) reference to the previous program's VM region corresponding to that shared buffer, leading to a Region* use-after-free. Fixes #1270.	2020-02-22 12:29:38 +01:00
Andreas Kling	ece2971112	Kernel: Disable profiling during the critical section of sys$execve() Since we're gonna throw away these stacks at the end of exec anyway, we might as well disable profiling before starting to mess with the process page tables. One less weird situation to worry about in the sampling code.	2020-02-22 11:09:03 +01:00
Andreas Kling	d7a13dbaa7	Kernel: Reset profiling state on exec() (but keep it going) We now log the new executable on exec() and throw away all the samples we've accumulated so far. But profiling keeps going.	2020-02-22 10:54:50 +01:00
Andreas Kling	2a679f228e	Kernel: Fix bitrotted DEBUG_IO logging	2020-02-21 15:49:30 +01:00
Andreas Kling	bead20c40f	Kernel: Remove SmapDisabler in sys$create_shared_buffer()	2020-02-18 14:12:39 +01:00
Andreas Kling	9aa234cc47	Kernel: Reset FPU state on exec()	2020-02-18 13:44:27 +01:00
Andreas Kling	a7dbb3cf96	Kernel: Use a FixedArray for a process's extra GIDs There's not really enough of these to justify using a HashTable.	2020-02-18 11:35:47 +01:00
Andreas Kling	48f7c28a5c	Kernel: Replace "current" with Thread::current and Process::current Suggested by Sergey. The currently running Thread and Process are now Thread::current and Process::current respectively. :^)	2020-02-17 15:04:27 +01:00
Andreas Kling	4f4af24b9d	Kernel: Tear down process address space during finalization Process teardown is divided into two main stages: finalize and reap. Finalization happens in the "Finalizer" kernel and runs with interrupts enabled, allowing destructors to take locks, etc. Reaping happens either in sys$waitid() or in the scheduler for orphans. The more work we can do in finalization, the better, since it's fully pre-emptible and reduces the amount of time the system runs without interrupts enabled.	2020-02-17 14:33:06 +01:00
Andreas Kling	31e1af732f	Kernel+LibC: Allow sys$mmap() callers to specify address alignment This is exposed via the non-standard serenity_mmap() call in userspace.	2020-02-16 12:55:56 +01:00
Andreas Kling	7a8be7f777	Kernel: Remove SmapDisabler in sys$accept()	2020-02-16 08:20:54 +01:00
Andreas Kling	7717084ac7	Kernel: Remove SmapDisabler in sys$clock_gettime()	2020-02-16 08:13:11 +01:00
Andreas Kling	16818322c5	Kernel: Reduce header dependencies of Process and Thread	2020-02-16 02:01:42 +01:00
Andreas Kling	e28809a996	Kernel: Add forward declaration header	2020-02-16 01:50:32 +01:00
Andreas Kling	1d611e4a11	Kernel: Reduce header dependencies of MemoryManager and Region	2020-02-16 01:33:41 +01:00
Andreas Kling	a356e48150	Kernel: Move all code into the Kernel namespace	2020-02-16 01:27:42 +01:00
Andreas Kling	1f55079488	Kernel: Remove SmapDisabler in sys$getgroups()	2020-02-16 00:30:00 +01:00
Andreas Kling	eb7b0c76a8	Kernel: Remove SmapDisabler in sys$setgroups()	2020-02-16 00:27:10 +01:00
Andreas Kling	0341ddc5eb	Kernel: Rename RegisterDump => RegisterState	2020-02-16 00:15:37 +01:00
Andreas Kling	580a94bc44	Kernel+LibC: Merge sys$stat() and sys$lstat() There is now only one sys$stat() instead of two separate syscalls.	2020-02-10 19:49:49 +01:00
Liav A	e559af2008	Kernel: Apply changes to use LibBareMetal definitions	2020-02-09 19:38:17 +01:00
Andreas Kling	7291370478	Kernel: Make File::truncate() take a u64 No point in taking a signed type here. We validate at the syscall layer and then pass around a u64 from then on.	2020-02-08 12:07:04 +01:00
Andreas Kling	88ea152b24	Kernel: Merge unnecessary DiskDevice class into BlockDevice	2020-02-08 02:20:03 +01:00
Andreas Kling	2b0b7cc5a4	Net: Add a basic sys$shutdown() implementation Calling shutdown prevents further reads and/or writes on a socket. We should do a few more things based on the type of socket, but this initial implementation just puts the basic mechanism in place. Work towards #428.	2020-02-08 00:54:43 +01:00
Andreas Kling	f3a5985bb2	Kernel: Remove two bad FIXME's We should absolutely not create a new thread in sys$exec(). There's also no sys$spawn() anymore.	2020-02-08 00:06:15 +01:00
Andreas Kling	d04fcccc90	Kernel: Truncate addresses stored by getsockname() and getpeername() If there's not enough space in the output buffer for the whole sockaddr we now simply truncate the address instead of returning EINVAL. This patch also makes getpeername() actually return the peer address rather than the local address.. :^)	2020-02-07 23:43:32 +01:00
Andreas Kling	dc18859695	Kernel: memset() all siginfo_t structs after creating them	2020-02-06 14:12:20 +01:00
Sergey Bugaev	1b866bbf42	Kernel: Fix sys$waitid(P_ALL, WNOHANG) return value According to POSIX, waitid() should fill si_signo and si_pid members with zeroes if there are no children that have already changed their state by the time of the call. Let's just fill the whole structure with zeroes to avoid leaking kernel memory.	2020-02-06 16:06:30 +03:00
Andreas Kling	75cb125e56	Kernel: Put sys$waitid() debug logging behind PROCESS_DEBUG	2020-02-05 19:14:56 +01:00
Sergey Bugaev	b3a24d732d	Kernel+LibC: Add sys$waitid(), and make sys$waitpid() wrap it sys$waitid() takes an explicit description of whether it's waiting for a single process with the given PID, all of the children, a group, etc., and returns its info as a siginfo_t. It also doesn't automatically imply WEXITED, which clears up the confusion in the kernel.	2020-02-05 18:14:37 +01:00
Andreas Kling	3879e5b9d4	Kernel: Start working on a syscall for logging performance events This patch introduces sys$perf_event() with two event types: - PERF_EVENT_MALLOC - PERF_EVENT_FREE After the first call to sys$perf_event(), a process will begin keeping these events in a buffer. When the process dies, that buffer will be written out to "perfcore" in the current directory unless that filename is already taken. This is probably not the best way to do this, but it's a start and will make it possible to start doing memory allocation profiling. :^)	2020-02-02 20:26:27 +01:00
Andreas Kling	934b1d8a9b	Kernel: Finalizer should not go back to sleep if there's more to do Before putting itself back on the wait queue, the finalizer task will now check if there's more work to do, and if so, do it first. :^) This patch also puts a bunch of process/thread debug logging behind PROCESS_DEBUG and THREAD_DEBUG since it was unbearable to debug this stuff with all the spam.	2020-02-01 10:56:17 +01:00
Andreas Kling	6634da31d9	Kernel: Disallow empty ranges in munmap/mprotect/madvise	2020-01-30 21:55:49 +01:00
Andreas Kling	31d1c82621	Kernel: Reject non-user address ranges in mmap/munmap/mprotect/madvise There's no valid reason to allow non-userspace address ranges in these system calls.	2020-01-30 21:51:27 +01:00
Andreas Kling	afd2b5a53e	Kernel: Copy "stack" and "mmap" bits when splitting a Region	2020-01-30 21:51:27 +01:00
Andreas Kling	c9e877a294	Kernel: Address validation helpers should take size_t, not ssize_t	2020-01-30 21:51:27 +01:00
Andreas Kling	c64904a483	Kernel: sys$readlink() should return the number of bytes written out	2020-01-27 21:50:51 +01:00
Andreas Kling	8b49804895	Kernel: sys$waitpid() only needs the waitee thread in the stopped case If the waitee process is dead, we don't need to inspect the thread. This fixes an issue with sys$waitpid() failing before reap() since dead processes will have no remaining threads alive.	2020-01-27 21:21:48 +01:00
Andreas Kling	f4302b58fb	Kernel: Remove SmapDisablers in sys$getsockname() and sys$getpeername() Instead use the user/kernel copy helpers to only copy the minimum stuff needed from to/from userspace. Based on work started by Brian Gianforcaro.	2020-01-27 21:11:36 +01:00
Andreas Kling	5163c5cc63	Kernel: Expose the signal that stopped a thread via sys$waitpid()	2020-01-27 20:47:10 +01:00
Andreas Kling	638fe6f84a	Kernel: Disable interrupts while looking into the thread table There was a race window in a bunch of syscalls between calling Thread::from_tid() and checking if the found thread was in the same process as the calling thread. If the found thread object was destroyed at that point, there was a use-after-free that could be exploited by filling the kernel heap with something that looked like a thread object.	2020-01-27 14:04:57 +01:00
Andreas Kling	c1f74bf327	Kernel: Never validate access to the kmalloc memory range Memory validation is used to verify that user syscalls are allowed to access a given memory range. Ring 0 threads never make syscalls, and so will never end up in validation anyway. The reason we were allowing kmalloc memory accesses is because kernel thread stacks used to be allocated in kmalloc memory. Since that's no longer the case, we can stop making exceptions for kmalloc in the validation code.	2020-01-27 12:43:21 +01:00
Andreas Kling	137a45dff2	Kernel: read()/write() should respect timeouts when used on a sockets Move timeout management to the ReadBlocker and WriteBlocker classes. Also get rid of the specialized ReceiveBlocker since it no longer does anything that ReadBlocker can't do.	2020-01-26 17:54:23 +01:00
Andreas Kling	b011857e4f	Kernel: Make writev() work again Vector::ensure_capacity() makes sure the underlying vector buffer can contain all the data, but it doesn't update the Vector::size(). As a result, writev() would simply collect all the buffers to write, and then do nothing.	2020-01-26 10:10:15 +01:00
Andreas Kling	b93f6b07c2	Kernel: Make sched_setparam() and sched_getparam() operate on threads Instead of operating on "some random thread in PID", these now operate on the thread with a specific TID. This matches other systems better.	2020-01-26 09:58:58 +01:00
Andreas Kling	f4e7aecec2	Kernel: Preserve CoW bits when splitting VM regions	2020-01-25 17:57:10 +01:00
Andreas Kling	7cc0b18f65	Kernel: Only open a single description for stdio in non-fork processes	2020-01-25 17:05:02 +01:00
Andreas Kling	81ddd2dae0	Kernel: Make sys$setsid() clear the calling process's controlling TTY	2020-01-25 14:53:48 +01:00
Andreas Kling	2bf11b8348	Kernel: Allow empty strings in validate_and_copy_string_from_user() Sergey pointed out that we should just allow empty strings everywhere.	2020-01-25 14:14:11 +01:00
Andreas Kling	69de90a625	Kernel: Simplify Process constructor Move all the fork-specific inheritance logic to sys$fork(), and all the stuff for setting up stdio for non-fork ring 3 processes moves to Process::create_user_process(). Also: we were setting up the PGID, SID and umask twice. Also the code for copying the open file descriptors was overly complicated. Now it's just a simple Vector copy assignment. :^)	2020-01-25 14:13:47 +01:00
Andreas Kling	0f5221568b	Kernel: sys$execve() should not EFAULT for empty argument strings It's okay to exec { "/bin/echo", "" } and it should not EFAULT.	2020-01-25 12:21:30 +01:00
Andreas Kling	30ad7953ca	Kernel: Rename UnveilState to VeilState	2020-01-21 19:28:59 +01:00
Andreas Kling	f38cfb3562	Kernel: Tidy up debug logging a little bit When using dbg() in the kernel, the output is automatically prefixed with [Process(PID:TID)]. This makes it a lot easier to understand which thread is generating the output. This patch also cleans up some common logging messages and removes the now-unnecessary "dbg() << *current << ..." pattern.	2020-01-21 16:16:20 +01:00
Andreas Kling	6081c76515	Kernel: Make O_RDONLY non-zero Sergey suggested that having a non-zero O_RDONLY would make some things less confusing, and it seems like he's right about that. We can now easily check read/write permissions separately instead of dancing around with the bits. This patch also fixes unveil() validation for O_RDWR which previously forgot to check for "r" permission.	2020-01-21 13:27:08 +01:00
Andreas Kling	1b3cac2f42	Kernel: Don't forget about unveiled paths with zero permissions We need to keep these around, otherwise the calling process can remove and re-add a path to increase its permissions.	2020-01-21 11:42:28 +01:00
Andreas Kling	22cfb1f3bd	Kernel: Clear unveiled state on exec()	2020-01-21 10:46:31 +01:00
Andreas Kling	cf48c20170	Kernel: Forked children should inherit unveil()'ed paths	2020-01-21 09:44:32 +01:00
Andreas Kling	0569123ad7	Kernel: Add a basic implementation of unveil() This syscall is a complement to pledge() and adds the same sort of incremental relinquishing of capabilities for filesystem access. The first call to unveil() will "drop a veil" on the process, and from now on, only unveiled parts of the filesystem are visible to it. Each call to unveil() specifies a path to either a directory or a file along with permissions for that path. The permissions are a combination of the following: - r: Read access (like the "rpath" promise) - w: Write access (like the "wpath" promise) - x: Execute access - c: Create/remove access (like the "cpath" promise) Attempts to open a path that has not been unveiled with fail with ENOENT. If the unveiled path lacks sufficient permissions, it will fail with EACCES. Like pledge(), subsequent calls to unveil() with the same path can only remove permissions, not add them. Once you call unveil(nullptr, nullptr), the veil is locked, and it's no longer possible to unveil any more paths for the process, ever. This concept comes from OpenBSD, and their implementation does various things differently, I'm sure. This is just a first implementation for SerenityOS, and we'll keep improving on it as we go. :^)	2020-01-20 22:12:04 +01:00
Andreas Kling	e901a3695a	Kernel: Use the templated copy_to/from_user() in more places These ensure that the "to" and "from" pointers have the same type, and also that we copy the correct number of bytes.	2020-01-20 13:41:21 +01:00
Sergey Bugaev	d5426fcc88	Kernel: Misc tweaks	2020-01-20 13:26:06 +01:00
Sergey Bugaev	9bc6157998	Kernel: Return new fd from sys$fcntl(F_DUPFD) This fixes GNU Bash getting confused after performing a redirection.	2020-01-20 13:26:06 +01:00
Andreas Kling	4b7a89911c	Kernel: Remove some unnecessary casts to uintptr_t VirtualAddress is constructible from uintptr_t and const void. PhysicalAddress is constructible from uintptr_t but not const void.	2020-01-20 13:13:03 +01:00
Andreas Kling	a246e9cd7e	Use uintptr_t instead of u32 when storing pointers as integers uintptr_t is 32-bit or 64-bit depending on the target platform. This will help us write pointer size agnostic code so that when the day comes that we want to do a 64-bit port, we'll be in better shape.	2020-01-20 13:13:03 +01:00
Andreas Kling	8d9dd1b04b	Kernel: Add a 1-deep cache to Process::region_from_range() This simple cache gets hit over 70% of the time on "g++ Process.cpp" and shaves ~3% off the runtime.	2020-01-19 16:44:37 +01:00
Andreas Kling	ae0c435e68	Kernel: Add a Process::add_region() helper This is a private helper for adding a Region to Process::m_regions. It's just for convenience since it's a bit cumbersome to do this.	2020-01-19 16:26:42 +01:00
Andreas Kling	1dc9fa9506	Kernel: Simplify PageDirectory swapping in sys$execve() Swap out both the PageDirectory and the Region list at the same time, instead of doing the Region list slightly later.	2020-01-19 16:05:42 +01:00
Andreas Kling	6eab7b398d	Kernel: Make ProcessPagingScope restore CR3 properly Instead of restoring CR3 to the current process's paging scope when a ProcessPagingScope goes out of scope, we now restore exactly whatever the CR3 value was when we created the ProcessPagingScope. This fixes breakage in situations where a process ends up with nested ProcessPagingScopes. This was making profiling very fragile, and with this change it's now possible to profile g++! :^)	2020-01-19 13:44:53 +01:00
Andreas Kling	f7b394e9a1	Kernel: Assert that copy_to/from_user() are called with user addresses This will panic the kernel immediately if these functions are misused so we can catch it and fix the misuse. This patch fixes a couple of misuses: - create_signal_trampolines() writes to a user-accessible page above the 3GB address mark. We should really get rid of this page but that's a whole other thing. - CoW faults need to use copy_from_user rather than copy_to_user since it's the source pointer that points to user memory. - Inode faults need to use memcpy rather than copy_to_user since we're copying a kernel stack buffer into a quickmapped page. This should make the copy_to/from_user() functions slightly less useful for exploitation. Before this, they were essentially just glorified memcpy() with SMAP disabled. :^)	2020-01-19 09:18:55 +01:00
Andreas Kling	5ce9382e98	Kernel: Only require "stdio" pledge for sending signals to self This should match what OpenBSD does. Sending a signal to yourself seems basically harmless.	2020-01-19 08:50:55 +01:00
Sergey Bugaev	3e1ed38d4b	Kernel: Do not return ENOENT for unresolved symbols ENOENT means "no such file or directory", not "no such symbol". Return EINVAL instead, as we already do in other cases.	2020-01-18 23:51:22 +01:00
Sergey Bugaev	d0d13e2bf5	Kernel: Move setting file flags and r/w mode to VFS::open() Previously, VFS::open() would only use the passed flags for permission checking purposes, and Process::sys$open() would set them on the created FileDescription explicitly. Now, they should be set by VFS::open() on any files being opened, including files that the kernel opens internally. This also lets us get rid of the explicit check for whether or not the returned FileDescription was a preopen fd, and in fact, fixes a bug where a read-only preopen fd without any other flags would be considered freshly opened (due to O_RDONLY being indistinguishable from 0) and granted a new set of flags.	2020-01-18 23:51:22 +01:00
Sergey Bugaev	544b8286da	Kernel: Do not open stdio fds for kernel processes Kernel processes just do not need them. This also avoids touching the file (sub)system early in the boot process when initializing the colonel process.	2020-01-18 23:51:22 +01:00
Sergey Bugaev	6466c3d750	Kernel: Pass correct permission flags when opening files Right now, permission flags passed to VFS::open() are effectively ignored, but that is going to change. * O_RDONLY is 0, but it's still nicer to pass it explicitly * POSIX says that binding a Unix socket to a symlink shall fail with EADDRINUSE	2020-01-18 23:51:22 +01:00
Andreas Kling	862b3ccb4e	Kernel: Enforce W^X between sys$mmap() and sys$execve() It's now an error to sys$mmap() a file as writable if it's currently mapped executable by anyone else. It's also an error to sys$execve() a file that's currently mapped writable by anyone else. This fixes a race condition vulnerability where one program could make modifications to an executable while another process was in the kernel, in the middle of exec'ing the same executable. Test: Kernel/elf-execve-mmap-race.cpp	2020-01-18 23:40:12 +01:00
Andreas Kling	4e6fe3c14b	Kernel: Symbolicate kernel EIP on process crash Process::crash() was assuming that EIP was always inside the ELF binary of the program, but it could also be in the kernel.	2020-01-18 14:38:39 +01:00
Andreas Kling	9c9fe62a4b	Kernel: Validate the requested range in allocate_region_with_vmobject()	2020-01-18 14:37:22 +01:00
Andreas Kling	aa63de53bd	Kernel: Use get_syscall_path_argument() in sys$execve() Paths passed to sys$execve() should certainly be subject to all the usual path validation checks.	2020-01-18 11:43:28 +01:00
Andreas Kling	b65572b3fe	Kernel: Disallow mmap names longer than PATH_MAX	2020-01-18 11:34:53 +01:00
Andreas Kling	94ca55cefd	Meta: Add license header to source files As suggested by Joshua, this commit adds the 2-clause BSD license as a comment block to the top of every source file. For the first pass, I've just added myself for simplicity. I encourage everyone to add themselves as copyright holders of any file they've added or modified in some significant way. If I've added myself in error somewhere, feel free to replace it with the appropriate copyright holder instead. Going forward, all new source files should include a license header.	2020-01-18 09:45:54 +01:00
Andreas Kling	19c31d1617	Kernel: Always dump kernel regions when dumping process regions	2020-01-18 08:57:18 +01:00
Sergey Bugaev	064cd2278c	Kernel: Remove the use of FileSystemPath in sys$realpath() Now that VFS::resolve_path() canonicalizes paths automatically, we don't need to do that here anymore.	2020-01-17 21:49:58 +01:00
Sergey Bugaev	8642a7046c	Kernel: Let inodes provide pre-open file descriptions Some magical inodes, such as /proc/pid/fd/fileno, are going to want to open() to a custom FileDescription, so add a hook for that.	2020-01-17 21:49:58 +01:00
Sergey Bugaev	e0013a6b4c	Kernel+LibC: Unify sys$open() and sys$openat() The syscall is now called sys$open(), but it behaves like the old sys$openat(). In userspace, open_with_path_length() is made a wrapper over openat_with_path_length().	2020-01-17 21:49:58 +01:00
Andreas Kling	4d4d5e1c07	Kernel: Drop futex queues/state on exec() This state is not meaningful to the new process image so just drop it.	2020-01-17 16:08:00 +01:00
Andreas Kling	26a31c7efb	Kernel: Add "accept" pledge promise for accepting incoming connections This patch adds a new "accept" promise that allows you to call accept() on an already listening socket. This lets programs set up a socket for for listening and then dropping "inet" and/or "unix" so that only incoming (and existing) connections are allowed from that point on. No new outgoing connections or listening server sockets can be created. In addition to accept() it also allows getsockopt() with SOL_SOCKET and SO_PEERCRED, which is used to find the PID/UID/GID of the socket peer. This is used by our IPC library when creating shared buffers that should only be accessible to a specific peer process. This allows us to drop "unix" in WindowServer and LookupServer. :^) It also makes the debugging/introspection RPC sockets in CEventLoop based programs work again.	2020-01-17 11:19:06 +01:00
Andreas Kling	c6e552ac8f	Kernel+LibELF: Don't blindly trust ELF symbol offsets in symbolication It was possible to craft a custom ELF executable that when symbolicated would cause the kernel to read from user-controlled addresses anywhere in memory. You could then fetch this memory via /proc/PID/stack We fix this by making ELFImage hand out StringView rather than raw const char* for symbol names. In case a symbol offset is outside the ELF image, you get a null StringView. :^) Test: Kernel/elf-symbolication-kernel-read-exploit.cpp	2020-01-16 22:11:31 +01:00
Andreas Kling	d79de38bd2	Kernel: Don't allow userspace to sys$open() literal symlinks The O_NOFOLLOW_NOERROR is an internal kernel mechanism used for the implementation of sys$readlink() and sys$lstat(). There is no reason to allow userspace to open symlinks directly.	2020-01-15 21:19:26 +01:00
Andreas Kling	e23536d682	Kernel: Use Vector::unstable_remove() in a couple of places	2020-01-15 19:26:41 +01:00
Liav A	d2b41010c5	Kernel: Change Region allocation helpers We now can create a cacheable Region, so when map() is called, if a Region is cacheable then all the virtual memory space being allocated to it will be marked as not cache disabled. In addition to that, OS components can create a Region that will be mapped to a specific physical address by using the appropriate helper method.	2020-01-14 15:38:58 +01:00
Andreas Kling	65cb406327	Kernel: Allow unlocking a held Lock with interrupts disabled This is needed to eliminate a race in Thread::wait_on() where we'd otherwise have to wait until after unlocking the process lock before we can disable interrupts.	2020-01-13 18:56:46 +01:00
Andrew Kaster	7a7e7c82b5	Kernel: Tighten up exec/do_exec and allow for PT_INTERP iterpreters This patch changes how exec() figures out which program image to actually load. Previously, we opened the path to our main executable in find_shebang_interpreter_for_executable, read the first page (or less, if the file was smaller) and then decided whether to recurse with the interpreter instead. We then then re-opened the main executable in do_exec. However, since we now want to parse the ELF header and Program Headers of an elf image before even doing any memory region work, we can change the way this whole process works. We open the file and read (up to) the first page in exec() itself, then pass just the page and the amount read to find_shebang_interpreter_for_executable. Since we now have that page and the FileDescription for the main executable handy, we can do a few things. First, validate the ELF header and ELF program headers for any shenanigans. ELF32 Little Endian i386 only, please. Second, we can grab the PT_INTERP interpreter from any ET_DYN files, and open that guy right away if it exists. Finally, we can pass the main executable's and optionally the PT_INTERP interpreter's file descriptions down to do_exec and not have to feel guilty about opening the file twice. In do_exec, we now have a choice. Are we going to load the main executable, or the interpreter? We could load both, but it'll be way easier for the inital pass on the RTLD if we only load the interpreter. Then it can load the main executable itself like any old shared object, just, the one with main in it :). Later on we can load both of them into memory and the RTLD can relocate itself before trying to do anything. The way it's written now the RTLD will get dibs on its requested virtual addresses being the actual virtual addresses.	2020-01-13 13:03:30 +01:00

... 2 3 4 5 6 ...

998 Commits