ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2024-09-20 01:37:39 +03:00

Author	SHA1	Message	Date
Brian Gianforcaro	25a620a573	Kernel: Enable timeout support for sys$futex(FUTEX_WAIT) Utilize the new Thread::wait_on timeout parameter to implement timeout support for FUTEX_WAIT. As we compute the relative time from the user specified absolute time, we try to delay that computation as long as possible before we call into Thread::wait_on(..). To enable this a small bit of refactoring was done pull futex_queue fetching out and timeout fetch and calculation separation.	2020-04-26 21:31:52 +02:00
Andreas Kling	fb826aa59a	Kernel: Make sys$sethostname() superuser-only Also take the hostname string lock exclusively.	2020-04-26 15:51:57 +02:00
Luke Payne	f191b84b50	Kernel: Added the ability to set the hostname via new syscall Userland/hostname: Now takes parameter to set the hostname LibC/unistd: Added sethostname function	2020-04-26 12:59:09 +02:00
Brian Gianforcaro	0f3990cfa3	Kernel: Support signaling all processes with pid == -1 This is a special case that was previously not implemented. The idea is that you can dispatch a signal to all other processes the calling process has access to. There was some minor refactoring to make the self signal logic into a function so it could easily be easily re-used from do_killall.	2020-04-26 12:54:10 +02:00
Brian Gianforcaro	1f64e3eb16	Kernel: Implement FUTEX_WAKE of arbitrary count. Previously we just woke all waiters no matter how many were requested. Fix this by implementing WaitQueue::wake_n(..).	2020-04-26 12:35:35 +02:00
Drew Stratford	4a37362249	LibPthread: implicitly call pthread_exit on return from start routine. Previously, when returning from a pthread's start_routine, we would segfault. Now we instead implicitly call pthread_exit as specified in the standard. pthread_create now creates a thread running the new pthread_create_helper, which properly manages the calling and exiting of the start_routine supplied to pthread_create. To accomplish this, the thread's stack initialization has been moved out of sys$create_thread and into the userspace function create_thread.	2020-04-25 16:51:35 +02:00
Itamar	edaa9c06d9	LibELF: Make ELF::Loader RefCounted	2020-04-20 17:25:50 +02:00
Sergey Bugaev	54550365eb	Kernel: Use shared locking mode in some places The notable piece of code that remains to be converted is Ext2FS.	2020-04-18 13:58:29 +02:00
Sergey Bugaev	f18d6610d3	Kernel: Don't include null terminator in sys$readlink() result POSIX says, "Conforming applications should not assume that the returned contents of the symbolic link are null-terminated." If we do include the null terminator into the returning string, Python believes it to actually be a part of the returned name, and gets unhappy about that later. This suggests other systems Python runs in don't include it, so let's do that too. Also, make our userspace support non-null-terminated realpath().	2020-04-14 18:40:24 +02:00
Andreas Kling	815b73bdcc	Kernel: Simplify sys$setgroups(0, ...) If we're dropping all groups, just clear the extra_gids and return.	2020-04-14 15:30:25 +02:00
Andreas Kling	9962db5bf8	Kernel: Remove SmapDisablers in {peek,poke}_user_data()	2020-04-14 09:52:49 +02:00
Itamar	3e9a7175d1	Debugger: Add DebugSession The DebugSession class wraps the usage of Ptrace. It is intended to be used by cli & gui debugger programs. Also, call objdump for disassemly	2020-04-13 00:53:22 +02:00
Itamar	aae3f7b914	Process: Fix siginfo for code CLD_STOPPED si_code, si_status where swapped	2020-04-13 00:53:22 +02:00
Itamar	9e51e295cf	ptrace: Add PT_SETREGS PT_SETTREGS sets the regsiters of the traced thread. It can only be used when the tracee is stopped. Also, refactor ptrace. The implementation was getting long and cluttered the alraedy large Process.cpp file. This commit moves the bulk of the implementation to Kernel/Ptrace.cpp, and factors out peek & poke to separate methods of the Process class.	2020-04-13 00:53:22 +02:00
Itamar	0431712660	ptrace: Stop a traced thread when it exists from execve This was a missing feature in the PT_TRACEME command. This feature allows the tracer to interact with the tracee before the tracee has started executing its program. It will be useful for automatically inserting a breakpoint at a debugged program's entry point.	2020-04-13 00:53:22 +02:00
Itamar	b306ac9b2b	ptrace: Add PT_POKE PT_POKE writes a single word to the tracee's address space. Some caveats: - If the user requests to write to an address in a read-only region, we temporarily change the page's protections to allow it. - If the user requests to write to a region that's backed by a SharedInodeVMObject, we replace the vmobject with a PrivateIndoeVMObject.	2020-04-13 00:53:22 +02:00
Itamar	984ff93406	ptrace: Add PT_PEEK PT_PEEK reads a single word from the tracee's address space and returns it to the tracer.	2020-04-13 00:53:22 +02:00
Andreas Kling	c19b56dc99	Kernel+LibC: Add minherit() and MAP_INHERIT_ZERO This patch adds the minherit() syscall originally invented by OpenBSD. Only the MAP_INHERIT_ZERO mode is supported for now. If set on an mmap region, that region will be zeroed out on fork().	2020-04-12 20:22:26 +02:00
Andrew Kaster	61acca223f	LibELF: Move validation methods to their own file These validate_elf_* methods really had no business being static methods of ELF::Image. Now that the ELF namespace exists, it makes sense to just move them to be free functions in the namespace.	2020-04-11 22:41:05 +02:00
Andrew Kaster	21b5909dc6	LibELF: Move ELF classes into namespace ELF This is for consistency with other namespace changes that were made a while back to the other libraries :)	2020-04-11 22:41:05 +02:00
Andreas Kling	dec352dacd	Kernel: Ignore zero-length PROGBITS sections in sys$module_load()	2020-04-10 16:36:01 +02:00
Andreas Kling	c06d5ef114	Kernel+LibC: Remove ESUCCESS There's no official ESUCCESS==0 errno code, and it keeps breaking the Lagom build when we use it, so let's just say 0 instead.	2020-04-10 13:09:35 +02:00
Andreas Kling	871d450b93	Kernel: Remove redundant "ACPI" from filenames in ACPI/	2020-04-09 18:17:27 +02:00
Andreas Kling	4644217094	Kernel: Remove "non-operational" ACPI parser state If we don't support ACPI, just don't instantiate an ACPI parser. This is way less confusing than having a special parser class whose only purpose is to do nothing. We now search for the RSDP in ACPI::initialize() instead of letting the parser constructor do it. This allows us to defer the decision to create a parser until we're sure we can make a useful one.	2020-04-09 17:19:11 +02:00
Andreas Kling	dc7340332d	Kernel: Update cryptically-named functions related to symbolication	2020-04-08 17:19:46 +02:00
Liav A	23fb985f02	Kernel & Userland: Allow to mount image files formatted with Ext2FS	2020-04-06 15:36:36 +02:00
Andreas Kling	9ae3cced76	Revert "Kernel & Userland: Allow to mount image files formatted with Ext2FS" This reverts commit `a60ea79a41`. Reverting these changes since they broke things. Fixes #1608.	2020-04-03 21:28:57 +02:00
Liav A	a60ea79a41	Kernel & Userland: Allow to mount image files formatted with Ext2FS	2020-04-02 12:03:08 +02:00
Itamar	6b74d38aab	Kernel: Add 'ptrace' syscall This commit adds a basic implementation of the ptrace syscall, which allows one process (the tracer) to control another process (the tracee). While a process is being traced, it is stopped whenever a signal is received (other than SIGCONT). The tracer can start tracing another thread with PT_ATTACH, which causes the tracee to stop. From there, the tracer can use PT_CONTINUE to continue the execution of the tracee, or use other request codes (which haven't been implemented yet) to modify the state of the tracee. Additional request codes are PT_SYSCALL, which causes the tracee to continue exection but stop at the next entry or exit from a syscall, and PT_GETREGS which fethces the last saved register set of the tracee (can be used to inspect syscall arguments and return value). A special request code is PT_TRACE_ME, which is issued by the tracee and causes it to stop when it calls execve and wait for the tracer to attach.	2020-03-28 18:27:18 +01:00
Shannon Booth	757c14650f	Kernel: Simplify process assertion checking if region is in range Let's use the helper function for this :)	2020-03-22 08:51:40 +01:00
Liav A	b536547c52	Process: Use monotonic time for timeouts	2020-03-19 15:48:00 +01:00
Liav A	4484513b45	Kernel: Add new syscall to allow changing the system date	2020-03-19 15:48:00 +01:00
Liav A	9db291d885	Kernel: Introduce the new Time management subsystem This new subsystem includes better abstractions of how time will be handled in the OS. We take advantage of the existing RTC timer to aid in keeping time synchronized. This is standing in contrast to how we handled time-keeping in the kernel, where the PIT was responsible for that function in addition to update the scheduler about ticks. With that new advantage, we can easily change the ticking dynamically and still keep the time synchronized. In the process context, we no longer use a fixed declaration of TICKS_PER_SECOND, but we call the TimeManagement singleton class to provide us the right value. This allows us to use dynamic ticking in the future, a feature known as tickless kernel. The scheduler no longer does by himself the calculation of real time (Unix time), and just calls the TimeManagment singleton class to provide the value. Also, we can use 2 new boot arguments: - the "time" boot argument accpets either the value "modern", or "legacy". If "modern" is specified, the time management subsystem will try to setup HPET. Otherwise, for "legacy" value, the time subsystem will revert to use the PIT & RTC, leaving HPET disabled. If this boot argument is not specified, the default pattern is to try to setup HPET. - the "hpet" boot argumet accepts either the value "periodic" or "nonperiodic". If "periodic" is specified, the HPET will scan for periodic timers, and will assert if none are found. If only one is found, that timer will be assigned for the time-keeping task. If more than one is found, both time-keeping task & scheduler-ticking task will be assigned to periodic timers. If this boot argument is not specified, the default pattern is to try to scan for HPET periodic timers. This boot argument has no effect if HPET is disabled. In hardware context, PIT & RealTimeClock classes are merely inheriting from the HardwareTimer class, and they allow to use the old i8254 (PIT) and RTC devices, managing them via IO ports. By default, the RTC will be programmed to a frequency of 1024Hz. The PIT will be programmed to a frequency close to 1000Hz. About HPET, depending if we need to scan for periodic timers or not, we try to set a frequency close to 1000Hz for the time-keeping timer and scheduler-ticking timer. Also, if possible, we try to enable the Legacy replacement feature of the HPET. This feature if exists, instructs the chipset to disconnect both i8254 (PIT) and RTC. This behavior is observable on QEMU, and was verified against the source code: `ce967e2f33` The HPETComparator class is inheriting from HardwareTimer class, and is responsible for an individual HPET comparator, which is essentially a timer. Therefore, it needs to call the singleton HPET class to perform HPET-related operations. The new abstraction of Hardware timers brings an opportunity of more new features in the foreseeable future. For example, we can change the callback function of each hardware timer, thus it makes it possible to swap missions between hardware timers, or to allow to use a hardware timer for other temporary missions (e.g. calibrating the LAPIC timer, measuring the CPU frequency, etc).	2020-03-19 15:48:00 +01:00
Alex Muscar	d013753f83	Kernel: Resolve relative paths when there is a veil (#1474 )	2020-03-19 09:57:34 +01:00
Andreas Kling	ad92a1e4bc	Kernel: Add sys$get_stack_bounds() for finding the stack base & size This will be useful when implementing conservative garbage collection.	2020-03-16 19:06:33 +01:00
Andreas Kling	3803196edb	Kernel: Get rid of SmapDisabler in sys$fstat()	2020-03-10 13:34:24 +01:00
Liav A	0f45a1b5e7	Kernel: Allow to reboot in ACPI via PCI or MMIO access Also, we determine if ACPI reboot is supported by checking the FADT flags' field.	2020-03-09 10:53:13 +01:00
Ben Wiederhake	b066586355	Kernel: Fix race in waitid This is similar to `28e1da344d` and `4dd4dd2f3c`. The crux is that wait verifies that the outvalue (siginfo* infop) is writable before waiting, and writes to it after waiting. In the meantime, a concurrent thread can make the output region unwritable, e.g. by deallocating it.	2020-03-08 14:12:12 +01:00
Ben Wiederhake	d8cd4e4902	Kernel: Fix race in select This is similar to `28e1da344d` and `4dd4dd2f3c`. The crux is that select verifies that the filedescriptor sets are writable before blocking, and writes to them after blocking. In the meantime, a concurrent thread can make the output buffer unwritable, e.g. by deallocating it.	2020-03-08 14:12:12 +01:00
Andreas Kling	b1058b33fb	AK: Add global FlatPtr typedef. It's u32 or u64, based on sizeof(void*) Use this instead of uintptr_t throughout the codebase. This makes it possible to pass a FlatPtr to something that has u32 and u64 overloads.	2020-03-08 13:06:51 +01:00
Andreas Kling	c6693f9b3a	Kernel: Simplify a bunch of dbg() and klog() calls LogStream can handle VirtualAddress and PhysicalAddress directly.	2020-03-06 15:00:44 +01:00
Liav A	85eb1d26d5	Kernel: Run clang-format on Process.cpp & ACPIDynamicParser.h	2020-03-05 19:04:04 +01:00
Liav A	1b8cd6db7b	Kernel: Call ACPI reboot method first if possible Now we call ACPI reboot method first if possible, and if ACPI reboot is not available, we attempt to reboot via the keyboard controller.	2020-03-05 19:04:04 +01:00
Ben Wiederhake	4dd4dd2f3c	Kernel: Fix race in clock_nanosleep This is a complete fix of clock_nanosleep, because the thread holds the process lock again when returning from sleep()/sleep_until(). Therefore, no further concurrent invalidation can occur.	2020-03-03 20:13:32 +01:00
Liav A	0fc60e41dd	Kernel: Use klog() instead of kprintf() Also, duplicate data in dbg() and klog() calls were removed. In addition, leakage of virtual address to kernel log is prevented. This is done by replacing kprintf() calls to dbg() calls with the leaked data instead. Also, other kprintf() calls were replaced with klog().	2020-03-02 22:23:39 +01:00
Andreas Kling	47beab926d	Kernel: Remove ability to create kernel-only regions at user addresses This was only used by the mechanism for mapping executables into each process's own address space. Now that we remap executables on demand when needed for symbolication, this can go away.	2020-03-02 11:20:34 +01:00
Andreas Kling	e56f8706ce	Kernel: Map executables at a kernel address during ELF load This is both simpler and more robust than mapping them in the process address space.	2020-03-02 11:20:34 +01:00
Andreas Kling	678c87087d	Kernel: Load executables on demand when symbolicating Previously we would map the entire executable of a program in its own address space (but make it unavailable to userspace code.) This patch removes that and changes the symbolication code to remap the executable on demand (and into the kernel's own address space instead of the process address space.) This opens up a couple of further simplifications that will follow.	2020-03-02 11:20:34 +01:00
Andreas Kling	0acac186fb	Kernel: Make the "entire executable" region shared This makes Region::clone() do the right thing with it on fork().	2020-03-02 06:13:29 +01:00
Andreas Kling	5c2a296a49	Kernel: Mark read-only PT_LOAD mappings as shared regions This makes Region::clone() do the right thing for these now that we differentiate based on Region::is_shared().	2020-03-01 21:26:36 +01:00
Andreas Kling	ecfde5997b	Kernel: Use SharedInodeVMObject for executables after all I had the wrong idea about this. Thanks to Sergey for pointing it out! Here's what he says (reproduced for posterity): > Private mappings protect the underlying file from the changes made by > you, not the other way around. To quote POSIX, "If MAP_PRIVATE is > specified, modifications to the mapped data by the calling process > shall be visible only to the calling process and shall not change the > underlying object. It is unspecified whether modifications to the > underlying object done after the MAP_PRIVATE mapping is established > are visible through the MAP_PRIVATE mapping." In practice that means > that the pages that were already paged in don't get updated when the > underlying file changes, and the pages that weren't paged in yet will > load the latest data at that moment. > The only thing MAP_FILE \| MAP_PRIVATE is really useful for is mapping > a library and performing relocations; it's definitely useless (and > actively harmful for the system memory usage) if you only read from > the file. This effectively reverts `e2697c2ddd`.	2020-03-01 21:16:27 +01:00
Andreas Kling	bb7dd63f74	Kernel: Run clang-format on Process.cpp	2020-03-01 21:16:27 +01:00
Andreas Kling	687b52ceb5	Kernel: Name perfcore files "perfcore.PID" This way we can trace many things and we get one perfcore file per process instead of everyone trying to write to "perfcore"	2020-03-01 20:59:02 +01:00
Andreas Kling	fee20bd8de	Kernel: Remove some more harmless InodeVMObject miscasts	2020-03-01 12:27:03 +01:00
Andreas Kling	95e3aec719	Kernel: Fix harmless type miscast in Process::amount_clean_inode()	2020-03-01 11:23:23 +01:00
Andreas Kling	e2697c2ddd	Kernel: Use PrivateInodeVMObject for loading program executables This will be a memory usage pessimization until we actually implement CoW sharing of the memory pages with SharedInodeVMObject. However, it's a huge architectural improvement, so let's take it and improve on this incrementally. fork() should still be neutral, since all private mappings are CoW'ed.	2020-03-01 11:23:10 +01:00
Andreas Kling	88b334135b	Kernel: Remove some Region construction helpers It's now up to the caller to provide a VMObject when constructing a new Region object. This will make it easier to handle things going wrong, like allocation failures, etc.	2020-03-01 11:23:10 +01:00
Andreas Kling	4badef8137	Kernel: Return bytes written if sys$write() fails after writing some If we wrote anything we should just inform userspace that we did, and not worry about the error code. Userspace can call us again if it wants, and we'll give them the error then.	2020-02-29 18:42:35 +01:00
Andreas Kling	7cd1bdfd81	Kernel: Simplify some dbg() logging We don't have to log the process name/PID/TID, dbg() automatically adds that as a prefix to every line. Also we don't have to do .characters() on Strings passed to dbg() :^)	2020-02-29 13:39:06 +01:00
Andreas Kling	8fbdda5a2d	Kernel: Implement basic support for sys$mmap() with MAP_PRIVATE You can now mmap a file as private and writable, and the changes you make will only be visible to you. This works because internally a MAP_PRIVATE region is backed by a unique PrivateInodeVMObject instead of using the globally shared SharedInodeVMObject like we always did before. :^) Fixes #1045.	2020-02-28 23:25:00 +01:00
Andreas Kling	aa1e209845	Kernel: Remove some unnecessary indirection in InodeFile::mmap() InodeFile now directly calls Process::allocate_region_with_vmobject() instead of taking an awkward detour via a special Region constructor.	2020-02-28 20:29:14 +01:00
Andreas Kling	651417a085	Kernel: Split InodeVMObject into two subclasses We now have PrivateInodeVMObject and SharedInodeVMObject, corresponding to MAP_PRIVATE and MAP_SHARED respectively. Note that PrivateInodeVMObject is not used yet.	2020-02-28 20:20:35 +01:00
Andreas Kling	07a26aece3	Kernel: Rename InodeVMObject => SharedInodeVMObject	2020-02-28 20:07:51 +01:00
Andreas Kling	5af95139fa	Kernel: Make Process::m_master_tls_region a WeakPtr Let's not keep raw Region* variables around like that when it's so easy to avoid it.	2020-02-28 14:05:30 +01:00
Andreas Kling	b0623a0c58	Kernel: Remove SmapDisabler in sys$connect()	2020-02-28 13:20:26 +01:00
Andreas Kling	dcd619bd46	Kernel: Merge the shbuf_get_size() syscall into shbuf_get() Add an extra out-parameter to shbuf_get() that receives the size of the shared buffer. That way we don't need to make a separate syscall to get the size, which we always did immediately after.	2020-02-28 12:55:58 +01:00
Andreas Kling	f72e5bbb17	Kernel+LibC: Rename shared buffer syscalls to use a prefix This feels a lot more consistent and Unixy: create_shared_buffer() => shbuf_create() share_buffer_with() => shbuf_allow_pid() share_buffer_globally() => shbuf_allow_all() get_shared_buffer() => shbuf_get() release_shared_buffer() => shbuf_release() seal_shared_buffer() => shbuf_seal() get_shared_buffer_size() => shbuf_get_size() Also, "shared_buffer_id" is shortened to "shbuf_id" all around.	2020-02-28 12:55:58 +01:00
Liav A	db23703570	Process: Use dbg() instead of dbgprintf() Also, fix a bad derefernce in sys$create_shared_buffer() method.	2020-02-27 13:05:12 +01:00
Andreas Kling	4997dcde06	Kernel: Always disable interrupts in do_killpg() Will caught an assertion when running "kill 9999999999999" :^)	2020-02-27 11:05:16 +01:00
Andreas Kling	4a293e8a21	Kernel: Ignore signals sent to threadless (zombie) processes If a process doesn't have any threads left, it's in a zombie state and we can't meaningfully send signals to it. So just ignore them. Fixes #1313.	2020-02-27 11:04:15 +01:00
Andreas Kling	0c1497846e	Kernel: Don't allow profiling a dead process Work towards #1313.	2020-02-27 10:42:31 +01:00
Cristian-Bogdan SIRB	05ce8586ea	Kernel: Fix ASSERTION failed in join_thread syscall set_interrupted_by_death was never called whenever a thread that had a joiner died, so the joiner remained with the joinee pointer there, resulting in an assertion fail in JoinBlocker: m_joinee pointed to a freed task, filled with garbage. Thread::current->m_joinee may not be valid after the unblock Properly return the joinee exit value to the joiner thread.	2020-02-27 10:09:44 +01:00
Andreas Kling	d28fa89346	Kernel: Don't assert on sys$kill() with pid=INT32_MIN On 32-bit platforms, INT32_MIN == -INT32_MIN, so we can't expect this to always work: if (pid < 0) positive_pid = -pid; // may still be negative! This happens because the -INT32_MIN expression becomes a long and is then truncated back to an int. Fixes #1312.	2020-02-27 10:02:04 +01:00
Cristian-Bogdan SIRB	717cd5015e	Kernel: Allow process with multiple threads to call exec and exit This allows a process wich has more than 1 thread to call exec, even from a thread. This kills all the other threads, but it won't wait for them to finish, just makes sure that they are not in a running/runable state. In the case where a thread does exec, the new program PID will be the thread TID, to keep the PID == TID in the new process. This introduces a new function inside the Process class, kill_threads_except_self which is called on exit() too (exit with multiple threads wasn't properly working either). Inside the Lock class, there is the need for a new function, clear_waiters, which removes all the waiters from the Process::big_lock. This is needed since after a exit/exec, there should be no other threads waiting for this lock, the threads should be simply killed. Only queued threads should wait for this lock at this point, since blocked threads are handled in set_should_die.	2020-02-26 13:06:40 +01:00
Andreas Kling	ceec1a7d38	AK: Make Vector use size_t for its size and capacity	2020-02-25 14:52:35 +01:00
Andreas Kling	d0f5b43c2e	Kernel: Use Vector::unstable_remove() when deallocating a region Process::m_regions is not sorted, so we can use unstable_remove() to avoid shifting the vector contents. :^)	2020-02-24 18:34:49 +01:00
Andreas Kling	30a8991dbf	Kernel: Make Region weakable and use WeakPtr<Region> instead of Region* This turns use-after-free bugs into null pointer dereferences instead.	2020-02-24 13:32:45 +01:00
Andreas Kling	79576f9280	Kernel: Clear the region lookup cache on exec() Each process has a 1-level lookup cache for fast repeated lookups of the same VM region (which tends to be the majority of lookups.) The cache is used by the following syscalls: munmap, madvise, mprotect and set_mmap_name. After a succesful exec(), there could be a stale Region* in the lookup cache, and the new executable was able to manipulate it using a number of use-after-free code paths.	2020-02-24 12:37:27 +01:00
Liav A	895e874eb4	Kernel: Include the new PIT class in system components	2020-02-24 11:27:03 +01:00
Andreas Kling	fc5ebe2a50	Kernel: Disown shared buffers on sys$execve() When committing to a new executable, disown any shared buffers that the process was previously co-owning. Otherwise accessing the same shared buffer ID from the new program would cause the kernel to find a cached (and stale!) reference to the previous program's VM region corresponding to that shared buffer, leading to a Region* use-after-free. Fixes #1270.	2020-02-22 12:29:38 +01:00
Andreas Kling	ece2971112	Kernel: Disable profiling during the critical section of sys$execve() Since we're gonna throw away these stacks at the end of exec anyway, we might as well disable profiling before starting to mess with the process page tables. One less weird situation to worry about in the sampling code.	2020-02-22 11:09:03 +01:00
Andreas Kling	d7a13dbaa7	Kernel: Reset profiling state on exec() (but keep it going) We now log the new executable on exec() and throw away all the samples we've accumulated so far. But profiling keeps going.	2020-02-22 10:54:50 +01:00
Andreas Kling	2a679f228e	Kernel: Fix bitrotted DEBUG_IO logging	2020-02-21 15:49:30 +01:00
Andreas Kling	bead20c40f	Kernel: Remove SmapDisabler in sys$create_shared_buffer()	2020-02-18 14:12:39 +01:00
Andreas Kling	9aa234cc47	Kernel: Reset FPU state on exec()	2020-02-18 13:44:27 +01:00
Andreas Kling	a7dbb3cf96	Kernel: Use a FixedArray for a process's extra GIDs There's not really enough of these to justify using a HashTable.	2020-02-18 11:35:47 +01:00
Andreas Kling	48f7c28a5c	Kernel: Replace "current" with Thread::current and Process::current Suggested by Sergey. The currently running Thread and Process are now Thread::current and Process::current respectively. :^)	2020-02-17 15:04:27 +01:00
Andreas Kling	4f4af24b9d	Kernel: Tear down process address space during finalization Process teardown is divided into two main stages: finalize and reap. Finalization happens in the "Finalizer" kernel and runs with interrupts enabled, allowing destructors to take locks, etc. Reaping happens either in sys$waitid() or in the scheduler for orphans. The more work we can do in finalization, the better, since it's fully pre-emptible and reduces the amount of time the system runs without interrupts enabled.	2020-02-17 14:33:06 +01:00
Andreas Kling	31e1af732f	Kernel+LibC: Allow sys$mmap() callers to specify address alignment This is exposed via the non-standard serenity_mmap() call in userspace.	2020-02-16 12:55:56 +01:00
Andreas Kling	7a8be7f777	Kernel: Remove SmapDisabler in sys$accept()	2020-02-16 08:20:54 +01:00
Andreas Kling	7717084ac7	Kernel: Remove SmapDisabler in sys$clock_gettime()	2020-02-16 08:13:11 +01:00
Andreas Kling	16818322c5	Kernel: Reduce header dependencies of Process and Thread	2020-02-16 02:01:42 +01:00
Andreas Kling	e28809a996	Kernel: Add forward declaration header	2020-02-16 01:50:32 +01:00
Andreas Kling	1d611e4a11	Kernel: Reduce header dependencies of MemoryManager and Region	2020-02-16 01:33:41 +01:00
Andreas Kling	a356e48150	Kernel: Move all code into the Kernel namespace	2020-02-16 01:27:42 +01:00
Andreas Kling	1f55079488	Kernel: Remove SmapDisabler in sys$getgroups()	2020-02-16 00:30:00 +01:00
Andreas Kling	eb7b0c76a8	Kernel: Remove SmapDisabler in sys$setgroups()	2020-02-16 00:27:10 +01:00
Andreas Kling	0341ddc5eb	Kernel: Rename RegisterDump => RegisterState	2020-02-16 00:15:37 +01:00
Andreas Kling	580a94bc44	Kernel+LibC: Merge sys$stat() and sys$lstat() There is now only one sys$stat() instead of two separate syscalls.	2020-02-10 19:49:49 +01:00
Liav A	e559af2008	Kernel: Apply changes to use LibBareMetal definitions	2020-02-09 19:38:17 +01:00
Andreas Kling	7291370478	Kernel: Make File::truncate() take a u64 No point in taking a signed type here. We validate at the syscall layer and then pass around a u64 from then on.	2020-02-08 12:07:04 +01:00
Andreas Kling	88ea152b24	Kernel: Merge unnecessary DiskDevice class into BlockDevice	2020-02-08 02:20:03 +01:00
Andreas Kling	2b0b7cc5a4	Net: Add a basic sys$shutdown() implementation Calling shutdown prevents further reads and/or writes on a socket. We should do a few more things based on the type of socket, but this initial implementation just puts the basic mechanism in place. Work towards #428.	2020-02-08 00:54:43 +01:00
Andreas Kling	f3a5985bb2	Kernel: Remove two bad FIXME's We should absolutely not create a new thread in sys$exec(). There's also no sys$spawn() anymore.	2020-02-08 00:06:15 +01:00
Andreas Kling	d04fcccc90	Kernel: Truncate addresses stored by getsockname() and getpeername() If there's not enough space in the output buffer for the whole sockaddr we now simply truncate the address instead of returning EINVAL. This patch also makes getpeername() actually return the peer address rather than the local address.. :^)	2020-02-07 23:43:32 +01:00
Andreas Kling	dc18859695	Kernel: memset() all siginfo_t structs after creating them	2020-02-06 14:12:20 +01:00
Sergey Bugaev	1b866bbf42	Kernel: Fix sys$waitid(P_ALL, WNOHANG) return value According to POSIX, waitid() should fill si_signo and si_pid members with zeroes if there are no children that have already changed their state by the time of the call. Let's just fill the whole structure with zeroes to avoid leaking kernel memory.	2020-02-06 16:06:30 +03:00
Andreas Kling	75cb125e56	Kernel: Put sys$waitid() debug logging behind PROCESS_DEBUG	2020-02-05 19:14:56 +01:00
Sergey Bugaev	b3a24d732d	Kernel+LibC: Add sys$waitid(), and make sys$waitpid() wrap it sys$waitid() takes an explicit description of whether it's waiting for a single process with the given PID, all of the children, a group, etc., and returns its info as a siginfo_t. It also doesn't automatically imply WEXITED, which clears up the confusion in the kernel.	2020-02-05 18:14:37 +01:00
Andreas Kling	3879e5b9d4	Kernel: Start working on a syscall for logging performance events This patch introduces sys$perf_event() with two event types: - PERF_EVENT_MALLOC - PERF_EVENT_FREE After the first call to sys$perf_event(), a process will begin keeping these events in a buffer. When the process dies, that buffer will be written out to "perfcore" in the current directory unless that filename is already taken. This is probably not the best way to do this, but it's a start and will make it possible to start doing memory allocation profiling. :^)	2020-02-02 20:26:27 +01:00
Andreas Kling	934b1d8a9b	Kernel: Finalizer should not go back to sleep if there's more to do Before putting itself back on the wait queue, the finalizer task will now check if there's more work to do, and if so, do it first. :^) This patch also puts a bunch of process/thread debug logging behind PROCESS_DEBUG and THREAD_DEBUG since it was unbearable to debug this stuff with all the spam.	2020-02-01 10:56:17 +01:00
Andreas Kling	6634da31d9	Kernel: Disallow empty ranges in munmap/mprotect/madvise	2020-01-30 21:55:49 +01:00
Andreas Kling	31d1c82621	Kernel: Reject non-user address ranges in mmap/munmap/mprotect/madvise There's no valid reason to allow non-userspace address ranges in these system calls.	2020-01-30 21:51:27 +01:00
Andreas Kling	afd2b5a53e	Kernel: Copy "stack" and "mmap" bits when splitting a Region	2020-01-30 21:51:27 +01:00
Andreas Kling	c9e877a294	Kernel: Address validation helpers should take size_t, not ssize_t	2020-01-30 21:51:27 +01:00
Andreas Kling	c64904a483	Kernel: sys$readlink() should return the number of bytes written out	2020-01-27 21:50:51 +01:00
Andreas Kling	8b49804895	Kernel: sys$waitpid() only needs the waitee thread in the stopped case If the waitee process is dead, we don't need to inspect the thread. This fixes an issue with sys$waitpid() failing before reap() since dead processes will have no remaining threads alive.	2020-01-27 21:21:48 +01:00
Andreas Kling	f4302b58fb	Kernel: Remove SmapDisablers in sys$getsockname() and sys$getpeername() Instead use the user/kernel copy helpers to only copy the minimum stuff needed from to/from userspace. Based on work started by Brian Gianforcaro.	2020-01-27 21:11:36 +01:00
Andreas Kling	5163c5cc63	Kernel: Expose the signal that stopped a thread via sys$waitpid()	2020-01-27 20:47:10 +01:00
Andreas Kling	638fe6f84a	Kernel: Disable interrupts while looking into the thread table There was a race window in a bunch of syscalls between calling Thread::from_tid() and checking if the found thread was in the same process as the calling thread. If the found thread object was destroyed at that point, there was a use-after-free that could be exploited by filling the kernel heap with something that looked like a thread object.	2020-01-27 14:04:57 +01:00
Andreas Kling	c1f74bf327	Kernel: Never validate access to the kmalloc memory range Memory validation is used to verify that user syscalls are allowed to access a given memory range. Ring 0 threads never make syscalls, and so will never end up in validation anyway. The reason we were allowing kmalloc memory accesses is because kernel thread stacks used to be allocated in kmalloc memory. Since that's no longer the case, we can stop making exceptions for kmalloc in the validation code.	2020-01-27 12:43:21 +01:00
Andreas Kling	137a45dff2	Kernel: read()/write() should respect timeouts when used on a sockets Move timeout management to the ReadBlocker and WriteBlocker classes. Also get rid of the specialized ReceiveBlocker since it no longer does anything that ReadBlocker can't do.	2020-01-26 17:54:23 +01:00
Andreas Kling	b011857e4f	Kernel: Make writev() work again Vector::ensure_capacity() makes sure the underlying vector buffer can contain all the data, but it doesn't update the Vector::size(). As a result, writev() would simply collect all the buffers to write, and then do nothing.	2020-01-26 10:10:15 +01:00
Andreas Kling	b93f6b07c2	Kernel: Make sched_setparam() and sched_getparam() operate on threads Instead of operating on "some random thread in PID", these now operate on the thread with a specific TID. This matches other systems better.	2020-01-26 09:58:58 +01:00
Andreas Kling	f4e7aecec2	Kernel: Preserve CoW bits when splitting VM regions	2020-01-25 17:57:10 +01:00
Andreas Kling	7cc0b18f65	Kernel: Only open a single description for stdio in non-fork processes	2020-01-25 17:05:02 +01:00
Andreas Kling	81ddd2dae0	Kernel: Make sys$setsid() clear the calling process's controlling TTY	2020-01-25 14:53:48 +01:00
Andreas Kling	2bf11b8348	Kernel: Allow empty strings in validate_and_copy_string_from_user() Sergey pointed out that we should just allow empty strings everywhere.	2020-01-25 14:14:11 +01:00
Andreas Kling	69de90a625	Kernel: Simplify Process constructor Move all the fork-specific inheritance logic to sys$fork(), and all the stuff for setting up stdio for non-fork ring 3 processes moves to Process::create_user_process(). Also: we were setting up the PGID, SID and umask twice. Also the code for copying the open file descriptors was overly complicated. Now it's just a simple Vector copy assignment. :^)	2020-01-25 14:13:47 +01:00
Andreas Kling	0f5221568b	Kernel: sys$execve() should not EFAULT for empty argument strings It's okay to exec { "/bin/echo", "" } and it should not EFAULT.	2020-01-25 12:21:30 +01:00
Andreas Kling	30ad7953ca	Kernel: Rename UnveilState to VeilState	2020-01-21 19:28:59 +01:00
Andreas Kling	f38cfb3562	Kernel: Tidy up debug logging a little bit When using dbg() in the kernel, the output is automatically prefixed with [Process(PID:TID)]. This makes it a lot easier to understand which thread is generating the output. This patch also cleans up some common logging messages and removes the now-unnecessary "dbg() << *current << ..." pattern.	2020-01-21 16:16:20 +01:00
Andreas Kling	6081c76515	Kernel: Make O_RDONLY non-zero Sergey suggested that having a non-zero O_RDONLY would make some things less confusing, and it seems like he's right about that. We can now easily check read/write permissions separately instead of dancing around with the bits. This patch also fixes unveil() validation for O_RDWR which previously forgot to check for "r" permission.	2020-01-21 13:27:08 +01:00
Andreas Kling	1b3cac2f42	Kernel: Don't forget about unveiled paths with zero permissions We need to keep these around, otherwise the calling process can remove and re-add a path to increase its permissions.	2020-01-21 11:42:28 +01:00
Andreas Kling	22cfb1f3bd	Kernel: Clear unveiled state on exec()	2020-01-21 10:46:31 +01:00
Andreas Kling	cf48c20170	Kernel: Forked children should inherit unveil()'ed paths	2020-01-21 09:44:32 +01:00
Andreas Kling	0569123ad7	Kernel: Add a basic implementation of unveil() This syscall is a complement to pledge() and adds the same sort of incremental relinquishing of capabilities for filesystem access. The first call to unveil() will "drop a veil" on the process, and from now on, only unveiled parts of the filesystem are visible to it. Each call to unveil() specifies a path to either a directory or a file along with permissions for that path. The permissions are a combination of the following: - r: Read access (like the "rpath" promise) - w: Write access (like the "wpath" promise) - x: Execute access - c: Create/remove access (like the "cpath" promise) Attempts to open a path that has not been unveiled with fail with ENOENT. If the unveiled path lacks sufficient permissions, it will fail with EACCES. Like pledge(), subsequent calls to unveil() with the same path can only remove permissions, not add them. Once you call unveil(nullptr, nullptr), the veil is locked, and it's no longer possible to unveil any more paths for the process, ever. This concept comes from OpenBSD, and their implementation does various things differently, I'm sure. This is just a first implementation for SerenityOS, and we'll keep improving on it as we go. :^)	2020-01-20 22:12:04 +01:00
Andreas Kling	e901a3695a	Kernel: Use the templated copy_to/from_user() in more places These ensure that the "to" and "from" pointers have the same type, and also that we copy the correct number of bytes.	2020-01-20 13:41:21 +01:00
Sergey Bugaev	d5426fcc88	Kernel: Misc tweaks	2020-01-20 13:26:06 +01:00
Sergey Bugaev	9bc6157998	Kernel: Return new fd from sys$fcntl(F_DUPFD) This fixes GNU Bash getting confused after performing a redirection.	2020-01-20 13:26:06 +01:00
Andreas Kling	4b7a89911c	Kernel: Remove some unnecessary casts to uintptr_t VirtualAddress is constructible from uintptr_t and const void. PhysicalAddress is constructible from uintptr_t but not const void.	2020-01-20 13:13:03 +01:00
Andreas Kling	a246e9cd7e	Use uintptr_t instead of u32 when storing pointers as integers uintptr_t is 32-bit or 64-bit depending on the target platform. This will help us write pointer size agnostic code so that when the day comes that we want to do a 64-bit port, we'll be in better shape.	2020-01-20 13:13:03 +01:00
Andreas Kling	8d9dd1b04b	Kernel: Add a 1-deep cache to Process::region_from_range() This simple cache gets hit over 70% of the time on "g++ Process.cpp" and shaves ~3% off the runtime.	2020-01-19 16:44:37 +01:00
Andreas Kling	ae0c435e68	Kernel: Add a Process::add_region() helper This is a private helper for adding a Region to Process::m_regions. It's just for convenience since it's a bit cumbersome to do this.	2020-01-19 16:26:42 +01:00
Andreas Kling	1dc9fa9506	Kernel: Simplify PageDirectory swapping in sys$execve() Swap out both the PageDirectory and the Region list at the same time, instead of doing the Region list slightly later.	2020-01-19 16:05:42 +01:00
Andreas Kling	6eab7b398d	Kernel: Make ProcessPagingScope restore CR3 properly Instead of restoring CR3 to the current process's paging scope when a ProcessPagingScope goes out of scope, we now restore exactly whatever the CR3 value was when we created the ProcessPagingScope. This fixes breakage in situations where a process ends up with nested ProcessPagingScopes. This was making profiling very fragile, and with this change it's now possible to profile g++! :^)	2020-01-19 13:44:53 +01:00
Andreas Kling	f7b394e9a1	Kernel: Assert that copy_to/from_user() are called with user addresses This will panic the kernel immediately if these functions are misused so we can catch it and fix the misuse. This patch fixes a couple of misuses: - create_signal_trampolines() writes to a user-accessible page above the 3GB address mark. We should really get rid of this page but that's a whole other thing. - CoW faults need to use copy_from_user rather than copy_to_user since it's the source pointer that points to user memory. - Inode faults need to use memcpy rather than copy_to_user since we're copying a kernel stack buffer into a quickmapped page. This should make the copy_to/from_user() functions slightly less useful for exploitation. Before this, they were essentially just glorified memcpy() with SMAP disabled. :^)	2020-01-19 09:18:55 +01:00
Andreas Kling	5ce9382e98	Kernel: Only require "stdio" pledge for sending signals to self This should match what OpenBSD does. Sending a signal to yourself seems basically harmless.	2020-01-19 08:50:55 +01:00
Sergey Bugaev	3e1ed38d4b	Kernel: Do not return ENOENT for unresolved symbols ENOENT means "no such file or directory", not "no such symbol". Return EINVAL instead, as we already do in other cases.	2020-01-18 23:51:22 +01:00
Sergey Bugaev	d0d13e2bf5	Kernel: Move setting file flags and r/w mode to VFS::open() Previously, VFS::open() would only use the passed flags for permission checking purposes, and Process::sys$open() would set them on the created FileDescription explicitly. Now, they should be set by VFS::open() on any files being opened, including files that the kernel opens internally. This also lets us get rid of the explicit check for whether or not the returned FileDescription was a preopen fd, and in fact, fixes a bug where a read-only preopen fd without any other flags would be considered freshly opened (due to O_RDONLY being indistinguishable from 0) and granted a new set of flags.	2020-01-18 23:51:22 +01:00
Sergey Bugaev	544b8286da	Kernel: Do not open stdio fds for kernel processes Kernel processes just do not need them. This also avoids touching the file (sub)system early in the boot process when initializing the colonel process.	2020-01-18 23:51:22 +01:00
Sergey Bugaev	6466c3d750	Kernel: Pass correct permission flags when opening files Right now, permission flags passed to VFS::open() are effectively ignored, but that is going to change. * O_RDONLY is 0, but it's still nicer to pass it explicitly * POSIX says that binding a Unix socket to a symlink shall fail with EADDRINUSE	2020-01-18 23:51:22 +01:00
Andreas Kling	862b3ccb4e	Kernel: Enforce W^X between sys$mmap() and sys$execve() It's now an error to sys$mmap() a file as writable if it's currently mapped executable by anyone else. It's also an error to sys$execve() a file that's currently mapped writable by anyone else. This fixes a race condition vulnerability where one program could make modifications to an executable while another process was in the kernel, in the middle of exec'ing the same executable. Test: Kernel/elf-execve-mmap-race.cpp	2020-01-18 23:40:12 +01:00
Andreas Kling	4e6fe3c14b	Kernel: Symbolicate kernel EIP on process crash Process::crash() was assuming that EIP was always inside the ELF binary of the program, but it could also be in the kernel.	2020-01-18 14:38:39 +01:00
Andreas Kling	9c9fe62a4b	Kernel: Validate the requested range in allocate_region_with_vmobject()	2020-01-18 14:37:22 +01:00
Andreas Kling	aa63de53bd	Kernel: Use get_syscall_path_argument() in sys$execve() Paths passed to sys$execve() should certainly be subject to all the usual path validation checks.	2020-01-18 11:43:28 +01:00
Andreas Kling	b65572b3fe	Kernel: Disallow mmap names longer than PATH_MAX	2020-01-18 11:34:53 +01:00
Andreas Kling	94ca55cefd	Meta: Add license header to source files As suggested by Joshua, this commit adds the 2-clause BSD license as a comment block to the top of every source file. For the first pass, I've just added myself for simplicity. I encourage everyone to add themselves as copyright holders of any file they've added or modified in some significant way. If I've added myself in error somewhere, feel free to replace it with the appropriate copyright holder instead. Going forward, all new source files should include a license header.	2020-01-18 09:45:54 +01:00
Andreas Kling	19c31d1617	Kernel: Always dump kernel regions when dumping process regions	2020-01-18 08:57:18 +01:00
Sergey Bugaev	064cd2278c	Kernel: Remove the use of FileSystemPath in sys$realpath() Now that VFS::resolve_path() canonicalizes paths automatically, we don't need to do that here anymore.	2020-01-17 21:49:58 +01:00
Sergey Bugaev	8642a7046c	Kernel: Let inodes provide pre-open file descriptions Some magical inodes, such as /proc/pid/fd/fileno, are going to want to open() to a custom FileDescription, so add a hook for that.	2020-01-17 21:49:58 +01:00
Sergey Bugaev	e0013a6b4c	Kernel+LibC: Unify sys$open() and sys$openat() The syscall is now called sys$open(), but it behaves like the old sys$openat(). In userspace, open_with_path_length() is made a wrapper over openat_with_path_length().	2020-01-17 21:49:58 +01:00
Andreas Kling	4d4d5e1c07	Kernel: Drop futex queues/state on exec() This state is not meaningful to the new process image so just drop it.	2020-01-17 16:08:00 +01:00
Andreas Kling	26a31c7efb	Kernel: Add "accept" pledge promise for accepting incoming connections This patch adds a new "accept" promise that allows you to call accept() on an already listening socket. This lets programs set up a socket for for listening and then dropping "inet" and/or "unix" so that only incoming (and existing) connections are allowed from that point on. No new outgoing connections or listening server sockets can be created. In addition to accept() it also allows getsockopt() with SOL_SOCKET and SO_PEERCRED, which is used to find the PID/UID/GID of the socket peer. This is used by our IPC library when creating shared buffers that should only be accessible to a specific peer process. This allows us to drop "unix" in WindowServer and LookupServer. :^) It also makes the debugging/introspection RPC sockets in CEventLoop based programs work again.	2020-01-17 11:19:06 +01:00
Andreas Kling	c6e552ac8f	Kernel+LibELF: Don't blindly trust ELF symbol offsets in symbolication It was possible to craft a custom ELF executable that when symbolicated would cause the kernel to read from user-controlled addresses anywhere in memory. You could then fetch this memory via /proc/PID/stack We fix this by making ELFImage hand out StringView rather than raw const char* for symbol names. In case a symbol offset is outside the ELF image, you get a null StringView. :^) Test: Kernel/elf-symbolication-kernel-read-exploit.cpp	2020-01-16 22:11:31 +01:00
Andreas Kling	d79de38bd2	Kernel: Don't allow userspace to sys$open() literal symlinks The O_NOFOLLOW_NOERROR is an internal kernel mechanism used for the implementation of sys$readlink() and sys$lstat(). There is no reason to allow userspace to open symlinks directly.	2020-01-15 21:19:26 +01:00
Andreas Kling	e23536d682	Kernel: Use Vector::unstable_remove() in a couple of places	2020-01-15 19:26:41 +01:00
Liav A	d2b41010c5	Kernel: Change Region allocation helpers We now can create a cacheable Region, so when map() is called, if a Region is cacheable then all the virtual memory space being allocated to it will be marked as not cache disabled. In addition to that, OS components can create a Region that will be mapped to a specific physical address by using the appropriate helper method.	2020-01-14 15:38:58 +01:00
Andreas Kling	65cb406327	Kernel: Allow unlocking a held Lock with interrupts disabled This is needed to eliminate a race in Thread::wait_on() where we'd otherwise have to wait until after unlocking the process lock before we can disable interrupts.	2020-01-13 18:56:46 +01:00
Andrew Kaster	7a7e7c82b5	Kernel: Tighten up exec/do_exec and allow for PT_INTERP iterpreters This patch changes how exec() figures out which program image to actually load. Previously, we opened the path to our main executable in find_shebang_interpreter_for_executable, read the first page (or less, if the file was smaller) and then decided whether to recurse with the interpreter instead. We then then re-opened the main executable in do_exec. However, since we now want to parse the ELF header and Program Headers of an elf image before even doing any memory region work, we can change the way this whole process works. We open the file and read (up to) the first page in exec() itself, then pass just the page and the amount read to find_shebang_interpreter_for_executable. Since we now have that page and the FileDescription for the main executable handy, we can do a few things. First, validate the ELF header and ELF program headers for any shenanigans. ELF32 Little Endian i386 only, please. Second, we can grab the PT_INTERP interpreter from any ET_DYN files, and open that guy right away if it exists. Finally, we can pass the main executable's and optionally the PT_INTERP interpreter's file descriptions down to do_exec and not have to feel guilty about opening the file twice. In do_exec, we now have a choice. Are we going to load the main executable, or the interpreter? We could load both, but it'll be way easier for the inital pass on the RTLD if we only load the interpreter. Then it can load the main executable itself like any old shared object, just, the one with main in it :). Later on we can load both of them into memory and the RTLD can relocate itself before trying to do anything. The way it's written now the RTLD will get dibs on its requested virtual addresses being the actual virtual addresses.	2020-01-13 13:03:30 +01:00
Brian Gianforcaro	4cee441279	Kernel: Combine validate and copy of user mode pointers (#1069 ) Right now there is a significant amount of boiler plate code required to validate user mode parameters in syscalls. In an attempt to reduce this a bit, introduce validate_read_and_copy_typed which combines the usermode address check and does the copy internally if the validation passes. This cleans up a little bit of code from a significant amount of syscalls.	2020-01-13 11:19:17 +01:00
Brian Gianforcaro	9cac205d67	Kernel: Fix SMAP in setkeymap syscall It looks like setkeymap was missed when the SMAP functionality was introduced. Disable SMAP only in the scope where we actually read the usermode addresses.	2020-01-13 11:17:10 +01:00
Brian Gianforcaro	02704a73e9	Kernel: Use the templated copy_from_user where possible Now that the templated version of copy_from_user exists their is normally no reason to use the version which takes the number of bytes to copy. Move to the templated version where possible.	2020-01-13 11:07:39 +01:00
Andreas Kling	20b2bfcafd	Kernel: Fix SMAP violation in sys$getrandom()	2020-01-12 20:10:53 +01:00
Sergey Bugaev	33c0dc08a7	Kernel: Don't forget to copy & destroy root_directory_for_procfs Also, rename it to root_directory_relative_to_global_root.	2020-01-12 20:02:11 +01:00
Sergey Bugaev	dd54d13d8d	Kernel+LibC: Allow passing mount flags to chroot() Since a chroot is in many ways similar to a separate root mount, we can also apply mount flags to it as if it was an actual mount. These flags will apply whenever the chrooted process accesses its root directory, but not when other processes access this same directory for the outside. Since it's common to chdir("/") immediately after chrooting (so that files accessed through the current directory inherit the same mount flags), this effectively allows one to apply additional limitations to a process confined inside a chroot. To this effect, sys$chroot() gains a mount_flags argument (exposed as chroot_with_mount_flags() in userspace) which can be set to all the same values as the flags argument for sys$mount(), and additionally to -1 to keep the flags set for that file system. Note that passing 0 as mount_flags will unset any flags that may have been set for the file system, not keep them.	2020-01-12 20:02:11 +01:00
Sergey Bugaev	93ff911473	Kernel: Properly propagate bind mount flags Previously, when performing a bind mount flags other than MS_BIND were ignored. Now, they're properly propagated the same way a for any other mount.	2020-01-12 20:02:11 +01:00
Sergey Bugaev	b620ed25ab	Kernel: Simplify Ext2FS mount code path Instead of looking up device metadata and then looking up a device by that metadata explicitly, just use VFS::open(). This also means that attempting to mount a device residing on a MS_NODEV file system will properly fail.	2020-01-12 20:02:11 +01:00
Sergey Bugaev	35b0f10f20	Kernel: Don't dump backtrace on successful exits This was getting really annoying.	2020-01-12 20:02:11 +01:00
Andreas Kling	d1839ae0c9	Kernel: Clearing promises with pledge("") should fail Thanks Sergey for catching this brain-fart. :^)	2020-01-12 12:16:17 +01:00
Andreas Kling	114a770c6f	Kernel: Reduce pledge requirement for recvfrom()+sendto() to "stdio" Since these only operate on already-open sockets, we should treat them the same as we do read() and write() by putting them into "stdio".	2020-01-12 11:52:37 +01:00
Andreas Kling	955034e86e	Kernel: Remove manual STAC/CLAC in create_thread()	2020-01-12 11:51:31 +01:00
Andreas Kling	a6cef2408c	Kernel: Add sigreturn() to "stdio" with all the other signal syscalls	2020-01-12 10:32:56 +01:00
Andreas Kling	7b53699e6f	Kernel: Require the "thread" pledge promise for futex()	2020-01-12 10:31:21 +01:00
Andreas Kling	c32d65ae9f	Kernel: Put some more syscalls in the "stdio" bucket yield() and get_kernel_info_page() seem like decent fits for "stdio".	2020-01-12 10:31:21 +01:00
Andreas Kling	ca609ce5a3	Kernel: Put fcntl() debug spam behind DEBUG_IO	2020-01-12 10:01:22 +01:00
Andreas Kling	017b34e1ad	Kernel: Add "video" pledge for accessing framebuffer devices WindowServer becomes the only user.	2020-01-12 02:18:30 +01:00
Andreas Kling	f187374c1b	Kernel: fork()ed children should inherit pledge promises :^) Update various places that now need wider promises as they are not reset by fork() anymore.	2020-01-11 23:28:41 +01:00
Andreas Kling	409a4f7756	ping: Use pledge()	2020-01-11 20:48:43 +01:00
Sergey Bugaev	0cb0f54783	Kernel: Implement bind mounts You can now bind-mount files and directories. This essentially exposes an existing part of the file system in another place, and can be used as an alternative to symlinks or hardlinks. Here's an example of doing this: # mkdir /tmp/foo # mount /home/anon/myfile.txt /tmp/foo -o bind # cat /tmp/foo This is anon's file.	2020-01-11 18:57:53 +01:00
Sergey Bugaev	61c1106d9f	Kernel+LibC: Implement a few mount flags We now support these mount flags: * MS_NODEV: disallow opening any devices from this file system * MS_NOEXEC: disallow executing any executables from this file system * MS_NOSUID: ignore set-user-id bits on executables from this file system The fourth flag, MS_BIND, is defined, but currently ignored.	2020-01-11 18:57:53 +01:00
Sergey Bugaev	2fcbb846fb	Kernel+LibC: Add O_EXEC, move exec permission checking to VFS::open() O_EXEC is mentioned by POSIX, so let's have it. Currently, it is only used inside the kernel to ensure the process has the right permissions when opening an executable.	2020-01-11 18:57:53 +01:00
Sergey Bugaev	4566c2d811	Kernel+LibC: Add support for mount flags At the moment, the actual flags are ignored, but we correctly propagate them all the way from the original mount() syscall to each custody that resides on the mounted FS.	2020-01-11 18:57:53 +01:00
Andreas Kling	83f59419cd	Kernel: Oops, recvfrom() is not quite ready for SMAP protections yet	2020-01-11 13:03:44 +01:00
Andreas Kling	24c736b0e7	Kernel: Use the Syscall string and buffer types more While I was updating syscalls to stop passing null-terminated strings, I added some helpful struct types: - StringArgument { const char; size_t; } - ImmutableBuffer<Data, Size> { const Data; Size; } - MutableBuffer<Data, Size> { Data*; Size; } The Process class has some convenience functions for validating and optionally extracting the contents from these structs: - get_syscall_path_argument(StringArgument) - validate_and_copy_string_from_user(StringArgument) - validate(ImmutableBuffer) - validate(MutableBuffer) There's still so much code around this and I'm wondering if we should generate most of it instead. Possible nice little project.	2020-01-11 12:47:47 +01:00
Andreas Kling	1434f30f92	Kernel: Remove SmapDisabler in bind()	2020-01-11 12:07:45 +01:00
Andreas Kling	2d7ae42f75	Kernel: Remove SmapDisabler in clock_nanosleep()	2020-01-11 11:51:03 +01:00
Andreas Kling	0ca6d6c8d2	Kernel: Remove validate_read_str() as nothing uses it anymore :^)	2020-01-11 10:57:50 +01:00
Andreas Kling	f5092b1c7e	Kernel: Pass a parameter struct to mount() This was the last remaining syscall that took a null-terminated string and figured out how long it was by walking it in kernelspace shudder.	2020-01-11 10:56:02 +01:00
Andreas Kling	e380142853	Kernel: Pass a parameter struct to rename()	2020-01-11 10:36:54 +01:00
Andreas Kling	46830a0c32	Kernel: Pass a parameter struct to symlink()	2020-01-11 10:31:33 +01:00
Andreas Kling	c97bfbd609	Kernel: Pass a parameter struct to mknod()	2020-01-11 10:27:37 +01:00
Andreas Kling	6536a80aa9	Kernel: Pass a parameter struct to chown()	2020-01-11 10:17:44 +01:00
Andreas Kling	29b3d95004	Kernel: Expose a process's filesystem root as a /proc/PID/root symlink In order to preserve the absolute path of the process root, we save the custody used by chroot() before stripping it to become the new "/". There's probably a better way to do this.	2020-01-10 23:48:44 +01:00
Andreas Kling	ddd0b19281	Kernel: Add a basic chroot() syscall :^) The chroot() syscall now allows the superuser to isolate a process into a specific subtree of the filesystem. This is not strictly permanent, as it is also possible for a superuser to break out of a chroot, but it is a useful mechanism for isolating unprivileged processes. The VFS now uses the current process's root_directory() as the root for path resolution purposes. The root directory is stored as an uncached Custody in the Process object.	2020-01-10 23:14:04 +01:00
Andreas Kling	485443bfca	Kernel: Pass characters+length to link()	2020-01-10 21:26:47 +01:00
Andreas Kling	416c7ac2b5	Kernel: Rename Syscall::SyscallString => Syscall::StringArgument	2020-01-10 20:16:18 +01:00
Andreas Kling	0695ff8282	Kernel: Pass characters+length to readlink() Note that I'm developing some helper types in the Syscall namespace as I go here. Once I settle on some nice types, I will convert all the other syscalls to use them as well.	2020-01-10 20:13:23 +01:00
Andreas Kling	8c5cd97b45	Kernel: Fix kernel null deref on process crash during join_thread() The join_thread() syscall is not supposed to be interruptible by signals, but it was. And since the process death mechanism piggybacked on signal interrupts, it was possible to interrupt a pthread_join() by killing the process that was doing it, leading to confusing due to some assumptions being made by Thread::finalize() for threads that have a pending joiner. This patch fixes the issue by making "interrupted by death" a distinct block result separate from "interrupted by signal". Then we handle that state in join_thread() and tidy things up so that thread finalization doesn't get confused by the pending joiner being gone. Test: Tests/Kernel/null-deref-crash-during-pthread_join.cpp	2020-01-10 19:23:45 +01:00
Andreas Kling	de69f84868	Kernel: Remove SmapDisablers in fchmod() and fchown()	2020-01-10 14:20:14 +01:00
Andreas Kling	952bb95baa	Kernel: Enable SMAP protection during the execve() syscall The userspace execve() wrapper now measures all the strings and puts them in a neat and tidy structure on the stack. This way we know exactly how much to copy in the kernel, and we don't have to use the SMAP-violating validate_read_str(). :^)	2020-01-10 12:20:36 +01:00
Andreas Kling	197e73ee31	Kernel+LibELF: Enable SMAP protection during non-syscall exec() When loading a new executable, we now map the ELF image in kernel-only memory and parse it there. Then we use copy_to_user() when initializing writable regions with data from the executable. Note that the exec() syscall still disables SMAP protection and will require additional work. This patch only affects kernel-originated process spawns.	2020-01-10 10:57:06 +01:00
Andreas Kling	ff16298b44	Kernel: Removed an unused global variable	2020-01-09 18:02:37 +01:00
Andreas Kling	17ef5bc0ac	Kernel: Rename {ss,esp}_if_crossRing to userspace_{ss,esp} These were always so awkwardly named.	2020-01-09 18:02:01 +01:00
Andreas Kling	4b4d369c5d	Kernel: Take path+length in the unlink() and umount() syscalls	2020-01-09 16:23:41 +01:00
Andrew Kaster	e594724b01	Kernel: mmap(..., MAP_PRIVATE, fd, offset) is not supported Make mmap return -ENOTSUP in this case to make sure users don't get confused and think they're using a private mapping when it's actually shared. It's currenlty not possible to open a file and mmap it MAP_PRIVATE, and change the perms of the private mapping to ones that don't match the permissions of the underlying file.	2020-01-09 09:29:36 +01:00
Andreas Kling	e1d4b19461	Kernel: open() and openat() should ignore non-permission bits in mode	2020-01-08 15:21:06 +01:00
Andreas Kling	532f240f24	Kernel: Remove unused syscall for setting the signal mask	2020-01-08 15:21:06 +01:00
Andreas Kling	200459d644	Kernel: Fix SMAP violation in join_thread()	2020-01-08 15:21:05 +01:00
Andreas Kling	50056d1d84	Kernel: mmap() should fail with ENODEV for directories	2020-01-08 12:47:37 +01:00
Andreas Kling	fe9680f0a4	Kernel: Validate PROT_READ and PROT_WRITE against underlying file This patch fixes some issues with the mmap() and mprotect() syscalls, neither of whom were checking the permission bits of the underlying files when mapping an inode MAP_SHARED. This made it possible to subvert execution of any running program by simply memory-mapping its executable and replacing some of the code. Test: Kernel/mmap-write-into-running-programs-executable-file.cpp	2020-01-07 19:32:32 +01:00
Andreas Kling	5387a19268	Kernel: Make Process::file_description() vend a RefPtr<FileDescription> This encourages callers to strongly reference file descriptions while working with them. This fixes a use-after-free issue where one thread would close() an open fd while another thread was blocked on it becoming readable. Test: Kernel/uaf-close-while-blocked-in-read.cpp	2020-01-07 15:53:42 +01:00
Andreas Kling	6a4b376021	Kernel: Validate ftruncate(fd, length) syscall inputs - EINVAL if 'length' is negative - EBADF if 'fd' is not open for writing	2020-01-07 14:48:43 +01:00
Andreas Kling	78a63930cc	Kernel+LibELF: Validate PT_LOAD and PT_TLS offsets before memcpy()'ing Before this, you could make the kernel copy memory from anywhere by setting up an ELF executable with a program header specifying file offsets outside the file. Since ELFImage didn't even know how large it was, we had no clue that we were copying things from outside the ELF. Fix this by adding a size field to ELFImage and validating program header ranges before memcpy()'ing to them. The ELF code is definitely going to need more validation and checking.	2020-01-06 21:04:57 +01:00
Andreas Kling	8088fa0556	Kernel: Process::send_signal() should prefer main thread The main/first thread in a process always has the same TID as the PID.	2020-01-06 14:37:26 +01:00
Andreas Kling	a803312eb4	Kernel: Send SIGCHLD to the thread with same PID as my PPID Instead of delivering SIGCHLD to "any thread" in the process with PPID, send it to the thread with the same TID as my PPID.	2020-01-06 14:35:42 +01:00
Andreas Kling	cd42ccd686	Kernel: The waitpid() syscall was not storing to "wstatus" in all cases	2020-01-06 14:34:04 +01:00
Andreas Kling	47cc3e68c6	Kernel: Remove bogus kernel image access validation checks This code had been misinterpreting the Multiboot ELF section headers since the beginning. Furthermore QEMU wasn't even passing us any headers at all, so this wasn't checking anything.	2020-01-06 13:27:14 +01:00
Andreas Kling	53bda09d15	Kernel: Make utime() take path+length, remove SmapDisabler	2020-01-06 12:23:30 +01:00
Andreas Kling	1226fec19e	Kernel: Remove SmapDisablers in stat() and lstat()	2020-01-06 12:13:48 +01:00
Andreas Kling	a47f0c93de	Kernel: Pass name+length to mmap() and remove SmapDisabler	2020-01-06 12:04:55 +01:00
Andreas Kling	33025a8049	Kernel: Pass name+length to set_mmap_name() and remove SmapDisabler	2020-01-06 11:56:59 +01:00
Andreas Kling	6af8392cf8	Kernel: Remove SmapDisabler in futex()	2020-01-06 11:44:15 +01:00
Andreas Kling	a30fb5c5c1	Kernel: SMAP fixes for module_load() and module_unload() Remove SmapDisabler in module_load() + use get_syscall_path_argument(). Also fix a SMAP violation in module_unload().	2020-01-06 11:36:16 +01:00
Andreas Kling	7c916b9fe9	Kernel: Make realpath() take path+length, get rid of SmapDisabler	2020-01-06 11:32:25 +01:00
Andreas Kling	d6b06fd5a3	Kernel: Make watch_file() syscall take path length as a size_t We don't care to handle negative path lengths anyway.	2020-01-06 11:15:49 +01:00
Andreas Kling	cf7df95ffe	Kernel: Use get_syscall_path_argument() for syscalls that take paths	2020-01-06 11:15:49 +01:00
Andreas Kling	0df72d4712	Kernel: Pass path+length to mkdir(), rmdir() and chmod()	2020-01-06 11:15:49 +01:00
Andreas Kling	642137f014	Kernel: Make access() take path+length Also, let's return EFAULT for nullptr at the LibC layer. We can't do all bad addresses this way, but we can at least do null. :^)	2020-01-06 11:15:48 +01:00
Andreas Kling	2c3a6c37ac	Kernel: Paper over SMAP violations in clock_{gettime,nanosleep}() Just put some SmapDisablers here to unbreak the nesalizer port.	2020-01-05 23:20:33 +01:00
Andreas Kling	c5890afc8b	Kernel: Make chdir() take path+length	2020-01-05 22:06:25 +01:00
Andreas Kling	f231e9ea76	Kernel: Pass path+length to the stat() and lstat() syscalls It's not pleasant having to deal with null-terminated strings as input to syscalls, so let's get rid of them one by one.	2020-01-05 22:02:54 +01:00
Andreas Kling	152a83fac5	Kernel: Remove SmapDisabler in watch_file()	2020-01-05 21:55:20 +01:00
Andreas Kling	80cbb72f2f	Kernel: Remove SmapDisablers in open(), openat() and set_thread_name() This patch introduces a helpful copy_string_from_user() function that takes a bounded null-terminated string from userspace memory and copies it into a String object.	2020-01-05 21:51:06 +01:00
Andreas Kling	c4a1ea34c2	Kernel: Fix SMAP violation in writev() syscall	2020-01-05 19:20:08 +01:00
Andreas Kling	9eef39d68a	Kernel: Start implementing x86 SMAP support Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that prevents the kernel from accessing userspace memory. With SMAP enabled, trying to read/write a userspace memory address while in the kernel will now generate a page fault. Since it's sometimes necessary to read/write userspace memory, there are two new instructions that quickly switch the protection on/off: STAC (disables protection) and CLAC (enables protection.) These are exposed in kernel code via the stac() and clac() helpers. There's also a SmapDisabler RAII object that can be used to ensure that you don't forget to re-enable protection before returning to userspace code. THis patch also adds copy_to_user(), copy_from_user() and memset_user() which are the "correct" way of doing things. These functions allow us to briefly disable protection for a specific purpose, and then turn it back on immediately after it's done. Going forward all kernel code should be moved to using these and all uses of SmapDisabler are to be considered FIXME's. Note that we're not realizing the full potential of this feature since I've used SmapDisabler quite liberally in this initial bring-up patch.	2020-01-05 18:14:51 +01:00
Andreas Kling	1525c11928	Kernel: Add missing iovec base validation for writev() syscall We were forgetting to validate the base pointers of iovecs passed into the writev() syscall. Thanks to braindead for finding this bug! :^)	2020-01-05 10:38:02 +01:00
Andreas Kling	c89fe8a6a3	Kernel: Fix bad TOCTOU pattern in syscalls that take a parameter struct Our syscall calling convention only allows passing up to 3 arguments in registers. For syscalls that take more arguments, we bake them into a struct and pass a pointer to that struct instead. When doing pointer validation, this is what we would do: 1) Validate the "params" struct 2) Validate "params->some_pointer" 3) ... other stuff ... 4) Use "params->some_pointer" Since the parameter struct is stored in userspace, it can be modified by userspace after validation has completed. This was a recurring pattern in many syscalls that was further hidden by me using structured binding declarations to give convenient local names to things in the parameter struct: auto& [some_pointer, ...] = *params; memcpy(some_pointer, ...); This devilishly makes "some_pointer" look like a local variable but it's actually more like an alias for "params->some_pointer" and will expand to a dereference when accessed! This patch fixes the issues by explicitly copying out each member from the parameter structs before validating them, and then never using the "param" pointers beyond that. Thanks to braindead for finding this bug! :^)	2020-01-05 10:37:57 +01:00
Andreas Kling	3a27790fa7	Kernel: Use Thread::from_tid() in more places	2020-01-04 18:56:04 +01:00
Andreas Kling	95ba0d5a02	Kernel: Remove unused "putch" syscall	2020-01-04 16:00:25 +01:00
Andreas Kling	5abc30e057	Kernel: Allow setgroups() to drop all groups with nullptr Previously we'd EFAULT for setgroups(0, nullptr), but we can just as well tolerate it if someone wants to drop groups without a pointer.	2020-01-04 13:47:54 +01:00
Andreas Kling	d84299c7be	Kernel: Allow fchmod() and fchown() on pre-bind() local sockets In order to ensure a specific owner and mode when the local socket filesystem endpoint is instantiated, we need to be able to call fchmod() and fchown() on a socket fd between socket() and bind(). This is because until we call bind(), there is no filesystem inode for the socket yet.	2020-01-03 20:14:56 +01:00
Andreas Kling	1dc64ec064	Kernel: Remove unnecessary logic in kill() and killpg() syscalls As Sergey pointed out, do_killpg() already interprets PID 0 as the PGID of the calling process.	2020-01-03 12:58:59 +01:00
Andreas Kling	9026598999	Kernel: Add a more expressive API for getting random bytes We now have these API's in <Kernel/Random.h>: - get_fast_random_bytes(u8* buffer, size_t buffer_size) - get_good_random_bytes(u8* buffer, size_t buffer_size) - get_fast_random<T>() - get_good_random<T>() Internally they both use x86 RDRAND if available, otherwise they fall back to the same LCG we had in RandomDevice all along. The main purpose of this patch is to give kernel code a way to better express its needs for random data. Randomness is something that will require a lot more work, but this is hopefully a step in the right direction.	2020-01-03 12:43:07 +01:00
Andreas Kling	24cc67d199	Kernel: Remove read_tsc() syscall Since nothing is using this, let's just remove it. That's one less thing to worry about.	2020-01-03 09:27:09 +01:00
Andreas Kling	8cc5fa5598	Kernel: Unbreak module loading (broke with NX bit changes) Modules are now mapped fully RWX. This can definitely be improved, but at least it unbreaks the feature for now.	2020-01-03 03:44:55 +01:00
Andreas Kling	0a1865ebc6	Kernel: read() and write() should fail with EBADF for wrong mode fd's It was previously possible to write to read-only file descriptors, and read from write-only file descriptors. All FileDescription objects now start out non-readable + non-writable, and whoever is creating them has to "manually" enable reading/writing by calling set_readable() and/or set_writable() on them.	2020-01-03 03:29:59 +01:00
Andreas Kling	15f3abc849	Kernel: Handle O_DIRECTORY in VFS::open() instead of in each syscall Just taking care of some FIXMEs.	2020-01-03 03:16:29 +01:00
Andreas Kling	05653a9189	Kernel: killpg() with pgrp=0 should signal every process in the group In the same group as the calling process, that is.	2020-01-03 03:16:29 +01:00
Andreas Kling	005313df82	Kernel: kill() with signal 0 should not actually send anything Also kill() with pid 0 should send to everyone in the same process group as the calling process.	2020-01-03 03:16:29 +01:00
Andreas Kling	8345f51a24	Kernel: Remove unnecessary wraparound check in Process::validate_read() This will be checked moments later by MM.validate_user_read().	2020-01-03 03:16:29 +01:00
Andreas Kling	fdde5cdf26	Kernel: Don't include the process GID in the "extra GIDs" table Process::m_extra_gids is for supplementary GIDs only.	2020-01-02 23:45:52 +01:00
Andreas Kling	9fe316c2d8	Kernel: Add some missing error checks to the setpgid() syscall	2020-01-02 19:40:04 +01:00
Andreas Kling	285130cc55	Kernel: Remove debug spam about marking threads for death	2020-01-02 13:45:22 +01:00
Andreas Kling	7f843ef3b2	Kernel: Make the purge() syscall superuser-only I don't think we need to give unprivileged users access to what is essentially a kernel testing mechanism.	2020-01-02 13:39:49 +01:00
Andreas Kling	c01f766fb2	Kernel: writev() should fail with EINVAL if total length > INT32_MAX	2020-01-02 13:01:41 +01:00
Andreas Kling	7f04334664	Kernel: Remove broken implementation of Unix SHM This code never worked, as was never used for anything. We can build a much better SHM implementation on top of TmpFS or similar when we get to the point when we need one.	2020-01-02 12:44:21 +01:00
Andrew Kaster	bc50a10cc9	Kernel: sys$mprotect protects sub-regions as well as whole ones Split a region into two/three if the desired mprotect range is a strict subset of an existing region. We can then set the access bits on a new region that is just our desired range and add both the new desired subregion and the leftovers back to our page tables.	2020-01-02 12:27:13 +01:00
Andreas Kling	3f7de2713e	Kernel: Make mknod() respect the process umask Otherwise the /bin/mknod command would create world-writable inodes by default (when run by superuser) which you probably don't want.	2020-01-02 02:40:43 +01:00
Andreas Kling	c7eb3ff1b3	Kernel: mknod() should not allow unprivileged users to create devices In fact, unless you are superuser, you may only create a regular file, a named pipe, or a local domain socket. Anything else should EPERM.	2020-01-02 02:36:12 +01:00
Andreas Kling	3dcec260ed	Kernel: Validate the full range of user memory passed to syscalls We now validate the full range of userspace memory passed into syscalls instead of just checking that the first and last byte of the memory are in process-owned regions. This fixes an issue where it was possible to avoid rejection of invalid addresses that sat between two valid ones, simply by passing a valid address and a size large enough to put the end of the range at another valid address. I added a little test utility that tries to provoke EFAULT in various ways to help verify this. I'm sure we can think of more ways to test this but it's at least a start. :^) Thanks to mozjag for pointing out that this code was still lacking! Incidentally this also makes backtraces work again. Fixes #989.	2020-01-02 02:17:12 +01:00
Andreas Kling	38f93ef13b	Kernel: Disable x86 RDTSC instruction in userspace It's still possible to read the TSC via the read_tsc() syscall, but we will now clear some of the bottom bits for unprivileged users.	2020-01-01 18:22:20 +01:00
Andreas Kling	f598bbbb1d	Kernel: Prevent executing I/O instructions in userspace All threads were running with iomapbase=0 in their TSS, which the CPU interprets as "there's an I/O permission bitmap starting at offset 0 into my TSS". Because of that, any bits that were 1 inside the TSS would allow the thread to execute I/O instructions on the port with that bit index. Fix this by always setting the iomapbase to sizeof(TSS32), and also setting the TSS descriptor's limit to sizeof(TSS32), effectively making the I/O permissions bitmap zero-length. This should make it no longer possible to do I/O from userspace. :^)	2020-01-01 17:31:41 +01:00
Andreas Kling	14cdd3fdc1	Kernel: Make module_load() and module_unload() be superuser-only These should just fail with EPERM if you're not the superuser.	2020-01-01 00:46:08 +01:00
Tibor Nagy	624116a8b1	Kernel: Implement AltGr key support	2019-12-31 19:31:42 +01:00
Andreas Kling	36f1de3c89	Kernel: Pointer range validation should fail on wraparound Let's reject address ranges that wrap around the 2^32 mark.	2019-12-31 18:23:17 +01:00
Andreas Kling	903b159856	Kernel: Write address validation was only checking end of write range Thanks to yyyyyyy for finding the bug! :^)	2019-12-31 18:18:54 +01:00
Andreas Kling	3f254bfbc8	Kernel+ping: Only allow superuser to create SOCK_RAW sockets /bin/ping is now setuid-root, and will drop privileges immediately after opening a raw socket.	2019-12-31 01:42:34 +01:00
Andreas Kling	a69734bf2e	Kernel: Also add a process boosting mechanism Let's also have set_process_boost() for giving all threads in a process the same boost.	2019-12-30 20:10:00 +01:00
Andreas Kling	610f3ad12f	Kernel: Add a basic thread boosting mechanism This patch introduces a syscall: int set_thread_boost(int tid, int amount) You can use this to add a permanent boost value to the effective thread priority of any thread with your UID (or any thread in the system if you are the superuser.) This is quite crude, but opens up some interesting opportunities. :^)	2019-12-30 19:23:13 +01:00
Andreas Kling	50677bf806	Kernel: Refactor scheduler to use dynamic thread priorities Threads now have numeric priorities with a base priority in the 1-99 range. Whenever a runnable thread is not scheduled, its effective priority is incremented by 1. This is tracked in Thread::m_extra_priority. The effective priority of a thread is m_priority + m_extra_priority. When a runnable thread is scheduled, its m_extra_priority is reset to zero and the effective priority returns to base. This means that lower-priority threads will always eventually get scheduled to run, once its effective priority becomes high enough to exceed the base priority of threads "above" it. The previous values for ThreadPriority (Low, Normal and High) are now replaced as follows: Low -> 10 Normal -> 30 High -> 50 In other words, it will take 20 ticks for a "Low" priority thread to get to "Normal" effective priority, and another 20 to reach "High". This is not perfect, and I've used some quite naive data structures, but I think the mechanism will allow us to build various new and interesting optimizations, and we can figure out better data structures later on. :^)	2019-12-30 18:46:17 +01:00
Andrew Kaster	cdcab7e5f4	Kernel: Retry mmap if MAP_FIXED is not in flags and addr is not 0 If an mmap fails to allocate a region, but the addr passed in was non-zero, non-fixed mmaps should attempt to allocate at any available virtual address.	2019-12-29 23:01:27 +01:00
Andreas Kling	fed3416bd2	Kernel: Embrace the SerenityOS name	2019-12-29 19:08:02 +01:00
Andreas Kling	1f31156173	Kernel: Add a mode flag to sys$purge and allow purging clean inodes	2019-12-29 13:16:53 +01:00
Andreas Kling	c74cde918a	Kernel+SystemMonitor: Expose amount of per-process clean inode memory This is memory that's loaded from an inode (file) but not modified in memory, so still identical to what's on disk. This kind of memory can be freed and reloaded transparently from disk if needed.	2019-12-29 12:45:58 +01:00
Andreas Kling	0d5e0e4cad	Kernel+SystemMonitor: Expose amount of per-process dirty private memory Dirty private memory is all memory in non-inode-backed mappings that's process-private, meaning it's not shared with any other process. This patch exposes that number via SystemMonitor, giving us an idea of how much memory each process is responsible for all on its own.	2019-12-29 12:28:32 +01:00
Andreas Kling	95034fdfbd	Kernel: Move PC speaker beep timing logic from scheduler to the syscall I don't know why I put this in the scheduler to begin with.. the caller can just block until the beeping is finished.	2019-12-26 22:31:26 +01:00
Andreas Kling	4a8683ea68	Kernel+LibPthread+LibC: Add a naive futex and use it for pthread_cond_t This patch implements a simple version of the futex (fast userspace mutex) API in the kernel and uses it to make the pthread_cond_t API's block instead of busily sched_yield(). An arbitrary userspace address is passed to the kernel as a "token" that identifies the futex and you can then FUTEX_WAIT and FUTEX_WAKE that specific userspace address. FUTEX_WAIT corresponds to pthread_cond_wait() and FUTEX_WAKE is used for pthread_cond_signal() and pthread_cond_broadcast(). I'm pretty sure I'm missing something in this implementation, but it's hopefully okay for a start. :^)	2019-12-25 23:54:06 +01:00
Andreas Kling	9e55bcb7da	Kernel: Make kernel memory regions be non-executable by default From now on, you'll have to request executable memory specifically if you want some.	2019-12-25 22:41:34 +01:00
Andreas Kling	56a28890eb	Kernel: Clarify the various input validity checks in mmap() Also share some validation logic between mmap() and mprotect().	2019-12-25 21:50:13 +01:00
Andreas Kling	419e0ced27	Kernel: Don't allow mmap()/mprotect() to set up PROT_WRITE\|PROT_EXEC ..but also allow mprotect() to set PROT_EXEC on a region, something we were just ignoring before.	2019-12-25 13:35:57 +01:00
Conrad Pankoff	efa7141d14	Kernel: Fail module loading if any symbols can not be resolved	2019-12-24 11:52:01 +01:00
Conrad Pankoff	9a8032b479	Kernel: Disallow loading a module twice without explicitly unloading it This ensures that a module has the chance to run its cleanup functions before it's taken out of service.	2019-12-24 02:20:37 +01:00
Conrad Pankoff	3aaeff483b	Kernel: Add a size argument to validate_read_from_kernel	2019-12-24 01:28:38 +01:00
Andreas Kling	4b8851bd01	Kernel: Make TID's be unique PID's This is a little strange, but it's how I understand things should work. The first thread in a new process now has TID == PID. Additional threads subsequently spawned in that process all have unique TID's generated by the PID allocator. TIDs are now globally unique.	2019-12-22 12:38:01 +01:00
Andreas Kling	16812f0f98	Kernel: Get rid of "main thread" concept The idea of all processes reliably having a main thread was nice in some ways, but cumbersome in others. More importantly, it didn't match up with POSIX thread semantics, so let's move away from it. This thread gets rid of Process::main_thread() and you now we just have a bunch of Thread objects floating around each Process. When the finalizer nukes the last Thread in a Process, it will also tear down the Process. There's a bunch of more things to fix around this, but this is where we get started :^)	2019-12-22 12:37:58 +01:00
Andreas Kling	b6ee8a2c8d	Kernel: Rename vmo => vmobject everywhere	2019-12-19 19:15:27 +01:00
Andreas Kling	8ea4217c01	Kernel: Merge Process::fork() into sys$fork() There was no good reason for this to be a separate function.	2019-12-19 19:07:41 +01:00
Andreas Kling	3012b224f0	Kernel: Fix intermittent assertion failure in sys$exec() While setting up the main thread stack for a new process, we'd incur some zero-fill page faults. This was to be expected, since we allocate a huge stack but lazily populate it with physical pages. The problem is that page fault handlers may enable interrupts in order to grab a VMObject lock (or to page in from an inode.) During exec(), a process is reorganizing itself and will be in a very unrunnable state if the scheduler should interrupt it and then later ask it to run again. Which is exactly what happens if the process gets pre-empted while the new stack's zero-fill page fault grabs the lock. This patch fixes the issue by creating new main thread stacks before disabling interrupts and going into the critical part of exec().	2019-12-18 23:03:23 +01:00
Andreas Kling	72ec2fae6e	Kernel: Ignore MADV_SET_NONVOLATILE if already non-volatile Just return 0 right away without changing any region flags.	2019-12-18 20:48:58 +01:00

... 4 5 6 7 8 ...

1068 Commits