Now the filesystem is generated on-the-fly instead of manually adding and
removing inodes as processes spawn and die.
The code is convoluted and bloated as I wrote it while sleepless. However,
it's still vastly better than the old ProcFS, so I'm committing it.
I also added /proc/PID/fd/N symlinks for each of a process's open fd's.
GObjects can now register a timer with the GEventLoop. This will eventually
cause GTimerEvents to be dispatched to the GObject.
This needed a few supporting changes in the kernel:
- The PIT now ticks 1000 times/sec.
- select() now supports an arbitrary timeout.
- gettimeofday() now returns something in the tv_usec field.
With these changes, the clock window in guitest2 finally ticks on its own.
FileDescriptor will now keep a pointer to the original inode even after
opening it resolves to a character device.
Fixed up /bin/ls to display major and minor device numbers instead of size
for device files.
This required a fair bit of plumbing. The CharacterDevice::close() virtual
will now be closed by ~FileDescriptor(), allowing device implementations to
do custom cleanup at that point.
One big problem remains: if the master PTY is closed before the slave PTY,
we go into crashy land.
Only raw octal modes are supported right now.
This patch also changes mode_t from 32-bit to 16-bit to match the on-disk
type used by Ext2FS.
I also ran into EPERM being errno=0 which was confusing, so I inserted an
ESUCCESS in its place.
It's really only supported in Ext2FS since SynthFS doesn't really want you
mucking around with its files. This is pretty neat though :^)
I ran into some trouble with HashMap while working on this but opted to work
around it and leave that for a separate investigation.
This means we only have to do one fill_rect() per line and the whole process
ends up being ~10% faster than before.
Also added a read_tsc() syscall to give userspace access to the TSC.
This patch adds most of the plumbing for working file deletion in Ext2FS.
Directory entries are removed and inode link counts updated.
We don't yet update the inode or block bitmaps, I will do that separately.
Make PageDirectory retainable and have each Region co-own the PageDirectory
they're mapped into. When unmapped, Region has no associated PageDirectory.
This allows Region to automatically unmap itself when destroyed.
It's a bit confusing that the "current" process is not actually running
while we're inside the scheduler. Perhaps the scheduler should redirect
"current" to its own dummy Process. I'm not sure.
Regardless, this patch improves responsiveness by allowing the scheduler
to unblock a process right after it calls select() in case it already has
a pending wakeup request.
The system can finally idle without burning CPU. :^)
There are some issues with scheduling making the mouse cursor sloppy
and unresponsive that need to be dealt with.
When you open /dev/ptmx, you get a file descriptor pointing to one of the
available MasterPTY's. If none are available, you get an EBUSY.
This makes it possible to open multiple (up to 4) Terminals. :^)
To support this, I also added a CharacterDevice::open() that gets control
when VFS is opening a CharacterDevice. This is useful when we want to return
a custom FileDescriptor like we do here.
Userspace programs can now open /dev/gui_events and read a stream of GUI_Event
structs one at a time.
I was stuck on a stupid problem where we'd reenter Scheduler::yield() due to
having one of the has_data_available_for_reading() implementations using locks.
This is a lot better than having them in kmalloc memory. I'm gonna need
a way to keep track of which process owns which bitmap eventually,
maybe through some sort of resource keying system. We'll see.
The old approach only worked because of an overpermissive accident.
There's now a concept of supervisor physical pages that can be allocated.
They all sit in the low 4 MB of physical memory and are identity mapped,
shared between all processes, and only ring 0 can access them.
Also use a simple array of { dword, const char* } for the KSyms and put the
whole shebang in kmalloc_eternal() memory. This was a fugly source of
kmalloc perma-frag.
It walks all the live Inode objects and flushes pending metadata changes
wherever needed.
This could be optimized by keeping a separate list of dirty Inodes,
but let's not get ahead of ourselves.
Use a little template magic to have Retainable::release() call out to
T::will_be_destroyed() if such a function exists before actually calling
the destructor. This gives us full access to virtual functions in the
pre-destruction code.
This synchronous approach to inodes is silly, obviously. I need to rework
it so that the in-memory CoreInode object is the canonical inode, and then
we just need a sync() that flushes pending changes to disk.
The kernel now bills processes for time spent in kernelspace and userspace
separately. The accounting is forwarded to the parent process in reap().
This makes the "time" builtin in bash work.
This way the scheduler doesn't need to plumb the exit status into the waiter.
We still plumb the waitee pid though, I don't love it but it can be fixed.
mmap() will now map uncommitted pages that get allocated and zeroed upon the
first access. I also made /proc/PID/vm show number of "committed" bytes in
each region. This is so cool! :^)
I was surprised to find that dup()'ed fds don't share the close-on-exec flag.
That means it has to be stored separately from the FileDescriptor object.
Instead of memcpy'ing the entire screen every time we press enter at the
bottom, use the VGA start address register to make a "view" onto the
underlying memory that moves downward as we scroll.
Eventually we run out of memory and have to reset to the start of the
buffer. That's when we memcpy everything. It would be cool if there was
some way to get the hardware to act like a ring buffer with automatic
wrapping here but I don't know how to do that.
I didn't even put the { } properly around everything that would leak.
Let's make sure this works correctly by splitting out the work into a
helper called do_exec().
- Process::exec() needs to restore the original paging scope when called
on a non-current process.
- Add missing InterruptDisabler guards around g_processes access.
- Only flush the TLB when modifying the active page tables.
This is really sweet! :^) The four instances of /bin/sh spawned at
startup now share their read-only text pages.
There are problems and limitations here, and plenty of room for
improvement. But it kinda works.
All right, we can now mmap() a file and it gets magically paged in from fs
in response to an NP page fault. This is really cool :^)
I need to refactor this to support sharing of read-only file-backed pages,
but it's cool to just have something working.
First of all, change sys$mmap to take a struct SC_mmap_params since our
sycsall calling convention can't handle more than 3 arguments.
This exposed a bug in Syscall::invoke() needing to use clobber lists.
It was a bit confusing to debug. :^)
This is dirty but pretty cool! If we have a pending, unmasked signal for
a process that's blocked inside the kernel, we set up alternate stacks
for that process and unblock it to execute the signal handler.
A slightly different return trampoline is used here: since we need to
get back into the kernel, a dedicated syscall is used (sys$sigreturn.)
This restores the TSS contents of the process to the state it was in
while we were originally blocking in the kernel.
NOTE: There's currently only one "kernel resume TSS" so signal nesting
definitely won't work.
Processes are either alive (with many substates), dead or forgiven.
A dead process is forgiven when the parent waitpid()s on it.
Dead orphans are also forgiven.
There's a lot of work to be done around this.
It only works for sending a signal to a process that's in userspace code.
We implement reception by synthesizing a PUSHA+PUSHF in the receiving process
(operating on values in the TSS.)
The TSS CS:EIP is then rerouted to the signal handler and a tiny return
trampoline is constructed in a dedicated region in the receiving process.
Also hacked up /bin/kill to be able to send arbitrary signals (kill -N PID)
Implemented some syscalls: dup(), dup2(), getdtablesize().
FileHandle is now a retainable, since that's needed for dup()'ed fd's.
I didn't really test any of this beyond a basic smoke check.
sys$fork() now clones all writable regions with per-page COW bits.
The pages are then mapped read-only and we handle a PF by COWing the pages.
This is quite delightful. Obviously there's lots of work to do still,
and it needs better data structures, but the general concept works.
This turned out way better than the old code. ELF loading is now quite
straightforward, and we don't need the weird concept of subregions anymore.
Next step is to respect the is_writable flag.
This is quite cool! The syscall entry point plumbs the register dump
down to sys$fork(), which uses it to set up the child process's TSS
in order to resume execution right after the int 0x80 fork() call. :^)
This works pretty well, although there is some problem with the kernel
alias mappings used to clone the parent process's regions. If I disable
the MM::release_page_directory() code, there's no problem. Probably there's
a premature freeing of a physical page somehow.
We no longer disable interrupts around the whole affair.
Since MM manages per-process data structures, this works quite smoothly now.
Only procfs had to be tweaked with an InterruptDisabler.