This adds some new buffers to the `FPUState` struct, which contains
enough space for the `xsave` instruction to run. This instruction writes
the upper part of the x86 SIMD registers (YMM0-15) to a seperate
256-byte area, as well as an "xsave header" describing the region.
If the underlying processor supports AVX, the `fxsave` instruction is no
longer used, as `xsave` itself implictly saves all of the SSE and x87
registers.
Co-authored-by: Leon Albrecht <leon.a@serenityos.org>
This will make it possible to add many, many more CPU features - more
than the current limit 32 and later limit of 64 if we stick with an enum
class to be specific :^)
This allows us to enable Write-Combine on e.g. framebuffers,
significantly improving performance on bare metal.
To keep things simple we right now only use one of up to three bits
(bit 7 in the PTE), which maps to the PA4 entry in the PAT MSR, which
we set to the Write-Combine mode on each CPU at boot time.
Since the inline capacity of the Vector return type was not specified
explicitly, the vector was automatically copied to a 0-length inline
capacity one, essentially eliminating the optimization.
... instead of returning the maximum number of Processor objects that we
can allocate.
Some ports (e.g. gdb) rely on this information to determine the number
of worker threads to spawn. When gdb spawned 64 threads, the kernel
could not cope with generating backtraces for it, which prevented us
from debugging it properly.
This commit also removes the confusingly named
`Processor::processor_count` function so that this mistake can't happen
again.
... In files included from Kernel/Thread.cpp or Kernel/Process.cpp
Some places the warning is suppressed, because we do not want a const
object do have non-const access to the returned sub-object.
The platform independent Processor.h file includes the shared processor
code and includes the specific platform header file.
All references to the Arch/x86/Processor.h file have been replaced with
a reference to Arch/Processor.h.
This fixes a triple fault that occurs when compiling serenity with
the i686 clang toolchain. (The underlying issue is that the old inline
assembly did not specify that it clobbered the eax/ecx/edx registers
and as such the compiler assumed they were not changed and used their
values across it)
Co-authored-by: Brian Gianforcaro <bgianf@serenityos.org>
Initializing the variable this way fixes a kernel panic in Clang where
the object was zero-initialized, so the `m_in_scheduler` contained the
wrong value. GCC got it right, but we're better off making this change,
as leaving uninitialized fields in constant-initialized objects can
cause other weird situations like this. Also, initializing only a single
field to a non-zero value isn't worth the cost of no longer fitting in
`.bss`.
Another two variables suffer from the same problem, even though their
values are supposed to be zero. Removing these causes the
`_GLOBAL_sub_I_` function to no longer be generated and the (not
handled) `.init_array` section to be omitted.
This avoids a race between getting the processor-specific SchedulerData
and accessing it. (Switching to a different CPU in that window means
that we're operating on the wrong SchedulerData.)
Co-authored-by: Tom <tomut@yahoo.com>
Leave interrupts enabled so that we can still process IRQs. Critical
sections should only prevent preemption by another thread.
Co-authored-by: Tom <tomut@yahoo.com>
By making these functions static we close a window where we could get
preempted after calling Processor::current() and move to another
processor.
Co-authored-by: Tom <tomut@yahoo.com>
Processing SMP messages outside of non-SMP mode is a waste of time,
and now that we don't rely on the side effects of calling the message
processing function, let's stop calling it entirely. :^)
To add a new per-CPU data structure, add an ID for it to the
ProcessorSpecificDataID enum.
Then call ProcessorSpecific<T>::initialize() when you are ready to
construct the per-CPU data structure on the current CPU. It can then
be accessed via ProcessorSpecific<T>::get().
This patch replaces the existing hard-coded mechanisms for Scheduler
and MemoryManager per-CPU data structure.
Right now we're using the FS segment for our per-CPU struct. On x86_64
there's an instruction to switch between a kernel and usermode GS
segment (swapgs) which we could use.
This patch doesn't update the rest of the code to use swapgs but it
prepares for that by using the GS segment instead of the FS segment.