Previously, `Heap` would store serialized data in blocks of 1024 bytes
regardless of the actual length. Data longer than 1024 bytes was
silently truncated causing database corruption.
This changes the heap storage to prefix every block with two new fields:
the total data size in bytes, and the next block to retrieve if the data
is longer than what can be stored inside a single block. By chaining
blocks together, we can store arbitrary amounts of data without needing
to change anything of the logic in the rest of LibSQL.
As part of these changes, the "free list" is also removed from the heap
awaiting an actual implementation: it was never used.
Note that this bumps the database version from 3 to 4, and as such
invalidates (deletes) any database opened with LibSQL that is not
version 4.
We are performing a lot of checks on pointers that are performed again
immediately afterwards because of a dereference. This removes the
redundant `VERIFY`s and simplifies a couple others.
These instances were detected by searching for files that include
AK/Format.h, but don't match the regex:
\\b(CheckedFormatString|critical_dmesgln|dbgln|dbgln_if|dmesgln|FormatBu
ilder|__FormatIfSupported|FormatIfSupported|FormatParser|FormatString|Fo
rmattable|Formatter|__format_value|HasFormatter|max_format_arguments|out
|outln|set_debug_enabled|StandardFormatter|TypeErasedFormatParams|TypeEr
asedParameter|VariadicFormatParams|v_critical_dmesgln|vdbgln|vdmesgln|vf
ormat|vout|warn|warnln|warnln_if)\\b
(Without the linebreaks.)
This regex is pessimistic, so there might be more files that don't
actually use any formatting functions.
Observe that this revealed that Userland/Libraries/LibC/signal.cpp is
missing an include.
In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
After splitting a node, the new node was written to the same pointer as
the current node - probably a copy / paste error. This new code requires
a `.pointer() -> u32` to exist on the object to be serialized,
preventing this issue from happening again.
Fixes#15844.
Classes reading and writing to the data heap would communicate directly
with the Heap object, and transfer ByteBuffers back and forth with it.
This makes things like caching and locking hard. Therefore all data
persistence activity will be funneled through a Serializer object which
in turn submits it to the Heap.
Introducing this unfortunately resulted in a huge amount of churn, in
which a number of smaller refactorings got caught up as well.
Unfortunately this patch is quite large.
The main functionality included are a BTree index implementation and
the Heap class which manages persistent storage.
Also included are a Key subclass of the Tuple class, which is a
specialization for index key tuples. This "dragged in" the Meta layer,
which has classes defining SQL objects like tables and indexes.