Pointer Compression to enable 8G Loom (#164)

### Description

Resolves #163

In fulfillment of https://urbit.org/grants/loom-pointer-compression

### Benchmark

#### Basic brass pill fakezod boot benchmark - x86_64 linux

Pay primary attention to `Elapsed (wall clock) time`, `Maximum resident
set size
(kbytes)` and `Major (requiring I/O) page faults`

##### Takeaway

We expected increased memory usage because this is naturally a tradeoff
of
alignment. Do note, that runs (2) and (3) included changes to align the
_stack_
as well as the _heap_. In the run without stack alignment (4) you can
see that
stack alignment has no effect on max RSS -- at least when booting from a
pill. From some basic evaluation in gdb I've done in the past, I expect
stack
usage when DWORD-aligned to increase by ~50% (rather than a theoretical
100%). Stack usage is quite small compared to heap usage however, so you
shouldn't expect to see this reflected in maximum RSS. Overall maximum
resident
memory increased by about ~33%.

The number of major pagefaults encountered during a brass boot is
roughly equal
to prior.

The elapsed (wall clock) time difference between (2) and (3) is
essentially
zero. There is essentially no performance gained by the virtual bit size
being a
compile-time constant.

There is a small latency cost in the current DWORD-aligned heap
allocation
implementation as compared to a runtime that doesn't require allocations
to be
aligned. Compare the elapsed times of (1) -- 2:21.09 or 141.09 s -- and
(2) --
2:23.50 or 143.50 s -- result: ~1.7% increased latency. If you look at
run (4)
however, which excluded stack alignment changes -- 2:22.55 or 142.55 s
--, we
split the difference at ~1.0% increased latency. Note, this _is_
repeatable and
the 1% difference isn't random. Running the same program over again on
the same
system exhibits tiny variance.

##### 1) -O3 no pointer compression vere/develop

A run from the HEAD of vere/develop

commit 7c890c3350

```
Command being timed: "./urbit -t -q -F zod -B brass.pill -c zod"
User time (seconds): 1.25
System time (seconds): 0.03
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:21.09
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 148036
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 3492
Minor (reclaiming a frame) page faults: 6188
Voluntary context switches: 68
Involuntary context switches: 5
Swaps: 0
File system inputs: 14866
File system outputs: 21544
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
```

##### 2) -O3 compiletime determined virtual bit size
i/163/pointer-compression

State of pointer compression work prior to migration (where concessions
to
runtime determined virtual bit size were made)

commit 4083f1c660

```
Command being timed: "./urbit -t -q -F zod -B brass.pill -c zod"
User time (seconds): 1.39
System time (seconds): 0.04
Percent of CPU this job got: 1%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:23.50
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 197176
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 3487
Minor (reclaiming a frame) page faults: 6219
Voluntary context switches: 68
Involuntary context switches: 2
Swaps: 0
File system inputs: 14866
File system outputs: 21544
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
```

##### 3) -O3 runtime determined virtual bit size
i/163/pointer-compression

Current state of pointer compression work -- after implementation of
migration
and runtime determined virtual bit size concession

commit 8dffe067e1:

```
Command being timed: "./urbit -t -q -F zod -B brass.pill -c zod"
User time (seconds): 1.40
System time (seconds): 0.06
Percent of CPU this job got: 1%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:23.52
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 197200
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 3489
Minor (reclaiming a frame) page faults: 7242
Voluntary context switches: 69
Involuntary context switches: 4
Swaps: 0
File system inputs: 14866
File system outputs: 21544
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
```

##### 4) -O3 runtime determined virtual bit size _WITHOUT_ stack
alignment barter-simsum/pointer-compression-no-align-stack

commit 8b0438ab3b

```
Command being timed: "./urbit -t -q -F zod -B brass.pill -c zod"
User time (seconds): 1.42
System time (seconds): 0.06
Percent of CPU this job got: 1%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:22.55
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 197204
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 3492
Minor (reclaiming a frame) page faults: 6221
Voluntary context switches: 68
Involuntary context switches: 3
Swaps: 0
File system inputs: 14866
File system outputs: 21544
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
```

##### FINAL BENCHMARK BEFORE MERGE:

This was run after some fairly significant changes to minimize malloc
padding,
fix memory corruption when run with `U3_MEMORY_DEBUG`, and more.

It was agreed to keep stack alignment out of this PR as it currently
isn't used
and costs us a bit of latency.

Runtime vs compiletime determined pointer compression still shows no
latency
difference on the x86 linux machine tested (ddr4 memory). On an m2 mac
air,
there _was_ a 5% latency increase from compiletime to runtime pointer
compression. This may be fixed later and would not necessitate another
migration.

The additional free list sanity checking done in `u3a_loom_sane`
introduces
negligible latency in `u3e_save`. On a relatively fragmented heap, it
only takes
60ms to complete. This will be kept in order to detect _some_ memory
corruption
if it occurs and prevent that corruption from propagating to disk.

A brass pill boot was performed off of
13e0b43d8da4bdd318fcd4e3d3610caa3af4608a. Observe there is no regression
in the
Elapsed (wall clock) time statistic. Further, the maximum resident set
size has
been reduced by 25% back to its pre pointer compression size (150M).
This is
likely due to a decrease in the average allocation's padding.

Lastly, total sweep size was compared between a freshly booted pier
without
pointer compression and with pointer compression post migration. There
is no
noticeable increase in the overall size of allocations.

```
Command being timed: "./burbit -t -q -F zod -B brass.pill -c brasspillbench"
User time (seconds): 1.27
System time (seconds): 0.04
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:21.49
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 150088
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 3490
Minor (reclaiming a frame) page faults: 6188
Voluntary context switches: 64
Involuntary context switches: 4
Swaps: 0
File system inputs: 14866
File system outputs: 21544
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
```
This commit is contained in:
Ted Blackman 2023-02-28 15:56:25 -05:00 committed by GitHub
commit c8be121455
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
22 changed files with 885 additions and 348 deletions

View File

@ -35,12 +35,14 @@ build --host_copt='-O3'
# fake ship tests.
build --host_copt='-DU3_CPU_DEBUG'
build --host_copt='-DU3_MEMORY_DEBUG'
build --host_copt='-DC3DBG'
# Enable maximum debug info and disable optimizations for debug config. It's
# important that these lines come after setting the default debug and
# optimization level flags above.
build:dbg --copt='-O0'
build:dbg --copt='-g3'
build:dbg --copt='-DC3DBG'
# Any personal configuration should go in .user.bazelrc.
try-import %workspace%/.user.bazelrc

9
.gitignore vendored
View File

@ -7,6 +7,15 @@
*.swo
*.swp
# Tags
.tags
.etags
TAGS
TAGS*
GPATH
GRTAGS
GTAGS
# Fake ships.
/zod
/nec

View File

@ -5,7 +5,10 @@
cc_library(
name = "c3",
srcs = glob(
["*.h"],
[
"*.h",
"*.c",
],
exclude = [
"c3.h",
"*_tests.c",

5
pkg/c3/defs.c Normal file
View File

@ -0,0 +1,5 @@
#include "defs.h"
c3_w c3_align_w(c3_w x, c3_w al, align_dir hilo);
c3_d c3_align_d(c3_d x, c3_d al, align_dir hilo);
void *c3_align_p(void const * p, size_t al, align_dir hilo);

View File

@ -22,6 +22,11 @@
/** Random useful C macros.
**/
/* Assert. Good to capture.
TODO: determine which c3_assert calls can rather call c3_dessert, i.e. in
public releases, which calls to c3_assert should abort and which should
no-op? If the latter, is the assert useful inter-development to validate
conditions we might accidentally break or not useful at all?
*/
# if defined(ASAN_ENABLED) && defined(__clang__)
@ -44,6 +49,26 @@
} while(0)
#endif
/* Dessert. Debug assert. If a debugger is attached, it will break in and
execution can be allowed to proceed without aborting the process.
Otherwise, the unhandled SIGTRAP will dump core.
*/
#ifdef C3DBG
#if defined(__i386__) || defined(__x86_64__)
#define c3_dessert(x) do { if(!(x)) __asm__ volatile("int $3"); } while (0)
#elif defined(__thumb__)
#define c3_dessert(x) do { if(!(x)) __asm__ volatile(".inst 0xde01"); } while (0)
#elif defined(__aarch64__)
#define c3_dessert(x) do { if(!(x)) __asm__ volatile(".inst 0xd4200000"); } while (0)
#elif defined(__arm__)
#define c3_dessert(x) do { if(!(x)) __asm__ volatile(".inst 0xe7f001f0"); } while (0)
#else
STATIC_ASSERT(0, "debugger break instruction unimplemented");
#endif
#else
#define c3_dessert(x) ((void)(0))
#endif
/* Stub.
*/
# define c3_stub c3_assert(!"stub")
@ -161,4 +186,44 @@
# define c3_rename(a, b) ({ \
rename(a, b);})
/* c3_align(
x - the address/quantity to align,
al - the alignment,
hilo - [C3_ALGHI, C3_ALGLO] high or low align
)
hi or lo align x to al
unless effective type of x is c3_w or c3_d, assumes x is a pointer.
*/
#define c3_align(x, al, hilo) \
_Generic((x), \
c3_w : c3_align_w, \
c3_d : c3_align_d, \
default : c3_align_p) \
(x, al, hilo)
typedef enum { C3_ALGHI=1, C3_ALGLO=0 } align_dir;
inline c3_w
c3_align_w(c3_w x, c3_w al, align_dir hilo) {
c3_dessert(hilo <= C3_ALGHI && hilo >= C3_ALGLO);
x += hilo * (al - 1);
x &= ~(al - 1);
return x;
}
inline c3_d
c3_align_d(c3_d x, c3_d al, align_dir hilo) {
c3_dessert(hilo <= C3_ALGHI && hilo >= C3_ALGLO);
x += hilo * (al - 1);
x &= ~(al - 1);
return x;
}
inline void*
c3_align_p(void const * p, size_t al, align_dir hilo) {
uintptr_t x = (uintptr_t)p;
c3_dessert(hilo <= C3_ALGHI && hilo >= C3_ALGLO);
x += hilo * (al - 1);
x &= ~(al - 1);
return (void*)x;
}
#endif /* ifndef C3_DEFS_H */

View File

@ -114,7 +114,7 @@
/** Address space layout.
***
*** NB: 2^29 words == 2GB
*** NB: 2^30 words == 4G
**/
# if defined(U3_OS_linux)
# ifdef __LP64__
@ -225,6 +225,9 @@
# endif
/* Static assertion.
TODO: we could just use static_assert (c23)/_Static_assert (c11) in
<assert.h>
*/
# define ASSERT_CONCAT_(a, b) a##b
# define ASSERT_CONCAT(a, b) ASSERT_CONCAT_(a, b)

View File

@ -9,6 +9,8 @@
**/
/* Canonical integers.
*/
typedef size_t c3_z;
typedef ssize_t c3_zs;
typedef uint64_t c3_d;
typedef int64_t c3_ds;
typedef uint32_t c3_w;
@ -32,4 +34,42 @@
typedef uintptr_t c3_p; // pointer-length uint - really really bad
typedef intptr_t c3_ps; // pointer-length int - really really bad
/* Print specifiers
*/
/* c3_z */
#define PRIc3_z "zu" /* unsigned dec */
#define PRIc3_zs "zd" /* signed dec */
#define PRIxc3_z "zx" /* unsigned hex */
#define PRIXc3_z "zX" /* unsigned HEX */
/* c3_d */
#define PRIc3_d PRIu64
#define PRIc3_ds PRIi64
#define PRIxc3_d PRIx64
#define PRIXc3_d PRIX64
/* c3_w */
#define PRIc3_w PRIu32
#define PRIc3_ws PRIi32
#define PRIxc3_w PRIx32
#define PRIXc3_w PRIX32
/* c3_s */
#define PRIc3_s PRIu16
#define PRIc3_ss PRIi16
#define PRIxc3_s PRIx16
#define PRIXc3_s PRIX16
/* c3_y */
#define PRIc3_y PRIu8
#define PRIc3_ys PRIi8
#define PRIxc3_y PRIx8
#define PRIXc3_y PRIX8
/* c3_b */
#define PRIc3_b PRIu8
#define PRIxc3_b PRIx8
#define PRIXc3_b PRIX8
#endif /* ifndef C3_TYPES_H */

View File

@ -18,6 +18,16 @@ c3_w u3_Code;
// declarations of inline functions
//
void u3a_config_loom(c3_w ver_w);
void *u3a_into(c3_w x);
c3_w u3a_outa(void *p);
c3_w u3a_to_off(c3_w som);
void *u3a_to_ptr(c3_w som);
c3_w *u3a_to_wtr(c3_w som);
c3_w u3a_to_pug(c3_w off);
c3_w u3a_to_pom(c3_w off);
void
u3a_drop(const u3a_pile* pil_u);
void*
@ -51,7 +61,30 @@ static void
_box_count(c3_ws siz_ws) { }
#endif
/* _box_vaal(): validate box alignment. no-op without C3DBG
TODO: I think validation code that might be compiled out like this,
_box_count, (others?) should have perhaps its own header and certainly its
own prefix. having to remind yourself that _box_count doesn't actually do
anything unless U3_CPU_DEBUG is defined is annoying. */
#define _box_vaal(box_u) \
do { \
c3_dessert(((uintptr_t)u3a_boxto(box_u) \
& u3C.balign_d-1) == 0); \
c3_dessert((((u3a_box*)(box_u))->siz_w \
& u3C.walign_w-1) == 0); \
} while(0)
/* _box_slot(): select the right free list to search for a block.
TODO: do we really need a loop to do this?
so our free list logic looks like this:
siz_w < 6 words then [0]
siz_w < 16 then [1]
siz_w < 32 then [2]
siz_w < 64 then [3]
...
siz_w > 4G then [26]
*/
static c3_w
_box_slot(c3_w siz_w)
@ -59,23 +92,18 @@ _box_slot(c3_w siz_w)
if ( siz_w < u3a_minimum ) {
return 0;
}
else {
c3_w i_w = 1;
while ( 1 ) {
if ( i_w == u3a_fbox_no ) {
return (i_w - 1);
}
if ( siz_w < 16 ) {
return i_w;
}
siz_w = (siz_w + 1) >> 1;
i_w += 1;
}
for (c3_w i_w = 1; i_w < u3a_fbox_no; i_w++) {
if ( siz_w < 16 ) return i_w;
siz_w = siz_w + 1 >> 1;
}
return u3a_fbox_no - 1;
}
/* _box_make(): construct a box.
box_v - start addr of box
siz_w - size of allocated space adjacent to block
use_w - box's refcount
*/
static u3a_box*
_box_make(void* box_v, c3_w siz_w, c3_w use_w)
@ -85,10 +113,12 @@ _box_make(void* box_v, c3_w siz_w, c3_w use_w)
c3_assert(siz_w >= u3a_minimum);
box_w[0] = siz_w;
box_w[siz_w - 1] = siz_w;
box_u->siz_w = siz_w;
box_w[siz_w - 1] = siz_w; /* stor size at end of allocation as well */
box_u->use_w = use_w;
_box_vaal(box_u);
# ifdef U3_MEMORY_DEBUG
box_u->cod_w = u3_Code;
box_u->eus_w = 0;
@ -179,6 +209,8 @@ _box_free(u3a_box* box_u)
return;
}
_box_vaal(box_u);
#if 0
/* Clear the contents of the block, for debugging.
*/
@ -191,12 +223,12 @@ _box_free(u3a_box* box_u)
}
#endif
if ( c3y == u3a_is_north(u3R) ) {
if ( c3y == u3a_is_north(u3R) ) { /* north */
/* Try to coalesce with the block below.
*/
if ( box_w != u3a_into(u3R->rut_p) ) {
c3_w laz_w = *(box_w - 1);
u3a_box* pox_u = (u3a_box*)(void *)(box_w - laz_w);
c3_w laz_w = *(box_w - 1); /* the size of a box stored at the end of its allocation */
u3a_box* pox_u = (u3a_box*)(void *)(box_w - laz_w); /* the head of the adjacent box below */
if ( 0 == pox_u->use_w ) {
_box_detach(pox_u);
@ -221,8 +253,8 @@ _box_free(u3a_box* box_u)
}
_box_attach(box_u);
}
}
else {
} /* end north */
else { /* south */
/* Try to coalesce with the block above.
*/
if ( (box_w + box_u->siz_w) != u3a_into(u3R->rut_p) ) {
@ -240,7 +272,7 @@ _box_free(u3a_box* box_u)
u3R->hat_p = u3a_outa(box_w + box_u->siz_w);
}
else {
c3_w laz_w = *(box_w - 1);
c3_w laz_w = box_w[-1];
u3a_box* pox_u = (u3a_box*)(void *)(box_w - laz_w);
if ( 0 == pox_u->use_w ) {
@ -250,71 +282,53 @@ _box_free(u3a_box* box_u)
}
_box_attach(box_u);
}
}
}
/* _me_align_pad(): pad to first point after pos_p aligned at (ald_w, alp_w).
*/
static __inline__ c3_w
_me_align_pad(u3_post pos_p, c3_w ald_w, c3_w alp_w)
{
c3_w adj_w = (ald_w - (alp_w + 1));
c3_p off_p = (pos_p + adj_w);
c3_p orp_p = off_p &~ (ald_w - 1);
c3_p fin_p = orp_p + alp_w;
c3_w pad_w = (fin_p - pos_p);
return pad_w;
}
/* _me_align_dap(): pad to last point before pos_p aligned at (ald_w, alp_w).
*/
static __inline__ c3_w
_me_align_dap(u3_post pos_p, c3_w ald_w, c3_w alp_w)
{
c3_w adj_w = alp_w;
c3_p off_p = (pos_p - adj_w);
c3_p orp_p = (off_p &~ (ald_w - 1));
c3_p fin_p = orp_p + alp_w;
c3_w pad_w = (pos_p - fin_p);
return pad_w;
} /* end south */
}
/* _ca_box_make_hat(): in u3R, allocate directly on the hat.
*/
static u3a_box*
_ca_box_make_hat(c3_w len_w, c3_w ald_w, c3_w alp_w, c3_w use_w)
_ca_box_make_hat(c3_w len_w, c3_w ald_w, c3_w off_w, c3_w use_w)
{
c3_w pad_w, siz_w;
u3_post all_p;
c3_w
pad_w, /* padding between returned pointer and box */
siz_w; /* total size of allocation */
u3_post
box_p, /* start of box */
all_p; /* start of returned pointer */
if ( c3y == u3a_is_north(u3R) ) {
all_p = u3R->hat_p;
pad_w = _me_align_pad(all_p, ald_w, alp_w);
siz_w = (len_w + pad_w);
box_p = all_p = u3R->hat_p;
all_p += c3_wiseof(u3a_box) + off_w;
pad_w = c3_align(all_p, ald_w, C3_ALGHI)
- all_p;
siz_w = c3_align(len_w + pad_w, u3C.walign_w, C3_ALGHI);
// hand-inlined: siz_w >= u3a_open(u3R)
//
if ( (siz_w >= (u3R->cap_p - u3R->hat_p)) ) {
return 0;
}
u3R->hat_p = (all_p + siz_w);
u3R->hat_p += siz_w;
}
else {
all_p = (u3R->hat_p - len_w);
pad_w = _me_align_dap(all_p, ald_w, alp_w);
siz_w = (len_w + pad_w);
all_p -= pad_w;
box_p = all_p = u3R->hat_p - len_w;
all_p += c3_wiseof(u3a_box) + off_w;
pad_w = all_p
- c3_align(all_p, ald_w, C3_ALGLO);
siz_w = c3_align(len_w + pad_w, u3C.walign_w, C3_ALGHI);
// hand-inlined: siz_w >= u3a_open(u3R)
//
if ( siz_w >= (u3R->hat_p - u3R->cap_p) ) {
return 0;
}
u3R->hat_p = all_p;
box_p = u3R->hat_p -= siz_w;
}
return _box_make(u3a_into(all_p), siz_w, use_w);
c3_dessert(!(ald_w <= 2 && off_w == 0) || (0 == pad_w));
c3_dessert(pad_w <= 4);
return _box_make(u3a_into(box_p), siz_w, use_w);
}
#if 0
@ -431,15 +445,23 @@ _ca_reclaim_half(void)
/* _ca_willoc(): u3a_walloc() internals.
*/
static void*
_ca_willoc(c3_w len_w, c3_w ald_w, c3_w alp_w)
_ca_willoc(c3_w len_w, c3_w ald_w, c3_w off_w)
{
c3_w siz_w = c3_max(u3a_minimum, u3a_boxed(len_w));
c3_w sel_w = _box_slot(siz_w);
alp_w = (alp_w + c3_wiseof(u3a_box)) % ald_w;
// XX: this logic is totally bizarre, but preserve it.
//
/* XX: this logic is totally bizarre, but preserve it.
**
** This means we use the next size bigger instead of the "correct"
** size. For example, a 20 word allocation will be freed into free
** list 2 but will be allocated from free list 3.
**
** This is important to preserve because the sequential search may be
** very slow. On a real-world task involving many compilations,
** removing this line made this function appear in ~80% of samples.
**
** For reference, this was added in cgyarvin/urbit ffed9e748d8f6c.
*/
if ( (sel_w != 0) && (sel_w != u3a_fbox_no - 1) ) {
sel_w += 1;
}
@ -449,6 +471,7 @@ _ca_willoc(c3_w len_w, c3_w ald_w, c3_w alp_w)
u3p(u3a_fbox) *pfr_p = &u3R->all.fre_p[sel_w];
while ( 1 ) {
/* increment until we get a non-null freelist */
if ( 0 == *pfr_p ) {
if ( sel_w < (u3a_fbox_no - 1) ) {
sel_w += 1;
@ -462,7 +485,7 @@ _ca_willoc(c3_w len_w, c3_w ald_w, c3_w alp_w)
// memory nearly empty; reclaim; should not be needed
//
// if ( (u3a_open(u3R) + u3R->all.fre_w) < 65536 ) { _ca_reclaim_half(); }
box_u = _ca_box_make_hat(siz_w, ald_w, alp_w, 1);
box_u = _ca_box_make_hat(siz_w, ald_w, off_w, 1);
/* Flush a bunch of cell cache, then try again.
*/
@ -470,68 +493,86 @@ _ca_willoc(c3_w len_w, c3_w ald_w, c3_w alp_w)
if ( u3R->all.cel_p ) {
u3a_reflux();
return _ca_willoc(len_w, ald_w, alp_w);
return _ca_willoc(len_w, ald_w, off_w);
}
else {
_ca_reclaim_half();
return _ca_willoc(len_w, ald_w, alp_w);
return _ca_willoc(len_w, ald_w, off_w);
}
}
else return u3a_boxto(box_u);
}
}
else {
c3_w pad_w = _me_align_pad(*pfr_p, ald_w, alp_w);
else { /* we got a non-null freelist */
u3_post box_p, all_p;
box_p = all_p = *pfr_p;
all_p += c3_wiseof(u3a_box) + off_w;
c3_w pad_w = c3_align(all_p, ald_w, C3_ALGHI) - all_p;
c3_w des_w = c3_align(siz_w + pad_w, u3C.walign_w, C3_ALGHI);
if ( 1 == ald_w ) c3_assert(0 == pad_w);
/* calls maximally requesting DWORD alignment of returned pointer
shouldn't require padding. */
c3_dessert(!(ald_w <= 2 && off_w == 0) || (0 == pad_w));
c3_dessert(pad_w <= 4);
if ( (siz_w + pad_w) > u3to(u3a_fbox, *pfr_p)->box_u.siz_w ) {
if ( (des_w) > u3to(u3a_fbox, *pfr_p)->box_u.siz_w ) {
/* This free block is too small. Continue searching.
*/
pfr_p = &(u3to(u3a_fbox, *pfr_p)->nex_p);
continue;
}
else {
else { /* free block fits desired alloc size */
u3a_box* box_u = &(u3to(u3a_fbox, *pfr_p)->box_u);
/* We have found a free block of adequate size. Remove it
** from the free list.
*/
siz_w += pad_w;
_box_count(-(box_u->siz_w));
/* misc free list consistency checks.
TODO: in the future should probably only run for C3DBG builds */
{
if ( (0 != u3to(u3a_fbox, *pfr_p)->pre_p) &&
(u3to(u3a_fbox, u3to(u3a_fbox, *pfr_p)->pre_p)->nex_p
!= (*pfr_p)) )
{
{ /* this->pre->nex isn't this */
c3_assert(!"loom: corrupt");
}
if( (0 != u3to(u3a_fbox, *pfr_p)->nex_p) &&
(u3to(u3a_fbox, u3to(u3a_fbox, *pfr_p)->nex_p)->pre_p
!= (*pfr_p)) )
{
{ /* this->nex->pre isn't this */
c3_assert(!"loom: corrupt");
}
/* pop the block */
/* this->nex->pre = this->pre */
if ( 0 != u3to(u3a_fbox, *pfr_p)->nex_p ) {
u3to(u3a_fbox, u3to(u3a_fbox, *pfr_p)->nex_p)->pre_p =
u3to(u3a_fbox, *pfr_p)->pre_p;
}
/* this = this->nex */
*pfr_p = u3to(u3a_fbox, *pfr_p)->nex_p;
}
/* If we can chop off another block, do it.
*/
if ( (siz_w + u3a_minimum) <= box_u->siz_w ) {
if ( (des_w + u3a_minimum) <= box_u->siz_w ) {
/* Split the block.
*/
/* XXX: Despite the fact that we're making a box here, we don't
actually have to ensure it's aligned, since des_w and all boxes
already on the loom /are/ aligned. A debug break here implies
that you broke those conditions, not that this needs to handle
alignment. abandon hope. */
c3_w* box_w = ((c3_w *)(void *)box_u);
c3_w* end_w = box_w + siz_w;
c3_w lef_w = (box_u->siz_w - siz_w);
c3_w* end_w = box_w + des_w;
c3_w lef_w = (box_u->siz_w - des_w);
_box_attach(_box_make(end_w, lef_w, 0));
return u3a_boxto(_box_make(box_w, siz_w, 1));
return u3a_boxto(_box_make(box_w, des_w, 1));
}
else {
c3_assert(0 == box_u->use_w);
@ -549,19 +590,28 @@ _ca_willoc(c3_w len_w, c3_w ald_w, c3_w alp_w)
}
/* _ca_walloc(): u3a_walloc() internals.
- len_w: allocation length in words
- ald_w: desired alignment. N.B. the void * returned is not guaranteed to be
aligned on this value. But the allocation will be sized such that the
caller can independently align the value.
- off_w: alignment offset to use when sizing request.
void * returned guaranteed to be DWORD (8-byte) aligned.
*/
static void*
_ca_walloc(c3_w len_w, c3_w ald_w, c3_w alp_w)
_ca_walloc(c3_w len_w, c3_w ald_w, c3_w off_w)
{
void* ptr_v;
while ( 1 ) {
ptr_v = _ca_willoc(len_w, ald_w, alp_w);
for (;;) {
ptr_v = _ca_willoc(len_w, ald_w, off_w);
if ( 0 != ptr_v ) {
break;
}
_ca_reclaim_half();
}
_box_vaal(u3a_botox(ptr_v));
return ptr_v;
}
@ -588,6 +638,7 @@ u3a_walloc(c3_w len_w)
xuc_i++;
}
#endif
_box_vaal(u3a_botox(ptr_v));
return ptr_v;
}
@ -597,7 +648,7 @@ void*
u3a_wealloc(void* lag_v, c3_w len_w)
{
if ( !lag_v ) {
return u3a_malloc(len_w);
return u3a_walloc(len_w);
}
else {
u3a_box* box_u = u3a_botox(lag_v);
@ -644,23 +695,37 @@ u3a_wfree(void* tox_v)
}
/* u3a_wtrim(): trim storage.
old_w - old length
len_w - new length
*/
void
u3a_wtrim(void* tox_v, c3_w old_w, c3_w len_w)
{
c3_w* nov_w = tox_v;
if ( (old_w > len_w)
&& ((old_w - len_w) >= u3a_minimum) )
{
c3_w* box_w = (void *)u3a_botox(nov_w);
c3_w* end_w = (nov_w + len_w + 1);
c3_w asz_w = (end_w - box_w);
c3_w bsz_w = box_w[0] - asz_w;
if ( (old_w > len_w)
&& ((old_w - len_w) >= u3a_minimum) )
{
u3a_box* box_u = u3a_botox(nov_w);
c3_w* box_w = (void*)u3a_botox(nov_w);
_box_attach(_box_make(end_w, bsz_w, 0));
c3_w* end_w = c3_align(nov_w + len_w + 1, /* +1 for trailing allocation size */
u3C.balign_d,
C3_ALGHI);
box_w[0] = asz_w;
c3_w asz_w = (end_w - box_w); /* total size in words of new allocation */
if (box_u->siz_w <= asz_w) return;
c3_w bsz_w = box_u->siz_w - asz_w; /* size diff in words between old and new */
c3_dessert(asz_w && ((asz_w & u3C.walign_w-1) == 0)); /* new allocation size must be non-zero and DWORD multiple */
c3_dessert(end_w < (box_w + box_u->siz_w)); /* desired alloc end must not exceed existing boundaries */
c3_dessert(((uintptr_t)end_w & u3C.balign_d-1) == 0); /* address of box getting freed must be DWORD aligned */
c3_dessert((bsz_w & u3C.walign_w-1) == 0); /* size of box getting freed must be DWORD multiple */
_box_attach(_box_make(end_w, bsz_w, 0)); /* free the unneeded space */
box_u->siz_w = asz_w;
box_w[asz_w - 1] = asz_w;
}
}
@ -681,31 +746,33 @@ u3a_calloc(size_t num_i, size_t len_i)
}
/* u3a_malloc(): aligned storage measured in bytes.
Internally pads allocations to 16-byte alignment independent of DWORD
alignment ensured for word sized allocations.
*/
void*
u3a_malloc(size_t len_i)
{
c3_w len_w = (c3_w)((len_i + 3) >> 2);
c3_w* ptr_w = _ca_walloc(len_w + 1, 4, 3);
u3_post ptr_p = u3a_outa(ptr_w);
c3_w pad_w = _me_align_pad(ptr_p, 4, 3);
c3_w* out_w = u3a_into(ptr_p + pad_w + 1);
c3_w len_w = (c3_w)((len_i + 3) >> 2);
c3_w *ptr_w = _ca_walloc(len_w +1, 4, 1); /* +1 for word storing pad size */
c3_w *out_w = c3_align(ptr_w + 1, 16, C3_ALGHI);
c3_w pad_w = u3a_outa(out_w) - u3a_outa(ptr_w);
#if 0
if ( u3a_botox(out_w) == (u3a_box*)(void *)0x3bdd1c80) {
static int xuc_i = 0;
out_w[-1] = pad_w - 1; /* the size of the pad doesn't include the word storing the size (-1) */
u3l_log("xuc_i %d", xuc_i);
// if ( 1 == xuc_i ) { abort(); }
xuc_i++;
}
#endif
out_w[-1] = pad_w;
c3_dessert(&out_w[len_w] /* alloced space after alignment is sufficient */
<= &((c3_w*)u3a_botox(ptr_w))[u3a_botox(ptr_w)->siz_w]);
c3_dessert(pad_w <= 4 && pad_w > 0);
c3_dessert(&out_w[-1] > ptr_w);
return out_w;
}
/* u3a_cellblock(): allocate a block of cells on the hat.
XXX beware when we stop boxing cells and QWORD align references. Alignment
not guaranteed to be preserved after a call.
*/
static c3_o
u3a_cellblock(c3_w num_w)
@ -730,7 +797,7 @@ u3a_cellblock(c3_w num_w)
// hand inline of _box_make(u3a_into(all_p), u3a_minimum, 1)
{
box_w[0] = u3a_minimum;
box_u->siz_w = u3a_minimum;
box_w[u3a_minimum - 1] = u3a_minimum;
box_u->use_w = 1;
#ifdef U3_MEMORY_DEBUG
@ -765,7 +832,7 @@ u3a_cellblock(c3_w num_w)
// hand inline of _box_make(u3a_into(all_p), u3a_minimum, 1);
{
box_w[0] = u3a_minimum;
box_u->siz_w = u3a_minimum;
box_w[u3a_minimum - 1] = u3a_minimum;
box_u->use_w = 1;
# ifdef U3_MEMORY_DEBUG
@ -786,6 +853,7 @@ u3a_cellblock(c3_w num_w)
}
/* u3a_celloc(): allocate a cell.
XXX beware when we stop boxing cells and QWORD align references
*/
c3_w*
u3a_celloc(void)
@ -881,9 +949,6 @@ u3a_realloc(void* lag_v, size_t len_i)
return new_w;
}
}
c3_w len_w = (c3_w)len_i;
return u3a_wealloc(lag_v, (len_w + 3) >> 2);
}
/* u3a_free(): free for aligned malloc.
@ -1682,12 +1747,24 @@ u3a_rewritten_noun(u3_noun som)
return som;
}
u3_post som_p = u3a_rewritten(u3a_to_off(som));
/* If this is being called during a migration, one-bit pointer compression
needs to be temporarily enabled so the rewritten reference is compressed */
if (u3C.migration_state == MIG_REWRITE_COMPRESSED)
u3C.vits_w = 1;
if ( c3y == u3a_is_pug(som) ) {
return u3a_to_pug(som_p);
som_p = u3a_to_pug(som_p);
}
else {
return u3a_to_pom(som_p);
som_p = u3a_to_pom(som_p);
}
/* likewise, pointer compression is disabled until migration is complete */
if (u3C.migration_state == MIG_REWRITE_COMPRESSED)
u3C.vits_w = 0;
return som_p;
}
/* u3a_mark_mptr(): mark a malloc-allocated ptr for gc.
@ -1910,25 +1987,28 @@ u3a_print_memory(FILE* fil_u, c3_c* cap_c, c3_w wor_w)
{
c3_assert( 0 != fil_u );
c3_w byt_w = (wor_w * 4);
c3_w gib_w = (byt_w / 1000000000);
c3_w mib_w = (byt_w % 1000000000) / 1000000;
c3_w kib_w = (byt_w % 1000000) / 1000;
c3_w bib_w = (byt_w % 1000);
c3_z byt_z = ((c3_z)wor_w * 4);
c3_z gib_z = (byt_z / 1000000000);
c3_z mib_z = (byt_z % 1000000000) / 1000000;
c3_z kib_z = (byt_z % 1000000) / 1000;
c3_z bib_z = (byt_z % 1000);
if ( byt_w ) {
if ( gib_w ) {
fprintf(fil_u, "%s: GB/%d.%03d.%03d.%03d\r\n",
cap_c, gib_w, mib_w, kib_w, bib_w);
if ( byt_z ) {
if ( gib_z ) {
fprintf(fil_u, "%s: GB/%" PRIc3_z ".%03" PRIc3_z ".%03" PRIc3_z ".%03" PRIc3_z "\r\n",
cap_c, gib_z, mib_z, kib_z, bib_z);
}
else if ( mib_w ) {
fprintf(fil_u, "%s: MB/%d.%03d.%03d\r\n", cap_c, mib_w, kib_w, bib_w);
else if ( mib_z ) {
fprintf(fil_u, "%s: MB/%" PRIc3_z ".%03" PRIc3_z ".%03" PRIc3_z "\r\n",
cap_c, mib_z, kib_z, bib_z);
}
else if ( kib_w ) {
fprintf(fil_u, "%s: KB/%d.%03d\r\n", cap_c, kib_w, bib_w);
else if ( kib_z ) {
fprintf(fil_u, "%s: KB/%" PRIc3_z ".%03" PRIc3_z "\r\n",
cap_c, kib_z, bib_z);
}
else if ( bib_w ) {
fprintf(fil_u, "%s: B/%d\r\n", cap_c, bib_w);
else if ( bib_z ) {
fprintf(fil_u, "%s: B/%" PRIc3_z "\r\n",
cap_c, bib_z);
}
}
}
@ -2631,3 +2711,34 @@ u3a_string(u3_atom a)
str_c[met_w] = 0;
return str_c;
}
/* u3a_loom_sane(): sanity checks the state of the loom for obvious corruption
*/
void
u3a_loom_sane()
{
/*
Only checking validity of freelists for now. Other checks could be added,
e.g. noun HAMT traversal, boxwise traversal of loom validating `siz_w`s,
`use_w`s, no empty space, etc. If added, some of that may need to be guarded
behind C3DBG flags. Freelist traversal is probably fine to always do though.
*/
for (c3_w i_w = 0; i_w < u3a_fbox_no; i_w++) {
u3p(u3a_fbox) this_p = u3R->all.fre_p[i_w];
u3a_fbox *this_u = u3to(u3a_fbox, this_p);
for (; this_p
; this_p = this_u->nex_p
, this_u = u3to(u3a_fbox, this_p)) {
u3p(u3a_fbox) pre_p = this_u->pre_p
, nex_p = this_u->nex_p;
u3a_fbox *pre_u = u3to(u3a_fbox, this_u->pre_p)
, *nex_u = u3to(u3a_fbox, this_u->nex_p);
if (nex_p && nex_u->pre_p != this_p) c3_assert(!"loom: wack");
if (pre_p && pre_u->nex_p != this_p) c3_assert(!"loom: wack");
if (!pre_p /* this must be the head of a freelist */
&& u3R->all.fre_p[_box_slot(this_u->box_u.siz_w)] != this_p)
c3_assert(!"loom: wack");
}
}
}

View File

@ -2,46 +2,54 @@
#define U3_ALLOCATE_H
#include "manage.h"
#include "options.h"
/** Constants.
**/
/* u3a_bits: number of bits in word-addressed pointer. 29 == 2GB.
*/
# define u3a_bits U3_OS_LoomBits
# define u3a_bits U3_OS_LoomBits /* 30 */
/* u3a_page: number of bits in word-addressed page. 12 == 16Kbyte page.
/* u3a_vits_max: number of virtual bits in a reference gained via pointer
compression
*/
# define u3a_page 12
# define u3a_vits_max 1
/* u3a_bits_max: max loom bex
*/
# define u3a_bits_max (8 * sizeof(c3_w) + u3a_vits_max)
/* u3a_page: number of bits in word-addressed page. 12 == 16K page
*/
# define u3a_page 12ULL
/* u3a_pages: maximum number of pages in memory.
*/
# define u3a_pages (1 << (u3a_bits - u3a_page))
# define u3a_pages (1ULL << (u3a_bits + u3a_vits_max - u3a_page) )
/* u3a_words: maximum number of words in memory.
*/
# define u3a_words (1 << u3a_bits)
# define u3a_words ( 1ULL << (u3a_bits + u3a_vits_max ))
/* u3a_bytes: maximum number of bytes in memory.
*/
# define u3a_bytes (sizeof(c3_w) * u3a_words)
# define u3a_bytes ((sizeof(c3_w) * u3a_words))
/* u3a_cells: number of representable cells.
*/
# define u3a_cells (c3_w)(u3a_words / u3a_minimum)
# define u3a_cells (( u3a_words / u3a_minimum ))
/* u3a_maximum: maximum loom object size (largest possible atom).
*/
# define u3a_maximum \
(c3_w)(u3a_words - (c3_wiseof(u3a_box) + c3_wiseof(u3a_atom)))
# define u3a_maximum ( u3a_words - (c3_wiseof(u3a_box) + c3_wiseof(u3a_atom) + 1))
/* u3a_minimum: minimum loom object size (actual size of a cell).
*/
# define u3a_minimum (c3_w)(1 + c3_wiseof(u3a_box) + c3_wiseof(u3a_cell))
# define u3a_minimum ((c3_w)( 1 + c3_wiseof(u3a_box) + c3_wiseof(u3a_cell) ))
/* u3a_fbox_no: number of free lists per size.
*/
# define u3a_fbox_no 27
# define u3a_fbox_no 27
/** Structures.
**/
@ -186,15 +194,15 @@
/** Macros. Should be better commented.
**/
/* In and out of the box.
u3a_boxed -> sizeof u3a_box + allocation size (len_w) + 1 (for storing the redundant size)
u3a_boxto -> the region of memory adjacent to the box.
u3a_botox -> the box adjacent to the region of memory
*/
# define u3a_boxed(len_w) (len_w + c3_wiseof(u3a_box) + 1)
# define u3a_boxto(box_v) ( (void *) \
( ((c3_w *)(void*)(box_v)) + \
c3_wiseof(u3a_box) ) )
# define u3a_botox(tox_v) ( (struct _u3a_box *) \
(void *) \
( ((c3_w *)(void*)(tox_v)) - \
c3_wiseof(u3a_box) ) )
( (u3a_box *)(void *)(box_v) + 1 ) )
# define u3a_botox(tox_v) ( (u3a_box *)(void *)(tox_v) - 1 )
/* Inside a noun.
*/
@ -214,26 +222,6 @@
*/
# define u3a_is_pom(som) ((0b11 == ((som) >> 30)) ? c3y : c3n)
/* u3a_to_off(): mask off bits 30 and 31 from noun [som].
*/
# define u3a_to_off(som) ((som) & 0x3fffffff)
/* u3a_to_ptr(): convert noun [som] into generic pointer into loom.
*/
# define u3a_to_ptr(som) (u3a_into(u3a_to_off(som)))
/* u3a_to_wtr(): convert noun [som] into word pointer into loom.
*/
# define u3a_to_wtr(som) ((c3_w *)u3a_to_ptr(som))
/* u3a_to_pug(): set bit 31 of [off].
*/
# define u3a_to_pug(off) (off | 0x80000000)
/* u3a_to_pom(): set bits 30 and 31 of [off].
*/
# define u3a_to_pom(off) (off | 0xc0000000)
/* u3a_is_atom(): yes if noun [som] is direct atom or indirect atom.
*/
# define u3a_is_atom(som) c3o(u3a_is_cat(som), \
@ -261,69 +249,63 @@
: u3m_bail(c3__exit) )
# define u3t(som) u3a_t(som)
/* u3a_into(): convert loom offset [x] into generic pointer.
*/
# define u3a_into(x) ((void *)(u3_Loom + (x)))
# define u3to(type, x) ((type *)u3a_into(x))
# define u3tn(type, x) (x) ? (type*)u3a_into(x) : (void*)NULL
/* u3a_outa(): convert pointer [p] into word offset into loom.
*/
# define u3a_outa(p) (((c3_w*)(void*)(p)) - u3_Loom)
# define u3of(type, x) (u3a_outa((type*)x))
/* u3a_is_north(): yes if road [r] is north road.
*/
# define u3a_is_north(r) __(r->cap_p > r->hat_p)
# define u3a_is_north(r) __((r)->cap_p > (r)->hat_p)
/* u3a_is_south(): yes if road [r] is south road.
*/
# define u3a_is_south(r) !u3a_is_north(r)
# define u3a_is_south(r) !u3a_is_north((r))
/* u3a_open(): words of contiguous free space in road [r]
*/
# define u3a_open(r) ( (c3y == u3a_is_north(r)) \
? (c3_w)(r->cap_p - r->hat_p) \
: (c3_w)(r->hat_p - r->cap_p) )
? (c3_w)((r)->cap_p - (r)->hat_p) \
: (c3_w)((r)->hat_p - (r)->cap_p) )
/* u3a_full(): total words in road [r];
** u3a_full(r) == u3a_heap(r) + u3a_temp(r) + u3a_open(r)
*/
# define u3a_full(r) ( (c3y == u3a_is_north(r)) \
? (c3_w)(r->mat_p - r->rut_p) \
: (c3_w)(r->rut_p - r->mat_p) )
? (c3_w)((r)->mat_p - (r)->rut_p) \
: (c3_w)((r)->rut_p - (r)->mat_p) )
/* u3a_heap(): words of heap in road [r]
*/
# define u3a_heap(r) ( (c3y == u3a_is_north(r)) \
? (c3_w)(r->hat_p - r->rut_p) \
: (c3_w)(r->rut_p - r->hat_p) )
? (c3_w)((r)->hat_p - (r)->rut_p) \
: (c3_w)((r)->rut_p - (r)->hat_p) )
/* u3a_temp(): words of stack in road [r]
*/
# define u3a_temp(r) ( (c3y == u3a_is_north(r)) \
? (c3_w)(r->mat_p - r->cap_p) \
: (c3_w)(r->cap_p - r->mat_p) )
? (c3_w)((r)->mat_p - (r)->cap_p) \
: (c3_w)((r)->cap_p - (r)->mat_p) )
# define u3a_north_is_senior(r, dog) \
__((u3a_to_off(dog) < r->rut_p) || \
(u3a_to_off(dog) >= r->mat_p))
__((u3a_to_off(dog) < (r)->rut_p) || \
(u3a_to_off(dog) >= (r)->mat_p))
# define u3a_north_is_junior(r, dog) \
__((u3a_to_off(dog) >= r->cap_p) && \
(u3a_to_off(dog) < r->mat_p))
__((u3a_to_off(dog) >= (r)->cap_p) && \
(u3a_to_off(dog) < (r)->mat_p))
# define u3a_north_is_normal(r, dog) \
c3a(!(u3a_north_is_senior(r, dog)), \
!(u3a_north_is_junior(r, dog)))
# define u3a_south_is_senior(r, dog) \
__((u3a_to_off(dog) < r->mat_p) || \
(u3a_to_off(dog) >= r->rut_p))
__((u3a_to_off(dog) < (r)->mat_p) || \
(u3a_to_off(dog) >= (r)->rut_p))
# define u3a_south_is_junior(r, dog) \
__((u3a_to_off(dog) < r->cap_p) && \
(u3a_to_off(dog) >= r->mat_p))
__((u3a_to_off(dog) < (r)->cap_p) && \
(u3a_to_off(dog) >= (r)->mat_p))
# define u3a_south_is_normal(r, dog) \
c3a(!(u3a_south_is_senior(r, dog)), \
@ -353,6 +335,30 @@
: (u3a_botox(u3a_to_ptr(som))->use_w == 1) \
? c3y : c3n )
/* like _box_vaal but for rods. Again, probably want to prefix validation
functions at the very least. Maybe they can be defined in their own header.
ps. while arguably cooler to have this compile to
do {(void(0));(void(0));} while(0)
It may be nicer to just wrap an inline function in #ifdef C3DBG guards. You
could even return the then validated road like
u3a_road f() {
u3a_road rod_u;
...
return _rod_vaal(rod_u);
}
*/
# define _rod_vaal(rod_u) \
do { \
c3_dessert(((uintptr_t)((u3a_road*)(rod_u))->hat_p \
& u3C.walign_w-1) == 0); \
} while(0)
/** Globals.
**/
/// Current road (thread-local).
@ -369,6 +375,68 @@
/** inline functions.
**/
/* u3a_config_loom(): configure loom information by u3v version
*/
inline void u3a_config_loom(c3_w ver_w) {
switch (ver_w) {
case U3V_VER1:
u3C.vits_w = 0;
break;
case U3V_VER2:
u3C.vits_w = 1;
break;
default:
c3_assert(0);
}
u3C.walign_w = 1 << u3C.vits_w;
u3C.balign_d = sizeof(c3_w) * u3C.walign_w;
}
/* u3a_into(): convert loom offset [x] into generic pointer.
*/
inline void *u3a_into(c3_w x) {
return u3_Loom + x;
}
/* u3a_outa(): convert pointer [p] into word offset into loom.
*/
inline c3_w u3a_outa(void *p) {
return ((c3_w *)p) - u3_Loom;
}
/* u3a_to_off(): mask off bits 30 and 31 from noun [som].
*/
inline c3_w u3a_to_off(c3_w som) {
return (som & 0x3fffffff) << u3C.vits_w;
}
/* u3a_to_ptr(): convert noun [som] into generic pointer into loom.
*/
inline void *u3a_to_ptr(c3_w som) {
return u3a_into(u3a_to_off(som));
}
/* u3a_to_wtr(): convert noun [som] into word pointer into loom.
*/
inline c3_w *u3a_to_wtr(c3_w som) {
return (c3_w *)u3a_to_ptr(som);
}
/* u3a_to_pug(): set bit 31 of [off].
*/
inline c3_w u3a_to_pug(c3_w off) {
c3_dessert((off & u3C.walign_w-1) == 0);
return (off >> u3C.vits_w) | 0x80000000;
}
/* u3a_to_pom(): set bits 30 and 31 of [off].
*/
inline c3_w u3a_to_pom(c3_w off) {
c3_dessert((off & u3C.walign_w-1) == 0);
return (off >> u3C.vits_w) | 0xc0000000;
}
/** road stack.
**/
/* u3a_drop(): drop a road stack frame per [pil_u].
@ -680,4 +748,9 @@
c3_c*
u3a_string(u3_atom a);
/* u3a_loom_sane(): sanity checks the state of the loom for obvious corruption
*/
void
u3a_loom_sane();
#endif /* ifndef U3_ALLOCATE_H */

View File

@ -442,7 +442,7 @@ _ce_patch_delete(void)
{
c3_c ful_c[8193];
snprintf(ful_c, 8192, "%s/.urb/chk/control.bin", u3P.dir_c);
snprintf(ful_c, 8192, "%s/.urb/chk/control.bin", u3P.dir_c);
if ( unlink(ful_c) ) {
fprintf(stderr, "loom: failed to delete control.bin: %s\r\n",
strerror(errno));
@ -460,31 +460,32 @@ _ce_patch_delete(void)
static c3_o
_ce_patch_verify(u3_ce_patch* pat_u)
{
ssize_t ret_i;
c3_w i_w;
c3_w pag_w, mug_w;
c3_w mem_w[pag_wiz_i];
c3_zs ret_zs;
if ( u3e_version != pat_u->con_u->ver_y ) {
fprintf(stderr, "loom: patch version mismatch: have %u, need %u\r\n",
pat_u->con_u->ver_y,
u3e_version);
if ( U3E_VERLAT != pat_u->con_u->ver_w ) {
fprintf(stderr, "loom: patch version mismatch: have %"PRIc3_w", need %u\r\n",
pat_u->con_u->ver_w,
U3E_VERLAT);
return c3n;
}
for ( i_w = 0; i_w < pat_u->con_u->pgs_w; i_w++ ) {
c3_w pag_w = pat_u->con_u->mem_u[i_w].pag_w;
c3_w mug_w = pat_u->con_u->mem_u[i_w].mug_w;
c3_w mem_w[1 << u3a_page];
for ( c3_z i_z = 0; i_z < pat_u->con_u->pgs_w; i_z++ ) {
pag_w = pat_u->con_u->mem_u[i_z].pag_w;
mug_w = pat_u->con_u->mem_u[i_z].mug_w;
if ( -1 == lseek(pat_u->mem_i, (i_w << (u3a_page + 2)), SEEK_SET) ) {
if ( -1 == lseek(pat_u->mem_i, (i_z << (u3a_page + 2)), SEEK_SET) ) {
fprintf(stderr, "loom: patch seek: %s\r\n", strerror(errno));
return c3n;
}
if ( pag_siz_i != (ret_i = read(pat_u->mem_i, mem_w, pag_siz_i)) ) {
if ( 0 < ret_i ) {
fprintf(stderr, "loom: patch partial read: %zu\r\n", (size_t)ret_i);
if ( pag_siz_i != (ret_zs = read(pat_u->mem_i, mem_w, pag_siz_i)) ) {
if ( 0 < ret_zs ) {
fprintf(stderr, "loom: patch partial read: %"PRIc3_zs"\r\n", ret_zs);
}
else {
fprintf(stderr, "loom: patch read fail: %s\r\n", strerror(errno));
fprintf(stderr, "loom: patch read: fail %"PRIc3_zs" of %"PRIc3_z" bytes\r\n",
ret_zs, pag_siz_i);
}
return c3n;
}
@ -492,13 +493,13 @@ _ce_patch_verify(u3_ce_patch* pat_u)
c3_w nug_w = u3r_mug_words(mem_w, pag_wiz_i);
if ( mug_w != nug_w ) {
fprintf(stderr, "loom: patch mug mismatch %d/%d; (%x, %x)\r\n",
pag_w, i_w, mug_w, nug_w);
fprintf(stderr, "loom: patch mug mismatch %"PRIc3_w"/%"PRIc3_z"; (%"PRIxc3_w", %"PRIxc3_w")\r\n",
pag_w, i_z, mug_w, nug_w);
return c3n;
}
#if 0
else {
u3l_log("verify: patch %d/%d, %x", pag_w, i_w, mug_w);
u3l_log("verify: patch %"PRIc3_w"/%"PRIc3_z", %"PRIxc3_w"\r\n", pag_w, i_z, mug_w);
}
#endif
}
@ -691,7 +692,7 @@ _ce_patch_compose(void)
_ce_patch_create(pat_u);
pat_u->con_u = c3_malloc(sizeof(u3e_control) + (pgs_w * sizeof(u3e_line)));
pat_u->con_u->ver_y = u3e_version;
pat_u->con_u->ver_w = U3E_VERLAT;
pgc_w = 0;
for ( i_w = 0; i_w < nor_w; i_w++ ) {
@ -787,7 +788,7 @@ _ce_patch_apply(u3_ce_patch* pat_u)
c3_w pag_w = pat_u->con_u->mem_u[i_w].pag_w;
c3_w mem_w[pag_wiz_i];
c3_i fid_i;
c3_w off_w;
c3_z off_w;
if ( pag_w < pat_u->con_u->nor_w ) {
fid_i = u3P.nor_u.fid_i;
@ -1077,6 +1078,9 @@ u3e_save(void)
return;
}
/* attempt to avoid propagating anything insane to disk */
u3a_loom_sane();
// u3a_print_memory(stderr, "sync: save", 4096 * pat_u->con_u->pgs_w);
_ce_patch_sync(pat_u);

View File

@ -5,6 +5,7 @@
#include "c3.h"
#include "allocate.h"
#include "version.h"
/** Data structures.
**/
@ -18,11 +19,11 @@
/* u3e_control: memory change, control file.
*/
typedef struct _u3e_control {
c3_w ver_y; // version number
c3_w nor_w; // new page count north
c3_w sou_w; // new page count south
c3_w pgs_w; // number of changed pages
u3e_line mem_u[0]; // per page
u3e_version ver_w; // version number
c3_w nor_w; // new page count north
c3_w sou_w; // new page count south
c3_w pgs_w; // number of changed pages
u3e_line mem_u[0]; // per page
} u3e_control;
/* u3_cs_patch: memory change, top level.
@ -60,7 +61,6 @@
/** Constants.
**/
# define u3e_version 1
/** Functions.
**/

View File

@ -1017,8 +1017,15 @@ _ch_rewrite_node(u3h_node* han_u, c3_w lef_w)
else {
void* hav_v = u3h_slot_to_node(sot_w);
u3h_node* nod_u = u3to(u3h_node,u3a_rewritten(u3of(u3h_node,hav_v)));
if (u3C.migration_state == MIG_REWRITE_COMPRESSED)
u3C.vits_w = 1;
han_u->sot_w[i_w] = u3h_node_to_slot(nod_u);
if (u3C.migration_state == MIG_REWRITE_COMPRESSED)
u3C.vits_w = 0;
if ( 0 == lef_w ) {
_ch_rewrite_buck(hav_v);
} else {
@ -1050,8 +1057,15 @@ u3h_rewrite(u3p(u3h_root) har_p)
else if ( _(u3h_slot_is_node(sot_w)) ) {
u3h_node* han_u = u3h_slot_to_node(sot_w);
u3h_node* nod_u = u3to(u3h_node,u3a_rewritten(u3of(u3h_node,han_u)));
if (u3C.migration_state == MIG_REWRITE_COMPRESSED)
u3C.vits_w = 1;
har_u->sot_w[i_w] = u3h_node_to_slot(nod_u);
if (u3C.migration_state == MIG_REWRITE_COMPRESSED)
u3C.vits_w = 0;
_ch_rewrite_node(han_u, 25);
}
}

View File

@ -27,7 +27,7 @@
** format - coordinate with allocate.h. The top two bits are:
**
** 00 - empty (in the root table only)
** 01 - table
** 01 - table (node or buck)
** 02 - entry, stale
** 03 - entry, fresh
*/
@ -79,8 +79,8 @@
# define u3h_slot_is_node(sot) ((1 == ((sot) >> 30)) ? c3y : c3n)
# define u3h_slot_is_noun(sot) ((1 == ((sot) >> 31)) ? c3y : c3n)
# define u3h_slot_is_warm(sot) (((sot) & 0x40000000) ? c3y : c3n)
# define u3h_slot_to_node(sot) (u3a_into((sot) & 0x3fffffff))
# define u3h_node_to_slot(ptr) (u3a_outa(ptr) | 0x40000000)
# define u3h_slot_to_node(sot) (u3a_into(((sot) & 0x3fffffff) << u3C.vits_w))
# define u3h_node_to_slot(ptr) ((u3a_outa((ptr)) >> u3C.vits_w) | 0x40000000)
# define u3h_noun_be_warm(sot) ((sot) | 0x40000000)
# define u3h_noun_be_cold(sot) ((sot) & ~0x40000000)
# define u3h_slot_to_noun(sot) (0x40000000 | (sot))

View File

@ -815,7 +815,7 @@ u3j_boot(c3_o nuu_o)
{
c3_assert(u3R == &(u3H->rod_u));
u3D.len_l =_cj_count(0, u3D.dev_u);
u3D.len_l = _cj_count(0, u3D.dev_u);
u3D.all_l = (2 * u3D.len_l) + 1024; // horrid heuristic
u3D.ray_u = c3_malloc(u3D.all_l * sizeof(u3j_core));

View File

@ -483,17 +483,18 @@ _pave_parts(void)
u3R->byc.har_p = u3h_new();
}
/* _pave_road(): initialize road boundaries
/* _pave_road(): writes road boundaries to loom mem (stored at mat_w)
*/
static u3_road*
_pave_road(c3_w* rut_w, c3_w* mat_w, c3_w* cap_w, c3_w siz_w)
{
c3_dessert(((uintptr_t)rut_w & u3C.balign_d-1) == 0);
u3_road* rod_u = (void*) mat_w;
// enable in case of corruption
//
// memset(mem_w, 0, 4 * len_w);
memset(rod_u, 0, 4 * siz_w);
memset(rod_u, 0, sizeof(c3_w) * siz_w);
// the top and bottom of the heap are initially the same
//
@ -504,10 +505,17 @@ _pave_road(c3_w* rut_w, c3_w* mat_w, c3_w* cap_w, c3_w siz_w)
rod_u->mat_p = u3of(c3_w, mat_w); // stack bottom
rod_u->cap_p = u3of(c3_w, cap_w); // stack top
_rod_vaal(rod_u);
return rod_u;
}
/* _pave_north(): calculate boundaries and initialize north road.
mem_w - the "beginning" of your loom (its lowest address). Corresponds to rut
in a north road.
siz_w - the size in bytes of your road record (or home record in the case of
paving home).
len_w - size of your loom in words
*/
static u3_road*
_pave_north(c3_w* mem_w, c3_w siz_w, c3_w len_w)
@ -518,14 +526,23 @@ _pave_north(c3_w* mem_w, c3_w siz_w, c3_w len_w)
// the stack starts at the end of the memory segment,
// minus space for the road structure [siz_w]
//
c3_w* rut_w = mem_w;
c3_w* mat_w = ((mem_w + len_w) - siz_w);
// 00~~~|R|---|H|######|C|+++|M|~~~FF
// ^--u3R which _pave_road returns (u3H for home road)
//
c3_w* mat_w = c3_align(mem_w + len_w - siz_w, u3C.balign_d, C3_ALGLO);
c3_w* rut_w = c3_align(mem_w, u3C.balign_d, C3_ALGHI);
c3_w* cap_w = mat_w;
return _pave_road(rut_w, mat_w, cap_w, siz_w);
}
/* _pave_south(): calculate boundaries and initialize south road.
mem_w - the "beginning" of your loom (its lowest address). Corresponds to mat
in a south road.
siz_w - the size in bytes of your road record (or home record in the case of
paving home).
len_w - size of your loom in words
*/
static u3_road*
_pave_south(c3_w* mem_w, c3_w siz_w, c3_w len_w)
@ -536,8 +553,10 @@ _pave_south(c3_w* mem_w, c3_w siz_w, c3_w len_w)
// the stack starts at the base memory pointer [mem_w],
// and ends after the space for the road structure [siz_w]
//
c3_w* rut_w = (mem_w + len_w);
c3_w* mat_w = mem_w;
// 00~~~|M|+++|C|######|H|---|R|~~~FFF
// ^---u3R which _pave_road returns
c3_w* mat_w = c3_align(mem_w, u3C.balign_d, C3_ALGHI);
c3_w* rut_w = c3_align(mem_w + len_w, u3C.balign_d, C3_ALGLO);
c3_w* cap_w = mat_w + siz_w;
return _pave_road(rut_w, mat_w, cap_w, siz_w);
@ -548,12 +567,15 @@ _pave_south(c3_w* mem_w, c3_w siz_w, c3_w len_w)
static void
_pave_home(void)
{
c3_w* mem_w = u3_Loom + 1;
/* a pristine home road will always have compressed references */
u3a_config_loom(U3V_VERLAT);
c3_w* mem_w = u3_Loom + u3C.walign_w;
c3_w siz_w = c3_wiseof(u3v_home);
c3_w len_w = u3C.wor_i - 1;
c3_w len_w = u3C.wor_i - u3C.walign_w;
u3H = (void *)_pave_north(mem_w, siz_w, len_w);
u3H->ver_w = u3v_version;
u3H->ver_w = U3V_VERLAT;
u3R = &u3H->rod_u;
_pave_parts();
@ -567,31 +589,38 @@ STATIC_ASSERT( ((c3_wiseof(u3v_home) * 4) == sizeof(u3v_home)),
static void
_find_home(void)
{
c3_w ver_w = *(u3_Loom + u3C.wor_i - 1);
u3a_config_loom(ver_w);
// NB: the home road is always north
//
c3_w* mem_w = u3_Loom + 1;
c3_w* mem_w = u3_Loom + u3C.walign_w;
c3_w siz_w = c3_wiseof(u3v_home);
c3_w len_w = u3C.wor_i - 1;
c3_w len_w = u3C.wor_i - u3C.walign_w;
c3_w* mat_w = c3_align(mem_w + len_w - siz_w, u3C.balign_d, C3_ALGLO);
{
c3_w ver_w = *((mem_w + len_w) - 1);
if ( u3v_version != ver_w ) {
fprintf(stderr, "loom: checkpoint version mismatch: "
"have %u, need %u\r\n",
ver_w,
u3v_version);
abort();
}
}
u3H = (void *)((mem_w + len_w) - siz_w);
u3H = (void *)mat_w;
u3R = &u3H->rod_u;
// this looks risky, but there are no legitimate scenarios
// where it's wrong
//
// this looks risky, but there are no legitimate scenarios where it's wrong
u3R->cap_p = u3R->mat_p = u3C.wor_i - c3_wiseof(*u3H);
/* As a further guard against any sneaky loom corruption */
u3a_loom_sane();
if (U3V_VERLAT > ver_w) {
u3m_migrate(U3V_VERLAT);
u3a_config_loom(U3V_VERLAT);
}
else if ( U3V_VERLAT < ver_w ) {
fprintf(stderr, "loom: checkpoint version mismatch: "
"have %u, need %u\r\n",
ver_w,
U3V_VERLAT);
abort();
}
_rod_vaal(u3R);
}
/* u3m_pave(): instantiate or activate image.
@ -780,9 +809,11 @@ u3m_error(c3_c* str_c)
void
u3m_leap(c3_w pad_w)
{
c3_w len_w;
c3_w len_w; /* the length of the new road (avail - (pad + wiseof(u3a_road))) */
u3_road* rod_u;
_rod_vaal(u3R);
/* Measure the pad - we'll need it.
*/
{
@ -795,40 +826,40 @@ u3m_leap(c3_w pad_w)
}
#endif
if ( (pad_w + c3_wiseof(u3a_road)) >= u3a_open(u3R) ) {
/* not enough storage to leap */
u3m_bail(c3__meme);
}
len_w = u3a_open(u3R) - (pad_w + c3_wiseof(u3a_road));
pad_w += c3_wiseof(u3a_road);
len_w = u3a_open(u3R) - pad_w;
c3_align(len_w, u3C.walign_w, C3_ALGHI);
}
/* Allocate a region on the cap.
*/
{
u3p(c3_w) bot_p;
u3p(c3_w) bot_p; /* S: bot_p = new mat. N: bot_p = new rut */
if ( c3y == u3a_is_north(u3R) ) {
bot_p = (u3R->cap_p - len_w);
u3R->cap_p -= len_w;
bot_p = u3R->hat_p + pad_w;
rod_u = _pave_south(u3a_into(bot_p), c3_wiseof(u3a_road), len_w);
u3e_ward(rod_u->cap_p, rod_u->hat_p);
#if 0
fprintf(stderr, "leap: from north %p (cap 0x%x), to south %p\r\n",
u3R,
u3R->cap_p + len_w,
rod_u);
fprintf(stderr, "NPAR.hat_p: 0x%x %p, SKID.hat_p: 0x%x %p\r\n",
u3R->hat_p, u3a_into(u3R->hat_p),
rod_u->hat_p, u3a_into(rod_u->hat_p));
#endif
}
else {
bot_p = u3R->cap_p;
u3R->cap_p += len_w;
rod_u = _pave_north(u3a_into(bot_p), c3_wiseof(u3a_road), len_w);
u3e_ward(rod_u->hat_p, rod_u->cap_p);
#if 0
fprintf(stderr, "leap: from south %p (cap 0x%x), to north %p\r\n",
u3R,
u3R->cap_p - len_w,
rod_u);
fprintf(stderr, "SPAR.hat_p: 0x%x %p, NKID.hat_p: 0x%x %p\r\n",
u3R->hat_p, u3a_into(u3R->hat_p),
rod_u->hat_p, u3a_into(rod_u->hat_p));
#endif
}
}
@ -850,6 +881,8 @@ u3m_leap(c3_w pad_w)
#ifdef U3_MEMORY_DEBUG
rod_u->all.fre_w = 0;
#endif
_rod_vaal(u3R);
}
void
@ -1304,7 +1337,7 @@ u3m_soft(c3_w mil_w,
{
u3_noun why;
why = u3m_soft_top(mil_w, (1 << 20), fun_f, arg); // 2MB pad
why = u3m_soft_top(mil_w, (1 << 20), fun_f, arg); // 4M pad
if ( 0 == u3h(why) ) {
return why;
@ -1981,3 +2014,158 @@ u3m_pack(void)
return (u3a_open(u3R) - pre_w);
}
static void
_migrate_reclaim()
{
fprintf(stderr, "loom: migration reclaim\r\n");
u3m_reclaim();
}
static void
_migrate_seek(const u3a_road *rod_u)
{
/*
very much like u3a_pack_seek with the following changes:
- there is no need to account for free space as |pack is performed before
the migration
- odd sized boxes will be padded by one word to achieve an even size
- rut will be moved from one word ahead of u3_Loom to two words ahead
*/
c3_w * box_w = u3a_into(rod_u->rut_p);
c3_w * end_w = u3a_into(rod_u->hat_p);
u3_post new_p = (rod_u->rut_p + 1 + c3_wiseof(u3a_box));
u3a_box * box_u = (void *)box_w;
fprintf(stderr, "loom: migration seek\r\n");
for (; box_w < end_w
; box_w += box_u->siz_w
, box_u = (void*)box_w)
{
if (!box_u->use_w)
continue;
c3_assert(box_u->siz_w);
c3_assert(box_u->use_w);
box_w[box_u->siz_w - 1] = new_p;
new_p = c3_align(new_p + box_u->siz_w, 2, C3_ALGHI);
}
}
static void
_migrate_rewrite()
{
fprintf(stderr, "loom: migration rewrite\r\n");
/* So that rewritten pointers are compressed, this flag is set */
u3C.migration_state = MIG_REWRITE_COMPRESSED;
_cm_pack_rewrite();
u3C.migration_state = MIG_NONE;
}
static void
_migrate_move(u3a_road *rod_u)
{
fprintf(stderr, "loom: migration move\r\n");
c3_z hiz_z = u3a_heap(rod_u) * sizeof(c3_w);
/* calculate required shift distance to prevent write head overlapping read head */
c3_w off_w = 1; /* at least 1 word because u3R->rut_p migrates from 1 to 2 */
for (u3a_box *box_u = u3a_into(rod_u->rut_p)
; (void *)box_u < u3a_into(rod_u->hat_p)
; box_u = (void *)((c3_w *)box_u + box_u->siz_w))
off_w += box_u->siz_w & 1; /* odd-sized boxes are padded by one word */
/* shift */
memmove(u3a_into(u3H->rod_u.rut_p + off_w),
u3a_into(u3H->rod_u.rut_p),
hiz_z);
/* manually zero the former rut */
*(c3_w *)u3a_into(rod_u->rut_p) = 0;
/* relocate boxes to DWORD-aligned addresses stored in trailing size word */
c3_w *box_w = u3a_into(rod_u->rut_p + off_w);
c3_w *end_w = u3a_into(rod_u->hat_p + off_w);
u3a_box *old_u = (void *)box_w;
c3_w siz_w = old_u->siz_w;
u3p(c3_w) new_p = rod_u->rut_p + 1 + c3_wiseof(u3a_box);
c3_w *new_w;
for (; box_w < end_w
; box_w += siz_w
, old_u = (void *)box_w
, siz_w = old_u->siz_w) {
old_u->use_w &= 0x7fffffff;
if (!old_u->use_w)
continue;
new_w = (void *)u3a_botox(u3a_into(new_p));
c3_assert(box_w[siz_w - 1] == new_p);
c3_assert(new_w <= box_w);
c3_w i_w;
for (i_w = 0; i_w < siz_w - 1; i_w++)
new_w[i_w] = box_w[i_w];
if (siz_w & 1) {
new_w[i_w++] = 0; /* pad odd sized boxes */
new_w[i_w++] = siz_w + 1; /* restore trailing size word */
new_w[0] = siz_w + 1; /* and the leading size word */
}
else {
new_w[i_w++] = siz_w;
}
new_p += i_w;
}
/* restore proper heap state */
rod_u->rut_p = 2;
rod_u->hat_p = new_p - c3_wiseof(u3a_box);
/* like |pack, clear the free lists and cell allocator */
for (c3_w i_w = 0; i_w < u3a_fbox_no; i_w++)
u3R->all.fre_p[i_w] = 0;
u3R->all.fre_w = 0;
u3R->all.cel_p = 0;
}
/* u3m_migrate: perform loom migration if necessary.
ver_w - target version
*/
void
u3m_migrate(u3v_version ver_w)
{
if (u3H->ver_w == ver_w)
return;
/* 1 -> 2 is all that is currently supported */
c3_dessert(u3H->ver_w == U3V_VER1 &&
ver_w == U3V_VER2);
/* only home road migration is supported */
c3_dessert((uintptr_t)u3H == (uintptr_t)u3R);
fprintf(stderr, "loom: migration running. This may take several minutes to perform.\r\n");
fprintf(stderr, "loom: have version: %"PRIc3_w" migrating to version: %"PRIc3_w"\r\n",
u3H->ver_w, ver_w);
/* packing first simplifies migration logic and minimizes required buffer space */
u3m_pack();
/* perform the migration in a pattern similar to |pack */
_migrate_reclaim();
_migrate_seek(&u3H->rod_u);
_migrate_rewrite();
_migrate_move(&u3H->rod_u);
/* finally update the version and commit to disk */
u3H->ver_w = ver_w;
/* extra assurance we haven't corrupted the loom before writing to disk */
u3a_loom_sane();
u3e_save();
}

View File

@ -5,6 +5,7 @@
#include "c3.h"
#include "types.h"
#include "version.h"
/** System management.
**/
@ -159,4 +160,10 @@
c3_w
u3m_pack(void);
/* u3m_migrate: perform loom migration if necessary.
ver_w - target version
*/
void
u3m_migrate(u3v_version ver_w);
#endif /* ifndef U3_MANAGE_H */

View File

@ -14,6 +14,14 @@
u3_noun who; // single identity
c3_c* dir_c; // execution directory (pier)
c3_w wag_w; // flags (both ways)
c3_w vits_w; // number of virtual bits in reference
c3_w walign_w; // word alignment
c3_d balign_d; // byte alignment
enum {
MIG_NONE,
MIG_REWRITE_COMPRESSED,
} migration_state;
size_t wor_i; // loom word-length (<= u3a_words)
void (*stderr_log_f)(c3_c*); // errors from c code
void (*slog_f)(u3_noun); // function pointer for slog
@ -45,5 +53,4 @@
extern u3o_config u3o_Config;
# define u3C u3o_Config
#endif /* ifndef U3_OPTIONS_H */

View File

@ -1033,63 +1033,46 @@ c3_w
u3r_met(c3_y a_y,
u3_atom b)
{
c3_assert(u3_none != b);
c3_assert(_(u3a_is_atom(b)));
c3_dessert(u3_none != b);
c3_dessert(_(u3a_is_atom(b)));
if ( b == 0 ) {
return 0;
}
else {
/* gal_w: number of words besides (daz_w) in (b).
** daz_w: top word in (b).
*/
c3_w gal_w;
c3_w daz_w;
/* gal_w: number of words besides (daz_w) in (b).
** daz_w: top word in (b).
*/
c3_w gal_w;
c3_w daz_w;
if ( _(u3a_is_cat(b)) ) {
gal_w = 0;
daz_w = b;
}
else {
u3a_atom* b_u = u3a_to_ptr(b);
gal_w = (b_u->len_w) - 1;
daz_w = b_u->buf_w[gal_w];
}
switch ( a_y ) {
case 0:
case 1:
case 2: {
/* col_w: number of bits in (daz_w)
** bif_w: number of bits in (b)
*/
c3_w bif_w, col_w;
if ( gal_w > ((UINT32_MAX - 35) >> 5) ) {
return u3m_bail(c3__fail);
}
col_w = c3_bits_word(daz_w);
bif_w = col_w + (gal_w << 5);
return (bif_w + ((1 << a_y) - 1)) >> a_y;
}
STATIC_ASSERT((UINT32_MAX > ((c3_d)u3a_maximum << 2)),
"met overflow");
case 3: return (gal_w << 2) + ((c3_bits_word(daz_w) + 7) >> 3);
case 4: return (gal_w << 1) + ((c3_bits_word(daz_w) + 15) >> 4);
default: {
c3_y gow_y = (a_y - 5);
return ((gal_w + 1) + ((1 << gow_y) - 1)) >> gow_y;
}
}
if ( _(u3a_is_cat(b)) ) {
gal_w = 0;
daz_w = b;
}
else {
u3a_atom* b_u = u3a_to_ptr(b);
gal_w = (b_u->len_w) - 1;
daz_w = b_u->buf_w[gal_w];
}
/* 5 because 1<<2 bytes in c3_w, 1<<3 bits in byte.
aka log2(CHAR_BIT * sizeof gal_w)
a_y < 5 informs whether we shift return left or right
*/
if (a_y < 5) {
c3_y max_y = (1 << a_y) - 1;
c3_y gow_y = 5 - a_y;
if (gal_w > ((UINT32_MAX - (32 + max_y)) >> gow_y))
return u3m_bail(c3__fail);
return (gal_w << gow_y)
+ ((c3_bits_word(daz_w) + max_y)
>> a_y);
}
c3_y gow_y = (a_y - 5);
return ((gal_w + 1) + ((1 << gow_y) - 1)) >> gow_y;
}
/* u3r_bit():

21
pkg/noun/version.h Normal file
View File

@ -0,0 +1,21 @@
#ifndef U3_VERSION_H
#define U3_VERSION_H
/* VORTEX
*/
typedef c3_w u3v_version;
#define U3V_VER1 1
#define U3V_VER2 2
#define U3V_VERLAT U3V_VER2
/* EVENTS
*/
typedef c3_w u3e_version;
#define U3E_VER1 1
#define U3E_VERLAT U3E_VER1
#endif /* ifndef U3_VERSION_H */

View File

@ -3,8 +3,10 @@
#ifndef U3_VORTEX_H
#define U3_VORTEX_H
#include "allocate.h"
#include "c3.h"
#include "imprison.h"
#include "version.h"
/** Data structures.
**/
@ -22,9 +24,9 @@
** NB: version must be last for discriminability in north road
*/
typedef struct _u3v_home {
u3a_road rod_u; // storage state
u3v_arvo arv_u; // arvo state
c3_w ver_w; // version number
u3a_road rod_u; // storage state
u3v_arvo arv_u; // arvo state
u3v_version ver_w; // version number
} u3v_home;
@ -37,7 +39,6 @@
/** Constants.
**/
# define u3v_version 1
/** Functions.
**/

View File

@ -1220,6 +1220,7 @@ u3_lord_init(c3_c* pax_c, c3_w wag_w, c3_d key_d[4], u3_lord_cb cb_u)
god_u->ops_u.file = arg_c[0];
god_u->ops_u.args = arg_c;
/* spawns worker thread */
if ( (err_i = uv_spawn(u3L, &god_u->cub_u, &god_u->ops_u)) ) {
fprintf(stderr, "spawn: %s: %s\r\n", arg_c[0], uv_strerror(err_i));

View File

@ -166,8 +166,8 @@ _main_init(void)
u3_Host.ops_u.hap_w = 50000;
u3_Host.ops_u.kno_w = DefaultKernel;
u3_Host.ops_u.lut_y = u3a_bits + 1;
u3_Host.ops_u.lom_y = u3a_bits + 1;
u3_Host.ops_u.lut_y = 31; /* aka 2G */
u3_Host.ops_u.lom_y = 31;
}
/* _main_pier_run(): get pier from binary path (argv[0]), if appropriate
@ -257,9 +257,9 @@ _main_getopt(c3_i argc, c3_c** argv)
switch ( ch_i ) {
case 5: { // urth-loom
c3_w lut_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lut_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lut_w);
if ( (c3n == res_o) || (lut_w < 20) ) {
fprintf(stderr, "error: --urth-loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --urth-loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
return c3n;
}
@ -368,9 +368,9 @@ _main_getopt(c3_i argc, c3_c** argv)
}
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 20) ) {
fprintf(stderr, "error: --loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
return c3n;
}
u3_Host.ops_u.lom_y = lom_w;
@ -1202,9 +1202,9 @@ _cw_eval(c3_i argc, c3_c* argv[])
switch ( ch_i ) {
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 24) ) {
fprintf(stderr, "error: --loom must be >= 24 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 24 and <= %zu\r\n", u3a_bits_max);
exit(1);
}
u3_Host.ops_u.lom_y = lom_w;
@ -1399,9 +1399,9 @@ _cw_cram(c3_i argc, c3_c* argv[])
switch ( ch_i ) {
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 20) ) {
fprintf(stderr, "error: --loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
exit(1);
}
u3_Host.ops_u.lom_y = lom_w;
@ -1478,9 +1478,9 @@ _cw_queu(c3_i argc, c3_c* argv[])
switch ( ch_i ) {
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 20) ) {
fprintf(stderr, "error: --loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
exit(1);
}
u3_Host.ops_u.lom_y = lom_w;
@ -1563,9 +1563,9 @@ _cw_meld(c3_i argc, c3_c* argv[])
switch ( ch_i ) {
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 20) ) {
fprintf(stderr, "error: --loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
exit(1);
}
u3_Host.ops_u.lom_y = lom_w;
@ -1637,9 +1637,9 @@ _cw_next(c3_i argc, c3_c* argv[])
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 20) ) {
fprintf(stderr, "error: --loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
exit(1);
}
u3_Host.ops_u.lom_y = lom_w;
@ -1696,9 +1696,9 @@ _cw_pack(c3_i argc, c3_c* argv[])
switch ( ch_i ) {
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 20) ) {
fprintf(stderr, "error: --loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
exit(1);
}
u3_Host.ops_u.lom_y = lom_w;
@ -1811,9 +1811,9 @@ _cw_prep(c3_i argc, c3_c* argv[])
switch ( ch_i ) {
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 20) ) {
fprintf(stderr, "error: --loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
exit(1);
}
u3_Host.ops_u.lom_y = lom_w;
@ -1869,9 +1869,9 @@ _cw_chop(c3_i argc, c3_c* argv[])
switch ( ch_i ) {
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 20) ) {
fprintf(stderr, "error: --loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
exit(1);
}
u3_Host.ops_u.lom_y = lom_w;
@ -2156,9 +2156,9 @@ _cw_vile(c3_i argc, c3_c* argv[])
switch ( ch_i ) {
case c3__loom: {
c3_w lom_w;
c3_o res_o = _main_readw(optarg, u3a_bits + 3, &lom_w);
c3_o res_o = _main_readw(optarg, u3a_bits_max+1, &lom_w);
if ( (c3n == res_o) || (lom_w < 20) ) {
fprintf(stderr, "error: --loom must be >= 20 and <= %u\r\n", u3a_bits + 2);
fprintf(stderr, "error: --loom must be >= 20 and <= %zu\r\n", u3a_bits_max);
exit(1);
}
u3_Host.ops_u.lom_y = lom_w;