Switch to custom cksum script from old crc32. use constants to make it faster

This commit is contained in:
Brendan Hansknecht 2021-09-30 22:08:14 -07:00
parent 157a04d1ac
commit e7cba9fb88
3 changed files with 229 additions and 15 deletions

View File

@ -0,0 +1,228 @@
{
This should do the exact same thing as the linux cksum utility.
It does a crc32 of the input data.
One core difference, this reads off of stdin while cksum reads from a file.
With the interpreter, it currently runs about 350x slower though and requires extended stack size.
}
{
Load 256 constants from https://github.com/wertarbyte/coreutils/blob/f70c7b785b93dd436788d34827b209453157a6f2/src/cksum.c#L117
To support the original false interpreter, numbers must be less than 32000.
To deal with loading, just split all the numbers in two chunks.
First chunk is lower 16 bits, second chunk is higher 16 bits shift to the right.
Its values are then shifted back and merged together.
}
16564 45559 65536*| 23811 46390 65536*| 31706 47221 65536*| 26221 48308 65536*|
13928 41715 65536*| 11231 42546 65536*| 3334 43889 65536*| 4273 44976 65536*|
44300 38911 65536*| 45243 37694 65536*| 38498 40573 65536*| 35797 39612 65536*|
56272 34043 65536*| 50791 32826 65536*| 57534 36217 65536*| 64777 35256 65536*|
39876 64998 65536*| 34419 63783 65536*| 41130 62564 65536*| 48413 61605 65536*|
60696 61154 65536*| 61615 59939 65536*| 54902 59232 65536*| 52161 58273 65536*|
30332 56302 65536*| 27595 57135 65536*| 19730 53868 65536*| 20645 54957 65536*|
160 51434 65536*| 7447 52267 65536*| 15310 49512 65536*| 9849 50601 65536*|
63060 10708 65536*| 60387 11541 65536*| 52538 8278 65536*| 53389 9367 65536*|
32904 15056 65536*| 40255 15889 65536*| 48102 13138 65536*| 42577 14227 65536*|
7148 4060 65536*| 1627 2845 65536*| 8322 1630 65536*| 15669 671 65536*|
27952 7384 65536*| 28807 6169 65536*| 22110 5466 65536*| 19433 4507 65536*|
11556 26053 65536*| 12435 24836 65536*| 5706 27719 65536*| 3069 26758 65536*|
23544 30401 65536*| 17999 29184 65536*| 24726 32579 65536*| 32033 31618 65536*|
49308 17357 65536*| 56619 18188 65536*| 64498 19023 65536*| 58949 20110 65536*|
46656 20681 65536*| 44023 21512 65536*| 36142 22859 65536*| 37017 23946 65536*|
12483 34161 65536*| 11636 33200 65536*| 2989 36083 65536*| 5658 34866 65536*|
17951 38517 65536*| 23464 37556 65536*| 32113 40951 65536*| 24774 39734 65536*|
56699 41849 65536*| 49356 42936 65536*| 58901 43771 65536*| 64418 44602 65536*|
43943 45181 65536*| 46608 46268 65536*| 37065 47615 65536*| 36222 48446 65536*|
60339 51552 65536*| 62980 52641 65536*| 53469 49378 65536*| 52586 50211 65536*|
40303 55908 65536*| 32984 56997 65536*| 42497 54246 65536*| 48054 55079 65536*|
1547 61288 65536*| 7100 60329 65536*| 15717 59114 65536*| 8402 57899 65536*|
28887 64620 65536*| 28000 63661 65536*| 19385 62958 65536*| 22030 61743 65536*|
34339 7506 65536*| 39828 6547 65536*| 48461 5328 65536*| 41210 4113 65536*|
61695 3670 65536*| 60744 2711 65536*| 52113 2004 65536*| 54822 789 65536*|
27547 15194 65536*| 30252 16283 65536*| 20725 13016 65536*| 19778 13849 65536*|
7495 10334 65536*| 240 11423 65536*| 9769 8668 65536*| 15262 9501 65536*|
23891 20803 65536*| 16612 21890 65536*| 26173 22721 65536*| 31626 23552 65536*|
11151 16967 65536*| 13880 18054 65536*| 4321 19397 65536*| 3414 20228 65536*|
45291 30539 65536*| 44380 29578 65536*| 35717 32457 65536*| 38450 31240 65536*|
50743 25679 65536*| 56192 24718 65536*| 64857 28109 65536*| 57582 26892 65536*|
41050 55547 65536*| 48621 56378 65536*| 39732 53625 65536*| 34435 54712 65536*|
54918 52223 65536*| 52017 53054 65536*| 60904 49789 65536*| 61535 50876 65536*|
19938 65267 65536*| 20565 64050 65536*| 30348 63345 65536*| 27451 62384 65536*|
15166 60919 65536*| 9865 59702 65536*| 80 58485 65536*| 7655 57524 65536*|
31530 38122 65536*| 26269 36907 65536*| 16452 40296 65536*| 24051 39337 65536*|
3574 34798 65536*| 4161 33583 65536*| 13976 36460 65536*| 11055 35501 65536*|
38546 45794 65536*| 35621 46627 65536*| 44540 47968 65536*| 45131 49057 65536*|
57422 41446 65536*| 65017 42279 65536*| 56096 43108 65536*| 50839 44197 65536*|
5818 16600 65536*| 2829 17433 65536*| 11732 18778 65536*| 12387 19867 65536*|
24678 21468 65536*| 32209 22301 65536*| 23304 23134 65536*| 18111 24223 65536*|
64258 26320 65536*| 59061 25105 65536*| 49260 28498 65536*| 56795 27539 65536*|
36318 30164 65536*| 36969 28949 65536*| 46768 31830 65536*| 43783 30871 65536*|
52682 3273 65536*| 53373 2056 65536*| 63140 1355 65536*| 60179 394 65536*|
47894 8141 65536*| 42657 6924 65536*| 32888 5711 65536*| 40399 4750 65536*|
8306 10945 65536*| 15813 11776 65536*| 6940 9027 65536*| 1707 10114 65536*|
22190 14789 65536*| 19225 15620 65536*| 28096 12359 65536*| 28791 13446 65536*|
53293 60541 65536*| 52634 59580 65536*| 60227 58879 65536*| 63220 57662 65536*|
42737 65401 65536*| 47942 64440 65536*| 40351 63227 65536*| 32808 62010 65536*|
15765 51829 65536*| 8226 52916 65536*| 1787 50167 65536*| 6988 50998 65536*|
19273 55665 65536*| 22270 56752 65536*| 28711 53491 65536*| 28048 54322 65536*|
2909 41068 65536*| 5866 42157 65536*| 12339 43502 65536*| 11652 44335 65536*|
32129 45928 65536*| 24630 47017 65536*| 18159 47850 65536*| 23384 48683 65536*|
59109 34404 65536*| 64338 33445 65536*| 56715 36838 65536*| 49212 35623 65536*|
36921 38240 65536*| 36238 37281 65536*| 43863 40162 65536*| 46816 38947 65536*|
26317 29790 65536*| 31610 28831 65536*| 23971 32220 65536*| 16404 31005 65536*|
4113 26458 65536*| 3494 25499 65536*| 11135 28376 65536*| 14024 27161 65536*|
35701 21078 65536*| 38594 22167 65536*| 45083 23508 65536*| 44460 24341 65536*|
64937 16722 65536*| 57374 17811 65536*| 50887 18640 65536*| 56176 19473 65536*|
48573 14415 65536*| 40970 15502 65536*| 34515 12749 65536*| 39780 13580 65536*|
52065 11083 65536*| 54998 12170 65536*| 61455 8905 65536*| 60856 9736 65536*|
20485 7751 65536*| 19890 6790 65536*| 27499 6085 65536*| 30428 4868 65536*|
9945 3395 65536*| 15214 2434 65536*| 7607 1217 65536*| 0
{load crc32 base 0}
0
{load the xor function into x for use later}
[
{duplicate both inputs}
{nand inputs}
&~
{bring original inputs to top of stack with rotation}
@@
{or inputs}
|
{and the nand and or result to get xor}
&
]x:
{load right shift function to r for later use}
[
{will right shift the second from top value by the top value}
{while top value > 0}
[$0>][
{minus one from the top value}
1-
{swap values}
\
{divide the bottom value by 2}
2/
{swap back}
\
]#
{drop the top value}
%
]r:
{i will be used to count the length. Set it to zero to start}
0i:
{load the first character}
^
{while data != - 1: # -1 is eof}
[$1_=~][
{increment i}
i;1+i:
{duplicate crc32 which is on the stack under the current character}
{Shift crc32 right by 24}
24r;!
{xor the data and crc32}
x;!
{and with 255 to ensure it is in range}
255&
{
The index goes into the constant array, but currently the crc32 is loaded infront of the constant array.
Add 1 to the index to skip this and then load the value from the stack.
}
1+ø
{swap the crc32 on top of the stack}
\
{left shift it by 8 (multiply by 0x100)}
256*
{xor with the loaded constant}
x;!
{truncate back to 32 bits (and with 0xFFFF_FFFF_FFFF_FFFF)}
65535 65535 65536*|&
{load the next character}
^
]#
{drop the -1 left on top of the stack}
%
{to match ck sum, add length to crc32}
{load i}
i;
{Note, this will break if i is negative from overflow}
{while i != 0}
[$0=~][
{duplicate and get last byte of i by and with 0xFF}
$255&
{duplicate crc32 which is on the stack under i and the current byte}
{Shift crc32 right by 24}
24r;!
{xor the data and crc32}
x;!
{and with 255 to ensure it is in range}
255&
{
The index goes into the constant array, but currently the i and the crc32 is loaded infront of the constant array.
Add 1 to the index to skip this and then load the value from the stack.
}
2+ø
{rotate the crc32 on top of the stack}
@
{left shift it by 8 (multiply by 0x100)}
256*
{xor with the loaded constant}
x;!
{truncate back to 32 bits (and with 0xFFFF_FFFF_FFFF_FFFF)}
65535 65535 65536*|&
{swap i back on top of the stack and right shift it by 8}
\8r;!
]#
{drop i}
%
{ binary negate the crc32 }
~
{truncate back to 32 bits (and with 0xFFFF_FFFF_FFFF_FFFF)}
65535 65535 65536*|&
{print the crc32}
.
{print a space}
" "
{print the length}
i;.

View File

@ -1,14 +0,0 @@
{
This currently doesn't run cause the interpreter will stack overflow, even in optimized build
I tried to run this with unlimited stack size, but I gave up after it used 24+ GB of my memory on stack space.
This is just to do the CRC32 of the letter a, which runs essentially instantly in one of the JS interpreters.
Maybe a really need to model things differently to help Roc figure out how to optimize or otherwise avoid essentially infinite recursion
}
{ unix cksum, CRC32. -- Jonathan Neuschäfer <j.neuschaefer@gmx.net> }
[[$0>][\2*\1-]#%]l:[0\128[$0>][$2O&0>[$@\64/x;!@@$g;*@x;!@@]?2/]#%%]h:
[[$0>][\2/\1-]#%]r:[1O$8 28l;!1-&$@=~\24r;!\[128|]?x;!h;!\8l;!x;!]s:79764919g:
[q1_0[\1+\^$1_>]s;#%@%\$@[1O0>][1O255&s;!\8r;!\]#~n;!32,%.10,]m:[$2O&@@|~|~]x:
[$0\>\1O[$u;!]?\~[$.]?%]n:[h;y:[3+O]h:255[$0>][$y;!\1-]#m;!256[$0>][\%1-]#%]o:
[1000$$**0@[$$0\>\4O\>~|][2O-\1+\]#\.\[10/$0>][\$2O/$.2O*-\]#%%]u: {width: 78}
{ usage: run m for "main" or o for "optimized" (builds a lookup table) } m;!

View File

@ -1 +1 @@
a
abc