urbit/pkg/arvo/sys
Paul Driver 3aaf82de5e arvo: accept embedded nulls in to-wain:format
Prior to this commit, there was a jet mismatch in to-wain (formerly
called lore, and still jetted under that name). 0 bytes in the middle of
a cord caused the jet to crash, whereas the hoon simply treated them as
the end of cord and truncated the output. The history of this behavior
is fraught with controversy. This commit rectifies the current mess with
the following rationale: Null bytes are valid ASCII/UTF-8, and \n\n in
the input will cause null list items in the output, so nulls are (for
the purposes of to-wain) allowed in cords. Trailing nulls cannot be
represented because of the nature of atoms, but that is outside the
scope of to-wain's concern. Therefore to-wain should simply measure the
cord and split on newlines, and do nothing fancy at all with nulls.

In addition, the hoon for to-wain was written in an inefficient style
that produced a lot of intermediate garbage atoms via rsh and cat. This
commit's implementation measures once and cuts once, so to speak, and so
avoids the intermediate garbage.  Quick benchmarks suggest it is about
20x faster than the old hoon, but still orders of magnitude slower than
the jetted code. to-wain is the workhorse for the txt mark, so we should
still prefer to have a jet.

The old jet is left wired up under %lore, and should be removed when
support for the old, unupgraded zuse is no longer necessary. A new jet
with matching null handling has been wired up under the name %leer.
2020-09-18 16:15:10 -07:00
..
vane arvo: adds |meld, triggering memory unification 2020-09-09 22:50:43 -07:00
arvo.hoon Merge branch 'release/next-sys' into jb/m/behn-scry 2020-07-22 02:02:05 +02:00
hoon.hoon Merge pull request #3304 from ohAitch/patch-3 2020-08-13 10:44:36 +02:00
zuse.hoon arvo: accept embedded nulls in to-wain:format 2020-09-18 16:15:10 -07:00