Commit Graph

482 Commits

Author SHA1 Message Date
Philip Monk
0e876b3cd4
ames: better printfs 2019-12-18 11:31:17 -03:30
Philip Monk
16d98e5eda
jael: stop ship-to-ship 2019-12-18 11:19:41 -03:30
Philip Monk
18c3e7253b
jael: add "eager" mode to avoid hitting nodes as much 2019-12-18 10:58:00 -03:30
Philip Monk
15bd35301e
jael: properly store ship sources 2019-12-18 10:42:57 -03:30
Ted Blackman
9fb37543ec ford: clear build results on +load 2019-12-18 00:25:27 -05:00
Philip Monk
7ca3d9624e
ames: handle misordered crashing boons
Two bugs fixed here: first, if the %done reentrancy triggered another
%boon, that wasn't getting translated to a %lost, even though it could
have been the reason the event crashed in the first place.

Second, the %done reentrancy needs to happen after we emit our move, so
that we don't invert the order of the %boon's we produce.
2019-12-17 20:58:30 -08:00
Philip Monk
e5ac690fd3
jael: re-enable ship-to-ship communication
Also fix bug in eth-watcher that didn't cancel outstanding threads when
config changes.

And set default rift for ourselves to 0.
2019-12-17 16:14:07 -08:00
Philip Monk
769a1c96af
eyre: turn sigpam into flog
This error is mostly harmless, but it does indicate we aren't cleaning
up our subscriptions properly.  This lets you silence with |knob.

fixes #2088
2019-12-14 00:49:23 -08:00
Philip Monk
b14606660a
goad: recompile apps after changes to /sys
OTAs commonly end up in an inconsistent state if apps depend on changes
to /sys.  For example, the %sift changes break on OTA because %spider
needs to be reloaded so that it's aware of the new thread type.  This
adds a %goad app, which reloads all apps after every change to /sys.

Getting this to start OTA is nontrivial, but this pattern should work
for apps in the future.  The changes to clock shouldn't generally be
necessary; they are only necessary here because we can't rely on hood to
start goad, since hood fails to compile if it's run before zuse is
reloaded.  Once goad is active, this will cease to be a problem.
2019-12-13 17:14:51 -08:00
Jared Tobin
9ba4505086
Merge branch 'ames-sift' (#2081)
* ames-sift:
  ames: refactor +load
  ames: +send-blob better ship printing
  hood: |ames-sift generator to trace by ship
  ames: add %sift  to trace by ship

Signed-off-by: Jared Tobin <jared@tlon.io>
2019-12-12 16:06:32 +08:00
Ted Blackman
35596ca7de
ames: refactor +load 2019-12-12 15:55:37 +08:00
Ted Blackman
d4574b5da4
ames: +send-blob better ship printing 2019-12-12 15:55:36 +08:00
Ted Blackman
d77fb0f685
ames: add %sift to trace by ship 2019-12-12 15:55:32 +08:00
Jared Tobin
85d447f173
Merge branch 'philip/gall-noop' (#2073)
* origin/philip/gall-noop:
  gall: no-op on duplicate watch-ack

Signed-off-by: Jared Tobin <jared@tlon.io>
2019-12-12 15:50:19 +08:00
Jared Tobin
2aa86e3121
Merge branch 'philip/stuck-flow' (#2071)
* origin/philip/stuck-flow:
  ames: recover from mismatched message nums

Signed-off-by: Jared Tobin <jared@tlon.io>
2019-12-12 15:49:53 +08:00
Jared Tobin
e4a7dae888
Merge branch 'philip/login-instructions' (#2039)
* origin/philip/login-instructions:
  eyre: add instructions to login page

Signed-off-by: Jared Tobin <jared@tlon.io>
2019-12-12 15:46:36 +08:00
Philip Monk
3b41a8be15
gall: no-op on duplicate watch-ack
fixes #2070
2019-12-10 18:49:50 -08:00
Philip Monk
29f078bb14
ames: don't forward up the sponsorship chain
This is *actually* why the galaxies are under so much load.  They're in
a forwarding loop with their stars, and this breaks the loop.
2019-12-10 16:20:12 -08:00
Philip Monk
68279d91e4
gall: remove message type from wire
%leave over the network didn't work because we included the message type
in the wire from gall, so the duct for the initial %watch and the %leave
were different.  We need to know the message type so we can route the
acknowledgment as %poke-ack, %watch-ack, or no-op.

This moves this piece of information to a piece of state, where we queue
up the message types per [duct wire].  Ames guarantees that
acknowledgments will come in order.

This also includes an easy state adapter.  The more interesting part of
the upgrade is that we likely have outstanding subscriptions with the
old wire format.  The disadvantage of storing information in wires is
that it can't be upgraded in +load.  So, here we listen for updates on
the old wire format, and when we get them we kill the old subscription,
so that it will be recreated with the new wire format.

As an aside, this is a good example of what we mean when we say
subscriptions may be killed at any time, so apps must handle this case.

Finally, this fixes the "attributing" ship to ~zod for agent requests.
This information was ignored for agent requests, but including it causes
spurious duct mismatches.
2019-12-10 19:32:26 +08:00
Philip Monk
e7c8a44e11
ames: recover from mismatched message nums
We've seen issues where the message-num of the head of live.state is
less than current.state.  When this happens, we continually try to
resend message n-1, but we throw away any acknowledgment for n-1 because
current.state is already n.  This halts progress on that flow.

We don't know what causes us to get in this bad state, so this adds an
assert to the packet pump that we're in a good state, run every time
the packet pump is run.  When this crashes, we can turn on |ames-verb
and hopefully identify the cause.

This also adds logic to +on-wake in the packet pump to not try to resend
any messages that have already been acknowledged.  This is just to
rescue ships that currently have these stuck flows.

(Incidentally, I'd love to have a rr-style debugger for stuff like this.
Just run a command that says "replay my event log watching for this
specific condition and then stop and let me poke around".)
2019-12-09 23:31:18 -08:00
Philip Monk
abde1d8aa9
ames: reduce load by increasing timer delays 2019-12-06 12:11:06 -08:00
Philip Monk
956a3c7420
eyre: add instructions to login page 2019-12-05 12:31:42 -08:00
Ted Blackman
bee0b5803a
ames: don't crash on missing queued larval event 2019-12-05 17:04:24 +08:00
Jared Tobin
41b64feb16
Merge branch 'philip/p2p' (#2025)
* philip/p2p:
  ames: don't overwrite lane if already direct

Signed-off-by: Jared Tobin <jared@tlon.io>
2019-12-05 16:08:01 +08:00
Philip Monk
5406f06092
ames: don't overwrite lane if already direct
This is why basically all packets are going through the galaxies right
now.  Most of the time, the flow right now is:

* talking to ~dopzod but don't know where it is, so ask ~zod to forward,
  which it does

* ~dopzod responds both directly (on the origin lane) and through ~zod

* (if NAT, the direct response doesn't get back, but the one through
  ~zod does. Then you respond directly to ~dopzod because their lane
  piggybacked on the response. ~dopzod responds both directly and
  through ~zod, and the story picks up the same as if you weren't behind a
  NAT)

* now you have a direct lane to ~dopzod, so all is well.

* now the duplicate response from ~dopzod through ~zod comes in (takes a
  little longer because it's bouncing off ~zod), resetting your lane to
  "provisional"

* since your lane is provisional, you send your next packet both
  directly and through ~zod

* GOTO 2

This change says "if I already have a direct lane, don't overwrite it
with a provisional one". This way, the only way the direct lane can be
overwritten is if they stop responding on it (cleared on "not
responding; still trying").

I also added |- to +send-blob to make |ames-verb %rot less confusing.
2019-12-05 16:05:06 +08:00
Jared Tobin
75ca54ca24
Merge branch 'ames-sponsor-scry-2' (#2021)
* ames-sponsor-scry-2:
  ames: scry for sponsor and don't crash on jael response

Signed-off-by: Jared Tobin <jared@tlon.io>
2019-12-05 15:43:00 +08:00
Ted Blackman
a7e638ebab ames: scry for sponsor and don't crash on jael response 2019-12-04 17:18:39 -05:00
Ted Blackman
b3f757d88b
ames: send larval crashes to dill 2019-12-05 02:23:13 +08:00
Ted Blackman
4c9cc1542a
ames: dequeue failed larval timer 2019-12-05 02:23:13 +08:00
Ted Blackman
c20f2391e1
ames: print and retry larval crashes 2019-12-05 02:22:27 +08:00
Philip Monk
ebec1eb54f
ping: delay kick until after ames processes breach 2019-12-04 02:27:35 -08:00
Philip Monk
9bc6ccb7fc
ames: don't say not responding if we haven't been talking 2019-12-03 20:21:43 -08:00
Philip Monk
38197fc79d
gen: add comments on new generators 2019-12-03 16:41:29 -08:00
Philip Monk
98b886acf2
Merge remote-tracking branch 'jfranklin9000/master' into rc 2019-12-03 15:10:32 -08:00
Philip Monk
702dd2c07a
verb: add +verb %bowl to print bowl on every event 2019-12-03 15:05:42 -08:00
Philip Monk
8c2c52c01c
ames: make life printf helpful 2019-12-03 13:06:04 -08:00
Philip Monk
2954ed0b55
gall: correctly construct wire for ap-specific-take 2019-12-02 23:46:15 -08:00
Philip Monk
5f1c4805fe
ames: printfs 2019-12-02 23:13:48 -08:00
Philip Monk
c90107659b
Merge remote-tracking branch 'origin/rc-ames-verb' into rc 2019-12-02 20:22:04 -08:00
Philip Monk
93d3edbf73
pill 2019-12-02 20:09:36 -08:00
Philip Monk
9186f232f1
gall: kick after sending leave 2019-12-02 20:09:36 -08:00
Ted Blackman
d0d45ed8f2
ames: fix message pump to complete queueing fix 2019-12-02 20:09:35 -08:00
Ted Blackman
6dcb6622fa
ames: fix ack queueing 2019-12-02 20:09:35 -08:00
Ted Blackman
0cb6464e9d ames: %spew to set verbosity 2019-12-02 18:46:40 -05:00
Philip Monk
096273cf4a
gall: add state upgrade for %pack 2019-12-02 03:20:34 -08:00
Philip Monk
0431c3c073
Merge remote-tracking branch 'origin/jam-cue-rock' into rc 2019-12-02 02:08:37 -08:00
Philip Monk
e3005eaffa
ames: clear out-of-order messages from packet queue 2019-12-02 00:41:50 -08:00
Philip Monk
aaf7b3b42e
ames: don't crash in +on-take-wake 2019-12-01 16:00:32 -08:00
Ted Blackman
0a8b12c882 ames: state adapter 2019-12-01 02:49:46 -05:00
Ted Blackman
900d923ccc ames: fix aggressive lane timeout (still needs migration) 2019-12-01 02:49:46 -05:00
John Franklin
c36edc838e jael: add %lyfe and %ryft scrys (unitized %life and %rift) and use in +trouble 2019-11-30 19:04:18 -06:00
Philip Monk
d6c1ff4e20
ames: add routing diagnostics 2019-11-30 14:44:57 -08:00
Ted Blackman
071b1a4bbe ames: ~s30 max timeout instead of ~m2 2019-11-28 01:17:34 -05:00
Ted Blackman
93604c2f29 Merge branch 'rc' of github.com:urbit/urbit into rc 2019-11-27 23:06:53 -05:00
Ted Blackman
3779cca5a9 ames: try sponsors above .our 2019-11-27 23:06:39 -05:00
Philip Monk
23cc21c383
ames: remove printf 2019-11-27 19:39:12 -08:00
Ted Blackman
e9ba500ee4 Merge branch 'rc' of github.com:urbit/urbit into rc 2019-11-27 22:31:18 -05:00
Philip Monk
26c5be2948
ames: remove printf 2019-11-27 18:40:33 -08:00
Ted Blackman
9af7b3954a ames: ignore encrypted packets from alien comets 2019-11-27 20:58:18 -05:00
Philip Monk
138cbb5d2e
ames: clean up printfs 2019-11-27 16:58:26 -08:00
Philip Monk
fdb1069b33
ames: printfs 2019-11-27 16:43:09 -08:00
Philip Monk
fc74ab2dbd
ames: count unsent messages for backpressure 2019-11-27 15:58:38 -08:00
Philip Monk
74b0f66850
ames: continue processing memos after %done 2019-11-27 15:13:17 -08:00
Philip Monk
f035955a36
ames: rename alef -> ames 2019-11-27 00:46:02 -08:00
Philip Monk
96be4b65a5
Merge remote-tracking branch 'origin/mall-testnet' into philip/mall-real 2019-11-27 00:14:08 -08:00
Ted Blackman
55c19a7d47 ames: use +crub more in comet logic 2019-11-27 02:13:15 -05:00
Ted Blackman
e3dba2ea75 incorporate @pcmonk's fix for .channel and .peer-state 2019-11-27 02:11:43 -05:00
Philip Monk
d0d66b99f4
ames: save acknowledgement when negative %done 2019-11-26 22:48:22 -08:00
Ted Blackman
6c18a4ef76 ames: use +crub for comet attestation 2019-11-27 01:15:05 -05:00
Ted Blackman
7fdb940b5c ames: fix another comet bug 2019-11-26 23:52:43 -05:00
Ted Blackman
f91a1f4873 ames: fix egregious comet bug 2019-11-26 22:54:00 -05:00
Philip Monk
9a94c35d4d
gall: acutally delete outgoing on watch-ack 2019-11-26 18:18:44 -08:00
Ted Blackman
3f550d669e Merge remote-tracking branch 'origin/philip/mall-real' into mall-testnet 2019-11-26 20:20:23 -05:00
Philip Monk
bf55197baf
ames: backpressure fixes 2019-11-26 14:56:20 -08:00
Philip Monk
7cfe1542e5
ames: too big of messages 2019-11-26 12:00:27 -08:00
Philip Monk
4d1457bbaa
Merge remote-tracking branch 'origin/master' into philip/mall-real 2019-11-24 00:01:04 -08:00
Joe Bryan
3741e734cb dill: adds |pack and and friends 2019-11-22 17:24:42 -08:00
Philip Monk
f8b612d053
ph: add ph-all to run multiple tests 2019-11-22 12:46:30 -08:00
Ted Blackman
5110b3459b ames: only %clog on 5 unsent messages 2019-11-22 08:43:08 -05:00
Ted Blackman
4b4e5ba80c ames: fix typo 2019-11-22 08:42:19 -05:00
Ted Blackman
93ab01e274 ames: minor printing improvements 2019-11-21 23:10:49 -05:00
Philip Monk
9d47222139
Merge remote-tracking branch 'origin/mall-testnet' into philip/mall-real 2019-11-21 19:07:28 -08:00
Ted Blackman
4446c96814 yet another oops 2019-11-21 21:24:34 -05:00
Ted Blackman
8bbdc45a01 another oops 2019-11-21 21:23:34 -05:00
Ted Blackman
a4bad50e71 oops 2019-11-21 21:22:05 -05:00
Ted Blackman
ae76a33157 ames: kill timers on breach 2019-11-21 21:20:39 -05:00
Ted Blackman
1d88275032 ames: abandon ill-fated decryption reordering 2019-11-21 21:17:43 -05:00
Ted Blackman
29c816c65c ames: fix typo 2019-11-21 21:15:05 -05:00
Ted Blackman
cf543134ee ames: don't repeat broken timers 2019-11-21 21:13:09 -05:00
Ted Blackman
e7be81e219 ames: check life before decrypt 2019-11-21 21:12:42 -05:00
Ted Blackman
b944024893 ames: fix nest-fail; update solid pill 2019-11-21 20:41:17 -05:00
Ted Blackman
4988875ea8 ames: don't brick on timer error 2019-11-21 19:52:57 -05:00
Philip Monk
bdbd35fcf3
gall: virtualize scry
Compare +mute and +mule.  Those pass through scry, which doesn't allow us to
catch crashes due to blocking scry.  If you intercept scry, you can't preserve
the type polymorphically.  By monomorphizing, we are able to do so safely.
2019-11-21 14:47:06 -08:00
Philip Monk
b8903e9a6f
gall: fix ap-kill-down
This broke when %kick was handled by resubscribing on your own ship
because it processed the %kick before the %leave.  For example, `@t`404
at the dojo would put the dojo in an unworkable state.

You want the %leave to be processed first because you can't do a
"resubscribe" in response to that.
2019-11-20 13:24:19 -08:00
Philip Monk
1d1119c1f3
drum: unsubscribe on poke-ack failure 2019-11-20 12:12:33 -08:00
Philip Monk
a5412f01de
Merge branch 'alef-testnet-merge' into philip/mall-real 2019-11-19 13:03:07 -08:00
Philip Monk
6a406e6b29
gall: mall -> gall 2019-11-18 20:36:21 -08:00
Philip Monk
9862dccc0e
mall: age -> app 2019-11-18 19:28:59 -08:00
Philip Monk
fea3bd60e4
dojo: add syntax for imps
This adds syntax for running imps.  For example:

-time ~s1

Runs the "time" imp with the argument ~s1.  This blocks the terminal
until the imp has completed (backspace kills it, of course).  You could
avoid blocking the terminal if you sacrifice the ability to use imps as
sources in more complex commands.

In keeping with this one-and-done view of imps, this also changes spider
to not use a live build of imps.  This significantly reduces the amount
of uncertainty around imps -- spider will try exactly once to run your
imp, and if it fails it'll tell you.  If you want to retry, that's up to
you.
2019-11-15 17:20:56 -08:00
Ted Blackman
49d81265c3 alef: clean up printing 2019-11-14 19:10:48 -05:00