mirror of
https://github.com/facebook/sapling.git
synced 2024-10-09 16:31:02 +03:00
f4a078e257
Summary: High-level goal of this diff: We have a problem in long_running_request_queue - if a tw job dies in the middle of processing a request then this request will never be picked up by any other job, and will never be completed. The idea of the fix is fairly simple - while a job is executing a request it needs to constantly update inprogress_last_updated_at field with the current timestamp. In case a job dies then other jobs would notice that timestamp hasn't been updated for a while and mark this job as "new" again, so that somebody else can pick it up. Note that it obviously doesn't prevent all possible race conditions - the worker might just be too slow and not update the inprogress timestamp in time, but that race condition we'd handle on other layers i.e. our worker guarantees that every request will be executed at least once, but it doesn't guarantee that it will be executed exactly once. Now a few notes about implementation: 1) I intentionally separated methods for finding abandoned requests, and marking them new again. I did so to make it easier to log which requests where abandoned (logging will come in the next diffs). 2) My original idea (D29821091) had an additional field called execution_uuid, which would be changed each time a new worker claims a request. In the end I decided it's not worth it - while execution_uuid can reduce the likelyhood of two workers running at the same time, it doesn't eliminate it completely. So I decided that execution_uuid doesn't really gives us much. 3) It's possible that there will be two workers will be executing the same request and update the same inprogress_last_updated_at field. As I mentioned above, this is expected, and request implementation needs to handle it gracefully. Reviewed By: krallin Differential Revision: D29845826 fbshipit-source-id: 9285805c163b57d22a1936f85783154f6f41df2f |
||
---|---|---|
.. | ||
fs | ||
hg-server | ||
integration | ||
locale | ||
mononoke | ||
scm | ||
test_support | ||
test-data | ||
.gitignore | ||
Eden.project.toml |