Summary:
Right now, not being able to find the mmap file can be seen as data corruption.
The only case that NotFound needs special handling is at open time.
This fixes some cases covered by an upcoming test about `repair`.
Reviewed By: xavierd
Differential Revision: D17741999
fbshipit-source-id: 1bd7c65c5a6381892723b31e2e749b22081e96d2
Summary: Now all `indexedlog` APIs use the new new Error type.
Reviewed By: xavierd
Differential Revision: D17732136
fbshipit-source-id: 8d306a08d8e8052d1c5e68fc5f05a9eed5c7d21f
Summary: This provides more details, and makes callsites simpler.
Reviewed By: xavierd
Differential Revision: D17732127
fbshipit-source-id: 0fe6dedee4ebb8874ea95505c86d8b107e3367ff
Summary:
Similar to the previous change, add context for Log APIs.
This shows error context that might replace backtrace. For example, run:
cargo test --test low_fileno_limit -- --nocapture
An example error looks like:
"/tmp/.tmpjrsfQt/rotatelog/1/index-key1": cannot duplicate file descriptor
in ChecksumTable::try_clone
in Index::try_clone
Index.path = "/tmp/.tmpjrsfQt/rotatelog/1/index-key1"
in Log::sync
Log.dir = Some("/tmp/.tmpjrsfQt/rotatelog/1")
Caused by 1 errors:
- Os { code: 24, kind: Other, message: "Too many open files" }
(Ignoring whitespace will make this diff much easier to review)
Reviewed By: xavierd
Differential Revision: D17732124
fbshipit-source-id: b0d500652d80b4a4755453c69bc05d467ecbdf90
Summary:
Since we lost backtrace by opting out failure, it'd be nice to restore some
"backtrace" information like what Index function is being called.
This diff adds it. It also includes more context like what key is being looked
up so it might actually be more useful than backtrace.
(Ignoring whitespace will make this diff much easier to review)
Reviewed By: xavierd
Differential Revision: D17732126
fbshipit-source-id: 8e5a2c714bee8a943076818f0cff3a21498a954e
Summary: This basically involves adding contexts for io::Error and other error types.
Reviewed By: xavierd
Differential Revision: D17732130
fbshipit-source-id: 79fb3b93d57562f1922f3990a8bda0018d2675e8
Summary: The new utlity function makes it easier to deal with mmap errors.
Reviewed By: xavierd
Differential Revision: D17732139
fbshipit-source-id: 93c8209b983d51198ebb367db983a2e9bc498d63
Summary: This makes it easier to lock a directory and makes error handling easier.
Reviewed By: xavierd
Differential Revision: D17732133
fbshipit-source-id: a404d41c0aaee7aad43271433f1352a8aa06bccb
Summary:
Migrate the remaining part of Index functions to use the new Error type. This
gives us an accurate view about whether an error indicates data corruption or
not, and makes the code more friendly - it works with `std::error::Error` now.
Reviewed By: xavierd
Differential Revision: D17705168
fbshipit-source-id: 8ae518602e7379d121e718a08127f0873f2e2423
Summary:
Migrate some return types from Fallible to the new Result. The main changes are
the way `io::Result` gets handled. The new API enforces attaching a `path` and
a message to them.
Reviewed By: xavierd
Differential Revision: D17705163
fbshipit-source-id: d060bdb2846a75c588b99201fd07ca3872f3a358
Summary:
Migrate more free-form errors handling like `data_error`, `parameter_error`
to the new Error type.
Reviewed By: xavierd
Differential Revision: D17705164
fbshipit-source-id: 45560a96e36fb5e83a9e365506e27c201f9448a6
Summary:
Migrate `range_error` and `verify_checksum` to the `IndexBuf` trait so they
all get path information on error. Remove the free-form `range_error` and
`verify_checksum` functions.
Reviewed By: xavierd
Differential Revision: D17705165
fbshipit-source-id: 556fda8081c69b6beccc8c666902810a90635231
Summary:
A lot of functions take (buf, checksum) tuple, instead of `Index` for input.
That is to avoid issues where borrowing the entire `Index` forbids modifying
other fields in `Index`.
However, not taking `Index` means it cannot figure out the file path on error.
To solve both problems, this diff defines a trait that is a subset of Index
including (on-disk buf, checksum, path). Then migrate functions from using
(buf, checksum) to the new trait (if it only needs to read from the on-disk
buffer), or &Index (if it also needs to work with in-memory dirty/mutable
data).
Reviewed By: xavierd
Differential Revision: D17705166
fbshipit-source-id: 90bde88142ea3718a2093beb02b8030d725a0e15
Summary:
Change some `range_error` to `Index::range_error`.
The new error is better because it includes path information.
Reviewed By: xavierd
Differential Revision: D17705162
fbshipit-source-id: 1de1c7cdd730fcf7c6c39e9e5840939fa561bc33
Summary:
Change `read_bitmap_unchecked` and `read_raw_int_unchecked` to use the new
Error type. Change their function signature from taking `&[u8]` to taking
`&Index` so we can get the file path in the error message.
Reviewed By: xavierd
Differential Revision: D17705167
fbshipit-source-id: 82bcbe21061cdf993d5c7f9867941c1f936166e5
Summary:
Migrate to the new Error type so we can know whether an error is considered
as a data corruption. The new Error should also provide more explicit error
messages.
(This diff is easier to review if whitespace changes are all ignored)
Reviewed By: xavierd
Differential Revision: D17696536
fbshipit-source-id: bfceffbf75a75940a90c914da7914a601d75a747
Summary:
`io::Result` is widely used in indexedlog internal and they need to be
converted to `Result`.
This diff defines the conversion function. It enforces 2 context parameters:
- File path.
- What operations is it? This is needed since we will lose the backtrace.
Reviewed By: xavierd
Differential Revision: D17696533
fbshipit-source-id: d9417a6b65cbfbb5d6d7d1c6449ddd13e3035b5c
Summary:
I need to make RotateLog understand whether errors occured in Log/std::io/Index
are data corruption or not. To be explicit, I defined a `is_data_corruption`
method. Downcasting a chain does not look like a confident solution (ex. less
confident to check that it covers all possible cases).
There are other motivations for this change:
- `failure`: it is unfriendly in a low level library; it requires callsites to
use failure, too. `failure` is less maintained - it still provides the nice
backtrace feature but it's more friendly if libraries just use std Error (we
lose backtrace inside the library, but hopefully the errors are in a high
quality so backtrace in the application is enough for debugging).
- Error with multi-sources. Both std and failure Error provides one slot for
"cause". Sometimes it's desirable to use multiple slots. For example,
RotateLog::open fails to read existing logs, and also fails to auto recover
by creating a new log. In that case, ideally we keep both errors in the
returned type.
Reviewed By: xavierd
Differential Revision: D17696532
fbshipit-source-id: 0387b3a3b71f097b1a3dc2dcc7671a43c465abb2
Summary:
This test checks that the RotateLog can still be opened, and read if the fileno
limit crashes other writer processes "randomly".
Reviewed By: xavierd
Differential Revision: D17676318
fbshipit-source-id: e08528189adfa260047c357c723c87735592ec8f
Summary: This logs more contexts for errors that might help debugging.
Reviewed By: xavierd
Differential Revision: D17670723
fbshipit-source-id: d22fb53689c0766b99aa344659a15148017212ad
Summary: This is no longer needed.
Reviewed By: quark-zju
Differential Revision: D17729378
fbshipit-source-id: 43d24df01dfae0449473d33fa851114e951197b0
Summary:
Instead of using KeyError to indicate that data isn't found, let's use an
Option. The Option type better encode that data is missing without having to do
a potentially error prone downcast, this may also enable us to set
RUST_BACKTRACE=1 everywhere as we won't except errors to happen often anymore,
previously, Mercurial will slow to a crawl due to the many KeyError being
thrown around.
I initially wanted to keep the change small to help reviews, but that didn't
really work out, as the dependencies on the `DataStore`/`HistoryStore` traits
are all over the place...
Reviewed By: quark-zju
Differential Revision: D17728486
fbshipit-source-id: de89c4fc441fd12ff37cc248e2230e4a1403ce44
Summary:
This is a short-term fix to help surface the real errors.
Instead of silently deleting or renaming data, surface the error so we
can get crash/traceback logged, and can login to investigate the broken
state.
Reviewed By: xavierd
Differential Revision: D17729511
fbshipit-source-id: b066ef12101aa742b4834bfd2e90bcb42fa15aff
Summary: Now that the `File` type is part of the crate's public API, it should be placed in the `files` module along with all the other exported file-related types (such as `FileMetadata`).
Reviewed By: xavierd
Differential Revision: D17726709
fbshipit-source-id: 4e3c0100ca765a7145f9eea49aa0b7ff11496c4b
Summary: When a downloaded manifest node fails to validate, investigating the issue generally requires the p1/p2 nodes (to manually compute the hash and compare it to the expected value). As such, let's print these out as part of the error message.
Reviewed By: xavierd
Differential Revision: D17724746
fbshipit-source-id: 0b1eb8d5344c0376a5895745dcdfb1092ad06321
Summary:
HgPython::run_hg was printing errors directly to stderr instead of to
the provided io.error. This caused unhandlable output in the -t.py tests. Let's
fix it to output to the provided pipe.
Reviewed By: quark-zju
Differential Revision: D17634721
fbshipit-source-id: f441e7be461193ef54db25e0939b2e67cdf06126
Summary:
Add a client-driven tree prefetching implementation to the Rust manifest code. Unlike the existing prefetch implementation in Python, this one does all computation of which nodes to fetch on the client side using the BFS logic from BfsDiff. The trees are then bulk fetched layer-by-layer using EdenAPI.
This initial version is fairly naive, and omits some obvious optimizations (such as performing fetches of multiple trees concurrently), but is sufficient to demonstrate HTTP tree prefetching in action.
Reviewed By: xavierd
Differential Revision: D17379178
fbshipit-source-id: f17fe99834ad4fec07b4a4ab196928cc4fe91142
Summary:
Change the `Files` iterator in the Rust manifest code to traverse the tree in BFS order, allowing for layer-by-layer prefetching similar to `Diff`. This can substantially speed up walks over the tree when the cache is cold.
As a side-effect, this changes the order in which paths are reported during a manifest walk. (In particular, they are now reported in breadth-first order rather than depth-first order.) This may break things that rely on the existing ordering; as such, we may need to add a sort somewhere if this turns out to be a problem.
Reviewed By: xavierd
Differential Revision: D17645389
fbshipit-source-id: 624e426094a93e206bde4523ea8bd034fe5aeb90
Summary:
This test checks that a directory without modifications between tree
and parent is not going to be materialized.
Reviewed By: quark-zju
Differential Revision: D17540173
fbshipit-source-id: 465f1e0410c42a55665bcd6903d75266c61d5e80
Summary: Remove the DFS diff implementation and replace it with the BFS implementation.
Reviewed By: xavierd
Differential Revision: D17618818
fbshipit-source-id: d486642caae924f866a200d3c82fa5a4cb7d5286
Summary: To ensure feature parity between BFS diff and DFS diff, copy the DFS tests into the BFS module and ensure they pass.
Reviewed By: xavierd
Differential Revision: D17618820
fbshipit-source-id: b516abbfa4e231fdc383293d94d8965333f2ab99
Summary:
This will make using xdiff much easier.
In the next diff I'm planning to also add a function that converts the output of this function to a textual diff (the one with `+`'s and `-`'s).
Reviewed By: quark-zju
Differential Revision: D17551184
fbshipit-source-id: cda332e817f733d7aa32aeeb7b2d312d971826dd
Summary:
These are limited (not all features are exposed) bindings for xdiff - the diff library used by git and our version of hg. We need them to be able to generate diffs in Mononoke.
In the next diff I'll add more rust-friendly wrapper library.
Reviewed By: quark-zju
Differential Revision: D17548528
fbshipit-source-id: f23c8a65d11d2c5de8f0456d32883f16b19a98e2
Summary:
Both Index and Log requires their on-disk files to be append-only.
Detect non-append-only changes and return errors. This might help
us get better error messages if the case actually happens.
Reviewed By: markbt
Differential Revision: D17592914
fbshipit-source-id: b12791177ceb04f2373e93a679101e8b96e2bc98
Summary:
Similar to Log, this somehow stress tests RotateLog behavior in a multi-thread
environment.
Reviewed By: xavierd
Differential Revision: D17542324
fbshipit-source-id: 35ea358157cf141bec3802b959c9f921eca3143a
Summary:
Add a simple stresstest that calls sync() in multiple threads. This should
give us some confidence that `sync()` has expected behavior when called in
an multi-thread environment.
Reviewed By: xavierd
Differential Revision: D17538980
fbshipit-source-id: 1793a3f871f0377c452807efa466d65d0da4b1f6
Summary:
D17429691 made blackbox reuse session_id unconditionally. That has an
undesirable side effect that chg processes are all logged as a same session id.
Fix that by detecting pid change and avoid reusing session_id in that case.
Reviewed By: singhsrb
Differential Revision: D17532555
fbshipit-source-id: cf11bb66f7d7242429b90ab5e5ea85ca307f92c3
Summary:
Add some context around "invalid read offset" to make errors slightly more
useful.
Reviewed By: xavierd
Differential Revision: D17577202
fbshipit-source-id: d51ba30abf6c462102be8bec1b60668ee66e07f2
Summary:
Instead of removing them unconditionally, keep one copy that failed to `open`
so we can have a look later.
Reviewed By: xavierd
Differential Revision: D17576432
fbshipit-source-id: 4f967d61aa602e6d3cac90d411e1971893c162bd
Summary:
Revise some error handling details so it covers corner cases more acurately and
provides more detailed error messages.
Since D16554090, RotateLog::open unconditionally attempts to create an empty
log at 0/ and reset latest to 0 if read_latest_and_log fails. That could be
undesirable if latest can be read but logs cannot, since it can silently reset
latest to 0 and might cause trouble in the future (For example, failed to
create an empty log at 1/ because it already exits).
This diff splits read_latest_and_log to read_latest and read_logs and handles
their errors individually.
The table summaries changes:
| latest | logs | old behavior | new behavior |
| okay | okay | return both | return both |
| okay | error | create log 0 | create log latest+1 |
| missing | whatever | create log 0 | create log 0 |
| error | whatever | create log 0 | error |
Reviewed By: xavierd
Differential Revision: D17576431
fbshipit-source-id: c9ab1fca5fb60eecf9e326baf90dfa98560a2b32
Summary:
Walker should own matcher instead of storing a reference, so the walker can be
stored as a member of a struct by itself
Reviewed By: xavierd
Differential Revision: D17511588
fbshipit-source-id: 039c6c3cced7feec4e9141c31e5333c43879484a
Summary:
The current way of opening the lock file and creating it on demand is racy.
Fix it by making it one operation.
Reviewed By: singhsrb
Differential Revision: D17552687
fbshipit-source-id: 5469862902ccab2d317f2c0ac61867c365e22aba
Summary:
Finalize is asking the cursor to traverse into directories that haven't changed.
This is bug introduced when updating finalize to support being called on
"Durable" nodes. Until then directories would always be traversed if they were
in the processing path. The path would only be chosen for "Ephemeral"
directories which we knew were different from a parent that is assumed to be
"Durable". I later learned that `finalize` is expected to return the manifests
that are directly fetched from storage. The update meant that we would skip
the directory that is processed if the "Node" (hash) is present and matches a
parent. The problem is that didn't update the point at which the parent cursor
is advanced.
Reviewed By: xavierd
Differential Revision: D17537448
fbshipit-source-id: 9c71a8f8f5a70c600031bc9d32535e59f2f32700
Summary: This will be used to store "invisible heads".
Reviewed By: sfilipco
Differential Revision: D17264837
fbshipit-source-id: a450b5c10cc961d43ec8eb852cb2fb22849a8c00
Summary:
The SpanSet can include a large amount of revs. Iterating through it by putting
everything in a PyList is suboptimal. Therefore add a dedicated native iterator
for it. This speeds up iteration greatly, which can be verified via debugshell:
Before:
In [1]: s=m.smartset.spansset(b.dag.spans(xrange(5000000)))
In [2]: %time s.slice(0,10)
CPU times: user 135 ms, sys: 42.9 ms, total: 178 ms
Wall time: 180 ms
After:
In [1]: s=m.smartset.spansset(b.dag.spans(xrange(5000000)))
In [2]: %time s.slice(0,10)
CPU times: user 49 µs, sys: 6 µs, total: 55 µs
Wall time: 58.2 µs
Reviewed By: sfilipco
Differential Revision: D17305350
fbshipit-source-id: 0db00aa57fb6bf2141ccea94b2536da78f103cef