Summary: `|&` apparently fails on mactest, so I've replaced it with `2>/dev/stdout | ` which works on my devserver and macbook.
Reviewed By: andll
Differential Revision: D29337621
fbshipit-source-id: eaac2592f4c7bfda6696c2500f3b08441b596c39
Summary:
Introduce basic contentstore fallback tracking to help monitor the scmstore shim rollout.
This will be expanded to a general fetch metrics system for scmstore in a future change.
Reviewed By: kulshrax
Differential Revision: D29305839
fbshipit-source-id: c6cc3ea15a3bb7b90f4ec298febc911ec4e2af91
Summary: This is rougly similar to algorithm in NameDag
Reviewed By: quark-zju
Differential Revision: D29318721
fbshipit-source-id: 51a9123daa2b4cf0fbe2346a8a0c7e75172d9afb
Summary: The naming is used in other parts of dag crate - this introduce mononoke side binding for corresponding functions on dag side
Reviewed By: quark-zju
Differential Revision: D29318722
fbshipit-source-id: e9eea5536b041b6ab2ce578914817bca43a10d48
Summary:
In D29079762 (ea2e2f8bbd), globbing was fixed to not match the recursive glob (**) against
the entire path, as this would lead some paths to be matched while they
shouldn't. It however introduced another bug: in some cases, recursive globs
would no longer match paths that should be matched.
To fix both, a partial revert of the original diff is done with a small tweak:
the path that is matched against no longer starts at the root of the
repository. This will prevent `a/b/**/b/c.txt` to match `a/b/c.txt` as
`**/b/c.txt` would only be matched against `c.txt`, and not `a/b/c.txt` like it
was previously.
Reviewed By: fanzeyi
Differential Revision: D29175333
fbshipit-source-id: 1a4137d6f64f6cb77c4be09bd143f72630aa58d5
Summary:
Bring Watchman closer to the EdenFS file name convention by moving
source files and includes into watchman/.
Reviewed By: fanzeyi
Differential Revision: D29242789
fbshipit-source-id: 6e29a4a50e7202dbf6b603ccc7e4c8184afeb115
Summary:
When buck kill fails, eden rm will also fail. This has caused some checkouts
to not be removed when they could be. Stopping aux processes is a nice thing
to do before we unmount. It ensures these processes close file handles in the
mount, but we force unmount anyways so open file handles should not be able to
block the umount call.
Reviewed By: xavierd
Differential Revision: D29205962
fbshipit-source-id: a899940efa5cc1d960cd14a775b7053c34f5d6f2
Summary:
Path should be relative to the symlink path, not to the repo root. This diff
fixes it
Reviewed By: farnz
Differential Revision: D29327682
fbshipit-source-id: a51161a8039a88263fe941562f2c2134aa5d4fef
Summary: This is more confusing than really helpful, thus let's remove the log.
Reviewed By: rkjfb
Differential Revision: D29317007
fbshipit-source-id: 3aba1ab8de7906e193946938aa69b32a09b8e5de
Summary: Update the remaining tests for scmstore. In each of these cases we're just disabling scmstore for various reasons. I think `test-lfs-bundle.t` and `test-lfs.t`'s failures represents a legitimate issue with scmstore's contentstore fallback, but I don't think it should block the rollout
Reviewed By: kulshrax
Differential Revision: D29289515
fbshipit-source-id: 10d055bf679db8efdeb16ac96b7ed597d7b6d82c
Summary:
Prevent `FileStore` from deadlocking when a write falls back to contentstore and attempts to write to the same indexedlog_local which is held lock for the batch.
Note: this shouldn't need to block release, we current expect writing raw LFS pointers to only happen with non-remotefilelog LFS.
Reviewed By: kulshrax
Differential Revision: D29299050
fbshipit-source-id: bf39f87b9956165a558f3a19960d3d055685db9a
Summary:
This is a followup from D28903515 (9a3fbfe311). In D28903515 (9a3fbfe311) we've added support for reusing
hg filenodes if parent has the same filenode. However we weren't reusing
manifests even if parent has an identical manifest, and this diff adds a
support to do so.
There's one caveat - we try to reuse parent manifests only if there are more
than one parent manifest. See explanation in the comments.
Reviewed By: farnz
Differential Revision: D29098908
fbshipit-source-id: 5ecfdc4b022ffc7620501cc024e7a659fb82f768
Summary:
When user runs hg diff between revision and working copy, and if the diff contains file that is not present in working copy because of it's sparse profile, then hg diff fails
Failure happens because hg diff tries to compare file sizes and calculating file size on workingfilectx fails, because file does not exists.
This diff overwrites size function for workingfilectx in the sparse extension to protect against that, similarly how it is already done for data function
Differential Revision: D29279691
fbshipit-source-id: 55d7843d23370c31693a32a0e1df8b882db0d89d
Summary:
Buck v2 builds from the root of the repo, not the current cell. This means that
the inferred logger name ends up being different.
We're going to need to fix this generally because otherwise it'll change logger
names for everyone (I'm tracking this in T93776519), but in the interest of not
having one Eden test arbitrarily failing on Buck v2 let's update this
with a workaround for now.
Reviewed By: genevievehelsel
Differential Revision: D29270388
fbshipit-source-id: 6968d9b6195a5eed7bd4018b161e12d88f78a421
Summary:
Like in many of the other cases, this needs to be told where the Eden binary
is instead of assuming it's right next to the edenfsctl binary, because on Buck
2 it's not.
Reviewed By: xavierd
Differential Revision: D29265845
fbshipit-source-id: 756bd863dc7d18eaf25a9ee209a9fd59345e6b5d
Summary:
In the walker, an Option<NodeData> value of None is used to indicate that no data could be found for a node, and that for derived data mappings we should try again to load it later, when it may have been derived.
When a node is outside the chunk boundary this isn't appropriate, we should just mark as visited and move on, which is what this change does.
Reviewed By: farnz
Differential Revision: D29230223
fbshipit-source-id: c2afdee9b914af89c7954c8e6a7d17a174df7ed1
Summary:
I'm suspecting that it is causing some broken pipe errors that aren't
resolving themself.
Reviewed By: kmancini
Differential Revision: D29279304
fbshipit-source-id: cfbf2261f2ac7dd7ec8b3311d1e27a0b9e160de4
Summary: Only four tests remaining after this.
Reviewed By: kulshrax
Differential Revision: D29229656
fbshipit-source-id: 56c0a17f6585263e983ce8bc3c345b1f266422e0
Summary: Update more tests to avoid relying on pack files and legacy LFS, and override configs in `test-inconsistent-hash.t` to continue using pack files even after the scmstore rollout to test the Mononoke's response to corruption, which is not currently as easy with indexedlog.
Reviewed By: quark-zju
Differential Revision: D29229650
fbshipit-source-id: 11fe677fcecbb19acbefc9182b17062b8e1644d8
Summary:
Pull in a patch which fixes writing out an incorrect entsize for the
`SHT_GNU_versym` section:
ddbae72082
Reviewed By: igorsugak
Differential Revision: D29248208
fbshipit-source-id: 90bbaa179df79e817e3eaa846ecfef5c1236073a
Summary:
For context and high level goal, see: https://fb.quip.com/8zOkAQRiXGQ3
On RedactedBlobs, let's return an `Arc<HashMap>` instead of `&Hashmap`.
This is not needed now, but when reloading information from configerator, we won't be able to return a reference, only a pointer.
Reviewed By: StanislavGlebik
Differential Revision: D28962040
fbshipit-source-id: 0848acc1a81a87c0b51d968efe31f61dacd57c47
Summary:
For context and high level goal, see: https://fb.quip.com/8zOkAQRiXGQ3
Instead of using `HashMap<String, RedactedMetadata>` everywhere, let's use a `Arc<RedactedBlobs>` object from which we can instead borrow a map. The borrow function is async because it will need to be when we're fetching from configerator, as it may need to rebuild the redaction data.
Wrapping it in `Arc` will also makes it re-use the same across repos, I believe right now it's cloned everywhere.
In later diffs I'll use this enum to add a new way to fetch configs.
Reviewed By: markbt
Differential Revision: D28935506
fbshipit-source-id: befa96810ee7ebb9487f99f9e769a945981b58ed
Summary:
We're doing imports for AOSP megarepo work, and want a tool to quickly check that our imports are what we expect.
Use libgit2 and a simple LFS parser to read git SHA-256 entries, and FSNodes to get the Mononoke entries to match
Reviewed By: StanislavGlebik
Differential Revision: D29169743
fbshipit-source-id: 1ef1e2c780b8742c7fa5f15f9ee01bc0481a6543
Summary: This is a minimal fix so that it builds, not enough to test the new bit, but enough to unbreak contbuild
Reviewed By: yancouto, HarveyHunt
Differential Revision: D29263246
fbshipit-source-id: c5430ff4bc885103664c33caca90af5819d97ddd
Summary:
Looks like D29145340 (d0e16f1a25) introduced regression - "hg pull" fails with
"TypeError: Population must be a sequence or set. For dicts, use list(d).", stack trace - P424458181.
This diff fixing it by converted a dag.nameset to a list first
Reviewed By: mzr
Differential Revision: D29258771
fbshipit-source-id: 9ffcc756f9931d6d24b69dadf1cd2d08faccb443
Summary: Spotted this in passing. Save a DashMap lookup in the OldestFirst case by checking the enum first
Reviewed By: farnz
Differential Revision: D29232280
fbshipit-source-id: 72e93ee704767a42c36ffeec505fd79a22c4d88e
Summary:
At the moment we have a few ways of deriving data:
1) "normal", which is used by most of the mononoke code. In this case we insert
derived data mapping after all the data for a given derived data type was
safely saved.
2) "backfill", which is used when we backfilling a lot of commits. In this case
we write all the data to in-memory blobstore first, and only later we save data
to real blobstore, and then write derived data mapping
3) "batch", when we derive data for a few commits at once. It can be combined
with "backfill" mode.
We also have a special scuba table for derived data derivation, however there
are a few problems with it.
Only "normal" mode has good and predictable logging i.e. it logs once before we
attempt to derive a commit, and once after commit was derived or failed.
"backfill" logs right after data for a given commit was "derived", however this is an in-memory
derivation, and at this point no data was saved to the blobstore.
So if backfill process crashes a bit later then commit might not be derived
after all, and it's impossible to tell it just by looking at the scuba table.
With "batch" mode it's even worse - we don't get any logs at all.
A bigger refactoring is needed here, because currently the process of
derivation is very hard to grok. But for now I suggest to slightly improve
scuba logging by logging and even when a derived data mapping was actually written (or failed to be
written). After this diff we'll get the following:
1) "normal" mode will get three entries in scuba table in this order: derivation start,
mapping written, derivation end,
2) "backfill" mode will also get three entries in scuba table by in a different
order: derivation start, derivation end, mapping written
3) "batch" mode will get one entry for writing the mapping. Not great, but
better than nothing!
Reviewed By: farnz
Differential Revision: D29231404
fbshipit-source-id: 2c601e7dc58c00e22fda1ddd542833a818d1d023
Summary: Just moving a code around a bit to make derive_impl file a bit smaller
Reviewed By: farnz
Differential Revision: D29231405
fbshipit-source-id: c923f42710f4be98147bc58d5b828d5d6c7bf1a6
Summary:
Invalidation on Windows is tricky, and I got it wrong in subtle ways
previously. The main obvious issue is that when the on-disk invalidation fails,
the refcount shouldn't be decremented as the placeholder/file is still present
on disk. This could cause weird issues in later checkout. The second one is how
invalidating a directory doesn't remove a placeholder (it actually adds one),
and thus we shouldn't decrement the FS refcount. And lastly, the refcount
should be decremented regardless of whether the inode is loaded or unloaded. As
long as it is known by the InodeMap it needs invalidation.
Reviewed By: fanzeyi
Differential Revision: D28970899
fbshipit-source-id: 0d64cadae01fcd4e028c53de9357ece7d648cdd4
Summary:
A user may have some undesirable environment variables when calling `edenfsctl start`,
which we do not want to propagate to edenfs as this may affect EdenFS's
ability to run properly. Having PYTHONPATH set to inside a repository may for
instance lead to a deadlock when EdenFS is trying to setup redirections.
To avoid this, we need to sanitize the environment before calling edenfs. This
functionality already exist but was bypassed on Windows.
Reviewed By: chadaustin
Differential Revision: D29244358
fbshipit-source-id: bc96698732e71412296ed5e28842b59b2c758699
Summary:
Introduce `LegacyStore` trait, which contains ContentStore methods not covered by other datastore traits.
Implement this trait for both contentstore and scmstore, and modify rust code which consumes `contentstore` directly to use `PyObject` and `LegacyStore` to abstract over both contentstore and scmstore instead.
Reviewed By: DurhamG
Differential Revision: D29043162
fbshipit-source-id: 26e10b23efc423265d47a8a13b25f223dbaef25c
Summary: Introduce a new config option, `scmstore.enableshim`, which replaces instances of contentstore in Python with scmstore objects instead. Currently, this config is not safe to enable. Addition fixes are incoming.
Reviewed By: DurhamG
Differential Revision: D29213190
fbshipit-source-id: 7fd4db77d55cd25cc08c40bee28798d6a6d2555c
Summary: Previously, we just fetched "best effort", and logged any encountered errors using `tracing`, leaving it up to the client to inspect errors if necessary. Python relies on catching these fetch errors as exceptions, though, so this change introduces some utility methods to help propagate them correctly.
Reviewed By: DurhamG
Differential Revision: D29211683
fbshipit-source-id: 5e9dee942c2b60e0f77a051624d7f393a811fc4e
Summary:
Remove packfile-specific parts of tests and modify them to test without depending on packfiles where possible.
Currently debugindexedlogdatastore and debugindexedloghistorystore appear to be broken, and debugdumpindexedlog just dumps the raw indexedlog contents, without any semantic information, so for the time being I've simply removed most packfile inspection.
Reviewed By: DurhamG
Differential Revision: D29099241
fbshipit-source-id: 86c4f9c83520374560587b8bec5c569d9c5c6510
Summary: My previous fix was actually incorrect, we now log actual remote requests, but join that with the logs from the contentstore fallback.
Reviewed By: DurhamG
Differential Revision: D29206878
fbshipit-source-id: d22e58792bf380c274e8086ce08aebe20dd9b848
Summary: The original commit broke globbing more than it fixed it. D29175333 will fully fix it, but in the meantime, let's revert the change to get a release out.
Reviewed By: singhsrb
Differential Revision: D29231954
fbshipit-source-id: 7a42e980c6fc4de09bee713a3a4141d52272b6d1
Summary:
Previously, when fetching data using several concurrent requests, the EdenAPI client would wait for the headers for every request to finish coming in before starting to deserialize and yield entries from the bodies of any of the requests.
Normally, this isn't a huge deal since the response headers on all of the requests are usually roughly the same size, so they all finish downloading at roughly the same time when the requests are run concurrently. However, this does become an issue when `edenapi.maxrequests` is set. This option makes EdenAPI configure libcurl to queue outgoing connections once the configured limit is hit.
This means that although from EdenAPI's perspective all of the requests are running concurrently, they are not actually running in parallel. The result is that the EdenAPI client ends up waiting for all of the queued requests to be sent before yielding any data to the caller, which forces it to buffer all of the received data, resulting in massive memory consumption.
This diff fixes the problem by rearranging the structure of the Futures/Streams involved such that the client immediately begins yielding entries when they are received from any of the underlying transfers.
Reviewed By: quark-zju
Differential Revision: D29204196
fbshipit-source-id: b6b56bb7d60457de3c4046a07a5965749e9dd371
Summary:
When the `send_async` method is used to dispatch multiple concurrent requests, the method needs to return an `AsyncResponse` for each request. Since `AsyncResponse`'s constructor is itself `async` (it waits for all of the headers to be received), internally the method ends up with a collection of `AsyncResponse` futures.
Previously, in an attempt to simplify the API, the method would insert all of these futures into a `FuturesUnordered`, thereby conceptually returning a `Stream` of `AsyncResponses`. Unfortunately, this API ends up making it harder to consume the resulting `AsyncResponses` concurrently, as one might want to do when streaming lots of data over several concurrent requests.
This diff changes the API to just insert the `AsyncResponse` futures into a `Vec` to allow the caller to use them as desired. To maintain compatibility with the old behavior for the sake of this diff, the one current callsite has been updated to just dump the returned `Vec` into a `FuturesUnordered`. This will be changed later in the stack.
Reviewed By: quark-zju
Differential Revision: D29204195
fbshipit-source-id: ecee8cff430badd8213c2efef62fc68fbd91fde9
Summary: Nothing was using this metadata, and removing it simplifies the subsequent diffs in this stack.
Reviewed By: quark-zju
Differential Revision: D29147228
fbshipit-source-id: aa4828b710c3ef719f4d66adec5f66cd5b7d05d1