Summary:
When scrubbing to collect commit times for path info logging, its much easier to get correct commit times for manifests by walking from oldest changeset first. That way when any manifest/tree is discovered its from the closest changeset chunk to its creation.
Alternative would have been using the path data from linknode associated changesets to prune out which sub-manifests to walk when walking forward, which is more complicated and would require holding more state (or reloading changesets continuall)
Differential Revision: D28092314
fbshipit-source-id: 871dc80dd88b63959501dd1018b6466afae5c6c7
Summary: This annotation got lost during the refactor of repo factory.
Reviewed By: quark-zju
Differential Revision: D28135734
fbshipit-source-id: b91d359422ac2456d7c670ae7094f20e3d6e5d7c
Summary:
The implementation of `to_cbor_bytes` does not make use of the ownership. It
works the same a reference is given. However the method is a lot more flexible
if a reference is used for the argument.
Reviewed By: kulshrax
Differential Revision: D28132732
fbshipit-source-id: 2eecd44ce9e4ff5bc42ff01fd358b0d30dde91ef
Summary:
There have been a bunch of problems with the previous approach to scmstore, so I'm going to try to start simple, make it feature complete, and then add async integration and factor out generic functionality as appropriate.
This change contains a `TreeStore` implementation with a single, synchronous, batch read method (supporting local storage, memcache, and legacy fallback, with writing missing to cache).
Add `TreeStoreBuilder`, which duplicates the existing `TreeScmStoreBuilder`, which some changes that make it easier to use for this case. I intend to unify these in the future.
Add an inherent impl for `EdenApiTreeStore` that provides subset of the `BlockingEdenApi` trait, which eliminates the need to unpack this type into a different adapter as the old `scmstore` code does. This might not be the right approach here, in reality we only need a `(client: Arc<dyn EdenApi>, repo: String)` here for trees, and that + `ExtStoredPolicy` for files, so we could take the `EdenApiAdapter` approach here too. The only reason we have to do any of this is because when `pyrevisionstore` is called to construct `scmstore` / `contentstore`, all we have is `Arc<EdenApiTreeStore>`. We could also just make the `EdenApiRemoteStore` fields public, and access them through the `Arc`.
Add `add_mcdata` method to `MemcacheStore`, `impl TryFrom<Entry> for McData`, and `impl From<McData> for Entry` for convenience when working with `MemcacheStore` (so we don't need to manually unpack the type and build `Entry`, or manually build a fake `Delta` from `Entry` to write).
Reviewed By: DurhamG
Differential Revision: D28076900
fbshipit-source-id: 7fdb5e8a42d052879eff449f60d40a83cfa7145d
Summary:
Both `get_local_path` and `get_cache_path` take suffix as as `PathBuf`, even though they only ever use it as a reference. `get_local_path` also takes a reference to a `PathBuf`, even though it always clones it internally, and takes an `Option`, even though it just maps across the contents of the option.
I modified `get_local_path` to accept a `PathBuf` by move, which it uses directly, and to not take an `Option` (instead just calling `map` externally, removing some unnecessary unwraps), and for both functions to accept `impl AsRef<Path>` for suffix.
Reviewed By: DurhamG
Differential Revision: D28100527
fbshipit-source-id: df28b51c8005f3d95acc8e082b40adaab18e31c9
Summary: Add a Read/Write Guard API to IndexedLogHgIdDataStore which allows client code outside the module to perform a series of reads and writes without locking for each individually.
Reviewed By: kulshrax
Differential Revision: D28075788
fbshipit-source-id: 2a65a426f443e1a421198ad8b4c610e4822574f7
Summary:
Add get_entry, put_entry, and flush_log inherent methods to IndexedLogHgIdDataStore. Refactor callers to use them in cases where they don't lock across multiple reads / writes (to avoid performance regressions).
This should allow `ReadStore` and `WriteStore` to be moved out of the module.
Reviewed By: DurhamG
Differential Revision: D27979828
fbshipit-source-id: c9fb8c4ac68f67b285c72396509aa17928aa54ed
Summary: It has been wrong since 2014 (tweakdefault).
Reviewed By: kulshrax
Differential Revision: D28122703
fbshipit-source-id: c83ddbac2c6162e36672649c60c2e7916dc7cbd2
Summary: This is step towards unifying native merge/rebase structs with native checkout - we now construct native checkout plan from the action map, instead of directly making it from the diff
Reviewed By: quark-zju
Differential Revision: D28078156
fbshipit-source-id: 318d7e419ca9fef15a4aebf7494451f69a3bbbe5
Summary:
This diff makes concurrency of native checkout to be configurable
This config can be used to reduce concurrency on platforms that are known to cause issues with watchman due to too many checkout operations
Reviewed By: quark-zju
Differential Revision: D28074993
fbshipit-source-id: 0a09fcf3ae48d08cead36da56c06b546aecd16b4
Summary: This diff refactors out `Checkout` component from checkout plan and allows to configure parallelism in checkout
Reviewed By: quark-zju
Differential Revision: D28074994
fbshipit-source-id: 72933c757d6e27615d1ef2bb4652bc67c9c3253d
Summary:
Vertex is old. It no longer makes sense with the current structure. The main
issue is that the dag crate now has VertexName which may confuse readers at
first glance.
When Vertex was added DagId would have been confusing because we had structs
that were named Dag that did not use DagId directly. Those structures are now
renamed and DagId is consistently used for dag crate structures.
The IdMap database would still use the vertex name until someone runs a
migration to rename the column.
I am not 100% that this is needed, but it's a change that's been on my mind.
Reviewed By: quark-zju
Differential Revision: D28110184
fbshipit-source-id: b996a7545a90acc25e2bb5326f2731b95c8740b4
Summary:
Previously there were two different paths to HgChangeset. This diff unifies them, so that when walker state.rs is checking for a previous visit it will find that it happened.
For existing walks of changesets in the NewestFirst direction this wasn't causing a problem, however the next diff in stack adds support for OldestFirst walks. In the OldestFirst case the mismatch in paths to HgChangeset was leaving a deferred edge to visit when everything should have been visited in previous chunks.
Differential Revision: D28095569
fbshipit-source-id: ccba4a679fc28bde042cfc222e5097c84fa968c0
Summary:
Right now we write straight to a logger with no filter, so no matter the log
level we print this stuff out. Let's fix it.
While we're at it, move this back to debug level.
I'd made this trace in my recent cmdlib refactoring (which resulted in us
properly initializing logging in all binaries), since I assumed we just had level
filtering working but with debug-logging enabled and I didn't want to have to
update every single test, but it turns out that the reason we didn't print it
out at trace is just because thats not enabled at all in our slog build:
D28097080.
Reviewed By: StanislavGlebik
Differential Revision: D28116053
fbshipit-source-id: f59d9a70ea3c3d834adea16f2686bfc244672b14
Summary: The precise compressed size of big blobs in zstd varies between runs. Glob out the exact size
Reviewed By: StanislavGlebik
Differential Revision: D28116066
fbshipit-source-id: 990add820de6c8cb0029805bc1de304fdf83acba
Summary:
It wasn't in warm bookmark cache, but it was an oversight - there's no reason
for it to not be here. Let's add it, since in case of crashlooping derived data
tailer (see attached task T89911396) there might be nothing to derive fastlog
data structure, and we end up with a long queue to derive.
Reviewed By: krallin
Differential Revision: D28114533
fbshipit-source-id: feb29c07d90be6250c5385ae9f2fb13eb52eedba
Summary:
From what I can see, this was added when EdenFS had a Mononoke store, which is
now long gone, thus we should be able to remove the Curl dependency altogether.
Reviewed By: fanzeyi
Differential Revision: D28037816
fbshipit-source-id: 834f7db64bab5dda1748ad2f033c27a2854b0ba4
Summary: Looks like these aren't needed since these files are owned by a TARGETS file.
Reviewed By: genevievehelsel
Differential Revision: D28101197
fbshipit-source-id: d790530227641bf25e48bd96c8a95dd31f08a954
Summary:
Now that autodeps knows where to find cpptoml.h, we no longer need these
manual annotation.
Reviewed By: kmancini
Differential Revision: D28100956
fbshipit-source-id: 463b73834c500c1d16a4a769af3655938124d49d
Summary:
For no particular reason I was looking at this and saw a bunch of
unneeded `vec![]` temporaries which could be replaced with arrays or slices.
Reviewed By: krallin
Differential Revision: D28073693
fbshipit-source-id: 7fca3b4c7b40cc380b4b128e9809912b7b9ba1f7
Summary:
The original bug that resulted in empty revisions being pulled is long-fixed:
T28553115. I'm planning to make data1 nullable so I can reclaim space by removing older
revs.
Reviewed By: DurhamG
Differential Revision: D28096278
fbshipit-source-id: a57da458df115dcbdf544e2151aa327651190c1a
Summary:
This enlists hgsql tests to the lists of tests using revision numbers and
marks some racy lines as optional
Reviewed By: quark-zju
Differential Revision: D28096282
fbshipit-source-id: eb8406cb74f3338d13d4109fce35f969ff9e3b79
Summary:
This is a hg-sever backport of fix from D27659634 (8e8aaa61d6)
Those are not used. Recently we saw build issues like:
lib/third-party/sha1dc/sha1.c:8:10: fatal error: string.h: No such file or directory
#include <string.h>
^~~~~~~~~~
Possibly by some compiler flags disabling stdlib. Since we don't need
the C code let's just remove them.
Reviewed By: StanislavGlebik
Differential Revision: D28096283
fbshipit-source-id: 6c5390d26264e1e39f99b29dec8608d92e5ae572
Summary: - Like it says in the title.
Reviewed By: HarveyHunt
Differential Revision: D28092796
fbshipit-source-id: 01816f815148aca6c86078fb7dec616ecf53095c
Summary:
This updates hg to have a different amount of retry for backoffs requested by
the server and errors.
The rationale is that backoffs are fairly well understood and usually caused by
a surge in traffic where everybody wants the same data (in which case we should
be willing to wait to get it because there is literally no alternative),
whereas general errors aren't predictable in the same way.
We're now effectively at a point on the server side where _all_ our instances
have the exact same load, so if any server is telling you to backoff, that
pretty much guarantees that the whole tier has too much traffic to deal with.
This leaves us with two options:
- Tell clients to wait longer and smooth out the traffic surge.
- Add enough capacity that even our biggest surges don't result in _any_
throttling.
The latter is a bit unrealistic unrealistic given we routinely get egress
variations in excess of 5x (here's an example: https://fburl.com/ods/pidsrqnl),
so this does the former.
This also updates the client to tell the server how many attempts it has left
in addition to how many it used up so far. How many are left is more meaningful
for alerting!
Finally, it adds a bit of logging so that in debug mode you can see this
happening.
Reviewed By: quark-zju
Differential Revision: D28092797
fbshipit-source-id: f61410e39c4a3e3356371a3c7bd7892de4beacc8
Summary:
After D27144492 (48cd15ab14) we disabled revision number resolution. There is no need to
consider it when calculating shortest prefix.
Reviewed By: DurhamG
Differential Revision: D28072997
fbshipit-source-id: 832017c7b626265eb8cd2dd78946a03c4e7228f6
Summary:
This diff defines symlink type in `DirType`.
Even though it is not directly used in the FSCK diff. This will allow us to support symlink in EdenFS Windows in the future.
Reviewed By: genevievehelsel
Differential Revision: D28016305
fbshipit-source-id: 67c1aa22e39198f9c91845129695f27b8303a5f1
Summary: Add strum derivations to bulkops so we can use them in command line parsing later in stack.
Differential Revision: D28069912
fbshipit-source-id: 4d997e20e18f2011b51933ed4322c85bb7468980
Summary:
We were ignoring the return value of runWhileMaterialized, and thus we were
returning to FUSE before fallocate returned.
Reviewed By: fanzeyi
Differential Revision: D28081991
fbshipit-source-id: f398942ddb2432e48e80c148abc8edb7e5ada71d
Summary: Start logging mtime as relatedness key in the walker scrub pack info output
Differential Revision: D28055637
fbshipit-source-id: 4c24c5f2af0414ae7df17ade69bba9ff18861264
Summary:
We used to carry patches for Tokio 0.2 to add support for disabling Tokio coop
(which was necessary to make Mononoke work with it), but this was upstreamed
in Tokio 1.x (as a different implementation), so that's no longer needed. Nobody
else besides Mononoke was using this.
For Hyper we used to carry a patch with a bugfix. This was also fixed in Tokio
1.x-compatible versions of Hyper. There are still users of hyper-02 in fbcode.
However, this is only used for servers and only when accepting websocket
connections, and those users are just using Hyper as a HTTP client.
Reviewed By: farnz
Differential Revision: D28091331
fbshipit-source-id: de13b2452b654be6f3fa829404385e80a85c4420
Summary:
This used to be used by Mononoke, but we're now on Tokio 1.x and on
corresponding versions of Gotham so it's not needed anymore.
Reviewed By: farnz
Differential Revision: D28091091
fbshipit-source-id: a58bcb4ba52f3f5d2eeb77b68ee4055d80fbfce2
Summary:
Connect up the scrub stream types so they will be uniform for scrubs that log pack info and those that do not.
This is in preprepation for the next diff which connects up the pack info logging of path hashes to scrub. CI for this diff verifies its not broken the non-path tracking case.
Differential Revision: D28031868
fbshipit-source-id: 7bf91eb1778f57487f6a2847f215cf7f5cd2dff7
Summary: This moves evolve_path up to WrappedPathLike so that we can use sample route evolution logic for routes that track paths (e.g. corpus sampling) and path hashes (e.g. scrub, where path hashes take less memory than full paths).
Differential Revision: D28031867
fbshipit-source-id: cdabdc466158a8db1c770536747c996dddb27e71
Summary: Name the fields rather than leave it as a tuple struct. This makes it a bit easier to work with in the rest of the stack
Differential Revision: D28062254
fbshipit-source-id: 9e5202b4d6f1f29d44d98b86aa9b6ddb97d821eb
Summary: Makes more sense for this to be a method on NodeType
Differential Revision: D28031869
fbshipit-source-id: 1ddbafa0d7634ac67fd8d5112e6f57759ed91638
Summary: Name the fields rather than leave it as a tuple struct
Differential Revision: D28031866
fbshipit-source-id: 039f004e0b81294aa6d6b13e79cb45ee2b84567c
Summary: This new trait abstracts across WrappedPath and WrapperPathHash. Later in the stack I make path tracking use this to track either full paths (for corpus sampling) or path hashes (for logging from scrub).
Differential Revision: D28031870
fbshipit-source-id: d1c57230f68fffff179929a3cb92c82d92e0588c
Summary:
Like it says in the title. This isn't giving us the same error consistently
causing flaky failures.
Reviewed By: StanislavGlebik
Differential Revision: D28091747
fbshipit-source-id: dfc7a28b443c6577823c71cee7b006ed30fec18e
Summary: This is no longer needed, as all construction is performed by facet factories.
Reviewed By: StanislavGlebik
Differential Revision: D28001390
fbshipit-source-id: 237dd4f7b8b08bec5b85360edc3be7018d9161de
Summary:
Keeping the `Changesets` trait as well as its implementations in the same crate means that users of `Changesets` also transitively depend on everything that is needed to implement it.
Flatten the dependency graph a little by splitting it into two crates: most users of `Changesets` will only depend on the trait definition. Only the factories need depend on the implementations.
Reviewed By: krallin
Differential Revision: D27430612
fbshipit-source-id: 6b45fe4ae6b0fa1b95439be5ab491b1675c4b177
Summary:
The changesets object is only valid to access the changesets of a single repo
(other repos may have different metadata database config), so it is pointless
for all methods to require the caller to provide the correct one. Instead,
make the changesets object remember the repo id.
Reviewed By: krallin
Differential Revision: D27430611
fbshipit-source-id: bf2c398af2e5eb77c1c7c55a89752753020939ab
Summary:
The `get_sql_changesets` method on `Changesets` is an abstraction violation,
and prevents extraction of `SqlChangesets` to a separate crate as it would
introduce a circular dependency.
It is used to allow bulk queries to enumerate changesets by integer unique ID,
so promote this to a full feature of `changesets`, and remove the
`get_sql_changesets` method.
Reviewed By: krallin
Differential Revision: D27426921
fbshipit-source-id: 2839503029b262dd5e6a8be09bb35bb143b4c5ac
Summary:
folly::via is a Future API, and thus it creates one, which requires allocating
it and then attaching it to the Executore. Since the code to dispatch a request
isn't Future based, we don't need to use folly::via, and we can simply add the
lambda to the Executor directly. This removes expensive memory allocations from
the EventBase.
Reviewed By: kmancini
Differential Revision: D27976674
fbshipit-source-id: 8fa9724a94ba69b071ab894cdbbad0d33733c098
Summary:
Neither macOS, nor Linux are sending multi-fragment requests to the NFS server.
Since supporting these means calling into memmove, which can be expensive for
large requests, let's just remove support for them for now. If somehow macOS
and/or Linux start sending these, the XCHECK(isLast) will catch this and we can
fix the code by then.
Reviewed By: kmancini
Differential Revision: D27976671
fbshipit-source-id: 77c758b2bb36517d22d5b637e6f0ebf84cc19e5b