Summary:
Similarly to how we could try invalidating a file that isn't cached, we could
also be trying to invalidate a file whose path isn't cached. Both are
legitimate, and thus we need to ignore both.
Reviewed By: chadaustin
Differential Revision: D24125225
fbshipit-source-id: e8abe5cde5aa3602bb48258abb64aa0cdf60241d
Summary:
Thrift represents `binary` data type as `std::string` in C++. This method will
help us to convert `Hash` into a byte string.
Reviewed By: xavierd
Differential Revision: D24083621
fbshipit-source-id: ae50088db7727d98ca11a017f82b71e942217a17
Summary:
This diff adds a new constructor to `SqliteDatabase` to allow creation of
in-memory SQLite database. This can come in handy in testing.
Reviewed By: xavierd
Differential Revision: D24083579
fbshipit-source-id: ad6dd8b1c20392a882c1f164ef1f8af2f0ba11f8
Summary:
This allows `edenfsctl debug processfetch` to display what processes triggered
some IO in EdenFS which will be useful to debug rogue processes walking the
entire repo.
Reviewed By: chadaustin
Differential Revision: D23997665
fbshipit-source-id: 7d92755d0068a4b1819eb0c84b30cbdaa24296f7
Summary:
This will enable to gather a bit more debugging regarding what processes are
fetching data. The one missing bit on Windows is to collect the process name,
for now, a "NOT IMPLEMENTED" placeholder is put in place.
Reviewed By: wez
Differential Revision: D23946258
fbshipit-source-id: 9f7642c7b9207c5b48ffff0f4eb0333af00bc7d5
Summary: Instead of returning an error upon receiving an empty request, just return a `Fetch` object that does nothing. This prevents Mercurial from crashing in situations where an empty request somehow makes it to the EdenAPI remote store.
Reviewed By: quark-zju
Differential Revision: D24119632
fbshipit-source-id: cf4ec707b4097656c76d7084a55b2d0b3150b679
Summary:
Previously, EdenAPI was using `remotefilelog.debug` to determine whether to print things like download stats. Let's give EdenAPI its own `debug` option that can be configured independently of remotefilelog.
One notable benefit of this change is that download stats will always be printed immediately after the HTTP request completes. This can help rule out network or server issues in situations where Mercurial appears to be hanging during data fetching. (e.g, if hg had downloaded all of the data but was taking a while to process it, the debug output would show this.)
Reviewed By: DurhamG
Differential Revision: D24097942
fbshipit-source-id: bf9b065e7b97fc7ffe50ab74b1b13e2fe364755c
Summary: HostInfoProperties is allocated for every HostInfo and is accessed on every request. There's no reason this should be a unique_ptr, and the pointer indirection is expensive.
Reviewed By: jmswen
Differential Revision: D24009296
fbshipit-source-id: 2034d1c6e61e0dec51ca6ac7bd14ab12e74966d4
Summary:
Previously phase calculation was done via a simple ancestor check. This
was very slow in cases that required going far back into the graph. Going a year
back could take a number of seconds.
To fix it, let's take the Rust phaseset logic and rework it to make only_both
produce an incremental public nodes set. In a later diff we can switch the
phaseset function to use this as well, but right now phaseset returns IdSet, and
that would need to be changed to Set, which may have consequences. So I'll do it
later.
Reviewed By: quark-zju
Differential Revision: D24096539
fbshipit-source-id: 5730ddd45b08cc985ecd9128c25021b6e7d7bc89
Summary:
This is one more fix to use correct commit sync config version. In particular,
this diff fixes a case where a single parent commit was rewritten out. E.g.
if a large repo commit touches only files that do not remap in a small repo. In
that case we still want to record correct mapping so that all descendants used
the correct mapping as well.
Reviewed By: ikostia
Differential Revision: D24109221
fbshipit-source-id: bcdbb01b964d70227dff8363e77964716a345261
Summary:
Let's move initialization into a separate function. I'm planning to use it in
the next diff for another test
Reviewed By: ikostia
Differential Revision: D24109222
fbshipit-source-id: 73142dd46ef3de15ff381670ed6d5e31653c5dd4
Summary:
Previously fetch_bonsai_range returned all commits between `ancestor` and
`descendant`, but `ancestor` was included. This is usually not what we want and
it might be surprising and can lead to subtle bugs. As an example, next commit
in the stack might have failed pushrebases when it shouldn't do that.
This diff changes the semantic of the function to exclude an ancestor. This
function was used for 2 use cases:
1) Find changed files. find_rebased_set function was manually removing the
ancestor anyway, so there's no change in behaviour
2) To check that there are no case conflicts. Previously we were checking the
case conflicts with ancestor included, but that wasn't necessary. To prove that
let's go over the two possible situation:
i) This is a first iteration of the pushrebase
```
CB
SB |
| ...
... CA
SA
| /
root
```
in that case files introduced by root commit will be used to check if we have
case conflicts or not. But this is not necessary, because pushrebase assumption
is that CA::CB should not introduce any new case conflicts. Besides, even if
they added a case conflict then checking with just the files that were changed by root commit is
not enough to verify that.
Similar logic goes to SA::SB commits. Checking if root has any conflicts with
SA::SB commits doesn't make sense.
ii) This is not the first iteration of the pushrebase
```
CB
SB |
| ...
... CA
SA
|
O <- latest pushrebase attempt
... <- we rebased over these commits on the previous attempts
| /
root
```
In this case it's even easier. Commit O was verified on the previous iteration,
so no need to add it here again.
Reviewed By: aslpavel
Differential Revision: D24110710
fbshipit-source-id: 90dff253cba0013e9d5e401474132a152d473cae
Summary:
The SpawnedProcess tests were failing on my macOS machine because pwd
and getcwd returned slightly different paths. Normalize them before
comparing.
Reviewed By: genevievehelsel
Differential Revision: D24094634
fbshipit-source-id: aacf802280b1dd1de19797604bfe359d7e60cbf8
Summary:
A couple of files were moved but test-check-code.t wasn't updated to reflect
this, causing it to fail.
Reviewed By: DurhamG
Differential Revision: D24113079
fbshipit-source-id: 9a0c0b6f07a6532715bf5ee401036ded0a05b16a
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/65
Using $LOCALIP will ensure more consistent behavior when setting up the server in ipv4 or ipv6.
The LOCALIP variable was also abused when it was used to override ssh client address, so SSH_IP_OVEERIDE env was created here.
Lastly the result of `curl` call is now printed whenever the test failed to verify that Mononoke is running.
Reviewed By: farnz
Differential Revision: D24108186
fbshipit-source-id: e4b68dd2c5dd368851f0b00064088ffc442e31e0
Summary: D24070707: `[Thrift] Provide sorted fields to read_field_begin` made a change to the generated rust thrift files, so the eden/scm thrift files have to be regenerated to fix the build.
Reviewed By: farnz
Differential Revision: D24109655
fbshipit-source-id: e8575a76642673a11514fdce8e30f13ca28151f0
Summary:
Normally, sync logic infers `CommitSyncConfigVersion` to use from parent commits (or from current version for root commits). However, for test purposes it is convenient to force a version override This logic does not change any of the production behaviors, and will be used in a later diff.
TODO: can it ever be needed beyond tests? I've thought about using this for "version boundary" commits, but those would probably just be constructed while completely bypassing the sync logic.
TBH, I am not certain this diff is a good change. But I've spend a very large amount of time crafting the repos used in the `sync_merge` tests later in this stack, so I am proposing to land this, then spend some time refactoring sync tests (and hopefully making it easier to craft test repos), then removing this logic. Obviously, this logic should only be landed if we land the tests in the first place.
Reviewed By: StanislavGlebik
Differential Revision: D24104101
fbshipit-source-id: 0825f04ed74532e89fd5f1fbebeee5f2001fedcd
Summary: It is sometimes very convenient to just inject new DAGs into existing repos.
Reviewed By: StanislavGlebik
Differential Revision: D24103164
fbshipit-source-id: abdfa18acb2f2fb1475b601a7eccb57e006982ec
Summary: No need to allocate a new vector if we just need to remove items from the current one.
Reviewed By: StanislavGlebik
Differential Revision: D24088319
fbshipit-source-id: 10804d925f20fe8dd1e2bb8500aa06d30bd367c1
Summary:
This just adds a single fn. I did not come up with a better place/name to put
it, suggestions are welcome. Seems generic enough to belong at the top-level
common location.
I've already needed this twice, so decided to extract. Second callsite will be further in the stack.
Reviewed By: StanislavGlebik
Differential Revision: D24080193
fbshipit-source-id: c3e0646f263562f3eed93f1fdbab9a076729f33c
Summary: `clippy` often complains about the use of `.len() != 0`, `.len() > 0` or `.len() == 0`and proposes to use `.is_empty()` instead. This diff does that across Mononoke.
Reviewed By: aslpavel
Differential Revision: D24099427
fbshipit-source-id: 1bba2f958485b7efb3f41bf3eae820879c92b0e5
Summary:
Formats a subset of opted-in Python files in fbsource.
Black formatting was applied first, which is guaranteed
safe as the AST will not have changed during formatting.
Pyfmt was then run, which also includes import sorting.
The changes from isort were manually reviewed, and
some potentially dangerous changes were reverted,
and the directive was added to those
files. A final run of pyfmt shows no more changes to
be applied.
Reviewed By: zertosh
Differential Revision: D24101830
fbshipit-source-id: 0f2616873117a821dbc6cfb6d8e4f64f4420312b
Summary: This can be used by dag::Vertex and minibytes::Bytes.
Reviewed By: kulshrax
Differential Revision: D23966985
fbshipit-source-id: 3b4b29648e038ef49f26ce2b500119e148544d9e
Summary:
The py_stream_class causes the code to be more verbose. It basically enforces
the bindings crate to define new types wrapping pure Rust types, and then
define py_stream_class.
In a future diff, I'm adding FromPyObject/ToPyObject support for types that
implements serde Deserialize/Serialize. py_stream_class gets in the way,
because the blanket type from cpython-ext cannot be used in the py_stream_class
macro. cpython-ext is not the proper place to define business-related stream
types.
Therefore, define a type-erased Python class, and implement
FromPyObject/ToPyObject automatically for TStream<anyhow::Result<T>> where
T implements FromPyObject or ToPyObject.
The FromPyObject now converts a Python iterator back to a stream. It's
no longer zero-cost. However, I'd imagine such usecases can be short-cut
using pure Rust code.
Background: Initially, I added some FromPyObject/ToPyObject impls to pure
Rust crates gated by a "pytypes" feature. While that works fine with cargo
build, buck does not support dynamic features and the fact that we support
both py2 and py3 makes it extremely hard to support cleanly in buck build.
For example, if minibytes::Bytes defines ToPyObject for Bytes, then any
crate using minibytes would have 2 different versions: a py2 version, a
py3 version, and they both depend on python. That seems to be a bad approach.
Reviewed By: sfilipco
Differential Revision: D23966984
fbshipit-source-id: eafb31ad458dcbdd8e970d8e419a10fbbe30595f
Summary:
Per the feedback on D23920367 (318f5683a5), let's make the human-readable download stats shorter. Example:
```
Downloaded 10.59 MiB in 12.35s over 5 requests (7.19 Mb/s, latency: 123ms)
```
The amount downloaded is now reported in binary-prefixed bytes (so that it can be directly compared to file sizes) whereas the transfer rate is reported in decimal-prefixed bits per second (so that it can be directly compared to a user's measured network speed).
Additionally, we now use the default formatting available from `std::time::Duration`, which will automatically choose the appropriate display units.
Reviewed By: quark-zju
Differential Revision: D24096525
fbshipit-source-id: 39c49f1b08135bbae7a7544b1ffe2bdbfe1533a1
Summary:
The bfs fetching path in Rust was broken because it directly called
getdesignatednodes. getdesignatednodes returns False if it wasn't able to
succeed, so we need to be able to fall back in that case. The _prefetchtrees
function is meant for that, so let's just call that.
Reviewed By: singhsrb
Differential Revision: D24090946
fbshipit-source-id: d16c2c8f80f690a22046385f0e95785996a62949
Summary:
Previously, only the first `__iter__` gets wrapped. With D23095468 (34df768136), the first
`__iter__` is used by the "simplify graph" feature, not the main iteration
loop rendering the graph log output, causing the prefetch feature to fail.
File "edenscm/mercurial/commands/__init__.py", line 4196, in log
return cmdutil.graphlog(ui, repo, pats, opts)
# pats = ()
File "edenscm/mercurial/cmdutil.py", line 3250, in graphlog
ui, repo, revdag, displayer, graphmod.asciiedges, getrenamed, filematcher
File "edenscm/mercurial/cmdutil.py", line 3106, in displaygraph
rustdisplaygraph(ui, repo, dag, displayer, getrenamed, filematcher, props)
File "edenscm/mercurial/cmdutil.py", line 3208, in rustdisplaygraph
for (rev, _type, ctx, parents) in dag:
File "edenscm/mercurial/graphmod.py", line 63, in dagwalker
rootnodes = cl.tonodes(revs)
# revs = <baseset- [7408158, 72057594037927936, ...]
File "edenscm/mercurial/changelog2.py", line 196, in tonodes
return self.inner.tonodes(revs)
# revs = <baseset- [7408158, 72057594037927936, ...]
File "edenscm/hgext/phabstatus.py", line 281, in next
return self.__next__()
Lift the "first time" limitation for wrapping `__iter__` to solve the problem.
Reviewed By: simpkins
Differential Revision: D24066918
fbshipit-source-id: 6bbd244e729724e5143147bde60bcb4c8ee4bc80
Summary: Since this is only used in the manifest target, fold it into it.
Reviewed By: DurhamG
Differential Revision: D24062629
fbshipit-source-id: c3241b53bde7abba8a80a2945661d1a24b7e3034
Summary: We now get progress bar output when fetching from memcache!
Reviewed By: kulshrax
Differential Revision: D24060663
fbshipit-source-id: ff5efa08bced2dac12f1e16c4a55fbc37fbc0837
Summary:
Less #ifdef is always better, since normalizing a path also works on
non-Windows, there is no reason to special case Windows here.
Reviewed By: fanzeyi
Differential Revision: D24020603
fbshipit-source-id: 114dae3bd9a4743230f4c82d219ff74ffc9379c7
Summary:
In c++, to zero-initialize, empty braces is sufficient for a non-class type.
{0} will only zero-initialize the first field, which is not the intent here.
Reviewed By: wez
Differential Revision: D24068481
fbshipit-source-id: 2de87da983a05f25e0222bf5338533a7b96fb36a
Summary: These aren't included anywhere, we can remove them.
Reviewed By: DurhamG
Differential Revision: D24062627
fbshipit-source-id: 9ff101eb44965ac3502ada3265ffcc8acc09d2e5
Summary:
This is no longer about datapack, but only about parsing manifest entries, thus
renaming.
Reviewed By: DurhamG
Differential Revision: D24062634
fbshipit-source-id: 5c52b784d20437e87012dd4bc6cb13d879da9cb9
Summary: The code doesn't use anything from libmpatch, we don't need to depend on it.
Reviewed By: DurhamG
Differential Revision: D24055084
fbshipit-source-id: 0f7bac73f1b711da4395e25619577a0a2e0ca959
Summary: These are unused, no need to keep the code around.
Reviewed By: DurhamG
Differential Revision: D24055085
fbshipit-source-id: 6246d746983a575c051ddcb51ae02582a764a814
Summary: This is unused, no need to keep it around.
Reviewed By: DurhamG
Differential Revision: D24054164
fbshipit-source-id: 161b294eb952c6b4584aa0d49d8ff46cd63ee30f
Summary:
This code is effectively unused. The only bit still relevant is that EdenFS
still depends on the Manifest class to parse a manifest.
Reviewed By: DurhamG
Differential Revision: D24037723
fbshipit-source-id: 901ae2ffc8960a95ec655a2e14d79afb8d32dcab
Summary: This is unused, let's remove it.
Reviewed By: DurhamG
Differential Revision: D24037722
fbshipit-source-id: bc8a272809cb1f20f54d651a39ee42ff57169534
Summary:
It was used as a fallback path only for a while now. Since Mercurial has
support CMD_CAT_TREE for quite some time, let's get rid of it.
Reviewed By: fanzeyi
Differential Revision: D24037004
fbshipit-source-id: 69887e6d8508419a22d68d062c78676aacba3b24
Summary:
If the data isn't found in the Rust one, it can't be found in the non-Rust one.
Since the non-Rust one will issue a filesystem rescan, this is a fairly
expensive operation which shows up in strobelight when trying to walk the
entire repo with: `rg --files`.
There is one last place that still use the non-Rust stores and that's as a
fallack for when Mercurial doesn't support CMD_CAT_TREE. Since this has been
supported for a bit, I'll make a followup change to completely get rid of the
non-Rust stores.
Reviewed By: fanzeyi
Differential Revision: D24035451
fbshipit-source-id: acd9741a16f3786796d329a4cddfe4ee435bcad9
Summary:
We want to end up with two `put` behaviours - overwrite and do not overwrite.
Currently, SQLBlob only implements the latter, but some users assume that `put` always overwrites. Change to match Manifold
Reviewed By: aslpavel
Differential Revision: D24079501
fbshipit-source-id: f75cac81acf874337c38f82597aae645c41a319b
Summary:
On Windows, the du command line application doesn't exist, thus we cannot use
it. Instead, we can simply re-implement the du functionality in Python and use
that on all platforms.
Reviewed By: chadaustin
Differential Revision: D24030269
fbshipit-source-id: e86c1bcdeac7eeca70201f6fde48c20ef7e305a6
Summary: Now that there are no more use-cases of `get_one`, let's remove it completely.
Reviewed By: farnz
Differential Revision: D24027990
fbshipit-source-id: 47baa6b1e28eedd94d95808efca0a98007a1d388
Summary:
This is a bit of a cargo-cult diff: it replaces the uses of `get_one` with `get` in tests, just to make the same wrong decisions later - use the first item from the produced list of items. So the only thing it does it removes a call site for `get_one`.
The reason it is ok to do `.into_iter().next()` here is because these are tests and we control the situation precisely - we know that there will be one mapping. Same reason we use `.unwrap()` in tests.
Reviewed By: farnz
Differential Revision: D24027785
fbshipit-source-id: 1c11acadfc9f7c6c4af658b414589c32008a6cce
Summary:
`get_one` is a deprecated method, because it uses incorrect logic to resolve ambiguities of multi-mapped commits: if just selects the very first of the potentially many mappings.
Correct resolution is to either handle the ambiguity at the caller site, or rely on provided resolution logic in commit_sync_outcome.rs.
Therefore, I am removing the uses of this method in this and a few surrounding commits.
In this case, the simplest thing is to replace it with `.get` and deal with multi-mappings on the client side:
- for `crossrepo map` subcommand we just print all mappings
- for `update_large_repo_bookmarks` we just fail on multi-mapping, as it seems dangerous to proceed without human intervention
Reviewed By: farnz
Differential Revision: D24030033
fbshipit-source-id: c84613579fbf8a5f6bac3c06da0cd4e0ad6c3fb0
Summary:
`get_one` is a deprecated method, because it uses incorrect logic to resolve ambiguities of multi-mapped commits: if just selects the very first of the potentially many mappings.
Correct resolution is to either handle the ambiguity at the caller site, or rely on provided resolution logic in `commit_sync_outcome.rs`.
Therefore, I am removing the uses of this method in this and a few surrounding commits.
In this case, I am changing `get_one` to `CommitSyncer::get_commit_sync_outcome`. There's no functional difference, as this is large-to-small mapping, which is always 1:1. But it allows us to get rid of `get_one` call-site, so let's do that.
Reviewed By: farnz
Differential Revision: D24027130
fbshipit-source-id: e57cb32c37a68e6762da6e2096ba216d251524f4