Summary:
Introduce `FileMetadata` and `DirectoryMetadata` to `Treeentry`, along with corresponding request API.
Move `metadata.flags` to `file_metadata.revisionstore_flags`, as it is never populated for trees. Do not use `metadata.size` on the wire, as it is never currently populated.
Leaving `DirectoryMetadata` commented out temporarily because serde round trips fail for unit struct. Re-introduced with fields in the next change in this stack.
Reviewed By: DurhamG
Differential Revision: D23455274
fbshipit-source-id: 57f440d5167f0b09eef2ea925484c84f739781e2
Summary:
EdenAPI always checks the integrity of filenode hashes before returning file data to the application. In the case of LFS files, this resulted in errors because the filenode hash is computed using the full file content, but the blob from the server only contains an LFS pointer.
Fix the bug by exempting LFS blobs from filenode integrity checks. (If integrity checks for LFS blobs are desired, the LFS code should be able to do this on its own since LFS blobs are content-addressed.)
Reviewed By: quark-zju
Differential Revision: D24145027
fbshipit-source-id: d7d86e2b912f267eba4120d1f5186908c3f4e9e3
Summary:
`cpython_ext` provides utilities to implement From/ToPyObject directly for
serde types. Lets' use it to simplify the code and set up an example.
debugshell:
In [2]: s,f=api.commitdata(repo.name, list(repo.nodes('master')))
In [3]: list(s)
Out[3]:
[{'hgid': (7, 61, 22, ...), 'revlog_data': '...'}]
Note: `HgId` serialization should probably be changed to use `serde_bytes` somehow
so it does not translate to a Python tuple. That will be fixed later.
Reviewed By: kulshrax
Differential Revision: D23966987
fbshipit-source-id: 9278ccae6f543c387eafe401d4ef8d6ce96d370f
Summary:
This can be used to automate Python/Rust conversions for complex structures
like `CommitRevlogData`.
Reviewed By: kulshrax
Differential Revision: D23966988
fbshipit-source-id: 17a19d38270e6ef0952c13a1cd778487e84a94ff
Summary:
The goal is to implement `FromPyObject` and `ToPyObject` more easily.
Today crates have to dependent on `cpython` to implement `From/ToPyObject`,
which is somewhat unwanted for pure Rust crates.
The `ser` module used to ignore the `variant` field for non-unit enum variants.
They have been fixed so the serialized value can be deserialized correctly.
For example, `enum E { A, B(T) }` will be serialized to `"A"` for `E::A`, and
`{"B": T}` for `E::B`.
Reviewed By: kulshrax
Differential Revision: D23966994
fbshipit-source-id: c50d57bf313caeec65a604ed9b05a5729f3b3635
Summary:
Switch from the default tuple deserialization which only understands the tuple
format, to "bytes" deserialization, which understands not only the existing
"tuple" format (therefore compatible with old data), but also "bytes" and "hex"
formats (for CBOR).
This will unblock us from switching to bytes serialization in the future.
Note: This is a breaking change for mincode serialization. Mincode + HgId users
(zsotre, metalog) have switched to explicit tuple serialization so they don't use
the default deserializaiton and remain unaffected.
Reviewed By: kulshrax
Differential Revision: D23966995
fbshipit-source-id: 83dd53f57bd4e6098de054f46a1d47f8b48133d0
Summary: This will unblock us from switching HgId to bytes serialization by default.
Reviewed By: kulshrax
Differential Revision: D24009039
fbshipit-source-id: a277869ec24652af428cda581faffa62c25d32c4
Summary: Similar to D23966992 (2a2971a4c7), add support to serialize Key differently.
Reviewed By: DurhamG
Differential Revision: D24009041
fbshipit-source-id: 2ecf1610b989a04083196d180bc62307b5162c2f
Summary: Similar to D23966992 (2a2971a4c7), add support to serialize Sha256 differently.
Reviewed By: DurhamG
Differential Revision: D24009040
fbshipit-source-id: b77f6732802f95507e1540f0bbde4d5a92d13cac
Summary:
Add a way to specify different merge tools for interactive and non-interactive
mode.
This will be used for the default `editmerge` merge tool, which pops up the
`EDITOR` (vim) regardless of interactive mode, causing various user complains
and hangs, including `arc pull` running rebase triggering editor, or VS Code
running rebase triggering editor, and some other rebase hangs or vim errors.
Reviewed By: DurhamG
Differential Revision: D24069105
fbshipit-source-id: ec16fdc704cab6daeedb0c23d4028b4309d96d3f
Summary:
This diff makes it so that pushrebase fails if tries to rebase over a commit
with a specified extra "failpushrebase" set. If a client runs into this issue
then they need to do a manual rebase.
Differential Revision: D24110709
fbshipit-source-id: 82cd771c92b9fb45f4fa8794b2c736f08ac900b1
Summary:
This is the first part of allowing us to update mononoke blobstore put behaviour to optionally a) log when it is overwriting keys, and b) not overwrite existing keys.
Introduce BlobstorePutOps for blobstore implementations so we can track overwrite status of a put, and force an explicit PutBehaviour if required. Its intended that only blobstore implementation code and special admin tooling will need to access BlobstorePutOps methods.
Reviewed By: farnz
Differential Revision: D24021168
fbshipit-source-id: 56ae34f9995a93cf1e47fbcfa2565f236c28ae12
Summary:
This passes `--tmpdir` option to `~/fbcode/eden/scm/tests/run-tests.py`
so it's predictable where for example mononoke's logs will be.
Some time ago I was debugging hanging test. It was very annoying that I couldn't specify that tmpdir manually. It also wasn't printed out (it's only printed out with `--keep-tmpdir` **after** the test finishes).
Now it is possible to specify that.
Reviewed By: krallin
Differential Revision: D24137737
fbshipit-source-id: 6280832517b48ece9b65e443c236035e385efea6
Summary:
This diff adds two things:
- the ability to compute the reverse of a `CommitSyncDataProvider::Test`, useful when creating both small-to-large and large-to-small `CommitSyncer` structs in tests
- the ability to set a current `CommitSyncConfigVersion` in the provider, which can also be useful, when simulating current version changes.
NB: I ended up not needing the set version functionality in my tests (further in the stack) in the end, so I can remove it, but I do think it will prove useful eventually.
Reviewed By: StanislavGlebik
Differential Revision: D24103206
fbshipit-source-id: 389169b2984684d83b0f6fdeb3be597d84cc0f12
Summary: Remove unnecessary clone in packblob along with the Clone constraint on the inner blobstore.
Reviewed By: krallin
Differential Revision: D24109293
fbshipit-source-id: b47e68e63b6ffda95d28d974ed6883e4ae31b3a1
Summary:
The `hg unhide` command acquired the repo lock without acquiring the wlock.
This causes locking order problems, as it calls other parts of the code that
will acquire the `wlock` (such as autopull during revset resolution) while it
is already holding the `lock`.
This can cause `hg unhide` to deadlock with other `hg` commands that acquire
`wlock` before `lock`.
Reviewed By: kulshrax
Differential Revision: D24129559
fbshipit-source-id: cf31ec661123df329f1773d2b67deb474d6476f8
Summary:
Similarly to how we could try invalidating a file that isn't cached, we could
also be trying to invalidate a file whose path isn't cached. Both are
legitimate, and thus we need to ignore both.
Reviewed By: chadaustin
Differential Revision: D24125225
fbshipit-source-id: e8abe5cde5aa3602bb48258abb64aa0cdf60241d
Summary:
Thrift represents `binary` data type as `std::string` in C++. This method will
help us to convert `Hash` into a byte string.
Reviewed By: xavierd
Differential Revision: D24083621
fbshipit-source-id: ae50088db7727d98ca11a017f82b71e942217a17
Summary:
This diff adds a new constructor to `SqliteDatabase` to allow creation of
in-memory SQLite database. This can come in handy in testing.
Reviewed By: xavierd
Differential Revision: D24083579
fbshipit-source-id: ad6dd8b1c20392a882c1f164ef1f8af2f0ba11f8
Summary:
This allows `edenfsctl debug processfetch` to display what processes triggered
some IO in EdenFS which will be useful to debug rogue processes walking the
entire repo.
Reviewed By: chadaustin
Differential Revision: D23997665
fbshipit-source-id: 7d92755d0068a4b1819eb0c84b30cbdaa24296f7
Summary:
This will enable to gather a bit more debugging regarding what processes are
fetching data. The one missing bit on Windows is to collect the process name,
for now, a "NOT IMPLEMENTED" placeholder is put in place.
Reviewed By: wez
Differential Revision: D23946258
fbshipit-source-id: 9f7642c7b9207c5b48ffff0f4eb0333af00bc7d5
Summary: Instead of returning an error upon receiving an empty request, just return a `Fetch` object that does nothing. This prevents Mercurial from crashing in situations where an empty request somehow makes it to the EdenAPI remote store.
Reviewed By: quark-zju
Differential Revision: D24119632
fbshipit-source-id: cf4ec707b4097656c76d7084a55b2d0b3150b679
Summary:
Previously, EdenAPI was using `remotefilelog.debug` to determine whether to print things like download stats. Let's give EdenAPI its own `debug` option that can be configured independently of remotefilelog.
One notable benefit of this change is that download stats will always be printed immediately after the HTTP request completes. This can help rule out network or server issues in situations where Mercurial appears to be hanging during data fetching. (e.g, if hg had downloaded all of the data but was taking a while to process it, the debug output would show this.)
Reviewed By: DurhamG
Differential Revision: D24097942
fbshipit-source-id: bf9b065e7b97fc7ffe50ab74b1b13e2fe364755c
Summary: HostInfoProperties is allocated for every HostInfo and is accessed on every request. There's no reason this should be a unique_ptr, and the pointer indirection is expensive.
Reviewed By: jmswen
Differential Revision: D24009296
fbshipit-source-id: 2034d1c6e61e0dec51ca6ac7bd14ab12e74966d4
Summary:
Previously phase calculation was done via a simple ancestor check. This
was very slow in cases that required going far back into the graph. Going a year
back could take a number of seconds.
To fix it, let's take the Rust phaseset logic and rework it to make only_both
produce an incremental public nodes set. In a later diff we can switch the
phaseset function to use this as well, but right now phaseset returns IdSet, and
that would need to be changed to Set, which may have consequences. So I'll do it
later.
Reviewed By: quark-zju
Differential Revision: D24096539
fbshipit-source-id: 5730ddd45b08cc985ecd9128c25021b6e7d7bc89
Summary:
This is one more fix to use correct commit sync config version. In particular,
this diff fixes a case where a single parent commit was rewritten out. E.g.
if a large repo commit touches only files that do not remap in a small repo. In
that case we still want to record correct mapping so that all descendants used
the correct mapping as well.
Reviewed By: ikostia
Differential Revision: D24109221
fbshipit-source-id: bcdbb01b964d70227dff8363e77964716a345261
Summary:
Let's move initialization into a separate function. I'm planning to use it in
the next diff for another test
Reviewed By: ikostia
Differential Revision: D24109222
fbshipit-source-id: 73142dd46ef3de15ff381670ed6d5e31653c5dd4
Summary:
Previously fetch_bonsai_range returned all commits between `ancestor` and
`descendant`, but `ancestor` was included. This is usually not what we want and
it might be surprising and can lead to subtle bugs. As an example, next commit
in the stack might have failed pushrebases when it shouldn't do that.
This diff changes the semantic of the function to exclude an ancestor. This
function was used for 2 use cases:
1) Find changed files. find_rebased_set function was manually removing the
ancestor anyway, so there's no change in behaviour
2) To check that there are no case conflicts. Previously we were checking the
case conflicts with ancestor included, but that wasn't necessary. To prove that
let's go over the two possible situation:
i) This is a first iteration of the pushrebase
```
CB
SB |
| ...
... CA
SA
| /
root
```
in that case files introduced by root commit will be used to check if we have
case conflicts or not. But this is not necessary, because pushrebase assumption
is that CA::CB should not introduce any new case conflicts. Besides, even if
they added a case conflict then checking with just the files that were changed by root commit is
not enough to verify that.
Similar logic goes to SA::SB commits. Checking if root has any conflicts with
SA::SB commits doesn't make sense.
ii) This is not the first iteration of the pushrebase
```
CB
SB |
| ...
... CA
SA
|
O <- latest pushrebase attempt
... <- we rebased over these commits on the previous attempts
| /
root
```
In this case it's even easier. Commit O was verified on the previous iteration,
so no need to add it here again.
Reviewed By: aslpavel
Differential Revision: D24110710
fbshipit-source-id: 90dff253cba0013e9d5e401474132a152d473cae
Summary:
The SpawnedProcess tests were failing on my macOS machine because pwd
and getcwd returned slightly different paths. Normalize them before
comparing.
Reviewed By: genevievehelsel
Differential Revision: D24094634
fbshipit-source-id: aacf802280b1dd1de19797604bfe359d7e60cbf8
Summary:
A couple of files were moved but test-check-code.t wasn't updated to reflect
this, causing it to fail.
Reviewed By: DurhamG
Differential Revision: D24113079
fbshipit-source-id: 9a0c0b6f07a6532715bf5ee401036ded0a05b16a
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/65
Using $LOCALIP will ensure more consistent behavior when setting up the server in ipv4 or ipv6.
The LOCALIP variable was also abused when it was used to override ssh client address, so SSH_IP_OVEERIDE env was created here.
Lastly the result of `curl` call is now printed whenever the test failed to verify that Mononoke is running.
Reviewed By: farnz
Differential Revision: D24108186
fbshipit-source-id: e4b68dd2c5dd368851f0b00064088ffc442e31e0
Summary: D24070707: `[Thrift] Provide sorted fields to read_field_begin` made a change to the generated rust thrift files, so the eden/scm thrift files have to be regenerated to fix the build.
Reviewed By: farnz
Differential Revision: D24109655
fbshipit-source-id: e8575a76642673a11514fdce8e30f13ca28151f0
Summary:
Normally, sync logic infers `CommitSyncConfigVersion` to use from parent commits (or from current version for root commits). However, for test purposes it is convenient to force a version override This logic does not change any of the production behaviors, and will be used in a later diff.
TODO: can it ever be needed beyond tests? I've thought about using this for "version boundary" commits, but those would probably just be constructed while completely bypassing the sync logic.
TBH, I am not certain this diff is a good change. But I've spend a very large amount of time crafting the repos used in the `sync_merge` tests later in this stack, so I am proposing to land this, then spend some time refactoring sync tests (and hopefully making it easier to craft test repos), then removing this logic. Obviously, this logic should only be landed if we land the tests in the first place.
Reviewed By: StanislavGlebik
Differential Revision: D24104101
fbshipit-source-id: 0825f04ed74532e89fd5f1fbebeee5f2001fedcd
Summary: It is sometimes very convenient to just inject new DAGs into existing repos.
Reviewed By: StanislavGlebik
Differential Revision: D24103164
fbshipit-source-id: abdfa18acb2f2fb1475b601a7eccb57e006982ec
Summary: No need to allocate a new vector if we just need to remove items from the current one.
Reviewed By: StanislavGlebik
Differential Revision: D24088319
fbshipit-source-id: 10804d925f20fe8dd1e2bb8500aa06d30bd367c1
Summary:
This just adds a single fn. I did not come up with a better place/name to put
it, suggestions are welcome. Seems generic enough to belong at the top-level
common location.
I've already needed this twice, so decided to extract. Second callsite will be further in the stack.
Reviewed By: StanislavGlebik
Differential Revision: D24080193
fbshipit-source-id: c3e0646f263562f3eed93f1fdbab9a076729f33c
Summary: `clippy` often complains about the use of `.len() != 0`, `.len() > 0` or `.len() == 0`and proposes to use `.is_empty()` instead. This diff does that across Mononoke.
Reviewed By: aslpavel
Differential Revision: D24099427
fbshipit-source-id: 1bba2f958485b7efb3f41bf3eae820879c92b0e5
Summary:
Formats a subset of opted-in Python files in fbsource.
Black formatting was applied first, which is guaranteed
safe as the AST will not have changed during formatting.
Pyfmt was then run, which also includes import sorting.
The changes from isort were manually reviewed, and
some potentially dangerous changes were reverted,
and the directive was added to those
files. A final run of pyfmt shows no more changes to
be applied.
Reviewed By: zertosh
Differential Revision: D24101830
fbshipit-source-id: 0f2616873117a821dbc6cfb6d8e4f64f4420312b
Summary: This can be used by dag::Vertex and minibytes::Bytes.
Reviewed By: kulshrax
Differential Revision: D23966985
fbshipit-source-id: 3b4b29648e038ef49f26ce2b500119e148544d9e
Summary:
The py_stream_class causes the code to be more verbose. It basically enforces
the bindings crate to define new types wrapping pure Rust types, and then
define py_stream_class.
In a future diff, I'm adding FromPyObject/ToPyObject support for types that
implements serde Deserialize/Serialize. py_stream_class gets in the way,
because the blanket type from cpython-ext cannot be used in the py_stream_class
macro. cpython-ext is not the proper place to define business-related stream
types.
Therefore, define a type-erased Python class, and implement
FromPyObject/ToPyObject automatically for TStream<anyhow::Result<T>> where
T implements FromPyObject or ToPyObject.
The FromPyObject now converts a Python iterator back to a stream. It's
no longer zero-cost. However, I'd imagine such usecases can be short-cut
using pure Rust code.
Background: Initially, I added some FromPyObject/ToPyObject impls to pure
Rust crates gated by a "pytypes" feature. While that works fine with cargo
build, buck does not support dynamic features and the fact that we support
both py2 and py3 makes it extremely hard to support cleanly in buck build.
For example, if minibytes::Bytes defines ToPyObject for Bytes, then any
crate using minibytes would have 2 different versions: a py2 version, a
py3 version, and they both depend on python. That seems to be a bad approach.
Reviewed By: sfilipco
Differential Revision: D23966984
fbshipit-source-id: eafb31ad458dcbdd8e970d8e419a10fbbe30595f
Summary:
Per the feedback on D23920367 (318f5683a5), let's make the human-readable download stats shorter. Example:
```
Downloaded 10.59 MiB in 12.35s over 5 requests (7.19 Mb/s, latency: 123ms)
```
The amount downloaded is now reported in binary-prefixed bytes (so that it can be directly compared to file sizes) whereas the transfer rate is reported in decimal-prefixed bits per second (so that it can be directly compared to a user's measured network speed).
Additionally, we now use the default formatting available from `std::time::Duration`, which will automatically choose the appropriate display units.
Reviewed By: quark-zju
Differential Revision: D24096525
fbshipit-source-id: 39c49f1b08135bbae7a7544b1ffe2bdbfe1533a1
Summary:
The bfs fetching path in Rust was broken because it directly called
getdesignatednodes. getdesignatednodes returns False if it wasn't able to
succeed, so we need to be able to fall back in that case. The _prefetchtrees
function is meant for that, so let's just call that.
Reviewed By: singhsrb
Differential Revision: D24090946
fbshipit-source-id: d16c2c8f80f690a22046385f0e95785996a62949
Summary:
Previously, only the first `__iter__` gets wrapped. With D23095468 (34df768136), the first
`__iter__` is used by the "simplify graph" feature, not the main iteration
loop rendering the graph log output, causing the prefetch feature to fail.
File "edenscm/mercurial/commands/__init__.py", line 4196, in log
return cmdutil.graphlog(ui, repo, pats, opts)
# pats = ()
File "edenscm/mercurial/cmdutil.py", line 3250, in graphlog
ui, repo, revdag, displayer, graphmod.asciiedges, getrenamed, filematcher
File "edenscm/mercurial/cmdutil.py", line 3106, in displaygraph
rustdisplaygraph(ui, repo, dag, displayer, getrenamed, filematcher, props)
File "edenscm/mercurial/cmdutil.py", line 3208, in rustdisplaygraph
for (rev, _type, ctx, parents) in dag:
File "edenscm/mercurial/graphmod.py", line 63, in dagwalker
rootnodes = cl.tonodes(revs)
# revs = <baseset- [7408158, 72057594037927936, ...]
File "edenscm/mercurial/changelog2.py", line 196, in tonodes
return self.inner.tonodes(revs)
# revs = <baseset- [7408158, 72057594037927936, ...]
File "edenscm/hgext/phabstatus.py", line 281, in next
return self.__next__()
Lift the "first time" limitation for wrapping `__iter__` to solve the problem.
Reviewed By: simpkins
Differential Revision: D24066918
fbshipit-source-id: 6bbd244e729724e5143147bde60bcb4c8ee4bc80
Summary: Since this is only used in the manifest target, fold it into it.
Reviewed By: DurhamG
Differential Revision: D24062629
fbshipit-source-id: c3241b53bde7abba8a80a2945661d1a24b7e3034
Summary: We now get progress bar output when fetching from memcache!
Reviewed By: kulshrax
Differential Revision: D24060663
fbshipit-source-id: ff5efa08bced2dac12f1e16c4a55fbc37fbc0837
Summary:
Less #ifdef is always better, since normalizing a path also works on
non-Windows, there is no reason to special case Windows here.
Reviewed By: fanzeyi
Differential Revision: D24020603
fbshipit-source-id: 114dae3bd9a4743230f4c82d219ff74ffc9379c7
Summary:
In c++, to zero-initialize, empty braces is sufficient for a non-class type.
{0} will only zero-initialize the first field, which is not the intent here.
Reviewed By: wez
Differential Revision: D24068481
fbshipit-source-id: 2de87da983a05f25e0222bf5338533a7b96fb36a