Summary: Not that useful and does not align with the direction we are headed.
Reviewed By: quark-zju
Differential Revision: D22213796
fbshipit-source-id: ffd86fc1a9207c134448836d0e54e48510a11135
Summary:
updated `eden top` to:
- obtain PID-fetchCounts data from the updated -`getAccessCounts` thrift call in the previous diff
- display that data in a new column `FUSE FETCH`
Reviewed By: kmancini
Differential Revision: D22101430
fbshipit-source-id: 6584e71ce3a4629c73469607ca0a4c6ffd63e46f
Summary:
This diff does three things:
- moves existing `CommitSyncConfig` validation from `config.rs` into
`convert/commit_sync.rs`, so that any user of `impl Convert for
RawCommitSyncConfig` gets it for free
- adds another thing to validate `CommitSyncConfig` against (large repo is one
of the small repos)
- adds `RawCommitSyncConfig` validation for something that can be lost when
looking at `CommitSyncConfig` (no duplicate small repos).
Reviewed By: markbt
Differential Revision: D22211897
fbshipit-source-id: a9820cc8baf427da66ce7dfc943e25eb67e1fd6e
Summary:
This diff updates `getAccessCounts` to
- obtain the PID-fetchCount map data added in the previous diff
- put that data into its `result`
Reviewed By: kmancini
Differential Revision: D22101232
fbshipit-source-id: 1d41715339d418b03c17f6c93a7a497b432973ae
Summary:
Both optional and pid_t weren't found and the right includes needed to be
provided. On Windows, the ProcessNameCache isn't compiled (yet), and since it
looks like the process name is optional in the BackingStoreLogger, let's not
provide it for now.
Reviewed By: fanzeyi
Differential Revision: D22215581
fbshipit-source-id: 31a7e7be62cd3d14108dc437d3dfabfb9e62f8d5
Summary:
The sshaskpass extension forks the current process and attempts to do
some tty magic. If there was an exception though, the exception could go up the
stack and trigger the resumption of the pull logic, resulting in pull executing
twice.
The fix is to move the `_silentexception(terminate=True)` context to wrap the
entire child process, so we're guaranteed to exit in all cases.
I also fixed the str-vs-byte issue that caused the original exception.
Reviewed By: xavierd
Differential Revision: D22211476
fbshipit-source-id: 5f5ca6b33b425e517650f9a83cab605a4c9783de
Summary: The extension was deprecated by the eol extension.
Reviewed By: DurhamG
Differential Revision: D22129826
fbshipit-source-id: 293a57b4039f424154955454e0a7a74dc7d23069
Summary:
The revision is passed down as a 40-bytes ascii string, and therefore it needs
decoding before being usable.
Reviewed By: DurhamG
Differential Revision: D22210203
fbshipit-source-id: b84bca5f89cbe4f267de1281c1a9ed55409174d2
Summary:
In python3, sys.std{in,out} are opened in text mode, but when actual fds are
passed in, we fdopen them in bytes mode. Since the code expects these to be
bytes, let's fdopen them in bytes mode too.
Reviewed By: DurhamG
Differential Revision: D22196541
fbshipit-source-id: b913267918af1c3bdf819e243a384312a2a27df0
Summary: When running integration tests we should make the paths absolute, but kept it relative so far. This results it breaking the tests.
Reviewed By: krallin
Differential Revision: D22209498
fbshipit-source-id: 54ca3def84abf313db32aecfac503c7a42ed6576
Summary:
So automation can detect abandoned transactions and run `hg recover`,
define an exit code. None of sysexit.h seemed to fit this case, so
start with 90.
Reviewed By: quark-zju
Differential Revision: D22198980
fbshipit-source-id: 5f267a2671c843f350668daaa14de34752244d4b
Summary: 6.5.19 is now available, switch OSS to pick that instead of old 6.5.17.
Reviewed By: rsunkad
Differential Revision: D22199286
fbshipit-source-id: 231346df8d2f918d2226cfe17b01bde12c18a5a7
Summary:
If we don't read the body for a response, then Hyper cannot return the
connection to the pool. So, let's do it automatically upon dropping. This will
typically happen when we send a request to upstream then don't read the
response.
I seem to remember this used to work fine at some point, but looking at the
code I think it's actually broken now and we don't reuse upstream connections
if we skip waiting for upstream in a batch request. So, let's fix it once and
for all with a more robust abstraction.
Reviewed By: HarveyHunt
Differential Revision: D22206742
fbshipit-source-id: 2da1c008556e1d964c1cc337d58f06f8d691a916
Summary:
This was old Tokio 0.1 code that needed channels for spawns, but in 0.2 that
actually is built-in to tokio::spawn, so let's use this.
Reviewed By: HarveyHunt
Differential Revision: D22206738
fbshipit-source-id: 8f89ca4f7afc8dd888fe289e8d597148976cc54c
Summary:
This fixes a bit of a tech debt item in the LFS Server. We've had this
discard_stream functon for a while, which was necessary because if you just
drop the data stream, you get an error on the sending end.
This makes the code more complex than it needs to be, since you need to always
explicitly discard data streams you don't want instead of just dropping them.
This fixes that by letting us support a sender that tolerates the receiver
being closed, and just ignores those errors.
Reviewed By: HarveyHunt
Differential Revision: D22206739
fbshipit-source-id: d209679b20a3724bcd2e082ebd0d2ce10e9ac481
Summary:
We have a lot of integration tests for LFS, but a handful of unit tests don't
hurt for some simpler changes. Let's make it easier to write those.
Reviewed By: HarveyHunt
Differential Revision: D22206741
fbshipit-source-id: abcb73b35c01f28dd54cc543cd0a746327d3787b
Summary:
This diff is probably going to sound weird ... but xavierd and I both think
this is the best approach for where we are right now. Here is why this is
necessary.
Consider the following scenario
- A client creates a LFS object. They upload it to Mononoke LFS, but not
upstream.
- The client shares this (e.g. with Sandcastle), and includes a LFS pointer.
- The client tries to push this commit
When this happens, the client might not actually have the object locally.
Indeed, the only pieces of data the client is guaranteed to have is
locally-authored data.
Even if the client does have the blob, that's going to be in the hgcache, and
uploading from the hgcache is a bit sketchy (because, well, it's a cache, so
it's not like it's normally guaranteed to just hold data there for us to push
it to the server).
The problem boils down to a mismatch of assumptions between client and server:
- The client assumes that if the data wasn't locally authored, then the server
must have it, and will never request this piece of data again.
- The server assumes that if the client offers a blob for upload, it can
request this blob from the client (and the client will send it).
Those assumptions are obviously not compatible, since we can serve
not-locally-authored data from LFS and yet want the client to upload it, either
because it is missing in upstream or locally.
This leaves us with a few options:
- Upload from the hg cache. As noted above, this isn't desirable, because the
data might not be there to begin with! Populating the cache on demand (from
the server) just to push data back to the server would be quite messy.
- Skip the upload entirely, either by having the server not request the upload
if the data is missing, by having the server report that the upload is
optional, or by having the client not offer LFS blobs it doens't have to the
server, or finally by having the client simply disobey the server if it
doesn't have the data the server is asking for.
So, why can we not just skip the upload? The answer is: for the same reason we
upload to upstream to begin with. Consider the following scenario:
- Misconfigured client produces a commit, and upload it to upstream.
- Misconfigured client shares the commit with Sandcastle, and includes a LFS
pointer.
- Sandcastle wants to push to master, so it goes to check if the blob is
present in LFS. It isn't (Mononoke LFS checks both upstream and internal, and
only finds the blob in upstream, so it requests that the client submit the
blob), but it's also not not locally authored, so we skip the push.
- The client tries to push to Mononoke
This push will fail, because it'll reference LFS data that is not present in
Mononoke (it's only in upstream).
As for how we fix this: the key guarantee made by our proxying mechanism is
that if you write to either LFS server, your data is readable in both (the way
we do this is that if you write to Mononoke LFS, we write it to upstream too,
and if you write to upstream, we can read it from Mononoke LFS too).
What does not matter there is where the data came from. So, when the client
uploads, we simply let it submit a zero-length blob, and if so, we take that to
mean that the client doesn't think it authored data (and thinks we have it), so
we try to figure out where the blob is on the server side.
Reviewed By: xavierd
Differential Revision: D22192005
fbshipit-source-id: bf67e33e2b7114dfa26d356f373b407f2d00dc70
Summary:
In Python3, array indexing into a byte string returns a int, not a string.
Let's instead use the struct module to extract a byte string out of it that we
can then decode afterwards.
Reviewed By: DurhamG
Differential Revision: D22097226
fbshipit-source-id: e6b306b4d3bcf2ba08422296603b56fcadbb636e
Summary:
These were broken mostly due a test issue (using bytes instead of str), and a
small difference when printing "units_per_sec", which was an int in python2,
and was computed as a float in python3.
Reviewed By: DurhamG
Differential Revision: D22095813
fbshipit-source-id: 8af8332dad5366d2c168485b120a984ff1ba558a
Summary:
Due to Thrift design of "include" statements in fbcode the thrift structures has to be contained in folders that are identical to the folder layout inside fbcode.
This diff changes the folder layout on Cargp.toml files and in fbcode_builder, there will be a next diff that changes this for ShipIt as well.
Reviewed By: ikostia
Differential Revision: D22208707
fbshipit-source-id: 65f9cafed2f0fcc8398a3887dfa622de9e139f68
Summary:
We've had a case where a client sent a bundle that contained an LFS pointer to
Sandcastle. On that original repo, the LFS blob and pointer were in the
hgcache, which signified that the server has it. When that bundle gets applied
on Sandcastle, the pointer then makes its way onto the local lfs pointer store
and the blob will be fetched from the LFS server itself properly.
Where it gets tricky is that Sandcastle then tried to re-bundle the change and
thus tried to re-upload it. The expectation from the client perspective is that
since the blob was fetched from the server in the first place, the server will
just not ask for the blob to be re-uploaded. The asumption proved to be
semi-inacurate in the case where the LFS server mirrors all the writes to a
secondary LFS server. When that secondary LFS server is standalone and not
Mononoke, it may not have the blob in question, and thus the LFS server may
request the blob to be re-uploaded so it can write it to the secondary LFS
server.
Unfortunately, things start breaking down at this point. Since the blob isn't
present in the local store the client can't rely on being able to read it and
fetching it from the server to then upload it is silly and a bit too complex.
Instead, what we can do is teach the server of this specific scenario by
manually setting the Content-Length to 0 in the case of an upload where the
blob wasn't found in the local store. The server recognizes this and will
manually copy the blob to the secondary LFS server.
Reviewed By: krallin
Differential Revision: D22192718
fbshipit-source-id: 67c5ba1a751cc07d5d5d51e07703282d8175b010
Summary:
If a commit changes modes (i.e. executable, symlink or regular) of a lot of files but
doesn't change their content then we don't need to put these filenodes to the
generated bundle. Mercurial stores mode in manifest, so changing the mode
doesn't change the filenode.
Reviewed By: ikostia
Differential Revision: D22206736
fbshipit-source-id: f64ee8a34281cd207c92653b927bf9109ccbe1b4
Summary:
The `amend` command can take a list of files (or file patterns) to limit the
amend to a subset of files. Make this clear in the help text.
Reviewed By: ikostia, krallin
Differential Revision: D22206906
fbshipit-source-id: 9f26b95a628bb5bbf5eae92c16f8852f44207d25
Summary:
I landed D22118926 (e288354caf) yesterday expecting those messages at about the same time
xavierd landed D21987918 (4d13ce1bcc), which removed them. This removes them from the
tests.
Reviewed By: StanislavGlebik
Differential Revision: D22204980
fbshipit-source-id: 6b1d696c93a07e942f86cd8df9a8e43037688728
Summary:
If the nodemap file is seriously out of sync, it's harder to recover on Windows,
because nodemap mmap in other processes prevents `atomic_write`.
Use `util::path::remove_file` to remove the old mmap files and improve the
success rate of nodemap rewrites.
Reviewed By: xavierd
Differential Revision: D22168134
fbshipit-source-id: 577df978b3175eac3257794f5e5ff7522cf21871
Summary:
Make it possible to delete files that are mmap-ed. This avoids the extra
complexity of cleaning leftover files.
Reviewed By: xavierd
Differential Revision: D22168135
fbshipit-source-id: edb33a75638c668945feae49608c3fff25e742a4
Summary:
Update snapshot command for python3 compatibility.
This one was a bit weird, and there are some future changes we could make if we care. First, our CBOR implementation doesn't support CBOR strings (utf8 strings), only CBOR bytestrings (just byte sequences). This would be easy to add, but would be binary-incompatible with previous snapshots, if we made no other changes.
We also don't take full advantage of the supported types we already have, ie we encoded version (an integer in Python) as a string, then wrote that string as a CBOR bytestring, which makes the very awkward encodeutf8(str(version)) you see in this change, instead of just plain version, which would work round-trip perfectly fine as far as I can tell, but would break compatibility with existing snapshots.
As is, I made only the most conservatve change, Python2 snapshots should be binary-identical to Python3 snapshots, and they should both be compatible with snapshots made before this change. We could get rid of the vast majority of the conversions by simply adding CBOR string support, which would be a breaking change without special handling.
Reviewed By: quark-zju
Differential Revision: D22195085
fbshipit-source-id: cead42db0c5f84b99e305e5a7f1d839a35becac6
Summary:
Move utility function _getoptbwrapper out of Python2-gated block.
Convert undo.py to use integer division in both python2 and python3.
Reviewed By: xavierd
Differential Revision: D22125861
fbshipit-source-id: 9a283065c98eb122c4488285870e026748257b3c
Summary:
With the Rust stores being enabled by default in the code, time to actually
remove the code that wasn't using it. A lot of code still remains as the
treemanifest code still uses it. Migrating the treemanifest code to use the
Rust store would allow for even more code to be removed!
Reviewed By: DurhamG
Differential Revision: D21987918
fbshipit-source-id: 8e0becd4a1fb2ac81e26b8517d13e6d06ab06f65
Summary:
Now that the workers are in Rust, we no longer need the forker version in
Python. For now, the Python LFS extension still uses the threaded worker so
keep this one for now, when that extension will be removed we can remove the
rest of the worker code.
In theory, not all repository would benefit from the Rust workers, but these
are not supported at FB due to not being remotefilelog based.
Reviewed By: DurhamG
Differential Revision: D21987295
fbshipit-source-id: d17b9730651671608cf13f7abe6a9bb32251e140
Summary:
The Rust store code has been enabled everywhere for a few weeks now, let's
enable it by default in the code. Future changes will remove the config as well
as all the code associated with the non Rust store code.
The various tests changes are due to small difference between the Rust code and
the Python one, the biggest one being it's handling of corrupted packfiles. The
old code silently ignored them, while the new one errors out for local
packfiles. The test-lfs-bundle.t difference is just due to an ordering
difference between Python and Rust.
Reviewed By: kulshrax
Differential Revision: D21985744
fbshipit-source-id: 10410560193476bc303a72e7583f84924a6de820
Summary:
Whenever data is redacted on the server, a specific tombstone is returned when
fetching it. Make sure that whenever we update the file on disk, we write a
nice message to the user instead of the tombstone itself.
While this code could have been moved into the Rust store code itself, I prefer
to leave it to its users to decide what to do with redacted data. EdenFS for
instance may want to prevent access to it instead of showing the redacted
message.
Reviewed By: kulshrax
Differential Revision: D21999345
fbshipit-source-id: 39a83cdf5ea4567628a13fbd59520b9677aba749
Summary:
On Windows, `pathlib.Path.resolve` has an use of uninitialized memory bug in Python 3.6.2. Lego Windows has this version of Python 3 installed.
When `eden prefetch` runs, it will first attempt to read the config file at the root of the repository (`$REPOROOT/.eden/config`). Because of this bug, when we normalize the path in edenfsctl, it will get some random bytes as a result. This subsequently causes `toml.read` to fail as it is unable to read a path containing NUL byte.
As you can see, this is the same exception as we see on Windows:
```
In [1]: open("test\x00")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-57764b20d660> in <module>
----> 1 open("test\x00")
ValueError: embedded null byte
```
Reviewed By: xavierd
Differential Revision: D22177997
fbshipit-source-id: ab2565c8946d809bc15bc1597b749fb5e9440ca0
Summary:
The dynamicconfig logic had an assert checking that a config was not
added by the dynamicconfig twice. This isn't actually a valid assert since the
caller could load the config as many times as it wanted.
This happened in hg doctor, where it loaded the ui once manually (without a repo
object) then it loaded it again during the creation of the repo object. I went
ahead and cleaned up hg doctor to not do this, but let's get rid of the assert
anyway.
Reviewed By: quark-zju
Differential Revision: D22194273
fbshipit-source-id: 9491d35fe14523ad3d9cb69b4ca0d615149dd0f0