Summary:
If we don't read the body for a response, then Hyper cannot return the
connection to the pool. So, let's do it automatically upon dropping. This will
typically happen when we send a request to upstream then don't read the
response.
I seem to remember this used to work fine at some point, but looking at the
code I think it's actually broken now and we don't reuse upstream connections
if we skip waiting for upstream in a batch request. So, let's fix it once and
for all with a more robust abstraction.
Reviewed By: HarveyHunt
Differential Revision: D22206742
fbshipit-source-id: 2da1c008556e1d964c1cc337d58f06f8d691a916
Summary:
This was old Tokio 0.1 code that needed channels for spawns, but in 0.2 that
actually is built-in to tokio::spawn, so let's use this.
Reviewed By: HarveyHunt
Differential Revision: D22206738
fbshipit-source-id: 8f89ca4f7afc8dd888fe289e8d597148976cc54c
Summary:
This fixes a bit of a tech debt item in the LFS Server. We've had this
discard_stream functon for a while, which was necessary because if you just
drop the data stream, you get an error on the sending end.
This makes the code more complex than it needs to be, since you need to always
explicitly discard data streams you don't want instead of just dropping them.
This fixes that by letting us support a sender that tolerates the receiver
being closed, and just ignores those errors.
Reviewed By: HarveyHunt
Differential Revision: D22206739
fbshipit-source-id: d209679b20a3724bcd2e082ebd0d2ce10e9ac481
Summary:
We have a lot of integration tests for LFS, but a handful of unit tests don't
hurt for some simpler changes. Let's make it easier to write those.
Reviewed By: HarveyHunt
Differential Revision: D22206741
fbshipit-source-id: abcb73b35c01f28dd54cc543cd0a746327d3787b
Summary:
This diff is probably going to sound weird ... but xavierd and I both think
this is the best approach for where we are right now. Here is why this is
necessary.
Consider the following scenario
- A client creates a LFS object. They upload it to Mononoke LFS, but not
upstream.
- The client shares this (e.g. with Sandcastle), and includes a LFS pointer.
- The client tries to push this commit
When this happens, the client might not actually have the object locally.
Indeed, the only pieces of data the client is guaranteed to have is
locally-authored data.
Even if the client does have the blob, that's going to be in the hgcache, and
uploading from the hgcache is a bit sketchy (because, well, it's a cache, so
it's not like it's normally guaranteed to just hold data there for us to push
it to the server).
The problem boils down to a mismatch of assumptions between client and server:
- The client assumes that if the data wasn't locally authored, then the server
must have it, and will never request this piece of data again.
- The server assumes that if the client offers a blob for upload, it can
request this blob from the client (and the client will send it).
Those assumptions are obviously not compatible, since we can serve
not-locally-authored data from LFS and yet want the client to upload it, either
because it is missing in upstream or locally.
This leaves us with a few options:
- Upload from the hg cache. As noted above, this isn't desirable, because the
data might not be there to begin with! Populating the cache on demand (from
the server) just to push data back to the server would be quite messy.
- Skip the upload entirely, either by having the server not request the upload
if the data is missing, by having the server report that the upload is
optional, or by having the client not offer LFS blobs it doens't have to the
server, or finally by having the client simply disobey the server if it
doesn't have the data the server is asking for.
So, why can we not just skip the upload? The answer is: for the same reason we
upload to upstream to begin with. Consider the following scenario:
- Misconfigured client produces a commit, and upload it to upstream.
- Misconfigured client shares the commit with Sandcastle, and includes a LFS
pointer.
- Sandcastle wants to push to master, so it goes to check if the blob is
present in LFS. It isn't (Mononoke LFS checks both upstream and internal, and
only finds the blob in upstream, so it requests that the client submit the
blob), but it's also not not locally authored, so we skip the push.
- The client tries to push to Mononoke
This push will fail, because it'll reference LFS data that is not present in
Mononoke (it's only in upstream).
As for how we fix this: the key guarantee made by our proxying mechanism is
that if you write to either LFS server, your data is readable in both (the way
we do this is that if you write to Mononoke LFS, we write it to upstream too,
and if you write to upstream, we can read it from Mononoke LFS too).
What does not matter there is where the data came from. So, when the client
uploads, we simply let it submit a zero-length blob, and if so, we take that to
mean that the client doesn't think it authored data (and thinks we have it), so
we try to figure out where the blob is on the server side.
Reviewed By: xavierd
Differential Revision: D22192005
fbshipit-source-id: bf67e33e2b7114dfa26d356f373b407f2d00dc70
Summary:
In Python3, array indexing into a byte string returns a int, not a string.
Let's instead use the struct module to extract a byte string out of it that we
can then decode afterwards.
Reviewed By: DurhamG
Differential Revision: D22097226
fbshipit-source-id: e6b306b4d3bcf2ba08422296603b56fcadbb636e
Summary:
These were broken mostly due a test issue (using bytes instead of str), and a
small difference when printing "units_per_sec", which was an int in python2,
and was computed as a float in python3.
Reviewed By: DurhamG
Differential Revision: D22095813
fbshipit-source-id: 8af8332dad5366d2c168485b120a984ff1ba558a
Summary:
Due to Thrift design of "include" statements in fbcode the thrift structures has to be contained in folders that are identical to the folder layout inside fbcode.
This diff changes the folder layout on Cargp.toml files and in fbcode_builder, there will be a next diff that changes this for ShipIt as well.
Reviewed By: ikostia
Differential Revision: D22208707
fbshipit-source-id: 65f9cafed2f0fcc8398a3887dfa622de9e139f68
Summary:
We've had a case where a client sent a bundle that contained an LFS pointer to
Sandcastle. On that original repo, the LFS blob and pointer were in the
hgcache, which signified that the server has it. When that bundle gets applied
on Sandcastle, the pointer then makes its way onto the local lfs pointer store
and the blob will be fetched from the LFS server itself properly.
Where it gets tricky is that Sandcastle then tried to re-bundle the change and
thus tried to re-upload it. The expectation from the client perspective is that
since the blob was fetched from the server in the first place, the server will
just not ask for the blob to be re-uploaded. The asumption proved to be
semi-inacurate in the case where the LFS server mirrors all the writes to a
secondary LFS server. When that secondary LFS server is standalone and not
Mononoke, it may not have the blob in question, and thus the LFS server may
request the blob to be re-uploaded so it can write it to the secondary LFS
server.
Unfortunately, things start breaking down at this point. Since the blob isn't
present in the local store the client can't rely on being able to read it and
fetching it from the server to then upload it is silly and a bit too complex.
Instead, what we can do is teach the server of this specific scenario by
manually setting the Content-Length to 0 in the case of an upload where the
blob wasn't found in the local store. The server recognizes this and will
manually copy the blob to the secondary LFS server.
Reviewed By: krallin
Differential Revision: D22192718
fbshipit-source-id: 67c5ba1a751cc07d5d5d51e07703282d8175b010
Summary:
If a commit changes modes (i.e. executable, symlink or regular) of a lot of files but
doesn't change their content then we don't need to put these filenodes to the
generated bundle. Mercurial stores mode in manifest, so changing the mode
doesn't change the filenode.
Reviewed By: ikostia
Differential Revision: D22206736
fbshipit-source-id: f64ee8a34281cd207c92653b927bf9109ccbe1b4
Summary:
The `amend` command can take a list of files (or file patterns) to limit the
amend to a subset of files. Make this clear in the help text.
Reviewed By: ikostia, krallin
Differential Revision: D22206906
fbshipit-source-id: 9f26b95a628bb5bbf5eae92c16f8852f44207d25
Summary:
I landed D22118926 (e288354caf) yesterday expecting those messages at about the same time
xavierd landed D21987918 (4d13ce1bcc), which removed them. This removes them from the
tests.
Reviewed By: StanislavGlebik
Differential Revision: D22204980
fbshipit-source-id: 6b1d696c93a07e942f86cd8df9a8e43037688728
Summary:
If the nodemap file is seriously out of sync, it's harder to recover on Windows,
because nodemap mmap in other processes prevents `atomic_write`.
Use `util::path::remove_file` to remove the old mmap files and improve the
success rate of nodemap rewrites.
Reviewed By: xavierd
Differential Revision: D22168134
fbshipit-source-id: 577df978b3175eac3257794f5e5ff7522cf21871
Summary:
Make it possible to delete files that are mmap-ed. This avoids the extra
complexity of cleaning leftover files.
Reviewed By: xavierd
Differential Revision: D22168135
fbshipit-source-id: edb33a75638c668945feae49608c3fff25e742a4
Summary:
Update snapshot command for python3 compatibility.
This one was a bit weird, and there are some future changes we could make if we care. First, our CBOR implementation doesn't support CBOR strings (utf8 strings), only CBOR bytestrings (just byte sequences). This would be easy to add, but would be binary-incompatible with previous snapshots, if we made no other changes.
We also don't take full advantage of the supported types we already have, ie we encoded version (an integer in Python) as a string, then wrote that string as a CBOR bytestring, which makes the very awkward encodeutf8(str(version)) you see in this change, instead of just plain version, which would work round-trip perfectly fine as far as I can tell, but would break compatibility with existing snapshots.
As is, I made only the most conservatve change, Python2 snapshots should be binary-identical to Python3 snapshots, and they should both be compatible with snapshots made before this change. We could get rid of the vast majority of the conversions by simply adding CBOR string support, which would be a breaking change without special handling.
Reviewed By: quark-zju
Differential Revision: D22195085
fbshipit-source-id: cead42db0c5f84b99e305e5a7f1d839a35becac6
Summary:
Move utility function _getoptbwrapper out of Python2-gated block.
Convert undo.py to use integer division in both python2 and python3.
Reviewed By: xavierd
Differential Revision: D22125861
fbshipit-source-id: 9a283065c98eb122c4488285870e026748257b3c
Summary:
With the Rust stores being enabled by default in the code, time to actually
remove the code that wasn't using it. A lot of code still remains as the
treemanifest code still uses it. Migrating the treemanifest code to use the
Rust store would allow for even more code to be removed!
Reviewed By: DurhamG
Differential Revision: D21987918
fbshipit-source-id: 8e0becd4a1fb2ac81e26b8517d13e6d06ab06f65
Summary:
Now that the workers are in Rust, we no longer need the forker version in
Python. For now, the Python LFS extension still uses the threaded worker so
keep this one for now, when that extension will be removed we can remove the
rest of the worker code.
In theory, not all repository would benefit from the Rust workers, but these
are not supported at FB due to not being remotefilelog based.
Reviewed By: DurhamG
Differential Revision: D21987295
fbshipit-source-id: d17b9730651671608cf13f7abe6a9bb32251e140
Summary:
The Rust store code has been enabled everywhere for a few weeks now, let's
enable it by default in the code. Future changes will remove the config as well
as all the code associated with the non Rust store code.
The various tests changes are due to small difference between the Rust code and
the Python one, the biggest one being it's handling of corrupted packfiles. The
old code silently ignored them, while the new one errors out for local
packfiles. The test-lfs-bundle.t difference is just due to an ordering
difference between Python and Rust.
Reviewed By: kulshrax
Differential Revision: D21985744
fbshipit-source-id: 10410560193476bc303a72e7583f84924a6de820
Summary:
Whenever data is redacted on the server, a specific tombstone is returned when
fetching it. Make sure that whenever we update the file on disk, we write a
nice message to the user instead of the tombstone itself.
While this code could have been moved into the Rust store code itself, I prefer
to leave it to its users to decide what to do with redacted data. EdenFS for
instance may want to prevent access to it instead of showing the redacted
message.
Reviewed By: kulshrax
Differential Revision: D21999345
fbshipit-source-id: 39a83cdf5ea4567628a13fbd59520b9677aba749
Summary:
On Windows, `pathlib.Path.resolve` has an use of uninitialized memory bug in Python 3.6.2. Lego Windows has this version of Python 3 installed.
When `eden prefetch` runs, it will first attempt to read the config file at the root of the repository (`$REPOROOT/.eden/config`). Because of this bug, when we normalize the path in edenfsctl, it will get some random bytes as a result. This subsequently causes `toml.read` to fail as it is unable to read a path containing NUL byte.
As you can see, this is the same exception as we see on Windows:
```
In [1]: open("test\x00")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-57764b20d660> in <module>
----> 1 open("test\x00")
ValueError: embedded null byte
```
Reviewed By: xavierd
Differential Revision: D22177997
fbshipit-source-id: ab2565c8946d809bc15bc1597b749fb5e9440ca0
Summary:
The dynamicconfig logic had an assert checking that a config was not
added by the dynamicconfig twice. This isn't actually a valid assert since the
caller could load the config as many times as it wanted.
This happened in hg doctor, where it loaded the ui once manually (without a repo
object) then it loaded it again during the creation of the repo object. I went
ahead and cleaned up hg doctor to not do this, but let's get rid of the assert
anyway.
Reviewed By: quark-zju
Differential Revision: D22194273
fbshipit-source-id: 9491d35fe14523ad3d9cb69b4ca0d615149dd0f0
Summary:
D22009699 recently introduced some logic that access repo.localvfs
during repo setup. This doesn't exist for peer repos, so let's exist early if
this is a peer.
Reviewed By: xavierd
Differential Revision: D22197586
fbshipit-source-id: 8f717aa21a0b45f8f85a1ff800ee4a1f86b94bdf
Summary: This makes it easier to run tests with non-buck build.
Reviewed By: sfilipco
Differential Revision: D22174992
fbshipit-source-id: 11100d608cb78973ae7a63056667f0be4a86ae00
Summary: This might make it easier for us to see what processes are hanging.
Reviewed By: sfilipco
Differential Revision: D22009696
fbshipit-source-id: a1c1f97976084160a1b4e4a3ff7d2f9781240f6f
Summary:
There are lots of "hanging" questions in the support group. While not all of
them are our fault (ex. mergedriver), we don't have an easy reply to tell
users what's actually going on.
This diff adds a way to write sigtrace periodically so we can include it in
rage output.
Reviewed By: sfilipco
Differential Revision: D22009699
fbshipit-source-id: 5349f613b08507ed02cc0e3fa00963fd7d4c3d21
Summary:
A lot of extensions wrap runcommand just to get the command name.
Let's define a proper API for it - ui.cmdname provides the resolved
command name.
Reviewed By: sfilipco
Differential Revision: D22009694
fbshipit-source-id: 4b8744ee5314924ea09c9f325570ce53c849fef5
Summary: The tracing APIs and error context APIs can achieve similar effects.
Reviewed By: xavierd
Differential Revision: D22129585
fbshipit-source-id: 0626e3f4c1a552c69c046ff06ac36f5e98a6c3d8
Summary:
`eden top` often adds extra quotes to the end of the commands in the process
table, this removes those pesky quotes
Reviewed By: genevievehelsel
Differential Revision: D21440316
fbshipit-source-id: f1200a28a5345691fcce517526d119f44e5993d0
Summary: Make test-test-commit-interactive.t python 3 compatible. Note I wasn't sure if it was worthwhile to spend too much time on the translation portion of the test. If someone objects to the deletion, I can spend more time on it.
Reviewed By: DurhamG
Differential Revision: D22122960
fbshipit-source-id: d232971581fca0532ed05c8ca1fdb7b5e8bd99e0
Summary:
Eden can sometimes unexpectedly fetch files from the server, and we want
to know why this is happening. This adds logging for the source of
data fetching in edens backing store to help obviate why these fetches
are happening.
This temporarily adds the logging in the HgQueuedBacking store to get a naive
version of logging rolled out sooner. Follow up changes will move this logging
closer to the data fetching itself if possible (in HgDatapackStore and HgImporter).
Reviewed By: chadaustin
Differential Revision: D22012572
fbshipit-source-id: b1b012ce4ee133fbacecd586b7365c3c5a5386df
Summary:
We have seen that eden will unexpectedly fetch data, we want to know why.
This adds the plumbing to interact with edens current logging to be able to
log when eden fetches data from the server and what caused eden to do this
fetch. Later changes will use the classes created here to log the cause of data
fetches.
Reviewed By: chadaustin
Differential Revision: D22051013
fbshipit-source-id: 27d377d7057e66f3e7a304cd7004f8aa44f8ba62
Summary:
Recently the server team added an un-used directory to test that eden would not
fetch these as a test for the upcoming repo merge. They saw that these files
were fetched a non trivial number of times. We want to know why eden is causing
these fetches.
This adds the pid and interface through which the client is interacting with eden to
ObjectFetchContext for this purpose. This information will be logged in later
changes.
note: currently this is only for fetches through Fuse (thrift interface to follow).
Reviewed By: chadaustin
Differential Revision: D22050919
fbshipit-source-id: 49b93257a0e6d910f48b1e8ec6471527e056d22a
Summary:
This passes ObjectFetchContext into the backing store to prepare for adding
logging for the cause of server fetches.
In following changes I will add logging in the HgQueuedBackingStore.
Ultimately we will want to move this logging to be closer to the data fetching
(in HgDatapackStore and HgImporter), but I plan to temporarily add logging to
the HgQueuedBackingStore to simplify so that we can more quickly roll out.
Reviewed By: chadaustin
Differential Revision: D22022992
fbshipit-source-id: ccb428458cbf7a1e33aaf9be9d0d766c45acedb3