Summary:
Previously we were trying to send trees to all clients during an
infinitepull, even ones that didn't support treemanifest. This caused
infinitepulls that required rebundling to fail for non-tree clients.
The fix is to just not send them unless the client is advertising the
capability.
Reviewed By: phillco
Differential Revision: D7432374
fbshipit-source-id: 1fae14a158ef56fe39439a718b1b98928f4e07b0
Summary:
Previously we were storing the changelog on the manifestlog and using
it to resolve linkrevs before serializing them. It turns out the changelog can
be invalidated at a different rate than the manifestlog, so we could encounter
issues where the manifestlog held a reference to the old changelog.
To fix this, let's hold a reference to the repo and access the changelog from
there when we need it. This introduces a circular reference between the
manifestlog and the repo, but it's probably fine for now until we can get rid of
the need for changelog invalidation.
Reviewed By: singhsrb
Differential Revision: D7360321
fbshipit-source-id: 2317c7fcd6b307a50b64f0c5df97dda2955f3e21
Summary:
When remotefilelog downloads filelog information for a particular file, set the
progress bar item to that file. This means if we get stuck on a particular
file, there is feedback to the user as to which file that is.
Reviewed By: quark-zju
Differential Revision: D7329503
fbshipit-source-id: 94416962cdc4c97994f76e8ed9203823aeca3d64
Summary:
A prior diff optimized push/pull to not send public trees between
peers, since those trees can be downloaded from the main server. Let's be
careful when sending data to the main server and always send everything.
In the future we should add validation on the server that the received data is
complete, but Mercurial doesn't currently do that today.
Differential Revision: D7296253
fbshipit-source-id: 49513685d19991a70d66da1d734ddae23491ed0c
Summary:
This allows pushing a treeonly pack to a server without using
pushrebase.
Differential Revision: D7295686
fbshipit-source-id: b0bfe4fbb04bc765e57f1db82909fa1ae7b3063b
Summary:
It can sometimes be nice to be able to just repack loose data and avoid existing packs.
This came up during building the megarepo, which creates many loose files, but including the packs added too much repack time.
Differential Revision: D7209296
fbshipit-source-id: 10afed40e409733e0ee004f025013cf86f3f7bf6
Summary:
Previously we were just putting nullid as the linknode in client side
trees, because when the trees were added the changelog hadn't been written yet,
so we didn't know the linknode. This diff updates mutablehistorypack to allow
resolving the linknodes at serialization time instead of add time. This will be
used in a future diff to fix storing linknodes in trees.
Reviewed By: ryanmce
Differential Revision: D7280105
fbshipit-source-id: 70063e627d0fd7baeb017bac5ac55957a100d06c
Summary:
In a future diff we'll want to allow converting flat manifests to tree
manifests during infinitepush. To do so, let's allow writing to in memory packs
on bundlerepositories.
Reviewed By: StanislavGlebik
Differential Revision: D7256563
fbshipit-source-id: 10ec58d1171b7882d6db9a916c50a19bc11dbcb4
Summary: We need to call None on the progress bar so it knows it's done.
Reviewed By: ryanmce
Differential Revision: D7267691
fbshipit-source-id: 81b19d945a44e8f46a15abdaf501ef6b5dea4ffc
Summary:
Previously, if we weren't sending trees we would not attempt to process
what files needed to be sent. This was incorrect, since files may be sent
independent of what trees are decided to be sent. So let's update the code to
care about shouldaddfilegroup instead of cansendtrees.
This exposed an additional bug where we wouldn't look at the treemanifests
during the file computation, so when sending treeonly infinitepush bundles we
would get an error.
Reviewed By: singhsrb
Differential Revision: D7240340
fbshipit-source-id: a9be69597f4f1cbbecdd7cb1661f1023114bf621
Summary:
Sometimes infinitepush has to rebundle the data it's trying to send to
the client (like if you pull only part of a bundle). Previously, this was broken
for treeonly bundles because we did not correctly extend the bundlerepo's
data/history store on treemanifest servers. This patch fixes that and updates a
test.
Reviewed By: quark-zju
Differential Revision: D7240343
fbshipit-source-id: a9dd3ae884ace3fa9f4a748fe753fc394e69d6c9
Summary:
The cstore doesn't contain our ondemand generator, nor our mutable data
store, so it's really not usable yet. Let's disable it for now until we can fix
up the issues that prevent us from using it (like having it report metrics).
Reviewed By: ryanmce
Differential Revision: D7148823
fbshipit-source-id: 5cc46af33c049b751c1c04916aafe0768d80ce7a
Summary:
In a future patch we will start using the contents of the
manifestrevlogstore as the place we call add() to add new revisions. In order to
do that, those revlogs must be manifestrevlogs. This patch just changes them to
use the right type.
Reviewed By: quark-zju
Differential Revision: D7148824
fbshipit-source-id: 0d60cd22b041db83b61ccaec3b6172624ea97e42
Summary:
This diff adds a config option to tweak deltabase in changegroup. It has 3
options:
- Always null - always use "null" as delta base, effectively make
everything full text
- No external - delta bases cannot be a revision outside the changegroup
- Default - the current behavior: delta bases can be anything that client
thinks the server should have.
This gives Mononoke more time to bake delta related logic, as we can
choose "always null" first, then incrementally increase the complexity.
Reviewed By: phillco
Differential Revision: D7158585
fbshipit-source-id: 5f6d9a78d1108093e8d08b9f296568f4f7e7471b
Summary:
Currently if you push or pull a bunch of commits between peers we will
include all the trees as part of the push. If the source repo doesn't have all
the necessary trees, it will go to the server to get them. Since the other
machine can just as easily go to the server (and probably won't need most of
those trees anyways), lets just have the source client send all draft trees and
skip the public commits,
Reviewed By: phillco
Differential Revision: D7141623
fbshipit-source-id: 6d33ae9d4c9cc32bf6dfa76f733c87c06890d719
Summary:
As part of unifying all our pre-pull/push prefetches, let's move the
changegroup-building prefetch into the cansendtrees function. In a future diff
we'll change this logic to not send trees for public commits in a peer to peer
push/pull.
Reviewed By: mjpieters
Differential Revision: D7141625
fbshipit-source-id: 0253fa32993666f3e03c10c98163d8d60370a97c
Summary:
A future diff will make it so we can send only draft trees instead of
all trees. To prepare for this, let's move the cansendtrees logic to
shallowbundle (since it will be used by both shallowbundle and by treemanifest)
and change it to return an enum.
Reviewed By: quark-zju
Differential Revision: D7141624
fbshipit-source-id: 34c78b0d1cdb6f8d86a99fb74665e80b2af12c5c
Summary:
During big `hg update` calls, the user will often see no progress
initially while we discover which blobs we need to download. Let's fix that.
Reviewed By: quark-zju
Differential Revision: D6903134
fbshipit-source-id: 35b174120b6dce412dd337b6b93c9f5b4233522d
Summary:
When background prefetch is enabled, let's use the base parameter to
limit how many files we download. This makes the operation O(files changed), and
on windows results in a significant speed up.
Reviewed By: mjpieters
Differential Revision: D7108838
fbshipit-source-id: a46b8a7d897ee204b9a4c1f1c65d875dbd3e9bc7
Summary:
Historically treeonly clients were ignoring flat manifests sent in
bundles, but with a recent change they will now try to recreate those manifests
when they receive such a bundle (so we don't lose any data when a user does 'hg
unbundle' on a flat-only bundle). This means we need to stop sending flat
manifests from the server to treeonly clients.
Differential Revision: D7083028
fbshipit-source-id: e4580b00a8be96fbef0ee624529c58f41cfa2752
Summary:
Previously we were only wrapping the changegroup packers for the
clients. In a future diff we want to use the shallow changegroup packers to
govern when to send trees from the server, so we need them enabled for the
server.
It turns out they were mostly enabled for the server already. While we weren't
replacing the classes in changegroup._packermap (which is what they get
instantiated from), we were replacing them via the interposeclass decorator.
This meant that the server would construct a cg3packer, but it would have a
shallowcg1packer in the class hierarchy, so most of the code was running
already. So this should be a pretty low risk change. In theory.
Differential Revision: D7083042
fbshipit-source-id: 5ce44a9ceda4d7d4bd126f52a01e45e6e1e7de40
Summary:
This makes the mutable history pack implement the history store read
api so we can add it to the union store and read the contents of things that
have been written but not yet committed.
The mutablehistorypack fileentries variable has been changed to contain a dict
instead of a list so we can access it quickly during reads. The list is from a
legacy requirement where we used to maintain the order that the writer wrote in.
We no longer do that (instead we topologically sort what they've given us), so
switching from a list to a dict should be fine.
Differential Revision: D7083036
fbshipit-source-id: ae511db60ab6432059714a2271c175dc9683b8e1
Summary:
Adds addstore and removestore to the union history store, just like
we've already done for the union data store.
Reviewed By: ryanmce
Differential Revision: D7083055
fbshipit-source-id: 49f1a4156376d0cf5d6191c4d30ec923ddb2ec14
Summary:
In a future diff we'll need the ability to modify the union store on
the fly, so let's add addstore and removestore apis.
Reviewed By: ryanmce
Differential Revision: D7051102
fbshipit-source-id: 901a50720bfdf4e5c59714d092830e65edccdfce
Summary:
The union content store
- iterates through all the stores it has until the current store has the
content.
- Or, it fails eventually if none of the stores have the content.
It does so by relying on the current store throwing a KeyError if it doesn't
have the content.
`remotetreedatastore` was throwing the MissingNodesError which means any
remaining stores after it would not even get a chance to look for the content.
This commit addresses the same.
Reviewed By: ryanmce
Differential Revision: D6867854
fbshipit-source-id: 784df195efcbe16f2e716968f3d93159afff6206
Summary:
We don't really support files larger than 2G with files larger than
4G remotefilelog crashes badly.
Reviewed By: markbt
Differential Revision: D7066855
fbshipit-source-id: 40cdebbe703a7b3f13ce84174bf6f96565e8c3b7
Summary:
If a remotefilelog loose object has a size of 0, it is invalid. Since
we're already doing a syscall to check if the file exists, we might as well try
to read it's size in the process and consider it missing if the size is 0.
Previously we would report the file as there, then later when we tried to use
the file we noticed it was broken and would downlad it on demand. If many files
were broken, this adhoc one-by-one downloading would be extremely slow, so
catching it earlier so we can do batch re-downloading will speed things up.
Reviewed By: singhsrb
Differential Revision: D7046960
fbshipit-source-id: 0139f7c2cad3de617e9aae925a075bdb65f70ff5
Summary:
When hg addremove needs to remove a file, remotefilelog tries to create a filenode in order to prefetch the removed files from the server. If the file is not in the parent context manifest, this throws an exception.
To solve this we have to first check if the file exists or not in the parent manifest. If the file does not exist in the manifest then we don't need to prefetch it, and addremove will behave like forget rather than remove.
Reviewed By: markbt
Differential Revision: D7009649
fbshipit-source-id: 0570bc00db546a455b9c2e4628740e24ca819dd6
Summary:
Previously we would only do ondemand downloading of tree blobs when
accessing data contents. When accessing just history we would just fail if the
data wasn't available locally. This adds the remote store to the history union
store so we can get history remotely.
Reviewed By: singhsrb
Differential Revision: D7003434
fbshipit-source-id: 839f8e84be35779ccb146d13ce3e1d6d1e7f46bd
Summary:
This can be convenient when the size of LFS downloads between r1 and r2 is needed.
NB: this does not pay attention to sparse profiles and acts as if the checkout was full. We probably need to know how to do both.
Reviewed By: quark-zju
Differential Revision: D6938259
fbshipit-source-id: 52ab88be83339472f2eccafc746a191ff26c16c7
Summary:
remotefilelog prefetch was broken in treeonly mode since it referred to
the manifest revlog to get the parents. Let's switch to the more modern
manifestctx way of accessing parent information.
Reviewed By: quark-zju
Differential Revision: D6995267
fbshipit-source-id: e0c11fca0f2156be3f936a6e437e7a4d3dffe75b
Summary:
As part of producing only one pack file per transaction, we need to
change the mutabledatapack store to allow reads of written-but-not-finalized
data. This will allow us to add the mutabledatapack to the union store, and then
keep it alive for the duration of the transaction.
Reviewed By: quark-zju
Differential Revision: D6944348
fbshipit-source-id: 1e721bd8e07335a9c1f9c6b7595a765ec018c007
Summary:
In an upcoming diff we'll make mutabledatapack readable, so we need the
entry parsing logic to be accessable. Let's move it out.
Reviewed By: singhsrb
Differential Revision: D6944345
fbshipit-source-id: c75162bb7c4fa47c6339b7ebbf96a4e386bd04b3
Summary:
This check is useful and detects real errors (ex. fbconduit). Unfortunately
`arc lint` will run it with both py2 and py3 so a lot of py2 builtins will
still be warned.
I didn't find a clean way to disable py3 check. So this diff tries to fix them.
For `xrange`, the change was done by a script:
```
import sys
import redbaron
headertypes = {'comment', 'endl', 'from_import', 'import', 'string',
'assignment', 'atomtrailers'}
xrangefix = '''try:
xrange(0)
except NameError:
xrange = range
'''
def isxrange(x):
try:
return x[0].value == 'xrange'
except Exception:
return False
def main(argv):
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
content = open(path).read()
try:
red = redbaron.RedBaron(content)
except Exception:
print(' warning: failed to parse')
continue
hasxrange = red.find('atomtrailersnode', value=isxrange)
hasxrangefix = 'xrange = range' in content
if hasxrangefix or not hasxrange:
print(' no need to change')
continue
# find a place to insert the compatibility statement
changed = False
for node in red:
if node.type in headertypes:
continue
# node.insert_before is an easier API, but it has bugs changing
# other "finally" and "except" positions. So do the insert
# manually.
# # node.insert_before(xrangefix)
line = node.absolute_bounding_box.top_left.line - 1
lines = content.splitlines(1)
content = ''.join(lines[:line]) + xrangefix + ''.join(lines[line:])
changed = True
break
if changed:
# "content" is faster than "red.dumps()"
open(path, 'w').write(content)
print(' updated')
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
```
For other py2 builtins that do not have a py3 equivalent, some `# noqa`
were added as a workaround for now.
Reviewed By: DurhamG
Differential Revision: D6934535
fbshipit-source-id: 546b62830af144bc8b46788d2e0fd00496838939
Summary:
On Windows, the intended access to the file handle must align with the
intended access to the memory mapping object, see [1].
When called without and argument, Python's `mmap.mmap` on Windows assumes
`ACCESS_WRITE` mode[3], therefore we get failures like [2] in Lego Windows.
[1]
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366537(v=vs.85).aspx
[2] P59034213
[3] https://docs.python.org/2/library/mmap.html
Reviewed By: quark-zju
Differential Revision: D6952583
fbshipit-source-id: 93159f8282e27d3e62d859f4c220e7c3bdfbe958
Summary:
With certain setup, the file history could be incorrect. That makes
`adjustlinknode` much slower since it has to scan the full history.
It is extremely slow if there is a commit with a massive renames.
This is undesirable since `hg log FILE` would give a wrong result.
But it does make the repo usable. Automation that does not care
about `hg log FILE` correctness can probably enable this relatively
safely.
This patch changes both shallow and full repos.
Reviewed By: DurhamG
Differential Revision: D6912051
fbshipit-source-id: 23d6f6c8dd91d4f72b43bc560cf26686bd6c4b47
Summary:
The remotefilelog cgunpacker logic could enter an infinite loop
if a copyfrom file node uses its copyto as delta base.
This repros without LFS in test-lfs-bundle.t.
Reviewed By: DurhamG
Differential Revision: D6910079
fbshipit-source-id: 99fea316e77218cd4bc9ea6f5506779a3e4ab9a6
Summary: Otherwise the copy data will be lost when applying LFS bundles.
Reviewed By: DurhamG
Differential Revision: D6906207
fbshipit-source-id: bc94c6614f9d4b2a2b4c7f44f57de49bd54d6b49
Summary: This is similar to the previous patch, but applies to remotefilelog.
Reviewed By: DurhamG
Differential Revision: D6906206
fbshipit-source-id: 2a9a56a57544b5e4d892f77438b2faaadece73ee
Summary:
Calling revdiff with non-zero flags is a sign of a hard-to-debug
error. Raise ProgrammingError in this case.
The change is straightforward. Apply it to both shallow and full
repos.
Reviewed By: DurhamG
Differential Revision: D6910080
fbshipit-source-id: cbcf1a444de90e104867cc9f1525629b7edda851
Summary: This is similar to the previous patch, but applies to remotefilelog.
Reviewed By: DurhamG
Differential Revision: D6906212
fbshipit-source-id: 30383632046f57b169dcb8a2ba1c0dd73113154a
Summary:
This is similar to `debugdata`, but instead of taking a file revision (or
file node in remotefilelog's case), it takes a revset.
This is more useful practically, since the user would know commit hashes
easily but file nodes are hidden from the UI.
This is intended to make it easier to investigate LFS contents.
Reviewed By: DurhamG, ryanmce
Differential Revision: D6891770
fbshipit-source-id: 415da9b773c30830a48c09eda9f1854c416e3222
Summary:
We have seen bad `hg gc` behavior on Lego-Windows, caused evidently by the fact
that we currently use plain Python `open` and `os.unlink` instead of
Mercurial's advanced `posixfile` and `unlink`.
Reviewed By: DurhamG
Differential Revision: D6900689
fbshipit-source-id: cd7ebbbb734a6163d062622d1d4606fad43c91ac
Summary: `fastverifier` was sometimes being overriden by `shallowverifier` when remotefilelog was enabled. Since the latter is a subset of the former, let's just fold both into the core verifier code backed by a config, `verify.skipmanifests`, that we can default to true.
Reviewed By: DurhamG
Differential Revision: D6882222
fbshipit-source-id: 9f337ca031a070425ccdc9ee02f6765e68436da9
Summary:
`lz4.block` needs to be imported explicitly before being able to
use `lz4.block.compress`.
We didn't notice this because we're using an old version of
`python-lz4`.
Reviewed By: DurhamG
Differential Revision: D6879877
fbshipit-source-id: 37e8fdc00386bef3733753f925ad308f42e5a740