Summary:
We may want to use an experimental store for remotefilelog data. This store may
return incorrect data, and cache is going to be poisoned. To prevent it from
hapenning let's add an option that makes it possible to prevent cache
poisoning.
Reviewed By: farnz
Differential Revision: D7615453
fbshipit-source-id: 0ade7407b92a3a32c841f6f0214dc353c30bff59
Summary:
A subsequent diff will need access to the node's diff and meta during iteration time. It seems like a
natural part of the API so let's add it.
Note: It's possible to call `.getdelta(name, node)` to get this data if we don't read it here.
But I ran into some weird occassional OSErrors from the mmap API when I did that. So let's just
do this.
Reviewed By: DurhamG
Differential Revision: D7369225
fbshipit-source-id: 252839a549242909153c74287db8f36d6c63bd9c
Summary:
Previously the connectionpool was a remotefilelog specific concept. We
want to start sharing connections between pull and prefetches, so let's move it
to core Mercurial.
Reviewed By: ryanmce, phillco
Differential Revision: D7480670
fbshipit-source-id: 1b2eff3b0e61a815709ffaec35df802eeda0c24b
Summary:
`remotefilelog.fileserverclient.peersetup.remotefilepeer` overrides the
`_callstream` method, however it uses `command` rather than `cmd` for the first
parameter name. This doesn't match the method it's overriding, and clashes
with clienttelemetry's use of this parameter for the original command that the
user ran.
Make this method match all the others.
Reviewed By: quark-zju
Differential Revision: D7443726
fbshipit-source-id: 1170feb21056c3e044bffaf55d95f7c48ff972fb
Summary:
Subsequent commits will need the new path of a mutable{data, hist}pack -- this makes
that data accessible.
Reviewed By: DurhamG
Differential Revision: D7369226
fbshipit-source-id: f6849aaed747fbd9afee7191e6a0e5e1357ca618
Summary:
There was an issue where if the prefetch inside cansendtrees failed, it
wouldn't allow it to actually try the operation. This is undesirable, since
prefetch only talks to the server while the actual tree fetch will also attempt
to generate a tree from an old flat manifest.
Ideally we'd have a more unified flow here, where we could have the server let
us know what nodes it couldn't find, then the client could try other options for
the remaining nodes, but that requires significantly more refactoring.
Reviewed By: quark-zju
Differential Revision: D7450662
fbshipit-source-id: a023f27ee4b74786633e4dce7e62f3d9604c2b7f
Summary:
Data stores have already migrated to MissingNodesError instead of
KeyError, so let's move metadatastore as well. This provides better error
messages and more specific catching.
Reviewed By: phillco
Differential Revision: D7448103
fbshipit-source-id: 33d0f267545abd7d4063d2b344a93d26aff76d81
Summary:
Show the 'impact' of a profile, relative to a non-sparse working copy.
By default, this is the percentage of the total file count; adding --verbose will show the file sizes too.
The defined matchers have been refactored to reuse more of mercurial.match.basematcher, making it easier to reuse these in a wider mercurial context and avoiding repetition of common methods.
Reviewed By: quark-zju
Differential Revision: D7415720
fbshipit-source-id: 4ac3492c61aa70ee71d4bdf8c201b905a345a9d1
Summary: On my system, the `outgoing()` revset could easily take several seconds. A spinner here helps dispell the notion that mercurial has hung.
Reviewed By: quark-zju
Differential Revision: D7429086
fbshipit-source-id: 28908a34798d985dad3120647c3a5f474ca8a746
Summary:
Previously we were trying to send trees to all clients during an
infinitepull, even ones that didn't support treemanifest. This caused
infinitepulls that required rebundling to fail for non-tree clients.
The fix is to just not send them unless the client is advertising the
capability.
Reviewed By: phillco
Differential Revision: D7432374
fbshipit-source-id: 1fae14a158ef56fe39439a718b1b98928f4e07b0
Summary:
Previously we were storing the changelog on the manifestlog and using
it to resolve linkrevs before serializing them. It turns out the changelog can
be invalidated at a different rate than the manifestlog, so we could encounter
issues where the manifestlog held a reference to the old changelog.
To fix this, let's hold a reference to the repo and access the changelog from
there when we need it. This introduces a circular reference between the
manifestlog and the repo, but it's probably fine for now until we can get rid of
the need for changelog invalidation.
Reviewed By: singhsrb
Differential Revision: D7360321
fbshipit-source-id: 2317c7fcd6b307a50b64f0c5df97dda2955f3e21
Summary:
When remotefilelog downloads filelog information for a particular file, set the
progress bar item to that file. This means if we get stuck on a particular
file, there is feedback to the user as to which file that is.
Reviewed By: quark-zju
Differential Revision: D7329503
fbshipit-source-id: 94416962cdc4c97994f76e8ed9203823aeca3d64
Summary:
A prior diff optimized push/pull to not send public trees between
peers, since those trees can be downloaded from the main server. Let's be
careful when sending data to the main server and always send everything.
In the future we should add validation on the server that the received data is
complete, but Mercurial doesn't currently do that today.
Differential Revision: D7296253
fbshipit-source-id: 49513685d19991a70d66da1d734ddae23491ed0c
Summary:
This allows pushing a treeonly pack to a server without using
pushrebase.
Differential Revision: D7295686
fbshipit-source-id: b0bfe4fbb04bc765e57f1db82909fa1ae7b3063b
Summary:
It can sometimes be nice to be able to just repack loose data and avoid existing packs.
This came up during building the megarepo, which creates many loose files, but including the packs added too much repack time.
Differential Revision: D7209296
fbshipit-source-id: 10afed40e409733e0ee004f025013cf86f3f7bf6
Summary:
Previously we were just putting nullid as the linknode in client side
trees, because when the trees were added the changelog hadn't been written yet,
so we didn't know the linknode. This diff updates mutablehistorypack to allow
resolving the linknodes at serialization time instead of add time. This will be
used in a future diff to fix storing linknodes in trees.
Reviewed By: ryanmce
Differential Revision: D7280105
fbshipit-source-id: 70063e627d0fd7baeb017bac5ac55957a100d06c
Summary:
In a future diff we'll want to allow converting flat manifests to tree
manifests during infinitepush. To do so, let's allow writing to in memory packs
on bundlerepositories.
Reviewed By: StanislavGlebik
Differential Revision: D7256563
fbshipit-source-id: 10ec58d1171b7882d6db9a916c50a19bc11dbcb4
Summary: We need to call None on the progress bar so it knows it's done.
Reviewed By: ryanmce
Differential Revision: D7267691
fbshipit-source-id: 81b19d945a44e8f46a15abdaf501ef6b5dea4ffc
Summary:
Previously, if we weren't sending trees we would not attempt to process
what files needed to be sent. This was incorrect, since files may be sent
independent of what trees are decided to be sent. So let's update the code to
care about shouldaddfilegroup instead of cansendtrees.
This exposed an additional bug where we wouldn't look at the treemanifests
during the file computation, so when sending treeonly infinitepush bundles we
would get an error.
Reviewed By: singhsrb
Differential Revision: D7240340
fbshipit-source-id: a9be69597f4f1cbbecdd7cb1661f1023114bf621
Summary:
Sometimes infinitepush has to rebundle the data it's trying to send to
the client (like if you pull only part of a bundle). Previously, this was broken
for treeonly bundles because we did not correctly extend the bundlerepo's
data/history store on treemanifest servers. This patch fixes that and updates a
test.
Reviewed By: quark-zju
Differential Revision: D7240343
fbshipit-source-id: a9dd3ae884ace3fa9f4a748fe753fc394e69d6c9
Summary:
The cstore doesn't contain our ondemand generator, nor our mutable data
store, so it's really not usable yet. Let's disable it for now until we can fix
up the issues that prevent us from using it (like having it report metrics).
Reviewed By: ryanmce
Differential Revision: D7148823
fbshipit-source-id: 5cc46af33c049b751c1c04916aafe0768d80ce7a
Summary:
In a future patch we will start using the contents of the
manifestrevlogstore as the place we call add() to add new revisions. In order to
do that, those revlogs must be manifestrevlogs. This patch just changes them to
use the right type.
Reviewed By: quark-zju
Differential Revision: D7148824
fbshipit-source-id: 0d60cd22b041db83b61ccaec3b6172624ea97e42
Summary:
This diff adds a config option to tweak deltabase in changegroup. It has 3
options:
- Always null - always use "null" as delta base, effectively make
everything full text
- No external - delta bases cannot be a revision outside the changegroup
- Default - the current behavior: delta bases can be anything that client
thinks the server should have.
This gives Mononoke more time to bake delta related logic, as we can
choose "always null" first, then incrementally increase the complexity.
Reviewed By: phillco
Differential Revision: D7158585
fbshipit-source-id: 5f6d9a78d1108093e8d08b9f296568f4f7e7471b
Summary:
Currently if you push or pull a bunch of commits between peers we will
include all the trees as part of the push. If the source repo doesn't have all
the necessary trees, it will go to the server to get them. Since the other
machine can just as easily go to the server (and probably won't need most of
those trees anyways), lets just have the source client send all draft trees and
skip the public commits,
Reviewed By: phillco
Differential Revision: D7141623
fbshipit-source-id: 6d33ae9d4c9cc32bf6dfa76f733c87c06890d719
Summary:
As part of unifying all our pre-pull/push prefetches, let's move the
changegroup-building prefetch into the cansendtrees function. In a future diff
we'll change this logic to not send trees for public commits in a peer to peer
push/pull.
Reviewed By: mjpieters
Differential Revision: D7141625
fbshipit-source-id: 0253fa32993666f3e03c10c98163d8d60370a97c
Summary:
A future diff will make it so we can send only draft trees instead of
all trees. To prepare for this, let's move the cansendtrees logic to
shallowbundle (since it will be used by both shallowbundle and by treemanifest)
and change it to return an enum.
Reviewed By: quark-zju
Differential Revision: D7141624
fbshipit-source-id: 34c78b0d1cdb6f8d86a99fb74665e80b2af12c5c
Summary:
During big `hg update` calls, the user will often see no progress
initially while we discover which blobs we need to download. Let's fix that.
Reviewed By: quark-zju
Differential Revision: D6903134
fbshipit-source-id: 35b174120b6dce412dd337b6b93c9f5b4233522d
Summary:
When background prefetch is enabled, let's use the base parameter to
limit how many files we download. This makes the operation O(files changed), and
on windows results in a significant speed up.
Reviewed By: mjpieters
Differential Revision: D7108838
fbshipit-source-id: a46b8a7d897ee204b9a4c1f1c65d875dbd3e9bc7
Summary:
Historically treeonly clients were ignoring flat manifests sent in
bundles, but with a recent change they will now try to recreate those manifests
when they receive such a bundle (so we don't lose any data when a user does 'hg
unbundle' on a flat-only bundle). This means we need to stop sending flat
manifests from the server to treeonly clients.
Differential Revision: D7083028
fbshipit-source-id: e4580b00a8be96fbef0ee624529c58f41cfa2752
Summary:
Previously we were only wrapping the changegroup packers for the
clients. In a future diff we want to use the shallow changegroup packers to
govern when to send trees from the server, so we need them enabled for the
server.
It turns out they were mostly enabled for the server already. While we weren't
replacing the classes in changegroup._packermap (which is what they get
instantiated from), we were replacing them via the interposeclass decorator.
This meant that the server would construct a cg3packer, but it would have a
shallowcg1packer in the class hierarchy, so most of the code was running
already. So this should be a pretty low risk change. In theory.
Differential Revision: D7083042
fbshipit-source-id: 5ce44a9ceda4d7d4bd126f52a01e45e6e1e7de40
Summary:
This makes the mutable history pack implement the history store read
api so we can add it to the union store and read the contents of things that
have been written but not yet committed.
The mutablehistorypack fileentries variable has been changed to contain a dict
instead of a list so we can access it quickly during reads. The list is from a
legacy requirement where we used to maintain the order that the writer wrote in.
We no longer do that (instead we topologically sort what they've given us), so
switching from a list to a dict should be fine.
Differential Revision: D7083036
fbshipit-source-id: ae511db60ab6432059714a2271c175dc9683b8e1
Summary:
Adds addstore and removestore to the union history store, just like
we've already done for the union data store.
Reviewed By: ryanmce
Differential Revision: D7083055
fbshipit-source-id: 49f1a4156376d0cf5d6191c4d30ec923ddb2ec14
Summary:
In a future diff we'll need the ability to modify the union store on
the fly, so let's add addstore and removestore apis.
Reviewed By: ryanmce
Differential Revision: D7051102
fbshipit-source-id: 901a50720bfdf4e5c59714d092830e65edccdfce
Summary:
The union content store
- iterates through all the stores it has until the current store has the
content.
- Or, it fails eventually if none of the stores have the content.
It does so by relying on the current store throwing a KeyError if it doesn't
have the content.
`remotetreedatastore` was throwing the MissingNodesError which means any
remaining stores after it would not even get a chance to look for the content.
This commit addresses the same.
Reviewed By: ryanmce
Differential Revision: D6867854
fbshipit-source-id: 784df195efcbe16f2e716968f3d93159afff6206
Summary:
We don't really support files larger than 2G with files larger than
4G remotefilelog crashes badly.
Reviewed By: markbt
Differential Revision: D7066855
fbshipit-source-id: 40cdebbe703a7b3f13ce84174bf6f96565e8c3b7
Summary:
If a remotefilelog loose object has a size of 0, it is invalid. Since
we're already doing a syscall to check if the file exists, we might as well try
to read it's size in the process and consider it missing if the size is 0.
Previously we would report the file as there, then later when we tried to use
the file we noticed it was broken and would downlad it on demand. If many files
were broken, this adhoc one-by-one downloading would be extremely slow, so
catching it earlier so we can do batch re-downloading will speed things up.
Reviewed By: singhsrb
Differential Revision: D7046960
fbshipit-source-id: 0139f7c2cad3de617e9aae925a075bdb65f70ff5
Summary:
When hg addremove needs to remove a file, remotefilelog tries to create a filenode in order to prefetch the removed files from the server. If the file is not in the parent context manifest, this throws an exception.
To solve this we have to first check if the file exists or not in the parent manifest. If the file does not exist in the manifest then we don't need to prefetch it, and addremove will behave like forget rather than remove.
Reviewed By: markbt
Differential Revision: D7009649
fbshipit-source-id: 0570bc00db546a455b9c2e4628740e24ca819dd6
Summary:
Previously we would only do ondemand downloading of tree blobs when
accessing data contents. When accessing just history we would just fail if the
data wasn't available locally. This adds the remote store to the history union
store so we can get history remotely.
Reviewed By: singhsrb
Differential Revision: D7003434
fbshipit-source-id: 839f8e84be35779ccb146d13ce3e1d6d1e7f46bd
Summary:
This can be convenient when the size of LFS downloads between r1 and r2 is needed.
NB: this does not pay attention to sparse profiles and acts as if the checkout was full. We probably need to know how to do both.
Reviewed By: quark-zju
Differential Revision: D6938259
fbshipit-source-id: 52ab88be83339472f2eccafc746a191ff26c16c7
Summary:
remotefilelog prefetch was broken in treeonly mode since it referred to
the manifest revlog to get the parents. Let's switch to the more modern
manifestctx way of accessing parent information.
Reviewed By: quark-zju
Differential Revision: D6995267
fbshipit-source-id: e0c11fca0f2156be3f936a6e437e7a4d3dffe75b
Summary:
As part of producing only one pack file per transaction, we need to
change the mutabledatapack store to allow reads of written-but-not-finalized
data. This will allow us to add the mutabledatapack to the union store, and then
keep it alive for the duration of the transaction.
Reviewed By: quark-zju
Differential Revision: D6944348
fbshipit-source-id: 1e721bd8e07335a9c1f9c6b7595a765ec018c007
Summary:
In an upcoming diff we'll make mutabledatapack readable, so we need the
entry parsing logic to be accessable. Let's move it out.
Reviewed By: singhsrb
Differential Revision: D6944345
fbshipit-source-id: c75162bb7c4fa47c6339b7ebbf96a4e386bd04b3
Summary:
This check is useful and detects real errors (ex. fbconduit). Unfortunately
`arc lint` will run it with both py2 and py3 so a lot of py2 builtins will
still be warned.
I didn't find a clean way to disable py3 check. So this diff tries to fix them.
For `xrange`, the change was done by a script:
```
import sys
import redbaron
headertypes = {'comment', 'endl', 'from_import', 'import', 'string',
'assignment', 'atomtrailers'}
xrangefix = '''try:
xrange(0)
except NameError:
xrange = range
'''
def isxrange(x):
try:
return x[0].value == 'xrange'
except Exception:
return False
def main(argv):
for i, path in enumerate(argv):
print('(%d/%d) scanning %s' % (i + 1, len(argv), path))
content = open(path).read()
try:
red = redbaron.RedBaron(content)
except Exception:
print(' warning: failed to parse')
continue
hasxrange = red.find('atomtrailersnode', value=isxrange)
hasxrangefix = 'xrange = range' in content
if hasxrangefix or not hasxrange:
print(' no need to change')
continue
# find a place to insert the compatibility statement
changed = False
for node in red:
if node.type in headertypes:
continue
# node.insert_before is an easier API, but it has bugs changing
# other "finally" and "except" positions. So do the insert
# manually.
# # node.insert_before(xrangefix)
line = node.absolute_bounding_box.top_left.line - 1
lines = content.splitlines(1)
content = ''.join(lines[:line]) + xrangefix + ''.join(lines[line:])
changed = True
break
if changed:
# "content" is faster than "red.dumps()"
open(path, 'w').write(content)
print(' updated')
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
```
For other py2 builtins that do not have a py3 equivalent, some `# noqa`
were added as a workaround for now.
Reviewed By: DurhamG
Differential Revision: D6934535
fbshipit-source-id: 546b62830af144bc8b46788d2e0fd00496838939
Summary:
On Windows, the intended access to the file handle must align with the
intended access to the memory mapping object, see [1].
When called without and argument, Python's `mmap.mmap` on Windows assumes
`ACCESS_WRITE` mode[3], therefore we get failures like [2] in Lego Windows.
[1]
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366537(v=vs.85).aspx
[2] P59034213
[3] https://docs.python.org/2/library/mmap.html
Reviewed By: quark-zju
Differential Revision: D6952583
fbshipit-source-id: 93159f8282e27d3e62d859f4c220e7c3bdfbe958
Summary:
With certain setup, the file history could be incorrect. That makes
`adjustlinknode` much slower since it has to scan the full history.
It is extremely slow if there is a commit with a massive renames.
This is undesirable since `hg log FILE` would give a wrong result.
But it does make the repo usable. Automation that does not care
about `hg log FILE` correctness can probably enable this relatively
safely.
This patch changes both shallow and full repos.
Reviewed By: DurhamG
Differential Revision: D6912051
fbshipit-source-id: 23d6f6c8dd91d4f72b43bc560cf26686bd6c4b47
Summary:
The remotefilelog cgunpacker logic could enter an infinite loop
if a copyfrom file node uses its copyto as delta base.
This repros without LFS in test-lfs-bundle.t.
Reviewed By: DurhamG
Differential Revision: D6910079
fbshipit-source-id: 99fea316e77218cd4bc9ea6f5506779a3e4ab9a6