Commit Graph

39 Commits

Author SHA1 Message Date
Boris Feld
cc076020d2 setdiscover: allow to ignore part of the local graph
Currently, the push discovery first determines the full set of common nodes
before looking into what changesets are outgoing. When pushing a specific
subset, this can lead to pathological situations where we search for the status
of thousand of local heads that are unrelated to the requested pushes.

To fix this, we need to teach the discovery to ignores part of the graph. Most
of the necessary pieces were already in place. This changeset just makes them
available to higher level API and tests them.

Change actually impacting pushes are coming in a later changeset.
2017-12-06 22:44:51 +01:00
Augie Fackler
da28f5be4e merge with stable 2017-11-30 15:48:42 -05:00
Yuya Nishihara
52b06fe73d dispatch: verify result of early command parsing
Before, early options were stripped from args, and because of this, some
kind of parsing errors weren't reported. For example,

  $ hg ci -m -Ra file

would execute "hg ci -m file" in repository "a".

This patch fixes the issue by parsing early options again by real getopt-based
parser, and verifying the results. If the early parsing appears wrong, hg just
aborts. The current error message seems not nice, and should be improved, maybe
in V2 or follow-up.

Note that this isn't a security feature because we can still do anything by
using shell aliases.
2017-11-11 12:40:13 +09:00
Yuya Nishihara
7137d3d976 dispatch: do not drop unpaired argument at _earlygetopt()
Before, "hg log -R" just worked.
2017-11-11 12:09:19 +09:00
Boris Feld
5e36a9bc8b test-pattern: actually update tests using the patterns
We mass update the tests now. This will help the next soul touching the http
protocol.
2017-11-05 08:23:12 +01:00
Denis Laxalde
9efc7f05e3 transaction-summary: show the range of new revisions upon pull/unbundle (BC)
Upon pull or unbundle, we display a message with the range of new revisions
fetched. This revision range could readily be used after a pull to look out
what's new with 'hg log'. The algorithm takes care of filtering "obsolete"
revisions that might be present in transaction's "changes" but should not be
displayed to the end user.
2017-10-12 09:39:50 +02:00
Saurabh Singh
15210308cf test-setdiscovery: make test compatible with chg
The test checks the output of the blackbox extension which will
contain logs corresponding to chg in case chg is running. Therefore, this
commit modifies the test to take chg into consideration while working with the
blackbox extension.

Test Plan:
Ran the test 'test-setdiscovery.t' with and without the '--chg'
option.

Differential Revision: https://phab.mercurial-scm.org/D926
2017-10-03 13:30:36 -07:00
Boris Feld
bbf23f4d9a pull: use 'phase-heads' to retrieve phase information
A new bundle2 capability 'phases' has been added. If 'heads' is part of the
supported value for 'phases', the server supports reading and sending 'phase-
heads' bundle2 part.

Server is now able to process a 'phases' boolean parameter to 'getbundle'. If
'True', a 'phase-heads' bundle2 part will be included in the bundle with phase
information relevant to the whole pulled set. If this method is available the
phases listkey namespace will no longer be listed.

Beside the more efficient encoding of the data, this new method will greatly
improve the phase exchange efficiency for repositories with non-served
changesets (obsolete, secret) since we'll no longer send data about the
filtered heads.

Add a new 'devel.legacy.exchange' config item to allow fallback to the old
'listkey in bundle2' method.

Reminder: the pulled set is not just the changesets bundled by the pull. It
also contains changeset selected by the "pull specification" on the client
side (eg: everything for bare pull). One of the reason why the 'pulled set' is
important is to make sure we can move -common- nodes to public.
2017-09-24 21:27:18 +02:00
Augie Fackler
e5d7bd82c5 cleanup: use $PYTHON to run python in many more tests
Spotted one of these, then wrote a check-code rule that caught them
all. It will be the next change.
2017-06-20 09:45:02 -04:00
Pierre-Yves David
c82b13f1cf setdiscovery: improves logged message
The 'srvheads' list contains all server heads including the common ones. We
adjust 'ui.log' message to provide more useful information about server heads
locally unknown. The performance impact of turning the list to set is
negligible (about 1e-4s) compared to the rest of the discovery cost, so I'm
taking the easy path.
2017-06-10 18:47:09 +01:00
Matt Harbison
71414924c3 test-setdiscovery: stabilize for Windows
Windows wants double quotes here.
2017-06-10 00:11:54 -04:00
Pierre-Yves David
ef5b27290d discovery: log discovery result in non-trivial cases
We log the discovery summary, the number of roundtrips and the elapsed time.
This is useful to understand where slow push might come from when lloking at
the blackbox.
2017-06-07 10:44:11 +01:00
Pierre-Yves David
4db3d34a4b discovery: include timing in the debug output
Having such date easily available is useful. It also prepare the inclusion of
some discovery related data in blackbox.
2017-06-07 10:29:39 +01:00
Gregory Szorc
e1840d5435 httppeer: advertise and support application/mercurial-0.2
Now that servers expose a capability indicating they support
application/mercurial-0.2 and compression, clients can key off
this to say they support responses that are compressed with
various compression formats.

After this commit, the HTTP wire protocol client now sends an
"X-HgProto-<N>" request header indicating its support for
"application/mercurial-0.2" media type and various compression
formats.

This commit also implements support for handling
"application/mercurial-0.2" responses. It simply reads the header
compression engine identifier then routes the remainder of the
response to the appropriate decompressor.

There were some test changes, but only to logging. That points to
an obvious gap in our test coverage. This will be addressed in a
subsequent commit once server support is in place (it is hard to
test without server support).
2016-12-24 15:22:18 -07:00
Maciej Fijalkowski
2e11b650af pypy: fix setdiscovery test
This test relies on the exact details of random.sample given the
seed. Things work a bit differently under pypy, make the test less
specific.
2016-04-05 14:44:18 +03:00
Martin von Zweigbergk
63c15f247e changegroup3: introduce experimental.changegroup3 boolean config
In order to give us the freedom to change the changegroup3 format,
let's hide it behind an experimental config. Since it is required by
treemanifests, that will override the cg3 config.
2016-01-12 21:23:45 -08:00
Augie Fackler
d33d6a0cb5 changegroup: introduce cg3, which has support for exchanging treemanifests
I'm not entirely happy with using a trailing / on a "file" entry for
transferring a treemanifest. We've discussed putting some flags on
each file header[0], but I'm unconvinced that's actually any better:
if we were going to add another feature to the cg format we'd still be
doing a version bump anyway to cg4, so I'm inclined to not spend time
coming up with a more sophisticated format until we actually know what
the next feature we want to stuff in a changegroup will be.

Test changes outside test-treemanifest.t are only due to the new CG3
bundlecap showing up in the wire protocol.

Many thanks to adgar@google.com and martinvonz@google.com for helping
me with various odd corners of the changegroup and treemanifest API.

0: It's not hard refactoring, nor is it a lot of work. I'm just
disinclined to do speculative work when it's not clear what the
customer would actually be.
2015-12-11 11:23:49 -05:00
Pierre-Yves David
3ff87b1cf4 incoming: request a bundle2 when possible (BC)
Incoming was using bundle1 in all cases, as bundle1 is restricted to
changegroup1 and does not support general delta, this can lead to significant
CPU overhead if the server is using general delta storage. We now properly
request and store a bundle2 to disk.

If the server include any output or error in the bundle, they will be stored on
disk and replayed when the bundle is read. As 'hg incoming' is going to read the
bundle right away, we call that 'good' enough and go back to the bigger plan of
having general delta on by default.

This was tracked as 4864
2015-10-05 00:23:20 -07:00
Matt Mackall
b709208c37 tests: drop DAEMON_PIDS from killdaemons calls 2015-06-08 14:55:40 -05:00
Matt Mackall
3ad28905f6 tests: drop explicit $TESTDIR from executables
$TESTDIR is added to the path, so this is superfluous. Also,
inconsistent use of quotes means we might have broken on tests with
paths containing spaces.
2015-06-08 14:44:30 -05:00
Pierre-Yves David
281365197e progress: get the extremely verbose output out of default debug
When the progress extension is not enabled, each call to 'ui.progress' used to
issue a debug message. This results is a very verbose output and often redundant
in tests. Dropping it makes tests less volatile to factor they do not meant to
test.

We had to alter the sed trick in 'test-rename-merge2.t'. Sed is used to drop all
output from a certain point and hidding the progress output remove its anchor.
So we anchor on something else.
2015-05-09 23:40:40 -07:00
Matt Harbison
9741c43e48 tests: replace uses of 'seq' with portable 'seq.py' 2015-03-17 21:47:47 -04:00
Pierre-Yves David
6ff053fa11 setdiscovery: always add exponential sample to the heads
As explained in a previous changeset, prioritizing heads too much behaves
pathologically when there are more heads than the sample size. To counter this,
we always inject exponential samples before reducing to the sample size limit.

This already show some benefit in the test themselves, but on a real-world example
this moves my discovery for push to pathologically headed repo from 45 rounds to
17 of them.

We should maybe ensure that at least 25% of the result sample is heads, but I
think the random sampling will be fine in practice.
2015-01-07 17:28:51 -08:00
Pierre-Yves David
60a9cd0334 setdiscovery: directly run '_updatesample'
The heads and exponential sample are going to end up in the same set
before any extra processing happens. We simplify the code by directly
updating a set with heads.

Changes in the order the set is built lead to small changes in the random
sampling output. But after double checking, I can confirm the input data to
the random sampling is consistent.
2015-01-07 17:23:21 -08:00
Pierre-Yves David
e3605ecf1f setdiscovery: randomly pick between heads and sample when taking full sample
Before this changeset, the discovery protocol was too heads-centric. Heads of the
undiscovered set were always sent for discovery and any room remaining in the
sample were filled with exponential samples (and random ones if any room
remained).

This behaved extremely poorly when the number of heads exceeded the sample size,
because we keep just asking about the existence of heads, then their direct parent
and so on. As a result, the 'O(log(len(repo)))' discovery turns into a
'O(len(repo))' one. As a solution we take a random sample of the heads plus
exponential samples. This way we ensure some exponential sampling is achieved,
bringing back some logarithmic convergence of the discovery again.

This patch only applies this principle in one place. More places will be updated
in future patches.

One test is impacted because the random sample happen to be different. By
chance, it helps a bit in this case.
2015-01-07 12:09:51 -08:00
Eric Sumner
7be426bb53 incoming: handle phases the same as pull
Now that bundlerepo can move phases safely, 'hg incoming' can share its phase
handling code with pull to better reflect what would actually show up.
2014-12-18 12:33:17 -08:00
Mads Kiilerich
f7f618d52d discovery: test coverage for issue4438 / 475a22a41c55 / a720a37e15a3
The randomness in the discovery protocol made this problem hard to reproduce.
The test mocks random.sample to make sure we hit the problem every time. The
set iteration order also made the output unstable ... but with the issue fixed,
it is stable.
2014-11-06 01:48:29 +01:00
Pierre-Yves David
1b8f2c7e41 setdiscovery: limit the size of all sample (issue4411)
Further digging on this issue show that the limit on the sample size used in
discovery never works for heads. Here is a quote from the code itself:

  desiredlen = size - len(always)
  if desiredlen <= 0:
      # This could be bad if there are very many heads, all unknown to the
      # server. We're counting on long request support here.

The long request support never landed and evolution make the "very many heads,
all unknown to the server" case quite common.

We implement a simple and stupid hard limit of sample size for all query. This
should prevent HTTP 414 error with the current state of the code.
2014-11-01 23:52:53 +00:00
Pierre-Yves David
e107a615ed setdiscovery: limit the size of the initial sample (issue4411)
The set discovery start by sending a "known" command with all local heads. When
the number of local heads is massive (eg: using hidden changesets) such request
becomes too large. This lead to 414 error over http, aborting the whole
process.

We limit the size of the sample used by the first query to fix this.

The test are impacted because they do test massive number of heads. But they do
not test it over real world http setup.
2014-10-27 17:52:33 +01:00
Mads Kiilerich
7f9b497a4c incoming: don't request heads that already are common
Pull would send a getbundle command where common heads were sent both as common
and head, even though there is no reason to request a common head.
The request was thus twice as big as necessary and more likely to hit HTTP
header size limits.

Instead, don't request heads that already are common.

This is fixed in bundlerepo.getremotechanges . It could perhaps also have been
fixed in discovery.findcommonincoming but that would have a bigger impact.
2014-08-15 03:24:40 +02:00
Mads Kiilerich
c6b11d841f tests: improve test coverage for discovery and actual parameters for pulling 2014-08-15 03:24:40 +02:00
Mads Kiilerich
d5ef6b65bb debugdiscovery: report heads in sorted order 2012-12-12 02:38:14 +01:00
Mads Kiilerich
fa1c4e5ebe tests: add missing trailing 'cd ..'
Many tests didn't change back from subdirectories at the end of the tests ...
and they don't have to. The missing 'cd ..' could always be added when another
test case is added to the test file.

This change do that tests (99.5%) consistently end up in $TESTDIR where they
started, thus making it simpler to extend them or move them around.
2012-06-11 01:40:51 +02:00
Lee Cantey
c042f32203 test-setdiscovery: allow for leading space in output of wc 2011-09-12 09:20:31 -07:00
Peter Arrenbrecht
83352215f8 setdiscovery: fix hang when #heads>200 (issue2971)
When setting up the next sample, we always add all of the heads, regardless
of the desired max sample size. But if the number of heads exceeds this
size, then we don't add any more nodes from the still undecided set.
(This is debatable per se, and I'll investigate it, but it's how we designed
it at the moment.)

The bug was that we always added the overall heads, not the heads of the
remaining undecided set. Thus, if #heads>200 (desired sample size), we
did not make progress any longer.
2011-08-25 21:25:14 +02:00
Peter Arrenbrecht
bc696e8298 dagutil: fix off-by-one in inverserevlogdag buildup 2011-08-25 17:20:00 +02:00
Mads Kiilerich
2c9adcac85 tests: solaris [ doesn't know -e 2011-06-25 01:55:16 +02:00
Peter Arrenbrecht
9a2d2f747c setdiscovery: batch heads and known(ownheads)
This means that we now discover both subset conditions (local<remote and
remote<local) in a single roundtrip without ever constructing an actual
sample (which takes a bit of client CPU).
2011-06-14 22:58:00 +02:00
Peter Arrenbrecht
75fa0e5ea9 discovery: add new set-based discovery
Adds a new discovery method based on repeatedly sampling the still
undecided subset of the local node graph to determine the set of nodes
common to both the client and the server.

For small differences between client and server, it uses about the same
or slightly fewer roundtrips than the old tree-based discovery. For
larger differences, it typically reduces the number of roundtrips
drastically (from 150 to 4, for instance).

The old discovery code now lives in treediscovery.py, the new code is
in setdiscovery.py.

Still missing is a hook for extensions to contribute nodes to the
initial sample. For instance, Augie's remotebranches could contribute
the last known state of the server's heads.

Credits for the actual sampler and computing common heads instead of
bases go to Benoit Boissinot.
2011-05-02 19:21:30 +02:00