Commit Graph

272 Commits

Author SHA1 Message Date
Durham Goode
2dc959255a changegroup: remove dictionary creation from deltachunk
Previously delta chunk returned a dictionary. Now that we consume deltachunk
within changegroup (instead of outside in revlog) we can just return a tuple and
have it be returned directly by deltaiter.

Differential Revision: https://phab.mercurial-scm.org/D746
2017-09-20 09:35:45 -07:00
Durham Goode
c00411b064 revlog: add revmap back to revlog.addgroup
The recent e85296920485 patch removed the linkmapper argument from addgroup, as
part of trying to make addgroup more agnostic from the changegroup format. It
turns out that the changegroup can't resolve linkrevs while iterating over the
deltas, because applying the deltas might affect the linkrev resolution. For
example, when applying a series of changelog entries, the linkmapper just
returns len(cl). If we're iterating over the deltas without applying them to the
changelog, this results in incorrect linkrevs. This was caught by the hgsql
extension, which reads the revisions before applying them.

The fix is to return linknodes as part of the delta iterator, and let the
consumer choose what to do.

Differential Revision: https://phab.mercurial-scm.org/D730
2017-09-20 09:22:22 -07:00
Kevin Bullock
f0dbcd6ad0 merge with stable 2017-09-18 14:12:20 -05:00
Martin von Zweigbergk
01e8a04410 repair: preserve phase also when not using generaldelta (issue5678)
It seems like we used to pick the oldest possible version of the
changegroup to use for bundles created by the repair module (used
e.g. by "hg strip" and for temporary bundles by "hg rebase"). I tried
to preserve that behavior when I created the changegroup.safeversion()
method in 77f74106b264 (changegroup: introduce safeversion(),
2016-01-19).

However, we have recently chagned our minds and decided that these
commands are only used locally and downgrades are unlikely. That
decicion allowed us to start adding obsmarker and phase information to
these bundles. However, as the bug report shows, it means we get
different behavior e.g. when generaldelta is not enabled (because when
it was enabled, it forced us to use bundle2). The commit that actually
caused the reported bug was 26d535788092 (strip: include phases in
bundle (BC), 2017-06-15).

So, since we now depend on having more information in the bundles,
let's make sure we instead pick the newest possible changegroup
version.

Differential Revision: https://phab.mercurial-scm.org/D715
2017-09-14 11:16:57 -07:00
Durham Goode
b8aaeea8b3 changegroup: add source parameter to generatemanifests
Extensions, like remotefilelog, will want to look at the source of a pull when
determining what manifests to add to a changegroup. For instance, on push they
will include everything, while on pull they won't.

Differential Revision: https://phab.mercurial-scm.org/D686
2017-09-11 13:39:22 -07:00
Durham Goode
c94bd0e75c changegroup: remove changegroup dependency from revlog.addgroup
Previously revlog.addgroup would accept a changegroup and a linkmapper and use
it to iterate of the deltas. As part of untangling the revlog-changegroup
interdependency, let's move the changegroup delta iteration logic to it's own
function and pass the simple iterator to the revlog instead.

This will make it easier to introduce non-revlogs stores in the future, without
reinventing any changegroup specific logic.

Differential Revision: https://phab.mercurial-scm.org/D688
2017-09-13 10:43:44 -07:00
Durham Goode
9ff843ded9 changegroup: rename getsubsetraw to makestream
Now that nothing uses getsubsetraw except makestream, let's move the
functionality into the makestream. This removes the last remaining excess
changegroup creation function, getsubsetraw.

Differential Revision: https://phab.mercurial-scm.org/D671
2017-09-10 18:52:40 -07:00
Durham Goode
21faf618dc changegroup: replace getchangegroup with makechangegroup
As part of reducing the number of changegroup creation APIs, let's replace
getchangegroup with calls to makechangegroup. This is mostly a drop in
replacement, but it does change the version specifier to be required, so it's
more obvious which callers are creating old version 1 changegroups still.

Differential Revision: https://phab.mercurial-scm.org/D669
2017-09-10 18:50:12 -07:00
Durham Goode
738a32ad20 changegroup: replace changegroup with makechangegroup
As part of reducing the number of changegroup creation APIs, let's replace the
changegroup function with makechangegroup. This pushes the responsibility of
creating the outgoing set to the caller, but that seems like a simple and
reasonable concept for the caller to be aware of.

Differential Revision: https://phab.mercurial-scm.org/D668
2017-09-10 18:48:42 -07:00
Durham Goode
49825855fd changegroup: delete getlocalchangegroup
As part of reducing the number of changegroup creation APIs, let's go ahead and
delete this unused function. It appears to have been deprecated in the last
release anyway.

Differential Revision: https://phab.mercurial-scm.org/D667
2017-09-10 18:47:39 -07:00
Durham Goode
b470a6a2f1 changegroup: replace getlocalchangegroupraw with makestream
As part of reducing the number of changegroup creation apis, let's replace calls
to getlocalchangegroupraw with calls to makestream. Aside from one case of
checking if there are no outgoing commits and returning None, this is pretty
much a drop in replacement.

Differential Revision: https://phab.mercurial-scm.org/D666
2017-09-10 19:01:56 -07:00
Durham Goode
394bc74c72 changegroup: replace changegroupsubset with makechangegroup
As part of getting rid of all the permutations of changegroup creation, let's
remove changegroupsubset and call makechangegroup instead. This moves the
responsibility of creating the outgoing set to the caller, but that seems like a
relatively reasonable unit of functionality for the caller to have to care about
(i.e. what commits should be bundled).

Differential Revision: https://phab.mercurial-scm.org/D665
2017-09-10 18:43:59 -07:00
Durham Goode
00542dbd45 changegroup: replace getsubset with makechangegroup
The current changegroup APIs are a bit of a mess. Currently you can use
getsubsetraw, getsubset, changegroupsubset, getlocalchangegroupraw,
getchangegroup, and getlocalchangroup to produce changegroups. This patch is the
beginning of a refactor to boil all of that away to just makechangegroup and
makestream.

The first step adds the new functions and replaces getsubset function with them.

Differential Revision: https://phab.mercurial-scm.org/D664
2017-09-10 18:39:02 -07:00
Durham Goode
1910a51ac1 changegroup: fix to allow empty manifest parts
The current chunk reading algorithm relied on counting the number of empty
chunks and comparing it to the number of chunk lists it expected (1 list of
files for cg1 and cg2, and 1 list of files + 1 list of trees for cg3). This
implicitly assumed that both the changelog part and the manifestlog part were
never empty (since them being empty would cause it to count it as one list being
done, and screw up the count). In our treemanifest code, the manifest section
could be empty, so we need to handle that case.

This patches refactors that code to be more explicit about how it counts the
expected parts.

Differential Revision: https://phab.mercurial-scm.org/D646
2017-09-06 18:33:55 -07:00
Boris Feld
0949aa4c6f changegroup: stop returning and recording added nodes in 'cg.apply'
cg.apply used to returns the added nodes. Callers doesn't have a use for it
anymore, remove the added node and stops recording it in the current
operation.

This information was added in the current release cycle so no extensions
breakage should happens.
2017-07-13 21:08:06 +02:00
Boris Feld
6ee0c66cd0 phases: rework phase movement code in 'cg.apply' to use 'registernew'
We rework the code to call 'registernew' before any other phase advancement.
This make 'changegroup.apply' register correct phase movement for the added
and bundled nodes.
2017-07-11 01:17:36 +02:00
Boris Feld
1c00e3fb58 changegroup: stop treating strip as special when dealing with phases
Since 26d535788092, the strip bundle includes the phases of the stripping
node. Hence we don't need this special case anymore.

Dropping it will helps make the phase behavior more consistent across all
exchanges medium.
2017-07-11 04:52:56 +02:00
Augie Fackler
c5d369a79b changegroup: more **kwargs
Differential Revision: https://phab.mercurial-scm.org/D273
2017-07-24 11:28:59 -04:00
Augie Fackler
8984fe77a0 changegroup: wrap some ** expansions in strkwargs 2017-07-24 11:16:53 -04:00
Martin von Zweigbergk
86800baf0f changegroup: don't fail on empty changegroup (API)
I don't know why applying an empty changegroup should be an error. It
seems harmless. I suspect the check was there to find code that
creates empty changegroups just because that would be wasteful. Let's
use develwarn() for that instead, so we catch any such cases that run
with our test runner, but we still allow others to generate empty
changegroups if they want to.

We have run into this check at Google once or twice and had to work
around it, but I'm changing this not so much because of that, but
because it seems like it shouldn't be an error.

I also changed the message slightly to be more modern ("changelog
group" -> "changegroup") and more generic ("received" -> "applied").
2017-06-30 23:58:59 -07:00
Martin von Zweigbergk
ed2bd4f3f3 changegroup: remove option to allow empty changegroup (API)
No caller sets the "emptyok" option, so let's remove it.
2017-07-01 00:00:09 -07:00
Pierre-Yves David
8524b4275f configitems: register the 'server.validate' config 2017-06-30 03:44:15 +02:00
Pierre-Yves David
1084baf8c5 configitems: register the 'bundle.reorder' config 2017-06-30 03:31:35 +02:00
Martin von Zweigbergk
80bc6d2e40 bundle: move combineresults() from changegroup to bundle2
The results only need to be combined if they come from a bundle2. More
importantly, we'll change its argument to a bundleoperation soon, and
then it definitely will no longer belong in changegroup.py.
2017-06-22 13:58:20 -07:00
Martin von Zweigbergk
3015c0a9a8 bundle2: record changegroup data in 'op.records' (API)
When adding support for bundling and unbundling phases, it will be
useful to have the list of added changesets. To do that, we return the
list from changegroup.apply().
2017-06-16 16:56:16 -07:00
Martin von Zweigbergk
16b7a41d93 changegroup: delete "if True" and reflow 2017-06-15 23:23:47 -07:00
Martin von Zweigbergk
560e5ce4f1 changegroup: let callers pass in transaction to apply() (API)
I think passing in the transaction makes it a little clearer and more
consistent with bundle2.
2017-06-15 22:46:38 -07:00
Martin von Zweigbergk
ed4a2d83be changegroup: inline 'publishing' variable in apply() 2017-06-19 00:06:23 -07:00
Martin von Zweigbergk
02b210363d changegroup: rename "dh" to the clearer "deltaheads"
We have a lot of frequently used abbreviations, but this is not one of
them.
2017-06-15 13:47:54 -07:00
Martin von Zweigbergk
f2408d8030 changegroup: rename "srccontent" to "cgnodes"
It's the list of nodes in the incoming changegroup, so "cgnodes" made
more sense to me.
2017-06-15 13:42:41 -07:00
Durham Goode
a73dbb6c8d changegroup: add bundlecaps back
Commit 9233182ea547d0aa removed the unused bundlecaps argument from the
changegroup code. While it is unused in core Mercurial, it was an important
feature for the remotefilelog extension because it allowed the exchange layer to
communicate to the changegroup packer that this was a shallow repo and that
filelogs should not be included. Without bundlecaps, there is currently no other
way to pass that information along without a more extensive refactor of
exchange, bundle, and changegroup code.

This patch backs out the original removal, and merges it with some recent
changes to changegroup apis.
2017-05-15 09:35:27 -07:00
Pierre-Yves David
dc1e05ed68 caches: stop warming the cache after changegroup application
Now that we garantee that branchmap cache is updated at the end of the
transaction we can drop this update. This removes a problematic case with
nested transaction where the new cache could be written on disk before the
transaction is finished (and even roll-backed)

Such premature cache write was visible in the following test:

* tests/test-acl.t
* tests/test-rebase-conflicts.t

In addition, running the cache update later means having more date about the
state of the repository (in particular: phases). So we can generate caches with
more information. This creates harmless changes to the following tests:

* tests/test-hardlinks-whitelisted.t
* tests/test-hardlinks.t
* tests/test-phases.t
* tests/test-tags.t
* tests/test-inherit-mode.t
2017-05-02 18:57:52 +02:00
Pierre-Yves David
9c635f53f5 caches: move the 'updating the branch cache' message in 'updatecaches'
We are about to remove the branchmap cache update in changegroup application.
There is a debug message alongside this update that we do not want to loose. We
move the message beforehand to simplify the test update in the next changeset.
The message move is quite noisy and isolating that noise is useful.

Most tests update are just line reordering since the message is issued at a
later point during the transaction.

After this changes, the message is displayed in more case since local commit
creation also issue it.
2017-05-02 22:27:44 +02:00
Pierre-Yves David
5149c68aae changegroup: deprecate 'getlocalchangroup' (API)
We have 'getchangegroup' with a shorter name for the exactly same feature. Now
that all users are gone we can formally deprecate it.
2017-05-04 12:43:41 +02:00
Pierre-Yves David
cc5d603ef2 changegroup: deduplicate 'getlocalchangegroup'
The two functions 'getlocalchangegroup' and 'getchangegroup' have been strictly
identical for multiple years ('getchangegroup' had a deprecated docstring)

We'll drop one of them (getlocalchangegroup, since it has the longest name).
However, we needs to migrate all users of the dropped one to the new one before
we can deprecate it. In the mean time we drop one of the duplicated definition
and the outdated docstring.
2017-05-04 12:36:45 +02:00
Martin von Zweigbergk
feebdd58e5 changegroup: delete unused 'bundlecaps' argument (API) 2017-05-02 23:47:10 -07:00
Martin von Zweigbergk
2e8637ffda merge with stable 2017-03-24 08:37:26 -07:00
Gregory Szorc
6fb64a9cca changegroup: store old heads as a set
Previously, the "oldheads" variable was a list. On a repository at
Mozilla with 46,492 heads, profiling revealed that list membership
testing was dominating execution time of applying small changegroups.

This patch converts the list of old heads to a set. This makes
membership testing significantly faster. On the aforementioned
repository with 46,492 heads:

$ hg unbundle <file with 1 changeset>
before: 18.535s wall
after:   1.303s

Consumers of this variable only check for truthiness (`if oldheads`),
length (`len(oldheads)`), and (most importantly) item membership
(`h not in oldheads` - which occurs twice). So, the change to a set
should be safe and suitable for stable.

The practical effect of this change is that changegroup application
and related operations (like `hg push`) no longer exhibit an O(n^2)
CPU explosion as the number of heads grows.
2017-03-23 19:54:59 -07:00
Remi Chaintron
6d11b9177b revlog: add 'raw' argument to revision and _addrevision
This patch introduces a new 'raw' argument (defaults to False) to revlog's
revision() and _addrevision() methods.
When the 'raw' argument is set to True, it indicates the revision data should be
handled as raw data by the flagprocessor.

Note: Given revlog.addgroup() calls are restricted to changegroup generation, we
can always set raw to True when calling revlog._addrevision() from
revlog.addgroup().
2017-01-05 17:16:07 +00:00
Pierre-Yves David
b77eb86598 changegroup: simplify logic around enabling changegroup 03
There was multiple spot that took care of adding '03' as supported changegroup
version for different condition. We gather them all in one location for
simplicity.

The 'supportedincomingversions' function is now doing nothing, but I kept it
around because it looks like a great hooking point for extension.

(Note that we should probably just get changegroup3 out of experimental now, But
that would be a patch with a much wider scope).
2016-12-19 04:25:18 +01:00
Pierre-Yves David
281f6d3210 changegroup: pass 'repo' to allsupportedversions
In the next changesets, we will introduce more logic directly related to the
repository to decide what version have to be supported. So we now directly pass
the repo object instead of just ui.
2016-12-19 04:29:33 +01:00
Pierre-Yves David
c066ecb20b changegroup: simplify 'allsupportedversions' logic
Discarding '03' to add it back is a bit strange. Instead we only discard it when
needed.
2016-12-19 04:31:13 +01:00
Stanislau Hlebik
da605718f2 cg1packer: fix compressed method
`cg1packer.compressed()` returns True even if `self._type` is 'UN'. This patch
fixes it.
2016-12-14 09:53:56 -08:00
Gregory Szorc
ce177e2b4d changegroup: use compression engines API
The new API doesn't have the equivalence for None and 'UN' so we
introduce code to use 'UN' explicitly.
2016-11-07 18:38:13 -08:00
Durham Goode
1a774a75e3 changegroup: remove remaining uses of repo.manifest
The remaining uses of repo.manifest in the changegroup module are treating the
manifest exclusively as a revlog, so let's replace them with instances of the
revlog directly.

This is part of dropping all dependencies on repo.manifest in favor of
repo.manifestlog.
2016-11-08 08:03:43 -08:00
Durham Goode
f952eca1af manifest: get rid of manifest.readshallowfast
This removes manifest.readshallowfast and converts it's one user to use
manifestlog instead.
2016-11-02 17:10:47 -07:00
Gregory Szorc
f6f301e95d changegroup: use changelogrevision()
Using offsets for accessing changelog entries isn't very readable.
As a bonus, changelog.changelogrevision() also accepts a revision,
so we don't need to perform the inline node resolution either.
2016-11-01 18:29:09 -07:00
Gregory Szorc
881f361857 changegroup: cache changelog and manifestlog outside of loop
History has taught us that repo.changelog can add significant overhead
to loops. So cache the changelog instance outside of the loop to
avoid the lookup. While we're here, do the same for manifestlog,
since each loop would otherwise initialize a new manifestlog instance.
2016-11-01 18:28:03 -07:00
Pulkit Goyal
56031921a5 py3: convert the mode argument of os.fdopen to unicodes (2 of 2) 2017-02-13 22:15:28 +05:30
Gregory Szorc
722900ff91 changegroup: increase write buffer size to 128k
By default, Python defers to the operating system for choosing the
default buffer size on opened files. On my Linux machine, the default
is 4k, which is really small for 2016.

This patch bumps the write buffer size when writing
changegroups/bundles to 128k. This matches the 128k read buffer
we already use on revlogs.

It's worth noting that this only impacts when writing to an explicit
file (such as during `hg bundle`). Buffers when writing to bundle
files via the repo vfs or to a temporary file are not impacted.

When producing a none-v2 bundle file of the mozilla-unified repository,
this change caused the number of write() system calls to drop from
952,449 to 29,788. After this change, the most frequent system
calls are fstat(), read(), lseek(), and open(). There were
2,523,672 system calls after this patch (so a net decrease of
~950k is statistically significant).

This change shows no performance change on my system. But I have a
high-end system with a fast SSD. It is quite possible this change
will have a significant impact on network file systems, where
extra network round trips due to excessive I/O system calls could
introduce significant latency.
2016-10-16 13:35:23 -07:00