Commit Graph

113 Commits

Author SHA1 Message Date
Eric Sumner
dab488d66f changegroup.getsubset: support multiple versions
Allow a version parameter to specify which version of the packer should be
used
2015-01-15 15:55:13 -08:00
Eric Sumner
96fb8b0c04 changegroup.writebundle: HG2Y support
This diff adds support to writebundle to generate a bundle2 wrapper; upcoming
diffs will add an option to write a v2 changegroup part instead of v1 in these
bundles.
2015-01-15 15:39:16 -08:00
Eric Sumner
7cbcf9bdca changegroup.writebundle: provide ui
The next diff will add support for writing bundle2 files to writebundle, but
the bundle2 generator wants access to a ui object.  This changes the signature
and callsites to pass one in.
2015-01-15 14:39:41 -08:00
Eric Sumner
c5cdff3779 pullbundle2: extract addchangegroup result combining into its own function
This will also be used for 'hg unbundle'
2015-01-16 12:53:45 -08:00
Mads Kiilerich
af8710d713 bundle: when verbose, show what takes up the space in the generated bundle
This is kind of similar to the debugbundle command but gives summarized actual
uncompressed number of bytes when creating the bundle. The numbers are as
usable as the bundle format is efficient. Hopefully bundle2 will make it a
better indicator of actual entropy.

This is useful when accepting pull requests to assess whether the repo size
increase seems reasonable for the diff before pushing stuff upstream, It has
helped me catching large files that should have been committed as largefiles
but was committed as regular files in intermediate changesets.

This output doesn't combine well with debug output so we only enable it when
verbose without debug.
2014-08-15 19:43:32 +02:00
Matt Mackall
174e7f793d merge with stable 2014-11-22 17:09:04 -06:00
Durham Goode
0a7a4a1f33 changegroup: fix file linkrevs during reorders (issue4462)
Previously, if reorder was true during the creation of a changegroup bundle,
it was possible that the manifest and filelogs would be reordered such that the
resulting bundle filelog had a linkrev that pointed to a commit that was not
the earliest instance of the filelog revision. For example:

With commits:

0<-1<---3<-4
  \       /
   --2<---

if 2 and 3 added the same version of a file, if the manifests of 2 and 3 have
their order reversed, but the changelog did not, it could produce a filelog with
linkrevs 0<-3 instead of 0<-2, which meant if commit 3 was stripped, it would
delete that file data from the repository and commit 2 would be corrupt (as
would any future pulls that tried to build upon that version of the file).

The fix is to make the linkrev fixup smarter. Previously it considered the first
manifest that added a file to be the first commit that added that file, which is
not true. Now, for every file revision we add to the bundle we make sure we
attach it to the earliest applicable linkrev.
2014-11-20 16:30:57 -08:00
Gregory Szorc
04eeb85285 changegroup: sparsely populate fnodes
Previously, fnodes had a key and empty dict value for every element in
changedfiles. This is somewhat wasteful. Empty dicts in CPython consume
a lot more memory than you would expect - 280 bytes.

On mozilla-central, which has ~190,000 files/fnodes keys, the previous
loop populating fnodes allocated 91,924 KB of memory, most of that for
the empty dicts.

With this patch in place, our peak RSS during mozilla-central clone
drops:

before:  364,356 KB
after:   326,008 KB
delta:   -38,348 KB

When combined with the previous patch, total peak RSS decrease is now
190,116 KB.
2014-11-06 22:48:20 -08:00
Gregory Szorc
c6e3c6fb27 changegroup: don't store unused value on fnodes (issue4443)
The contents of fnodes are only accessed once per key. It is wasteful to
cache the value since nobody will use it.

Before this patch, the caching of unused data in fnodes was effectively
causing a memory leak during the file streaming part of bundle creation.

On mozilla-central (which has ~190,000 entries in fnodes), this patch
has a significant impact on RSS at the end of generate():

before:  516,124 KB
after:   364,356 KB
delta:  -151,768 KB

The origin of this code can be traced back to 1f567a607f1f and has been
with us since the 2.7 release.
2014-11-06 22:33:48 -08:00
Gregory Szorc
0bfb4de7ec changegroup: don't define lookupmf() until it is needed
lookupmf() is currently defined earlier than when it is needed. Future
patches further refactoring this code will be easier to read when
lookupmf() is in its new home.
2014-11-06 20:57:12 -08:00
Pierre-Yves David
8259127ccb transaction: pass the transaction to 'postclose' callback
This mirrors the API for 'pending' and 'finalize' callbacks. I do not have
immediate usage planned for it, but I'm sure some callback will be happy to
access transaction related data.
2014-11-08 16:35:15 +00:00
Matt Mackall
816fd34333 merge with stable 2014-11-10 17:29:15 -06:00
Siddharth Agarwal
3e8587d071 changegroup.cg2packer: lookup 'group' via inheritance chain
This lets extensions insert themselves in the class hierarchy.
2014-11-07 17:54:59 -08:00
Pierre-Yves David
130c63f6e2 changegroup: use the 'postclose' API on transaction
The post-transaction hooks run after the lock release (because hooks may want to
touch the repository), but they must only run if the transaction is successfully
closed.

We use the new 'addpostclose' method on transaction to register a callback
installing this post-lock-release call.
2014-10-28 15:44:23 +01:00
Pierre-Yves David
8803fc197d changelog: rely on transaction for finalization
Instead of calling 'cl.finalize()' by hand (possibly at a bogus time) we
register it in the transaction during 'delayupdate' and rely on 'tr.close()' to
call it at the right time.
2014-10-18 01:09:41 -07:00
Pierre-Yves David
d6b8860637 changelog: handle writepending in the transaction
The 'delayupdate' method now takes a transaction object and registers its
'_writepending' method for execution in 'transaction.writepending()'. The hook can then
use 'transaction.writepending()' directly.

At some point this will allow the addition of other file creation
during writepending.
2014-10-17 21:55:31 -07:00
Sune Foldager
7cb0f8602d changegroup: introduce cg2packer/unpacker
cg2 supports generaldelta in changegroups, to be used in bundle2.
Since generaldelta is handled directly in cg2, reordering is switched
off by default.
2014-10-17 14:41:11 +02:00
Sune Foldager
efbba1affa changegroup: allow use of different cg#packer in getchangegroupraw
This will allow the use of general delta aware changegroup formats.
2014-10-17 14:41:21 +02:00
Sune Foldager
e8de499479 changegroup: introduce "raw" versions of some commands
The commands getchangegroup, getlocalchangegroup and getsubset now each
have a version ending in -raw. The raw versions return the chunk generator
from the changegroup packer directly, without wrapping it in a chunkbuffer
and unpacker. This avoids extra chunkbuffers in the bundle2 code path.

Also, the raw versions can be extended to support alternative packers
in the future, to be used from bundle2.
2014-10-17 14:41:02 +02:00
Pierre-Yves David
7e87948427 changegroup: add a "packermap" dictionary to track different packer versions
We only have "01" right now, but we should get general delta in soon.
Bundle2 is expected to make use of this to advertise and select the right packer
to use on both sides.
2014-09-24 21:24:06 -07:00
Pierre-Yves David
03cb1a74e8 changegroup: store source and url in the hookargs dict
We store the source and url of the current data into `transaction.hookargs` this
let us inherit it from upper layers that may have created a much wider
transaction. We have to modify bundle2 at the same time to register the source
and url in the transaction. We have to do it in the same patch otherwise, the
`addchangegroup` call would fill these values and the hook calling will crash
because of the duplicated 'source' and 'url' arguments passed to the hook call.
2014-10-14 00:06:46 -07:00
Pierre-Yves David
ce86284532 prechangegroup: use hook argument from the transaction
There can be useful data in there (eg: bundle2 related one)
2014-10-14 00:43:20 -07:00
Pierre-Yves David
0e7fe9a947 addchangegroup: call prechangegroup hook after transaction retrieval
We want to reused some possible information stored in the transaction
`hookargs` dict that may be stored by something handling the transaction at an
upper level (eg: bundle2) So we move the running of the hooks after transaction
creation. This has no visible effects (but an empty transaction roolback if the
hook fails) because nothing had happened in the transaction yet.
2014-10-14 00:09:25 -07:00
Pierre-Yves David
9e19dbeaf9 addchangegroup: get the node argument of incoming hook from transaction
The transaction is now carrying hook-related informations. So we use it to
retrieve the `node` argument. This will also carry around all kinds of other useful
informations (like: "are we in a bundle2 processing")
2014-10-14 00:03:03 -07:00
Mike Hommey
14669879bf changegroup: use a copy of hookargs when invoking the changegroup hook
addchangegroup creates a runhook function that is used to invoke the
changegroup and incoming hooks, but at the time the function is called,
the contents of hookargs associated with the transaction may have been
modified externally. For instance, bundle2 code affects it with
obsolescence markers and bookmarks info.

It also creates problems when a single transaction is used with multiple
changegroups added (as per an upcoming change), whereby the contents
of hookargs are that of after adding a latter changegroup when invoking
the hook for the first changegroup.
2014-10-16 15:54:53 +09:00
Sune Foldager
eb415860f8 changegroup: rename bundle-related functions and classes
Functions like getbundle and classes like unbundle10 really manipulate
changegroups and not bundles. A HG10 bundle is the same as a changegroup
plus a small header, but this is no longer the case for a HG2X bundle,
so it's better to separate the names a bit.
2014-09-02 12:11:36 +02:00
Pierre-Yves David
5f2b50474c phase: add a transaction argument to retractboundary
We now pass a transaction option to this phase movement function. The
object is currently not used by the function, but it will be in the
future.

All call sites have been updated. Most call sites were already enclosed in a
transaction for a long time. The handful of others have been recently
updated in previous commit.
2014-08-05 23:52:21 -07:00
Pierre-Yves David
a9275323db phase: add a transaction argument to advanceboundary
We now pass a transaction option to this phase movement function. The object
is currently not used by the function, but it will be in the future.

All call sites have been updated. Most call sites were already enclosed in a
transaction for a long time. The handful of others have been recently
updated in previous commit.

The retractboundary function remains to be upgraded.
2014-08-06 01:54:19 -07:00
Pierre-Yves David
8102dcd0f7 changegroup: add a targetphase argument to addchangegroup
This argument controls the phase used for the added changesets. This can be
useful to unbundle in "secret" phase as required by shelve.

This change aims at helping high-level code get rid of manual phase
movement. An important milestone for having phases part of the transaction.
2014-08-05 13:49:38 -07:00
Durham Goode
f94c8279f1 changegroup: refactor outgoing logic into a function
Extensions that add to bundle2 will want to know which commits are outgoing so
they can bundle data that is appropriate to those commits. This moves the logic
for figuring that out to a separate function so extensions can do the same
computation.
2014-05-07 17:22:34 -07:00
Pierre-Yves David
d9e2c653ca changegroup: use tr.hookargs when calling changegroup hooks
So that other parties using the transaction can put information in our hook
calls.
2014-04-17 17:46:26 -04:00
Pierre-Yves David
140d78d745 changegroup: use tr.hookargs when calling pretxnchangegroup hooks
So that other parties using the transaction can put information in our hook
calls.
2014-04-17 17:15:02 -04:00
Pierre-Yves David
1e7dd6937e addchangegroup: register data in tr.hookargs
We are registering data related to the process into the transaction hook data.
This lets other parties using the same transaction get informed of the
addchangegroup result.
2014-04-17 17:09:20 -04:00
Pierre-Yves David
009e9f6c33 bundle2: move readbundle into the exchange module
The `readbundle` function is going to understand the bundle2 header. We move the
function to a more suitable place before making any other changes.
2014-04-14 15:33:50 -04:00
Pierre-Yves David
a8e05fc7b8 changegroup: move chunk extraction into a getchunks method of unbundle10
This code used to be in `writebundle` only. We needs to make it more broadly
available for bundle2. The "changegroup" bundle2 part has to retrieve the
binary content of changegroup stream. We moved the chunks retrieving code into
the `unbundle10` object directly and the `writebundle` code is now using that.

This split is useful for bundle2 purpose, we want to be able to easily stream
changegroup content in a part.

To keep thing simples, we kept compression out of the new methods. If it make
more sense in the future, compression may get included in this function too.
2014-04-10 13:19:00 -07:00
FUJIWARA Katsunori
535cdba175 changegroup: add "vfs" argument to "readbundle()" to pass relative filename 2014-03-09 01:03:28 +09:00
FUJIWARA Katsunori
01e064edbb changegroup: add "vfs" argument to "writebundle()" for relative access via vfs
Before this patch, filename specified to "changegroup.writebundle()"
should be absolute one.

In some cases, they should be relative to repository root, store and
so on (backup before strip, for example).

This patch adds "vfs" argument to "writebundle()", and makes
"writebundle()" open (and unlink) "filename" via vfs for relative
access, if both filename and vfs are specified.
2014-03-09 01:03:28 +09:00
Pierre-Yves David
fddf556ac8 phase: apply publishing enforcement for "serve" source
When a changegroup is added by a push on a publishing server, we ensure they
are added as public. This is used to enforce publishing on server when the
client is not aware of phases. It also prevents race conditions where a reader
could see the changesets as draft before they get turned public by the client.
Finally, this save rounds trip as the client does not need additional request to
turn them public.

However, this logic was only enforced when the changegroup was from a "push"
source. And "push" is used for local pushes only. Wireprotocol push uses "serve"
as source since Mercurial 1.9. We now enforce this logic for both "push" and
"serve" sources.

One could note that this logic was mainly useful during wireprotocol exchanges.
So this code is finally put into good use, 9 versions after its introduction.
2014-04-07 18:10:50 -07:00
Matt Mackall
e10cab8769 merge with stable 2014-04-04 14:01:25 -05:00
Sean Farley
90ccd3ca5f changegroup: remove unused variable caught by pyflakes 2014-04-03 20:29:03 -05:00
Pierre-Yves David
6b9778c026 localrepo: move the addchangegroup method in changegroup module
This is a gratuitous code move aimed at reducing the localrepo bloatness.

The method had few callers, not enough to be kept in local repo.
2014-04-01 15:27:53 -07:00
Pierre-Yves David
4d81b98c1e localrepo: move the addchangegroupfiles method in changegroup module
This is a gratuitous code move aimed at reducing the localrepo bloatness.

The method had a single caller, far too few for being kept in local repo.
2014-04-01 15:21:56 -07:00
Pierre-Yves David
8e0876686d localrepo: move the changegroup method in changegroup module
This is a gratuitous code move aimed at reducing the localrepo bloatness.

The method had few callers, not enough to be kept in local repo.

The peer API stay unchanged.
2014-04-01 15:08:27 -07:00
Pierre-Yves David
30f24fdb7a localrepo: move the getbundle method in changegroup module
This is a gratuitous code move aimed at reducing the localrepo bloatness.

The method had few callers, not enough to be kept in local repo.

The peer API remains unchanged.
2014-04-01 14:40:35 -07:00
Pierre-Yves David
47880ff5c9 localrepo: move the getlocalbundle method in changegroup module
This is a gratuitous code move aimed at reducing the localrepo bloatness.

The method had 3 callers total, far too few for being kept in local repo.
2014-04-01 14:33:23 -07:00
Pierre-Yves David
f14b79a23f localrepo: move the changegroupsubset method in changegroup module
This is a gratuitous code move aimed at reducing the localrepo bloatness.

The method had few callers, not enough to be kept in local repo.

The peer API remains unchanged.
2014-04-01 14:25:03 -07:00
Pierre-Yves David
eb9e40c9b6 localrepo: move the changegroupinfo method in changegroup module
This is a gratuitous code move aimed at reducing the localrepo bloatness.

The method had a single caller... already in this changegroup module.
2014-04-01 14:13:34 -07:00
Pierre-Yves David
691e520cd5 localrepo: move the _changegroupsubset method in changegroup module
This is a gratuitous code move aimed at reducing the localrepo bloatness.

The method had 3 callers total, far too few for being kept in local repo.
2014-04-01 13:59:55 -07:00
Augie Fackler
0c049f6f1f changegroup: move from dict() construction to {} literals
The latter are both faster and more consistent across Python 2 and 3.
2014-03-12 13:14:31 -04:00
Antoine Pitrou
ecc34bd7c0 bundle: fix performance regression when bundling file changes (issue4031)
Somewhere before 2.7, a change [82beb9b16505] was committed that
entailed a large performance regression when bundling (and therefore
remote cloning) repositories. For each file in the repository, it would
recompute the set of needed changesets even though it is the same for
all files. This computation would dominate bundle runtimes according to
profiler output (by 10x or more).
2013-09-07 21:20:00 +02:00