Commit Graph

253 Commits

Author SHA1 Message Date
Martin von Zweigbergk
58c3ff9aaf changegroup: fix treemanifests on merges
The current code for generating treemanifest revisions takes the list
of files in the changeset and finds the directories from them. This
does not work for merges, since a merge may pick file A from one side
and file B from another and neither of them would appear in the
changeset's "files" list, but the manifest would still change.

Fix this by instead walking the root manifest log for all needed
revisions, storing all needed file and subdirectory revisions, then
recursively visiting the subdirectories. This also turns out to be
faster: cloning a version of hg core converted to treemanifests went
from ~28s to ~19s (timing somewhat unfair: before this patch, timed
until crash; after this patch, timed until manifests complete).

The new algorithm is used only on treemanifest repos. Although it
works equally well on flat manifests, we leave the iteration over
files in the changeset for flat manifests for now.
2016-02-12 23:09:09 -08:00
Martin von Zweigbergk
c1d77f8a77 changegroup: write root manifests and subdir manifests in a single loop
This is another step towards making the manifest generation recurse
along the directory trees. The loop over 'tmfnodes' now takes the form
of a queue. At this point, we only add to the queue twice: we add the
root manifests, and, while visiting the root manifest revisions, we
add all subdirectory revisions (for treemanifest repos). Thus, any
iterations over 'tmfnodes' after the first will not add any items and
the "queue" will just keep shrinking.
2016-02-12 23:30:18 -08:00
Martin von Zweigbergk
58d674fbd1 changegroup: introduce makelookupmflinknode(dir)
This is another step towards making the manifest generation recurse
along the directory trees. It makes the two calls to _packmanifests()
more similar.
2016-02-12 23:26:15 -08:00
Martin von Zweigbergk
9381017496 changegroup: prune subdirectory dirlogs too
We already prune changesets, root manifests and files whose linkrev is
in the set of common revisions. We should do the same for dirlogs.
2016-02-12 21:21:28 -08:00
Martin von Zweigbergk
3761b3f9e7 changegroup: include subdirectory manifests in verbose size
When verbose logging is one, we report the size in bytes of the
manifest data in the changegroup. For files, we report the size per
file, but I'm not sure we need that level of detail (i.e. size per
directory manifest). Instead, report a single figure for the size of
root manifest plus submanifests.
2016-02-12 15:42:16 -08:00
Martin von Zweigbergk
cd0a0297ee changegroup: make _packmanifests() dumber
The next few patches will rewrite the manifest generation code to work
with merges. We will then walk dirlogs recursively. This prepares for
that by moving much of the treemanifest code out of _packmanifests()
and into generatemanifests(). For this to work, it also adds
_manifestsdone() method that returns the "end of manifests" close
chunk for cg3 and an empty string for cg1 and cg2.
2016-02-12 15:18:56 -08:00
Martin von Zweigbergk
fb3a96fcf4 changegroup: extract generatemanifests()
The changegroup.generate() function is pretty long, so let's extract
the manifest generation part of it.
2016-02-11 20:19:48 -08:00
Martin von Zweigbergk
86ca76bafe changegroup: fix pulling to treemanifest repo from flat repo (issue5066)
In b89de5ee5b31 (changegroup: don't support versions 01 and 02 with
treemanifests, 2016-01-19), I stopped supporting use of cg1 and cg2
with treemanifest repos. What I had not considered was that it's
perfectly safe to pull *to* a treemanifest repo using any changegroup
version. As reported in issue5066, I therefore broke pull from old
repos into a treemanifest repo. It was not covered by the test case,
because that pulled from a local repo while enabling treemanifests,
which enabled treemanifests on the source repo as well. After
switching to pulling via HTTP, it breaks.

Fix by splitting up changegroup.supportedversions() into
supportedincomingversions() and supportedoutgoingversions().
2016-01-27 09:07:28 -08:00
Augie Fackler
db82034373 changegroup: fix treemanifest exchange code (issue5061)
There were two mistakes: one was accidental reuse of the fclnode
variable from the loop gathering file nodes, and the other (masked by
that bug) was not correctly handling deleted directories. Both cases
are now fixed and the test passes.
2016-01-27 10:24:25 -05:00
Martin von Zweigbergk
c28812c552 shelve: use cg3 for treemanifests
Similar to previous change, this teaches shelve to pick the right
changegroup version for repos that use treemanifests.
2016-01-19 15:37:07 -08:00
Martin von Zweigbergk
4208c8682a changegroup: introduce safeversion()
In a few places (at least repair.py and shelve.py), we want to find
the best changegroup version that we can assume users of the repo will
understand. For example, we choose version 01 by default, but if it's
a generaldelta repo, we expect clients to support version 02 anyway,
so we choose that for new bundles (for e.g. "hg strip"). Let's create
a helper for this functionality in changegroup, so we can reuse it
elsewhere later.
2016-01-19 15:32:32 -08:00
Martin von Zweigbergk
fb1b7626e4 changegroup: don't support versions 01 and 02 with treemanifests
Since it would be terribly expensive to convert between flat manifests
and treemanifests, we have decided to simply not support changegroup
version 01 and 02 with treemanifests. Therefore, let's stop announcing
that we support these versions on treemanifest repos.

Note that this means that older clients that try to clone from a
treemanifest repo will fail. What happens is that the server, after
this patch, finds that there are no common versions and raises
"ValueError: no common changegroup version". This results in "abort:
HTTP Error 500: Internal Server Error" on the client.

Before this patch, it was no better: The server would instead find
that there were directory manifest nodes to put in the changegroup 01
or 02 and raise an AssertionError on changegroup.py#668 (assert not
tmfnodes), which would also appear as a 500 to the client.
2016-01-19 14:27:18 -08:00
Martin von Zweigbergk
2e9366a5ee changegroup: cg3 has two empty groups *after* manifests
changegroup.getchunks() determines the end of the stream by looking
for an empty chunk group (two consecutive empty chunks). It ignores
empty groups in the first two groups. Changegroup 3 introduced an
empty chunk between the manifests and the files, which confuses
getchunks(). Since it comes after the first two, getchunks() will stop
there.

Fix by rewriting getchunks so it first counts two groups (empty or
not) and then keeps antostarts counting empty groups. With this counting,
changegroup 1 and 2 have exactly one empty group after the first two
groups, while changegroup 3 has two (one for directories and one for
files).

It's a little hard to test this at this point, but I have verified
that this patch fixes narrowhg (which was broken before this
patch). Also, future patches will fix "hg strip" with treemanifests,
and once that's done, getchunks() will be tested through tests of "hg
strip".
2016-01-19 17:44:25 -08:00
Bryan O'Sullivan
337c3199e2 with: use context manager for transaction in changegroup apply
(This needs some line wrapping due to the additional indent level. -mpm)
2016-01-15 13:14:50 -08:00
Martin von Zweigbergk
d9bf44d310 changegroup3: move treemanifest support into _unpackmanifests()
By putting the treemanifest code in _unpackmanifests(),
_addchangegroupfiles() will only be about files again, and we get a
nice symmetry between _packmanifests() and _unpackmanifest(). The
immediate benefit to me is that remotefilelog should not need to be
updated to work with treemanifests. It should also make
server.validate and progress output easier to get right. Probably
bundlerepo too.
2016-01-08 16:12:58 -08:00
Martin von Zweigbergk
87d65b1188 changegroup3: add empty chunk separating directories and files
Remotefilelog overrides changegroup._addchangegroupfiles(), assuming
it is about files, which seems like a natural assumption. However, in
changegroup3, directory manifests are sent in the files section of the
changegroup. These naturally make remotefilelog unhappy.

The fact that the directories are not separated from the files
(although they do come before the files) also makes server.validate
harder to implement. Since we read one chunk at a time from the steam,
once we have found a file (non-directory) entry in the stream, we
would have to push the read data back into the stream, or otherwise
refactor the code. It will be easier if we add an empty chunk after
all directory manifests.

This change adds that empty chunk, although we don't yet take
advantage of it on the reading side. We will soon move the tree
manifest stuff out of _addchangegroupfiles() and into
_unpackmanifests().
2016-01-11 15:10:31 -08:00
Martin von Zweigbergk
63c15f247e changegroup3: introduce experimental.changegroup3 boolean config
In order to give us the freedom to change the changegroup3 format,
let's hide it behind an experimental config. Since it is required by
treemanifests, that will override the cg3 config.
2016-01-12 21:23:45 -08:00
Martin von Zweigbergk
e5bd6473b3 changegroup: hide packermap behind methods
This is to prepare for hiding changegroup3 behind a config option.
2016-01-12 21:01:06 -08:00
Mateusz Kwapich
6688b1c845 hooks: add HG_NODE_LAST to txnclose and changegroup hook environments
Sometimes a txnclose or changegroup hook wants to iterate through all
the changesets in transaction: in that situation usually the revset
`$HG_NODE:` is used to select the revisions. Unfortunately this revset
sometimes may contain too many changesets because we don't have the
write lock while the hook runs newer changes may be added to
repository in the meantime.

That's why there is a need for extra variable carrying the information about
the last change in the transaction.
2016-01-05 17:37:59 -08:00
Martin von Zweigbergk
fafdf90374 changegroup: remove now-unused 'wasempty' variable and parameter 2016-01-08 21:14:08 -08:00
Martin von Zweigbergk
417363259e treemanifests: set bundle2 part parameter indicating treemanifest
By adding a mandatory 'treemanifest' parameter in the bundle2 part, we
make it possible for the recipient to set repo requirements before the
manifest revlog is accessed.
2016-01-08 21:13:06 -08:00
Martin von Zweigbergk
88327fd798 changegroup: don't add a second trailing '/' in dir name
The paths given from treemanifest.dir() already contains the trailing
slash.
2016-01-08 14:47:02 -08:00
Martin von Zweigbergk
ed1140692c changegroup: remove left-over debugging help 2016-01-08 14:33:13 -08:00
Mike Edgar
44af48ee4a changegroup: add flags field to cg3 delta header
This lets revlog flags be transmitted over the wire. Right now this is
useful for censored nodes and for narrowhg's ellipsis nodes.
2015-12-14 15:55:12 -05:00
Augie Fackler
d33d6a0cb5 changegroup: introduce cg3, which has support for exchanging treemanifests
I'm not entirely happy with using a trailing / on a "file" entry for
transferring a treemanifest. We've discussed putting some flags on
each file header[0], but I'm unconvinced that's actually any better:
if we were going to add another feature to the cg format we'd still be
doing a version bump anyway to cg4, so I'm inclined to not spend time
coming up with a more sophisticated format until we actually know what
the next feature we want to stuff in a changegroup will be.

Test changes outside test-treemanifest.t are only due to the new CG3
bundlecap showing up in the wire protocol.

Many thanks to adgar@google.com and martinvonz@google.com for helping
me with various odd corners of the changegroup and treemanifest API.

0: It's not hard refactoring, nor is it a lot of work. I'm just
disinclined to do speculative work when it's not clear what the
customer would actually be.
2015-12-11 11:23:49 -05:00
Augie Fackler
f675aea41b changegroup: restate file linknode callback using generator expressions
I think this is slightly clearer, and it nicely avoids an extra nested
function.
2015-12-04 11:39:03 -05:00
Augie Fackler
7ee8b9a4d3 changegroup: clean up file lookup function
One case is basically degenerate, so just extract it and make the
function clearer.
2015-12-04 11:38:02 -05:00
Augie Fackler
53ca8538c0 changegroup: remove one special case from lookupmflinknode
In the fastpathlinkrev case, lookupmflinknode was a very complicated
way of saying mfs.__getitem__, so let's just get that case out of our
way so it's easier to understand what's going on.
2015-12-04 10:55:46 -05:00
Augie Fackler
4e80790b8d changegroup: drop 'if True' that made the previous change clearer 2015-12-04 10:35:45 -05:00
Augie Fackler
c3a36c8116 changegroup: avoid iterating the whole manifest
The old code gathered the list of all files that changed anywhere in
history and then gathered changed file nodes by walking the entirety
of each manifest to be sent in order to gather changed file
nodes. That's going to be unfortunate for narrowhg, and it's already
inefficient for medium-to-large repositories.

Timings for bundle --all on my hg repo, tested with hgperf:
Before:
! wall 23.442445 comb 23.440000 user 23.250000 sys 0.190000 (best of 3)

After:
! wall 20.272187 comb 20.270000 user 20.190000 sys 0.080000 (best of 3)
2015-12-04 10:34:58 -05:00
Augie Fackler
514dae67c6 changegroup: document manifest linkrev callback some more
Martin and I just got super-confused reading some code here, so I
think it's time for some more documentation.
2015-12-03 10:56:05 -05:00
Augie Fackler
aa07f6f058 changegroup: note during bundle apply if the repo was empty
An upcoming change for exchanging treemanifest data will need to
update the repository capabilities, which we should only do if the
repository was empty before we started applying this changegroup. In
the future we will probably need a strategy for upgrading to
treemanifest in requires during a pull (I'm assuming at some point
we'll make it possible to have a flag day to enable treemanifests on
an existing history.)
2015-12-02 14:32:17 -05:00
Pierre-Yves David
f89772113f changegroup: back code change of b5988e1d3dcb out
The previous changeset is a simpler way of fixing issue4934 without changing the
spirit of the code. We can remove the dual call to 'delayupdate' but we keep the
tests to show that the issue is still fixed.
2015-11-06 13:01:15 -05:00
Pierre-Yves David
dfd6e44ebe changegroup: call 'prechangegroup' hook before setting up write delay
The 'prechangegroup' interfere with 'delayupdate' logic because it trigger the
one time call of 'changelog._writepending' (see issure4934). There is no reason
not to call that hook before setting up 'delayupdate' so we move the call a bit
earlier to avoid interference.
2015-11-06 12:59:09 -05:00
Pierre-Yves David
107254a73d changegroup: fix the scope of a try finally
The try finally is here to ensure we release the just-created transaction.
Therefore we should not do half a dozen operations before actually entry the try
scope.
2015-11-06 12:39:06 -05:00
Durham Goode
49c25cc444 hooks: fix hooks not firing if prechangegroup was set (issue4934)
We need to call delayupdate again after writing to the changelog.
Otherwise the prechangegroup hook consumes the delayupdate subscription and
future hooks don't see the pending changes (see issue 4934 for more details).

Adds a test that triggers the prechangegroup hook before the pretxnchangegroup
hook and verifies that the output of pretxnchangegroup doesn't change.
2015-11-03 17:13:27 -08:00
FUJIWARA Katsunori
8f93d72f88 hook: centralize passing HG_PENDING to external hook process
This patch centralizes passing HG_PENDING to external hook process
into '_exthook()'. To make in-memory changes visible to external hook
process, this patch does:

  - write (or schedule to write) in-memory dirstate changes, and
  - set HG_PENDING environment variable, if:
    - a transaction is running, and
    - there are in-memory changes to be visible

This patch tests some commands with some hooks, because transaction
activity of a same hook differs from each other ("---": "not tested").

    ======== ========= ========= ============
    command  preupdate precommit pretxncommit
    ======== ========= ========= ============
    unshelve   o        ---       ---
    backout    x        ---       ---
    import     ---       o         o
    qrefresh   ---       x         o
    ======== ========= ========= ============

Each hooks are examined separately to prevent in-memory changes from
being visible to external process accidentally by side effect of hooks
previously invoked.
2015-10-17 01:15:34 +09:00
Augie Fackler
96c9d2fb66 changegroup: move manifest unpacking into its own method
The upcoming cg3 will need different logic for unpacking manifests.
2015-10-14 15:11:53 -04:00
Augie Fackler
ebae33811b changegroup: move manifest packing into a separate function
A future change will introduce a new function on a cg3packer that can
pack treemanifests as well as flatmanifests.
2015-10-01 15:35:10 -04:00
Augie Fackler
f3c48144c3 changegroup: rename manifest linknode closure for clarity
Since I'm spending the time to understand this code, I may as well
leave it clearer than I found it.
2015-09-30 19:59:12 -04:00
Augie Fackler
af65966d0c changegroup: reformat packermap and add comment
I'm about to add a cg3, and it seems prudent to annotate what formats
support what features. It strikes me that we may want to consider
moving to a more feature-oriented model in the future, but we'll see
how that looks in a little while I guess.
2015-09-29 15:14:03 -04:00
Augie Fackler
63207d7efe changegroup: document the public surface area of cg?unpackers
This should help future readers at least a little.
2015-10-14 12:05:27 -04:00
Augie Fackler
f83ec87df0 changegroup: mark cg1unpacker.chunklength as private 2015-10-14 11:58:56 -04:00
Augie Fackler
c297597d4b changegroup: note why a few methods on cg1unpacker exist
I'm not sure what to do abstraction-wise here. It might be more
sensible to make a memoryrepo that could apply a bundle in-memory and
then we could make the changegroup data be strictly an applyable
stream, but that's an idea for Later.
2015-10-14 11:58:35 -04:00
Augie Fackler
4978a2ba88 changegroup: mark _addchangegroupfiles as module-private
I'm trying to reason about the public surface area of this module now,
so it's worth tagging private things as such.
2015-10-13 17:16:10 -04:00
Augie Fackler
4933ef34f9 changegroup: delete now-unused addchangegroup method 2015-10-13 17:14:37 -04:00
Augie Fackler
7acc1ffbe4 changegroup: migrate addchangegroup() to forward to cg?unpacker.apply()
I'll clean up callers in subsequent patches, then remove the forwarding.
2015-10-13 16:58:51 -04:00
Augie Fackler
24acce6876 changegroup: move source check to top of addchangegroup
This is preparation for some refactoring.
2015-10-13 15:54:05 -04:00
Pierre-Yves David
11cc1a1216 getsubset: get the unpacker version from the bundler
The current setup requires to pass both a packer and, optionally, the version
of the unpacker. This is confusing and error prone as the two value cannot
mismatch. Instead, we simply grab the version from the packer. This fixes a bug
where requesting a cg2 from 'hg bundle' were reported as changegroup 1.

I should have caught that in the initial changeset but I missed it somehow.
2015-10-09 14:59:37 -07:00
Pierre-Yves David
30913031d4 error: get Abort from 'error' instead of 'util'
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.

For great justice.
2015-10-08 12:55:45 -07:00
Pierre-Yves David
ad2caca390 changegroup: extract the file management part in its own function
The current writebundle function do two things:

- taking a changegroup-packer instance and storing it into a valid bundle with
  proper header.

- creating a temporary or requested file to store that bundle

We would like to make it easier to forward bundle stream directly from a remote
peer to a file, so we split the two logic to be able to skip the one about
building a valid bundle (the remote is already sending one).
2015-10-05 00:14:47 -07:00
Pierre-Yves David
52f209c327 changegroup: add version argument to getchangegroup
For some obscure reasons (probably upsetting a Greek goddess),
getchangegroup did not had a 'version' argument to control the changegroup
version. We fixes this to allow cg02 to be used with 'hg bundle' in the future.
2015-10-01 19:14:47 -07:00
Pierre-Yves David
905bd41a77 changegroup: add version argument to getlocalchangegroup
For some obscure reasons (probably upsetting a Greek goddess),
getlocalchangegroup did not have a 'version' argument to control the
changegroup version. We fix this to allow cg02 to be used with 'hg
bundle' in the future.
2015-10-01 19:14:33 -07:00
Pierre-Yves David
28ea49b6c2 writebundle: add a compression argument for the bundle2 case
Bundle2 compression is more complex than the bundle1 one. Therefore it
is handled by the bundler itself. Moreover, on-disk bundle2 will
probably have a large number of flavors so simply adding a new "format"
for it does not seems the way to go.

This will be used in the next changeset to compress bundle2 strip backup.
2015-09-29 14:41:40 -07:00
Pierre-Yves David
496ebe3ecb changegroup: use a different compression key for BZ in HG10
For "space saving", bundle1 "strip" the first two bytes of the BZ stream since
they always are 'BZ'. So the current code boostrap the uncompressor with 'BZ'.
This hack is impractical in more generic case so we move it in a dedicated
"decompression".
2015-09-23 11:33:30 -07:00
Yuya Nishihara
b2d79071df readbundle: fix typo of None compression
The test simulates pre-53c48b25631a hgweb that sends "unbundle" capability
with no argument.
2015-09-18 21:32:43 +09:00
Pierre-Yves David
0b4b905886 readbundle: map 'HG10UN' to None compression
In line with the other previous changes
2015-09-11 17:06:56 -07:00
Pierre-Yves David
70cc7f5586 getsubset: use None to request uncompressed changegroup 2015-09-11 17:06:02 -07:00
Pierre-Yves David
6a841b0f13 writebundle: use 'None' instead of 'UN' for the bundle2 case
Let's be modern!
2015-09-15 17:43:54 -07:00
Pierre-Yves David
47af8d73d3 compression: use 'None' for no-compression
This seems more idiomatic and clearer. We still support both None and 'UN' for
now because no user are migrated.
2015-09-15 17:53:28 -07:00
Pierre-Yves David
f052875fa0 changegroup: move all compressions utilities in util
We'll reuse the compression for other things (next target bundle2), so let's
make it more accessible and organised.
2015-09-15 17:35:32 -07:00
Gregory Szorc
acca109d80 changegroup: use absolute_import 2015-08-08 00:35:37 -07:00
Matt Mackall
e2fb109462 generaldelta: mark experimental reordering option 2015-06-25 17:43:52 -05:00
Gregory Szorc
2768272919 changegroup: compute seen files as changesets are added (issue4750)
Before this patch, addchangegroup() would walk the changelog and compute
the set of seen files between applying changesets and applying
manifests. When cloning large repositories such as mozilla-central,
this consumed a non-trivial amount of time. On my MBP, this walk takes
~10s. On a dainty EC2 instance, this was measured to take ~125s! On the
latter machine, this delay was enough for the Mercurial server to
disconnect the client, thinking it had timed out, thus causing a clone
to abort.

This patch enables the changelog to compute the set of changed files as
new revisions are added. By doing so, we:

* avoid a potentially heavy computation between changelog and manifest
  processing by spreading the computation across all changelog additions
* avoid extra reads from the changelog by operating on the data as it is
  added

The downside of this is that the add revision callback does result in
extra I/O. Before, we would perform a flush (and subsequent read to
construct the full revision) when new delta chains were created. For
changelogs, this is typically every 2-4 revisions. Using the callback
guarantees there will be a flush after every added revision *and* an
open + read of the changelog to obtain the full revision in order to
read the added files. So, this increases the frequency of these
operations by the average chain length. In the future, the revlog
should be smart enough to know how to read revisions that haven't been
flushed yet, thus eliminating this extra I/O.

On my MBP, the total CPU times for an `hg unbundle` with a local
mozilla-central gzip bundle containing 251,934 changesets and 211,065
files did not have a statistically significant change with this patch,
holding steady around 360s. So, the increased revlog flushing did not
have an effect.

With this patch, there is no longer a visible pause between applying
changeset and manifest data. Before, it sure felt like Mercurial was
lethargic making this transition. Now, the transition is nearly
instantaneous, giving the impression that Mercurial is faster. Of course,
eliminating this pause means that the potential for network disconnect due
to channel inactivity during the changelog walk is eliminated as well.
And that is the impetus behind this change.
2015-07-18 10:57:20 -07:00
Matt Mackall
b050378961 merge with stable 2015-07-01 16:33:31 -05:00
Pierre-Yves David
b05f468b44 changegroup: properly compute common base in changeggroupsubset (issue4736)
The computation of roots was buggy, any ancestor of a bundled merge which was
also a descendant of the parents of a bundled revision were included as part of
the bundle. We fix it and add a test for strip (which revealed the problem).

Check the test for a practical usecase.
2015-06-29 11:20:09 -07:00
Gregory Szorc
5380dea2a7 global: mass rewrite to use modern exception syntax
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".

This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.

This patch was produced by running `2to3 -f except -w -n .`.
2015-06-23 22:20:08 -07:00
Matt Harbison
75f10ee474 changegroup: flush the ui stdio buffers after adding a changegroup
This eliminates the following test failure on Windows, as well as a similar one
in evolve's test-wireproto.t.  See the previous patch for details on the
problem.

  --- e:/Projects/hg/tests/test-init.t
  +++ e:/Projects/hg/tests/test-init.t.err
  @@ -216,10 +216,10 @@
      * test                      0:08b9e9f63b32
     $ hg clone -e "python \"$TESTDIR/dummyssh\"" local ssh://user@dummy/remote-bookmarks
     searching for changes
  +  exporting bookmark test
     remote: adding changesets
     remote: adding manifests
     remote: adding file changes
     remote: added 1 changesets with 1 changes to 1 files
  -  exporting bookmark test
     $ hg -R remote-bookmarks bookmarks
        test                      0:08b9e9f63b32
2015-04-10 23:34:06 -04:00
Pierre-Yves David
af7d20b000 bundle2: rename format, parts and config to final names
It is finally time to freeze the bundle2 format! To do so we:
- rename HG2Y to HG20,
- drop "b2x:" prefix from all part names,
- rename capability to "bundle2-exp" to "bundle2"
- rename the hook flag from 'bundle2-exp' to 'bundle2'
2015-04-09 16:25:48 -04:00
Matt Mackall
e73d6537ab publishing: use new helper method 2015-06-18 15:34:22 -05:00
Martin von Zweigbergk
194a36de1c changegroup: simplify by not reusing 'prog(ress)' instance
Just create a new instance of the 'prog' class for each step instead
of replacing its fields and resetting the counter.
2015-06-12 11:00:50 -07:00
Martin von Zweigbergk
468a7b8172 changegroup: don't use 'repo' for non-repo 'self'
'repo' is a very confusing name to use for 'self', especially when
it's not a repo. Also drop repo.ui member (a.k.a. self.ui) now that
'self' doesn't shadow outer 'repo' variable.
2015-06-12 10:54:10 -07:00
Mike Edgar
b4a5dfbe4d changegroup: emit full-replacement deltas if either revision is censored
To ensure that exchanged deltas in the presence of censored revisions can
always be applied to the recipient repository, the deltas must replace the
entire base text. To make this restriction reasonably enforceable, the delta
must do so with a single patch operation.

For background and broader design of the censorship feature, see:
http://mercurial.selenic.com/wiki/CensorPlan
2015-01-21 22:09:32 -05:00
Mads Kiilerich
b2b60414f6 spelling: fixes from proofreading of spell checker issues 2015-01-18 02:38:57 +01:00
Mike Edgar
9635f8c5b0 revlog: in addgroup, reject ill-formed deltas based on censored nodes
To ensure interoperability when clones disagree about which file nodes are
censored, a restriction is made on deltas based on censored nodes. Any such
delta must replace the full text of the base in a single patch.

If the recipient of a delta considers the base to be censored and the delta
is not in the expected form, the recipient must reject it, as it can't know
if the source has also censored the base.

For background and broader design of the censorship feature, see:
http://mercurial.selenic.com/wiki/CensorPlan
2015-02-06 00:55:29 +00:00
Eric Sumner
dab488d66f changegroup.getsubset: support multiple versions
Allow a version parameter to specify which version of the packer should be
used
2015-01-15 15:55:13 -08:00
Eric Sumner
96fb8b0c04 changegroup.writebundle: HG2Y support
This diff adds support to writebundle to generate a bundle2 wrapper; upcoming
diffs will add an option to write a v2 changegroup part instead of v1 in these
bundles.
2015-01-15 15:39:16 -08:00
Eric Sumner
7cbcf9bdca changegroup.writebundle: provide ui
The next diff will add support for writing bundle2 files to writebundle, but
the bundle2 generator wants access to a ui object.  This changes the signature
and callsites to pass one in.
2015-01-15 14:39:41 -08:00
Eric Sumner
c5cdff3779 pullbundle2: extract addchangegroup result combining into its own function
This will also be used for 'hg unbundle'
2015-01-16 12:53:45 -08:00
Mads Kiilerich
af8710d713 bundle: when verbose, show what takes up the space in the generated bundle
This is kind of similar to the debugbundle command but gives summarized actual
uncompressed number of bytes when creating the bundle. The numbers are as
usable as the bundle format is efficient. Hopefully bundle2 will make it a
better indicator of actual entropy.

This is useful when accepting pull requests to assess whether the repo size
increase seems reasonable for the diff before pushing stuff upstream, It has
helped me catching large files that should have been committed as largefiles
but was committed as regular files in intermediate changesets.

This output doesn't combine well with debug output so we only enable it when
verbose without debug.
2014-08-15 19:43:32 +02:00
Pierre-Yves David
15b83609e4 addchangegroup: accept an expected total number of changesets as argument
Caller can optionally informs how much changesets are expected to be added. This
will be used for a more useful progress bar output.
2015-06-07 15:57:40 -07:00
Pierre-Yves David
060a368fc5 changegroup: remove 'getchangegroupraw' function
There is no remaining caller for this function.
2015-06-07 15:49:57 -07:00
Gregory Szorc
ac42db2bfb changegroup: rename _computeoutgoing to computeoutgoing
We're going to use this function from another module in an upcoming
patch. Drop the _ prefix to mark it as non-private.
2015-06-02 19:58:06 -07:00
Martin von Zweigbergk
0059d15d29 changegroup: drop _changelog and _manifest properties
We already have a _repo property on the packer, and we only access the
changelog and manifest revlog in one place, so it's just as easy to
get them from self._repo.
2015-04-30 16:45:03 -07:00
Martin von Zweigbergk
5a53aebb16 changegroup: document the cases where reordering complicates linkrevs 2015-04-29 13:25:07 -07:00
Martin von Zweigbergk
d2ebe6d492 changegroup: extract condition for linkrev fastpath
The condition for taking the fastpath (or not) is used in two
places. By extracting it, we also provide a place to document what
it's about.
2015-04-29 10:34:28 -07:00
Martin von Zweigbergk
c581722414 changegroup.group: drop 'reorder' parameter
Since we always pass self._reorder to self.group(), let's drop the
parameter and let group() read from self._reorder itself. There are no
other in-tree callers to group().
2015-04-29 10:30:58 -07:00
Martin von Zweigbergk
933b6a8964 cg2packer: set reorder=False in __init__ instead of in group()
The difference between reorder=None (bundle.reorder=auto) and
reorder=False is that the generaldelta revlogs get reordered with the
former. In cg2packer, group() we check if the revlog uses generaldelta
and if reorder=None and then convert that to reorder=False. We are
effectively saying that whether or not generaldelta is used, we want
reorder=None to mean reorder=False for changegroup 2. To make this
clearer, check if reorder=None in the constructor and change it to
False there and drop the overriding of group(). Also document the
reason for turning reordering off.
2015-04-29 10:38:45 -07:00
Martin von Zweigbergk
b0cea1d15c changegroup: use 'reorder is None' instead of 'reorder is not True/False'
The config option bundle.reorder can be {on,off,auto}, which gets read
into the 'reorder' variable as {True,False,None}. In two places, we
need to decide how to handle the None/auto case. I personally find it
easier to read those expressions when written to explicitly compare to
None.
2015-04-23 09:44:22 -07:00
Martin von Zweigbergk
3c6a3be528 changegroup: close progress in same function as it's started
changegroup.group() and changegroup.generatefiles() both currently
start progress (with topic "bundling"), but changegroup.generate()
closes the topic. Move the closing to the functions that start the
topic, so it's easier to see where the topic is started and closed.

This completes a move that seems to have been started in f8f5836242c6
(bundle-ng: move progress handling out of the linkrev callback,
2013-05-10).
2015-04-22 15:03:09 -07:00
Martin von Zweigbergk
cf9f806198 changegroup: don't reuse 'mfest' variable for different type
We have a variable 'mfest' that's first a manifest nodeid and then a
manifest. Let's make it clearer by using separate variables for the
two uses.
2015-04-28 10:21:04 -07:00
Martin von Zweigbergk
67aa378a4f changegroup: rename 'mf' to 'ml' to match 'cl', since it's a revlog
The 'mf' variable is a manifest revlog, not a manifest, so let's
rename it accordingly. We already call the changelog variable 'cl', so
'ml' seems appropriate.
2015-04-28 10:19:42 -07:00
Martin von Zweigbergk
a2a1e5c7db changegroup: rename 'needed' to 'clrevs' to match 'clnodes' 2015-04-20 14:11:20 -07:00
Martin von Zweigbergk
725e64cb42 changegroup: document that 'source' parameter exists for extensions
The 'source' parameter passed to generatefiles() is unused by the
method itself, but Durham says it is used by an extension.
2015-04-28 13:49:19 -07:00
Martin von Zweigbergk
e67f94169b changegroup: removed unused 'source' parameter from prune()
The parameter has been unused since it was introduced in 40209abd6471
(bundle: refactor changegroup prune to be its own function,
2013-05-30), and Durham says it is not used by his extension either.
2015-04-28 13:40:00 -07:00
Matt Mackall
174e7f793d merge with stable 2014-11-22 17:09:04 -06:00
Durham Goode
0a7a4a1f33 changegroup: fix file linkrevs during reorders (issue4462)
Previously, if reorder was true during the creation of a changegroup bundle,
it was possible that the manifest and filelogs would be reordered such that the
resulting bundle filelog had a linkrev that pointed to a commit that was not
the earliest instance of the filelog revision. For example:

With commits:

0<-1<---3<-4
  \       /
   --2<---

if 2 and 3 added the same version of a file, if the manifests of 2 and 3 have
their order reversed, but the changelog did not, it could produce a filelog with
linkrevs 0<-3 instead of 0<-2, which meant if commit 3 was stripped, it would
delete that file data from the repository and commit 2 would be corrupt (as
would any future pulls that tried to build upon that version of the file).

The fix is to make the linkrev fixup smarter. Previously it considered the first
manifest that added a file to be the first commit that added that file, which is
not true. Now, for every file revision we add to the bundle we make sure we
attach it to the earliest applicable linkrev.
2014-11-20 16:30:57 -08:00
Gregory Szorc
04eeb85285 changegroup: sparsely populate fnodes
Previously, fnodes had a key and empty dict value for every element in
changedfiles. This is somewhat wasteful. Empty dicts in CPython consume
a lot more memory than you would expect - 280 bytes.

On mozilla-central, which has ~190,000 files/fnodes keys, the previous
loop populating fnodes allocated 91,924 KB of memory, most of that for
the empty dicts.

With this patch in place, our peak RSS during mozilla-central clone
drops:

before:  364,356 KB
after:   326,008 KB
delta:   -38,348 KB

When combined with the previous patch, total peak RSS decrease is now
190,116 KB.
2014-11-06 22:48:20 -08:00
Gregory Szorc
c6e3c6fb27 changegroup: don't store unused value on fnodes (issue4443)
The contents of fnodes are only accessed once per key. It is wasteful to
cache the value since nobody will use it.

Before this patch, the caching of unused data in fnodes was effectively
causing a memory leak during the file streaming part of bundle creation.

On mozilla-central (which has ~190,000 entries in fnodes), this patch
has a significant impact on RSS at the end of generate():

before:  516,124 KB
after:   364,356 KB
delta:  -151,768 KB

The origin of this code can be traced back to 1f567a607f1f and has been
with us since the 2.7 release.
2014-11-06 22:33:48 -08:00
Gregory Szorc
0bfb4de7ec changegroup: don't define lookupmf() until it is needed
lookupmf() is currently defined earlier than when it is needed. Future
patches further refactoring this code will be easier to read when
lookupmf() is in its new home.
2014-11-06 20:57:12 -08:00