Commit Graph

321 Commits

Author SHA1 Message Date
Boris Feld
d449bcaec6 bundle2: support a 'records' mode for the 'bookmarks' part
In this mode, the bookmarks changes are record in the 'bundleoperation' records
instead of inflicted to the repository. This is necessary to use the part when
pulling.
2017-10-17 15:26:16 +02:00
Boris Feld
85a28bae65 bundle2: add a 'modes' dictionary to the bundle operation
This new attribute allows the codes requesting an unbundling to pass important
information to individual part handlers. The current target use case is to
allow for receiving 'bookmarks' part without directly updating local
repository, but just recording the received data instead. This is necessary
for pull where the remote bookmarks are processed locally. I expect the
concept to be beneficial to other parts in the future.

To clarify the bookmark behavior on pull, the remote bookmark value are not just
taken -as-is- into the local repository. There is an extra step to detect
bookmark divergence. The remote bookmarks data are stored until this processing
happens.
2017-10-17 15:39:34 +02:00
Boris Feld
fa6d2b1849 bookmark: add pushkey hook compatiblity to the bundle2 part
Currently, pushing a bookmark update triggers a pushkey hooks. It is likely
that users in the wild use such hooks to control bookmark movement. Using a non
push-key mechanism to exchange bookmark means these hooks are no longer called,
possibly breaking existing users setup. So we add explicit call to the pushkey
hooks in the handling of the bundle2 part. This behavior can be disabled with a
new config knob: 'server.bookmarks-pushkey-compat'.
2017-10-17 12:07:24 +02:00
Boris Feld
b1c5929fec bookmark: introduce a 'bookmarks' part
This part can carry and apply bookmarks information. We start with adding the
core behavior of the part. In its current form, the part is only suitable for
push since it plain update the bookmark without consideration for the local
state. Support of the behavior needed for pulling will be added in later
changesets.
2017-10-15 18:02:11 +02:00
Boris Feld
e7031fe580 push: include a 'check:bookmarks' part when possible
Before updating the actual bookmark update, we can start with updating the way
we check for push race. Checking bookmarks state earlier is useful even if we
still use pushkey. Aborting before the changegroup is added can save a lot of
time.
2017-11-13 04:22:45 +01:00
Boris Feld
02c4c0f0cf bookmark: add a 'check:bookmarks' bundle2 part
This part checks that bookmarks are still at the node they are expected to be.
This allows a pushing client to detect push race where the repository was
updated between the time it discovered the server state and the time it managed
to finish its push.

Such checking already exists when pushing bookmark through pushkey. This new
part can be inserted at the beginning of the bundle, triggering abort earlier.
In addition, we would like to move away from pushey to push bookmark. A step
useful to solve issue5165.
2017-10-15 15:01:03 +02:00
Gregory Szorc
a7f7143fb9 bundle2: avoid unbound read when seeking
Currently, seekableunbundlepart.seek() will perform a read() during
seek operations. This will allocate a buffer to hold the raw data
over the seek distance. This can lead to very large allocations
and cause performance to suffer.

We change the code to perform read(32768) in a loop to avoid
potentially large allocations.

`hg perfbundleread` on an uncompressed Firefox bundle reveals
a performance impact:

! bundle2 iterparts()
! wall 2.992605 comb 2.990000 user 2.260000 sys 0.730000 (best of 4)
! bundle2 iterparts() seekable
! wall 3.863810 comb 3.860000 user 3.000000 sys 0.860000 (best of 3)
! bundle2 part seek()
! wall 6.213387 comb 6.200000 user 3.350000 sys 2.850000 (best of 3)
! wall 3.820347 comb 3.810000 user 2.980000 sys 0.830000 (best of 3)

Since seekable bundle parts are (only) used by bundlerepo, this /may/
speed up initial loading of bundle-based repos. But any improvement
will likely only be noticed on very large bundles.

Differential Revision: https://phab.mercurial-scm.org/D1394
2017-11-13 22:20:12 -08:00
Gregory Szorc
d1e95ae39b bundle2: inline struct operations
Before, we were calling struct.unpack() (via an alias) on every
loop iteration. I'm not sure what Python does under the hood, but
it would have to look at the struct format and determine what to
do.

This commit establishes a struct.Struct instance and reuses it for
struct reading.

We can see the impact from running `hg perfbundleread` on a Firefox
bundle:

! read(8k)
! wall 0.679730 comb 0.680000 user 0.140000 sys 0.540000 (best of 15)
! read(16k)
! wall 0.577228 comb 0.570000 user 0.080000 sys 0.490000 (best of 17)
! read(32k)
! wall 0.516060 comb 0.520000 user 0.040000 sys 0.480000 (best of 20)
! read(128k)
! wall 0.496378 comb 0.490000 user 0.010000 sys 0.480000 (best of 20)
! bundle2 iterparts()
! wall 3.056811 comb 3.050000 user 2.340000 sys 0.710000 (best of 4)
! wall 2.992605 comb 2.990000 user 2.260000 sys 0.730000 (best of 4)
! bundle2 iterparts() seekable
! wall 4.007676 comb 4.000000 user 3.170000 sys 0.830000 (best of 3)
! wall 3.863810 comb 3.860000 user 3.000000 sys 0.860000 (best of 3)
! bundle2 part seek()
! wall 6.267110 comb 6.250000 user 3.480000 sys 2.770000 (best of 3)
! wall 6.213387 comb 6.200000 user 3.350000 sys 2.850000 (best of 3)
! bundle2 part read(8k)
! wall 3.404164 comb 3.400000 user 2.650000 sys 0.750000 (best of 3)
! wall 3.241099 comb 3.250000 user 2.560000 sys 0.690000 (best of 3)
! bundle2 part read(16k)
! wall 3.197972 comb 3.200000 user 2.490000 sys 0.710000 (best of 4)
! wall 3.003930 comb 3.000000 user 2.270000 sys 0.730000 (best of 4)
! bundle2 part read(32k)
! wall 3.060557 comb 3.060000 user 2.340000 sys 0.720000 (best of 4)
! wall 2.904695 comb 2.900000 user 2.160000 sys 0.740000 (best of 4)
! bundle2 part read(128k)
! wall 2.952209 comb 2.950000 user 2.230000 sys 0.720000 (best of 4)
! wall 2.776140 comb 2.780000 user 2.070000 sys 0.710000 (best of 4)

Profiling now says most remaining time is spent in util.chunkbuffer.
I already heavily optimized that data structure several releases ago.
So we'll likely get little more performance out of bundle2 reading
while still retaining util.chunkbuffer().

Differential Revision: https://phab.mercurial-scm.org/D1393
2017-11-13 21:54:46 -08:00
Gregory Szorc
e2444ba526 bundle2: inline changegroup.readexactly()
Profiling reveals this loop is pretty tight. Literally any
function call elimination can make a big difference.

This commit inlines the relatively trivial changegroup.readexactly()
method inside the loop.

The results with `hg perfbundleread` on a bundle of the Firefox repo
speak for themselves:

! read(8k)
! wall 0.679730 comb 0.680000 user 0.140000 sys 0.540000 (best of 15)
! read(16k)
! wall 0.577228 comb 0.570000 user 0.080000 sys 0.490000 (best of 17)
! read(32k)
! wall 0.516060 comb 0.520000 user 0.040000 sys 0.480000 (best of 20)
! read(128k)
! wall 0.496378 comb 0.490000 user 0.010000 sys 0.480000 (best of 20)
! bundle2 iterparts()
! wall 3.460903 comb 3.460000 user 2.760000 sys 0.700000 (best of 3)
! wall 3.056811 comb 3.050000 user 2.340000 sys 0.710000 (best of 4)
! bundle2 iterparts() seekable
! wall 4.312722 comb 4.310000 user 3.480000 sys 0.830000 (best of 3)
! wall 4.007676 comb 4.000000 user 3.170000 sys 0.830000 (best of 3)
! bundle2 part seek()
! wall 6.754764 comb 6.740000 user 3.970000 sys 2.770000 (best of 3)
! wall 6.267110 comb 6.250000 user 3.480000 sys 2.770000 (best of 3)
! bundle2 part read(8k)
! wall 3.668004 comb 3.660000 user 2.960000 sys 0.700000 (best of 3)
! wall 3.404164 comb 3.400000 user 2.650000 sys 0.750000 (best of 3)
! bundle2 part read(16k)
! wall 3.489196 comb 3.480000 user 2.750000 sys 0.730000 (best of 3)
! wall 3.197972 comb 3.200000 user 2.490000 sys 0.710000 (best of 4)
! bundle2 part read(32k)
! wall 3.388569 comb 3.380000 user 2.640000 sys 0.740000 (best of 3)
! wall 3.060557 comb 3.060000 user 2.340000 sys 0.720000 (best of 4)
! bundle2 part read(128k)
! wall 3.276415 comb 3.270000 user 2.560000 sys 0.710000 (best of 4)
! wall 2.952209 comb 2.950000 user 2.230000 sys 0.720000 (best of 4)

Differential Revision: https://phab.mercurial-scm.org/D1392
2017-11-13 21:48:35 -08:00
Gregory Szorc
a494c5cbce bundle2: inline debug logging
Profiling revealed that repeated calls to indebug() were
consuming a fair amount of CPU during bundle2 reading, with
most of the time spent in ui.configbool().

Inlining indebug() and avoiding extra attribute lookups speeds
things up substantially. Using `hg perfbundleread` with a Firefox
bundle:

! read(8k)
! wall 0.679730 comb 0.680000 user 0.140000 sys 0.540000 (best of 15)
! read(16k)
! wall 0.577228 comb 0.570000 user 0.080000 sys 0.490000 (best of 17)
! read(32k)
! wall 0.516060 comb 0.520000 user 0.040000 sys 0.480000 (best of 20)
! read(128k)
! wall 0.496378 comb 0.490000 user 0.010000 sys 0.480000 (best of 20)
! bundle2 iterparts()
! wall 6.983756 comb 6.980000 user 6.220000 sys 0.760000 (best of 3)
! wall 3.460903 comb 3.460000 user 2.760000 sys 0.700000 (best of 3)
! bundle2 iterparts() seekable
! wall 8.132131 comb 8.110000 user 7.160000 sys 0.950000 (best of 3)
! wall 4.312722 comb 4.310000 user 3.480000 sys 0.830000 (best of 3)
! bundle2 part seek()
! wall 10.860942 comb 10.840000 user 7.790000 sys 3.050000 (best of 3)
! wall 6.754764 comb 6.740000 user 3.970000 sys 2.770000 (best of 3)
! bundle2 part read(8k)
! wall 7.258035 comb 7.260000 user 6.470000 sys 0.790000 (best of 3)
! wall 3.668004 comb 3.660000 user 2.960000 sys 0.700000 (best of 3)
! bundle2 part read(16k)
! wall 7.099891 comb 7.080000 user 6.310000 sys 0.770000 (best of 3)
! wall 3.489196 comb 3.480000 user 2.750000 sys 0.730000 (best of 3)
! bundle2 part read(32k)
! wall 6.964685 comb 6.950000 user 6.130000 sys 0.820000 (best of 3)
! wall 3.388569 comb 3.380000 user 2.640000 sys 0.740000 (best of 3)
! bundle2 part read(128k)
! wall 6.852867 comb 6.850000 user 6.060000 sys 0.790000 (best of 3)
! wall 3.276415 comb 3.270000 user 2.560000 sys 0.710000 (best of 4)

Differential Revision: https://phab.mercurial-scm.org/D1391
2017-11-13 22:05:54 -08:00
Gregory Szorc
2f77487f6f bundle2: don't use seekable bundle2 parts by default (issue5691)
The last commit removed the last use of the bundle2 part seek() API
in the generic bundle2 part iteration code. This means we can now
switch to using unseekable bundle2 parts by default and have the
special consumers that actually need the behavior request it.

This commit changes unbundle20.iterparts() to expose non-seekable
unbundlepart instances by default. If seekable parts are needed,
callers can pass "seekable=True." The bundlerepo class needs
seekable parts, so it does this.

The interrupt handler is also changed to use a regular unbundlepart.
So, by default, all consumers except bundlerepo will see unseekable
parts.

Because the behavior of the iterparts() benchmark changed, we add
a variation to test seekable parts vs unseekable parts. And because
parts no longer have seek() unless "seekable=True," we update the
"part seek" benchmark.

Speaking of benchmarks, this change has the following impact to
`hg perfbundleread` on an uncompressed bundle of the Firefox repo
(6,070,036,163 bytes):

! read(8k)
! wall 0.722709 comb 0.720000 user 0.150000 sys 0.570000 (best of 14)
! read(16k)
! wall 0.602208 comb 0.590000 user 0.080000 sys 0.510000 (best of 17)
! read(32k)
! wall 0.554018 comb 0.560000 user 0.050000 sys 0.510000 (best of 18)
! read(128k)
! wall 0.520086 comb 0.530000 user 0.020000 sys 0.510000 (best of 20)
! bundle2 forwardchunks()
! wall 2.996329 comb 3.000000 user 2.300000 sys 0.700000 (best of 4)
! bundle2 iterparts()
! wall 8.070791 comb 8.060000 user 7.180000 sys 0.880000 (best of 3)
! wall 6.983756 comb 6.980000 user 6.220000 sys 0.760000 (best of 3)
! bundle2 iterparts() seekable
! wall 8.132131 comb 8.110000 user 7.160000 sys 0.950000 (best of 3)
! bundle2 part seek()
! wall 10.370142 comb 10.350000 user 7.430000 sys 2.920000 (best of 3)
! wall 10.860942 comb 10.840000 user 7.790000 sys 3.050000 (best of 3)
! bundle2 part read(8k)
! wall 8.599892 comb 8.580000 user 7.720000 sys 0.860000 (best of 3)
! wall 7.258035 comb 7.260000 user 6.470000 sys 0.790000 (best of 3)
! bundle2 part read(16k)
! wall 8.265361 comb 8.250000 user 7.360000 sys 0.890000 (best of 3)
! wall 7.099891 comb 7.080000 user 6.310000 sys 0.770000 (best of 3)
! bundle2 part read(32k)
! wall 8.290308 comb 8.280000 user 7.330000 sys 0.950000 (best of 3)
! wall 6.964685 comb 6.950000 user 6.130000 sys 0.820000 (best of 3)
! bundle2 part read(128k)
! wall 8.204900 comb 8.150000 user 7.210000 sys 0.940000 (best of 3)
! wall 6.852867 comb 6.850000 user 6.060000 sys 0.790000 (best of 3)

The significant speedup is due to not incurring the overhead to track
payload offset data. Of course, this overhead is proportional to
bundle2 part size. So a multiple gigabyte changegroup part is on the
extreme side of the spectrum for real-world impact.

In addition to the CPU efficiency wins, not tracking offset data
also means not using memory to hold that data. Using a bundle based on
the example BSD repository in issue 5691, this change has a drastic
impact to memory usage during `hg unbundle` (`hg clone` would behave
similarly). Before, memory usage incrementally increased for the
duration of bundle processing. In other words, as we advanced through
the changegroup and bundle2 part, we kept allocating more memory to
hold offset data. After this change, we still increase memory during
changegroup application. But the rate of increase is significantly
slower. (A bulk of the remaining gradual increase appears to be the
storing of revlog sizes in the transaction object to facilitate
rollback.)

The RSS at the end of filelog application is as follows:

Before: ~752 MB
After:  ~567 MB

So, we were storing ~185 MB of offset data that we never even used.
Talk about wasteful!

.. api::

   bundle2 parts are no longer seekable by default.

.. perf::

   bundle2 read I/O throughput significantly increased.

.. perf::

   Significant memory use reductions when reading from bundle2 bundles.

   On the BSD repository, peak RSS during changegroup application
   decreased by ~185 MB from ~752 MB to ~567 MB.

Differential Revision: https://phab.mercurial-scm.org/D1390
2017-11-13 21:10:37 -08:00
Gregory Szorc
d035774cab bundle2: only seek to beginning of part in bundlerepo
For reasons still not yet fully understood by me, bundlerepo
requires its changegroup bundle2 part to be seeked to beginning
after part iteration. As far as I can tell, it is the only
bundle2 part consumer that relies on this behavior.

This seeking was performed in the generic iterparts() API. Again,
I don't fully understand why it was here and not in bundlerepo.
Probably historical reasons.

What I do know is that all other bundle2 part consumers don't
need this special behavior (assuming the tests are comprehensive).
So, we move the code from bundle2's iterparts() to bundlerepo's
consumption of iterparts().

Differential Revision: https://phab.mercurial-scm.org/D1389
2017-11-13 20:12:00 -08:00
Gregory Szorc
14c677b632 bundle2: implement consume() API on unbundlepart
We want bundle parts to not be seekable by default. That means
eliminating the generic seek() method.

A common pattern in bundle2.py is to seek to the end of the part
data. This is mainly used by the part iteration code to ensure
the underlying stream is advanced to the next bundle part.

In this commit, we establish a dedicated API for consuming a
bundle2 part data. We switch users of seek() to it.

The old implementation of seek(0, os.SEEK_END) would effectively
call self.read(). The new implementation calls self.read(32768)
in a loop. The old implementation would therefore assemble a
buffer to hold all remaining data being seeked over. For seeking
over large bundle parts, this would involve a large allocation and
a lot of overhead to collect intermediate data! This overhead can
be seen in the results for `hg perfbundleread`:

! bundle2 iterparts()
! wall 10.891305 comb 10.820000 user 7.990000 sys 2.830000 (best of 3)
! wall 8.070791 comb 8.060000 user 7.180000 sys 0.880000 (best of 3)
! bundle2 part seek()
! wall 12.991478 comb 10.390000 user 7.720000 sys 2.670000 (best of 3)
! wall 10.370142 comb 10.350000 user 7.430000 sys 2.920000 (best of 3)

Of course, skipping over large payload data isn't likely very common.
So I doubt the performance wins will be observed in the wild.

Differential Revision: https://phab.mercurial-scm.org/D1388
2017-11-13 20:03:02 -08:00
Gregory Szorc
589a08705a bundle2: implement generic part payload decoder
The previous commit extracted _payloadchunks() to a new derived class.
There was still a reference to this method in unbundlepart, making
unbundlepart unusable on its own.

This commit implements a generic version of a bundle2 part payload
decoder, without offset tracking. seekableunbundlepart._payloadchunks()
has been refactored to consume it, adding offset tracking like before.
We also implement unbundlepart._payloadchunks(), which is a thin
wrapper for it. Since we never instantiate unbundlepart directly,
this new method is not used. This will be changed in subsequent
commits.

The new implementation also inlines some simple code from unpackermixin
and adds some local variable to prevent extra function calls and
attribute lookups. `hg perfbundleread` on an uncompressed Firefox
bundle seems to show a minor win:

! bundle2 iterparts()
! wall 12.593258 comb 12.250000 user 8.870000 sys 3.380000 (best of 3)
! wall 10.891305 comb 10.820000 user 7.990000 sys 2.830000 (best of 3)
! bundle2 part seek()
! wall 13.173163 comb 11.100000 user 8.390000 sys 2.710000 (best of 3)
! wall 12.991478 comb 10.390000 user 7.720000 sys 2.670000 (best of 3)
! bundle2 part read(8k)
! wall 9.483612 comb 9.480000 user 8.420000 sys 1.060000 (best of 3)
! wall 8.599892 comb 8.580000 user 7.720000 sys 0.860000 (best of 3)
! bundle2 part read(16k)
! wall 9.159815 comb 9.150000 user 8.220000 sys 0.930000 (best of 3)
! wall 8.265361 comb 8.250000 user 7.360000 sys 0.890000 (best of 3)
! bundle2 part read(32k)
! wall 9.141308 comb 9.130000 user 8.220000 sys 0.910000 (best of 3)
! wall 8.290308 comb 8.280000 user 7.330000 sys 0.950000 (best of 3)
! bundle2 part read(128k)
! wall 8.880587 comb 8.850000 user 7.960000 sys 0.890000 (best of 3)
! wall 8.204900 comb 8.150000 user 7.210000 sys 0.940000 (best of 3)

Function call overhead in Python strikes again!

Of course, bundle2 decoding CPU overhead is likely small compared to
decompression and changegroup application. But every little bit helps.

Differential Revision: https://phab.mercurial-scm.org/D1387
2017-11-12 19:46:15 -08:00
Gregory Szorc
d2101718bc bundle2: extract logic for seeking bundle2 part into own class
Currently, unbundlepart classes support bi-directional seeking.
Most consumers of unbundlepart only ever seek forward - typically
as part of moving to the end of the bundle part so they can move
on to the next one. But regardless of the actual usage of the
part, instances maintain an index mapping offsets within the
underlying raw payload to offsets within the decoded payload.

Maintaining the mapping of offset data can be expensive in terms of
memory use. Furthermore, many bundle2 consumers don't have access
to an underlying seekable stream. This includes all compressed
bundles. So maintaining offset data when the underlying stream
can't be seeked anyway is wasteful. And since many bundle2 streams
can't be seeked, it seems like a bad idea to expose a seek API
in bundle2 parts by default. If you provide them, people will
attempt to use them.

Seekable bundle2 parts should be the exception, not the rule. This
commit starts the process dividing unbundlepart into 2 classes: a
base class that supports linear, one-time reads and a child class
that supports bi-directional seeking. In this first commit, we
split various methods and attributes out into a new
"seekableunbundlepart" class. Previous instantiators of "unbundlepart"
now instantiate "seekableunbundlepart." This preserves backwards
compatibility. The coupling between the classes is still tight:
"unbundlepart" cannot be used on its own. This will be addressed
in subsequent commits.

Differential Revision: https://phab.mercurial-scm.org/D1386
2017-11-13 19:22:11 -08:00
Gregory Szorc
db823f8dff bundle2: use os.SEEK_* constants
Constants make code easier to read than magic numbers.

I also threw in an explicit argument in a caller to further
increase code comprehension.

Differential Revision: https://phab.mercurial-scm.org/D1370
2017-11-11 16:48:40 -08:00
Boris Feld
f74c12124e phase: introduce a new 'check:phases' part
This part checks if revisions are still in the same phase as when the
bundle was generated. This is similar to what 'check:heads' or
'check:updated-heads' bundle2 part achieves for changesets.

We needs seems before we can move away from pushkey usage from phase since
pushkey has it own built-in push-race detection.
2017-10-11 07:13:02 +02:00
Durham Goode
c48caf9e03 bundle2: immediate exit for ctrl+c (issue5692)
21c2df59a regressed bundle2 by catching all exceptions and trying to handle
them. The old behavior was to allow KeyboardInterrupts to throw and not have
graceful cleanup, which allowed it to exit immediately. Let's go back to that
behavior.

Differential Revision: https://phab.mercurial-scm.org/D960
2017-10-11 10:36:59 -07:00
Boris Feld
88a253014b pull: remove inadequate use of operations records to update stepdone
The 'stepdone' set is design to be a client side mechanism. If the client used
some advanced capabilities to request necessary information (changeset,
obsmarkers, phases, etc). It marks the steps as done to avoid having a less
advanced mechanism issue a duplicated request.

So, the "stepdone.add('phases')" should be the result of a client choice,
because only the client can know it has requested all it needed to request. In
4a08cf1a2cfe this principle was broken because any phase-heads part sent by
the server to the client would declare the phases retrieval complete.

Now that there is an official phases related capability and code associated to
it. We do not need the change in 4a08cf1a2cfe anymore and we can back it out.
This brings back 'stepdone' management for 'phases' in line with the rest of
the code (including other phases handing).

Here is an example of potential misbehavior that 4a08cf1a2cfe introduced:

Imagine a server that pre-computes bundles. The bundles contains a changegroup
part and an (advisory) 'phase-heads' part. When a pull occurs, precomputed
bundled are reused if available. As the phase part is advisory it can be sent
to all clients.  However they could be relevant changesets without phase
information.  Either because they are already common or because they had no
precomputed bundle for them yet.

If receiving any 'phase-heads' parts disable subsequent phases re-trivial
parts, the client will not request phase data for all relevant changesets. For
example common changesets will not turn public.
2017-09-26 15:55:01 +02:00
Boris Feld
bbf23f4d9a pull: use 'phase-heads' to retrieve phase information
A new bundle2 capability 'phases' has been added. If 'heads' is part of the
supported value for 'phases', the server supports reading and sending 'phase-
heads' bundle2 part.

Server is now able to process a 'phases' boolean parameter to 'getbundle'. If
'True', a 'phase-heads' bundle2 part will be included in the bundle with phase
information relevant to the whole pulled set. If this method is available the
phases listkey namespace will no longer be listed.

Beside the more efficient encoding of the data, this new method will greatly
improve the phase exchange efficiency for repositories with non-served
changesets (obsolete, secret) since we'll no longer send data about the
filtered heads.

Add a new 'devel.legacy.exchange' config item to allow fallback to the old
'listkey in bundle2' method.

Reminder: the pulled set is not just the changesets bundled by the pull. It
also contains changeset selected by the "pull specification" on the client
side (eg: everything for bare pull). One of the reason why the 'pulled set' is
important is to make sure we can move -common- nodes to public.
2017-09-24 21:27:18 +02:00
Boris Feld
f759f88d15 bundle2: only grab a transaction when 'phase-heads' affect the repository
The next patch will use the 'phase-heads' part to exchange phase data relevant to
the pulled set.

'handlephases' currently acquires a transaction even in case of no-op pull,
which would results in an empty transaction and messing with the existing
journal.

Pass the transaction fetcher to updatephases so it can fetch it if necessary.
2017-09-20 18:29:10 +02:00
Boris Feld
a0c1d592a7 phases: move the binary decoding function in the phases module
We move the decoding function near the encoding one in a place where they can
be reused in other place (current target, 'exchange.py').
2017-09-19 22:23:41 +02:00
Boris Feld
ac514cb58c phases: move binary encoding into a reusable function
We want to use binary phases for pushing and pulling. We extract the encoding
function out of the bundle2 module first.
2017-09-19 22:01:31 +02:00
Boris Feld
2d59c6c27b phases: use a Struct object for binary encoding and decoding
We will move the binary encoding and decoding code to 'phases.py' in order to
make it easier to reuse. First, let's cleanup it a bit.
2017-09-19 22:08:09 +02:00
Augie Fackler
6ceabd37bd bundle2: portably grab first byte of part name for letter check 2017-09-19 00:27:55 -04:00
Augie Fackler
a00e8f6b04 bundle2: make ValueError messages native strings 2017-09-18 14:03:21 -04:00
Augie Fackler
dcafebb06b bundle2: update check for a generator to work on Python 3 2017-09-18 13:36:05 -04:00
Augie Fackler
dc633b89a0 bundle2: stop using %r to quote part names
Valid part names are restricted to [a-zA-Z0-9_:-]+, so I'm not worried
about having quoting present in places where we should have
predominantly valid part names. This will significantly ease the
Python 3 transition, and simultaneously isn't a BC because this is
only in error messages that should never be shown.
2017-09-18 13:35:43 -04:00
Durham Goode
a64cc1e3c6 bundle2: move part processing to a separate function
Now that the part processing loop is tiny, let's move it to a separate function.
This will allow extensions to completely replace the part processing logic,
without having to replace the overall bundle processing logic or the stream
maintenance logic.

This will be useful for the infinitepush extension, so it can completely take
over receiving a bundle and rerouting it to a side store. This will also make it
easier to upstream the infinitepush functionality later.

Differential Revision: https://phab.mercurial-scm.org/D709
2017-09-14 10:20:05 -07:00
Durham Goode
32d42092c8 bundle2: remove unnecessary try finally
This is no longer needed.

Differential Revision: https://phab.mercurial-scm.org/D708
2017-09-14 10:20:05 -07:00
Durham Goode
d241ca079c bundle2: move handler validation out of processpart
As part of refactoring bundle part processing let's move handler validation to
its own function.

Differential Revision: https://phab.mercurial-scm.org/D707
2017-09-14 10:20:05 -07:00
Durham Goode
b29bb0eb76 bundle2: move processpart stream maintenance into part iterator
The processpart function also did some stream maintenance, so let's move it to
the part iterator as well, as part of moving all part iteration logic into the
class.

There is one place processpart is called outside of the normal loop, so we
manually handle the seek there.

The now-empty try/finally will be removed in a later patch, for ease of review.

Differential Revision: https://phab.mercurial-scm.org/D706
2017-09-14 10:20:05 -07:00
Augie Fackler
96bbe76280 bundle2: raise a more helpful error if building a bundle part header fails
I've tripped on this several times now, and am tired of debugging. Now
the header parts are part of the error message when the ''.join()
fails, which makes debugging obvious.
2017-09-15 18:37:29 -04:00
Augie Fackler
fdaf985a63 bundles: turn nbchanges int into a bytestr using pycompat.bytestr
Fixes some python 3 failures.
2017-09-15 18:38:36 -04:00
Durham Goode
f57a683d4b bundle2: move exception handling into part iterator
As part of separating the part iteration logic from the part handling logic,
let's move the exception handling to the part iterator class.

Differential Revision: https://phab.mercurial-scm.org/D705
2017-09-13 20:39:01 -07:00
Durham Goode
a46fac03a2 bundle2: move part counter to partiterator
As part of moving the part iterator logic to a separate class, let's move the
part counting logic and the output for it.

Differential Revision: https://phab.mercurial-scm.org/D704
2017-09-13 17:16:50 -07:00
Durham Goode
577e9d1779 bundle2: move part iterator a separate class
Currently, the part iterator logic is tightly coupled with the part handling
logic, which means it's hard to replace the part handling logic without
duplicating the part iterator bits.

In a future diff we'll want to be able to replace all part handling, so let's
begin refactoring the part iterator logic to it's own class.

Differential Revision: https://phab.mercurial-scm.org/D703
2017-09-13 17:16:45 -07:00
Durham Goode
21faf618dc changegroup: replace getchangegroup with makechangegroup
As part of reducing the number of changegroup creation APIs, let's replace
getchangegroup with calls to makechangegroup. This is mostly a drop in
replacement, but it does change the version specifier to be required, so it's
more obvious which callers are creating old version 1 changegroups still.

Differential Revision: https://phab.mercurial-scm.org/D669
2017-09-10 18:50:12 -07:00
Durham Goode
22fc2e18a8 bundle2: seek part back during iteration
Previously, iterparts would yield the part to users, then consume the part. This
changed the part after the user was given it and left it at the end, both of
which seem unexpected.  Let's seek back to the beginning after we've consumed
it. I tried not seeking to the end at all, but that seems important for the
overall bundle2 consumption.

This is used in a future patch to let us move the bundlerepo
bundle2-changegroup-part to be handled entirely within the for loop, instead of
having to do a seek back to 0 after the entire loop finishes.

Differential Revision: https://phab.mercurial-scm.org/D289
2017-08-23 12:35:03 -07:00
Martin von Zweigbergk
7603f48c32 exchange: don't attempt phase exchange if phase-heads was in bundle
The Mercurial core server doesn't yet include phase-heads parts in the
bundle, but our Google-internal server wants to do
that. Unfortunately, the usual exchange still happens even if
phase-heads part is included (including the short-circuited one for
old/publishing servers). That means that even if our server (again,
the Google-internal one, but also future Mercurial core servers)
includes a phase-heads part to indicate that some heads should be
drafts, that would still get overwritten by the phase updating that
happens after. So let's fix that by marking the phase step done if we
receive at least one phase-heads part in the bundle.

Differential Revision: https://phab.mercurial-scm.org/D440
2017-08-17 13:04:47 -07:00
Alex Gaynor
df2c1417e6 bundle2: fixed usage of an attribute that was removed in py3k
Differential Revision: https://phab.mercurial-scm.org/D482
2017-08-23 01:09:08 +00:00
Pulkit Goyal
7d16e8a210 pushvars: add a coreconfigitem for push.pushvars.server
Differential Revision: https://phab.mercurial-scm.org/D359
2017-08-12 04:47:40 +05:30
Yuya Nishihara
843b049128 bundle2: relax the condition to update transaction.hookargs
This is just a micro optimization. If hookargs is empty, nothing should be
necessary.
2017-08-13 11:10:35 +09:00
Yuya Nishihara
355a92a8ee bundle2: raise ProgrammingError for invalid call of addhookargs()
It should be hard error. Also fixed the error message as s/hooks/hookargs/.
2017-08-13 11:05:56 +09:00
Boris Feld
657776f4e3 bundle2: fix transaction availability detection
Changeset aa97e972460f introduce more complex logic around
'bundleoperation.gettransaction'. In that process it turns the old "attribute"
into a proper method which breaks the code that detects the "transaction
availability".

The change was visible in 'test-acl.t', fixing this reverts the test changes.

Differential Revision: https://phab.mercurial-scm.org/D303
2017-08-09 17:01:21 +02:00
Augie Fackler
6a68930f82 bundle2: convert ints to strings using pycompat.bytestring()
Fixes some Python 3 regressions.

We don't use %d here because the part id is actually an
Optional[int]. It should always be initialized to a non-None value by
the time this code executes, but we shouldn't blindly depend on that
being the case.

Differential Revision: https://phab.mercurial-scm.org/D272
2017-07-24 11:16:32 -04:00
Pulkit Goyal
89fd642a01 pushvars: move fb extension pushvars to core
pushvars extension in fbext adds a --pushvars flag to push command using which
one send strings to server which becomes environment variables there prepended
with HG_USERVAR_. These variables can then be used to run hooks on the server.
The extension is moved directly to core and unbundling of the strings and
converting them to environment variables at server is disabled by default for
security reasons. One can turn that on by following config:

[push]
pushvars.server = true

This patch also adds the test for the extension.

Differential Revision: https://phab.mercurial-scm.org/D210
2017-07-31 09:59:42 +05:30
Yuya Nishihara
fb236e4381 py3: convert arbitrary exception object to byte string more reliably
Our exception types implement __bytes__(), which should be tried first. Do
lossy encoding conversion as a last resort.
2017-08-03 23:02:32 +09:00
Augie Fackler
65bd64ce26 bundle2: obtain repr() of exception in a python3-safe way
This was exposed by other problems in bundle generation, but I'm not
sure how to test it for now.
2017-07-24 11:19:11 -04:00
Augie Fackler
3038df5570 bundle2: use bytestr() instead of str() to convert part id to bytes
This was exposed by trying to run previously-passing Python 3 tests.
2017-07-24 11:28:40 -04:00