Summary:
Now that all our repos are treemanifest, let's enable the extension by
default in tests. Once we're certain no one needs it in production we'll also
make it the default in core Mercurial.
This diff includes a minor fix in treemanifest to be aware of always-enabled
extensions. It won't matter until we actually add treemanifest to the list of
default enabled extensions, but I caught this while testing things.
Reviewed By: ikostia
Differential Revision: D15030253
fbshipit-source-id: d8361f915928b6ad90665e6ed330c1df5c8d8d86
Summary:
The postincoming checks prints out advice of the following forms:
* `(run 'hg heads' to see heads)`
* `(run 'hg heads' to see heads, 'hg merge' to merge)`
* `(run 'hg heads .' to see heads, 'hg merge' to merge)`
* `(run 'hg update' to get a working copy)`
This advice is no longer useful, so remove it.
Reviewed By: DurhamG, farnz
Differential Revision: D15317185
fbshipit-source-id: 50ba576406c96715fa058399da53462be9b7a3bf
Summary:
Removes bundlev1 from the supported outgoing versions, but keeps a flag
around to force enable it if tests need it.
Reviewed By: quark-zju
Differential Revision: D7591176
fbshipit-source-id: 280cbbbe87848e3d6c9d448ce4f87c5eadeff720
Summary:
We want to deprecate the bundlev1 format, so let's start by adding a
develwarn. Later diffs will update the tests to not use v1, then remove v1 as a
supported outgoing bundle entirely.
Reviewed By: quark-zju
Differential Revision: D7591166
fbshipit-source-id: 143ad029bfe4d141f91d6d5077342dfa44ad2944
Summary:
When running with a Python runtime with a slightly different zlib module,
some `zlib.compress` outputs are different. Some tests are testing the
length, or the content of `zlib.compress` output, directly or indirectly.
That's causing issues.
This patch adds a `common-zlib` hghave test so it can be used to gate tests
checking zlib output. Some lengths are also changed to glob patterns to be
compatible.
Reviewed By: ryanmce
Differential Revision: D6937735
fbshipit-source-id: 2328a39d7f2022f16d51f61b6178568b26dfe2fb
Upon pull or unbundle, we display a message with the range of new revisions
fetched. This revision range could readily be used after a pull to look out
what's new with 'hg log'. The algorithm takes care of filtering "obsolete"
revisions that might be present in transaction's "changes" but should not be
displayed to the end user.
We now have a dedicated help topic to describe bundle specification
strings. Let's update `hg bundle`'s documentation to reflect its
existence.
While I was hear, I also tweaked some wording which I felt was out
of date and needed tweaking. Specifically, `hg bundle` no longer
just deals with "changegroup" data: it can also generate files
that have non-changegroup data.
Previously, `hg bundle zstd` on a non-generaldelta repo would
attempt to use a v1 bundle. This would fail because zstd is not
supported on v1 bundles.
This patch changes the behavior to automatically use a v2 bundle
when the user explicitly requests a bundlespec that is a compression
engine not supported on v1. If the bundlespec is <engine>-v1, it is
still explicitly rejected because that request cannot be fulfilled.
Version 1 bundles only support a fixed set of compression engines.
Before this change, we would accept any compression engine for v1
bundles, even those that may not work on v1. This could lead to
an error.
We define a fixed set of compression engines known to work with v1
bundles and we add checking to ensure a newer engine (like zstd)
won't work with v1 bundles.
I also took the liberty of adding test coverage for unknown compression
names because I noticed we didn't have coverage of it before.
Currently, bundle compression uses the default compression level
for the active compression engine. The default compression level
is tuned as a compromise between speed and size.
Some scenarios may call for a different compression level. For
example, with clone bundles, bundles are generated once and used
several times. Since the cost to generate is paid infrequently,
server operators may wish to trade extra CPU time for better
compression ratios.
This patch introduces an experimental and undocumented config
option to control the bundle compression level. As the inline
comment says, this approach is a bit hacky. I'd prefer for
the compression level to be encoded in the bundle spec. e.g.
"zstd-v2;complevel=15." However, given that the 4.1 freeze is
imminent, I'm not comfortable implementing this user-facing
change without much time to test and consider the implications.
So, we're going with the quick and dirty solution for now.
Having this option in the 4.1 release will enable Mozilla to
easily produce and test zlib and zstd bundles with non-default
compression levels in production. This will help drive future
development of the feature and zstd integration with Mercurial.
Now that zstd is vendored and being built (in some configurations), we
can implement a compression engine for zstd!
The zstd engine is a little different from existing engines. Because
it may not always be present, we have to defer load the module in case
importing it fails. We facilitate this via a cached property that holds
a reference to the module or None. The "available" method is
implemented to reflect reality.
The zstd engine declares its ability to handle bundles using the
"zstd" human name and the "ZS" internal name. The latter was chosen
because internal names are 2 characters (by only convention I think)
and "ZS" seems reasonable.
The engine, like others, supports specifying the compression level.
However, there are no consumers of this API that yet pass in that
argument. I have plans to change that, so stay tuned.
Since all we need to do to support bundle generation with a new
compression engine is implement and register the compression engine,
bundle generation with zstd "just works!" Tests demonstrating this
have been added.
How does performance of zstd for bundle generation compare? On the
mozilla-unified repo, `hg bundle --all -t <engine>-v2` yields the
following on my i7-6700K on Linux:
engine CPU time bundle size vs orig size throughput
none 97.0s 4,054,405,584 100.0% 41.8 MB/s
bzip2 (l=9) 393.6s 975,343,098 24.0% 10.3 MB/s
gzip (l=6) 184.0s 1,140,533,074 28.1% 22.0 MB/s
zstd (l=1) 108.2s 1,119,434,718 27.6% 37.5 MB/s
zstd (l=2) 111.3s 1,078,328,002 26.6% 36.4 MB/s
zstd (l=3) 113.7s 1,011,823,727 25.0% 35.7 MB/s
zstd (l=4) 116.0s 1,008,965,888 24.9% 35.0 MB/s
zstd (l=5) 121.0s 977,203,148 24.1% 33.5 MB/s
zstd (l=6) 131.7s 927,360,198 22.9% 30.8 MB/s
zstd (l=7) 139.0s 912,808,505 22.5% 29.2 MB/s
zstd (l=12) 198.1s 854,527,714 21.1% 20.5 MB/s
zstd (l=18) 681.6s 789,750,690 19.5% 5.9 MB/s
On compression, zstd for bundle generation delivers:
* better compression than gzip with significantly less CPU utilization
* better than bzip2 compression ratios while still being significantly
faster than gzip
* ability to aggressively tune compression level to achieve
significantly smaller bundles
That last point is important. With clone bundles, a server can
pre-generate a bundle file, upload it to a static file server, and
redirect clients to transparently download it during clone. The server
could choose to produce a zstd bundle with the highest compression
settings possible. This would take a very long time - a magnitude
longer than a typical zstd bundle generation - but the result would
be hundreds of megabytes smaller! For the clone volume we do at
Mozilla, this could translate to petabytes of bandwidth savings
per year and faster clones (due to smaller transfer size).
I don't have detailed numbers to report on decompression. However,
zstd decompression is fast: >1 GB/s output throughput on this machine,
even through the Python bindings. And it can do that regardless of the
compression level of the input. By the time you have enough data to
worry about overhead of decompression, you have plenty of other things
to worry about performance wise.
zstd is wins all around. I can't wait to implement support for it
on the wire protocol and in revlogs.
The bundle2 changegroup part has an advisory param saying how many
changesets are in the part. Before this patch, we were setting
this part when generating bundle2 parts via the wire protocol but
not when generating local bundle2 files.
A side effect of not setting the changeset count part is that progress
bars don't work when applying changesets. As the tests show, this
impacted clone bundles, shelve, backup bundles, `hg unbundle`, and
anything touching bundle2 files.
This patch adds a backdoor to allow us to pass state from
changegroup generation into the unbundler. We store the number
of changesets in the changegroup in this state and use it to
populate the aforementioned advisory part parameter when generating
the bundle2 bundle.
I concede that I'm not thrilled by how state is being passed in
changegroup.py (it feels a bit hacky). I would love to overhaul the
rather confusing set of functions in changegroup.py with something that
passes rich objects around instead of e.g. low-level generators.
However, given the code freeze for 3.9 is imminent, I'd rather not
undertake this endeavor right now. This feels like the easiest way
to get the parameter added to the changegroup part.
`hg debugbundle` is calling repr() on bundle2 part params, which are
now util.sortdict instances. Unfortunately, repr() doesn't appear
to be deterministic for util.sortdict. So, we implement one.
We include the type name because that's the common convention for
__repr__ implementations. Having the type name in `hg debugbundle`
is a bit ugly. But it's a debug command and I don't care enough to
fix it.
Generaldelta changes some of the default targets for 'hg bundle'. All cases are
already properly tested but some ambiguous specifications are affected.
The cut and head utilities on Solaris have weird differences from the GNU
versions. The f helper script does a dump more nicely than those tools,
anyway.
The old code was tailored to `hg bundle` usage and not appropriate for
use as a general API, which clone bundles will require. The code has
been rewritten to make it more generally suitable.
We introduce dedicated error types to represent invalid and unsupported
bundle specifications. The reason we need dedicated error types (rather
than error.Abort) is because clone bundles will want to catch these
exception as part of filtering entries. We don't want to swallow
error.Abort on principle.
The current setup requires to pass both a packer and, optionally, the version
of the unpacker. This is confusing and error prone as the two value cannot
mismatch. Instead, we simply grab the version from the packer. This fixes a bug
where requesting a cg2 from 'hg bundle' were reported as changegroup 1.
I should have caught that in the initial changeset but I missed it somehow.
We had some basic undocumented support for uncompressed bundle2 support. We now
have an official extensible syntax to specify both format type and compression
(eg: bzip2-v2).
In practice, this changeset introduce the 'v1' and 'v2' identifier to make it
possible to combine format and compression. The default format is still 'v1'.
We'll care about picking 'v1' or 'v2' in regard with general delta in the next
changesets.
1. This is consistent with pushing.
2. This allows to see the URL of the other repo in case accessing the repo
fails, e.g. wrong ssh path or issues with the https certificate, without
using --debug or showconfig paths.
Additionally add test for this in the context of ssh with a wrong path.
Checking the bundle type late in the command's execution can mean
that we do work for a long time before complaining about incorrect
user input and aborting. Guess how I discovered this.