Commit Graph

1822 Commits

Author SHA1 Message Date
Gregory Szorc
4f8c92c87a check-code: recommend util.urlreq when importing urlparse 2017-03-21 22:46:17 -07:00
Gregory Szorc
5ca0f908bf py3: add __bool__ to every class defining __nonzero__
__nonzero__ was renamed to __bool__ in Python 3. This patch simply
aliases __bool__ to __nonzero__ for every class implementing
__nonzero__.
2017-03-13 12:40:14 -07:00
Gregory Szorc
ddf0e60af7 perf: perform a garbage collection before each iteration
Currently, no explicit garbage collection is performed when running
the microbenchmarks in `hg perf`. I think this is wrong because
garbage collection can have a significant impact on execution times.
And, if gc is triggered via the default heuristics, it will
fire effectively randomly during subsequent benchmark iterations
due to variable amount of garbage left over from previous runs.

Running a gc before invoking the measured function will help ensure
state is more consistent across all iterations.
2017-03-13 18:16:42 -07:00
Augie Fackler
5ceb2295c6 wix: add censor docs to installer script
Spotted by Matt Harbison.
2017-03-06 18:42:36 -05:00
Pierre-Yves David
083299a68b vfs: use 'vfs' module directly in 'contrib/undumprevlog'
Now that the 'vfs' classes moved in their own module, lets use the new module
directly. We update code iteratively to help with possible bisect needs in the
future.
2017-03-02 13:32:49 +01:00
Jun Wu
dc918444b5 chg: forward user-defined signals
SIGUSR1 and SIGUSR2 are reserved for user-defined behaviors. They may be
redefined by an hg extension [1], but cannot be easily redefined for chg.
Since the default behavior (kill) is not that useful for chg, let's forward
them to hg, hoping it got redefined there and could be more useful.

[1] https://bitbucket.org/facebook/hg-experimental/commits/e7c883a465
2017-03-08 13:46:26 -08:00
Jun Wu
ddc5fd5cc8 chg: document why we send SIGHUP and SIGINT to process group
This makes the code more consistent - other signals are documented.
2017-03-08 13:34:25 -08:00
Pierre-Yves David
e5cb48ac36 vfs: replace 'scmutil.opener' usage with 'scmutil.vfs'
The 'vfs' class is the first class citizen for years. We remove all usages of
the older API. This will let us remove the old API eventually.
2017-03-02 03:52:36 +01:00
Pierre-Yves David
259635a73a config: update the Windows example config file
We move from the color extensions to the 'ui.color' config.
2017-02-28 20:23:10 +01:00
Pierre-Yves David
a29094b506 color: update main documentation
Now that the feature no longer lives in the extension, we document it in the
help of the core config. This include the new 'ui.color' option introduced in
the previous changesets.

As a result the color extensions can now be deprecated.

This is a documentation patch only; color is still disabled by default.
2017-02-21 20:04:55 +01:00
Matt Harbison
d4978c7d62 wix: include the help for pager
Similar (I assume) to 47bcbe06a48d.
2017-02-25 21:44:34 -05:00
Simon Farnsworth
01a98361c5 contrib: add a write microbenchmark to perf.py
I'm adding some performance logging to ui.write - this benchmark lets us
confirm that the cost of that logging is acceptably low.

At this point, the microbenchmark on Linux over SSH shows:

! wall 3.213560 comb 0.410000 user 0.350000 sys 0.060000 (best of 4)

while on the Mac locally, it shows:

! wall 0.342325 comb 0.180000 user 0.110000 sys 0.070000 (best of 20)
2017-02-15 13:07:26 -08:00
Simon Farnsworth
e0b70e4f7f mercurial: switch to util.timer for all interval timings
util.timer is now the best available interval timer, at the expense of not
having a known epoch. Let's use it whenever the epoch is irrelevant.
2017-02-15 13:17:39 -08:00
Martin von Zweigbergk
2b641873b5 merge with stable 2017-02-13 09:44:16 -08:00
FUJIWARA Katsunori
2afd920706 misc: update year in copyright lines
This patch also makes some expected output lines in tests glob-ed for
persistence of them.

BTW, files below aren't yet changed in 2017, but this patch also
updates copyright of them, because:

    - mercurial/help/hg.1.txt

      almost all of "man hg" output comes from online help of hg
      command, and is already changed in 2017

    - mercurial/help/hgignore.5.txt
    - mercurial/help/hgrc.5

      "copyright 2005-201X Matt Mackall" in them mentions about
      copyright of Mercurial itself
2017-02-12 02:23:33 +09:00
FUJIWARA Katsunori
f778068b44 misc: replace domain of mercurial-devel ML address by mercurial-scm.org
This patch also adds new check-code.py pattern to detect invalid usage
of "mercurial-devel@selenic.com".
2017-02-11 00:23:55 +09:00
FUJIWARA Katsunori
e6f91a13e7 misc: replace domain of mercurial ML address by mercurial-scm.org
This patch also adds new check-code.py pattern to detect invalid usage
of "mercurial@selenic.com".

Change for test-convert-tla.t is tested, but similar change for almost
same test-convert-baz.t isn't yet tested actually, because I couldn't
find out the way to get "GNU Arch baz client".

AFAIK, buildbot skips test-convert-baz.t, too. Does anybody have
appropriate environment for testing?
2017-02-11 00:23:53 +09:00
Gregory Szorc
3f02063f8e zstd: vendor python-zstandard 0.7.0
Commit 3054ae3a66112970a091d3939fee32c2d0c1a23e from
https://github.com/indygreg/python-zstandard is imported without
modifications (other than removing unwanted files).

The vendored zstd library within has been upgraded from 1.1.2 to
1.1.3. This version introduced new APIs for threads, thread
pools, multi-threaded compression, and a new dictionary
builder (COVER). These features are not yet used by
python-zstandard (or Mercurial for that matter). However,
that will likely change in the next python-zstandard release
(and I think there are opportunities for Mercurial to take
advantage of the multi-threaded APIs).

Relevant to Mercurial, the CFFI bindings are now fully
implemented. This means zstd should "just work" with PyPy
(although I haven't tried). The python-zstandard test suite also
runs all tests against both the C extension and CFFI bindings to
ensure feature parity.

There is also a "decompress_content_dict_chain()" API. This was
derived from discussions with Yann Collet on list about alternate
ways of encoding delta chains.

The change most relevant to Mercurial is a performance enhancement in
the simple decompression API to reuse a data structure across
operations. This makes decompression of multiple inputs significantly
faster. (This scenario occurs when reading revlog delta chains, for
example.)

Using python-zstandard's bench.py to measure the performance
difference...

On changelog chunks in the mozilla-unified repo:

decompress discrete decompress() reuse zctx
1.262243 wall; 1.260000 CPU; 1.260000 user; 0.000000 sys 170.43 MB/s (best of 3)
0.949106 wall; 0.950000 CPU; 0.950000 user; 0.000000 sys 226.66 MB/s (best of 4)

decompress discrete dict decompress() reuse zctx
0.692170 wall; 0.690000 CPU; 0.690000 user; 0.000000 sys 310.80 MB/s (best of 5)
0.437088 wall; 0.440000 CPU; 0.440000 user; 0.000000 sys 492.17 MB/s (best of 7)

On manifest chunks in the mozilla-unified repo:

decompress discrete decompress() reuse zctx
1.367284 wall; 1.370000 CPU; 1.370000 user; 0.000000 sys 274.01 MB/s (best of 3)
1.086831 wall; 1.080000 CPU; 1.080000 user; 0.000000 sys 344.72 MB/s (best of 3)

decompress discrete dict decompress() reuse zctx
0.993272 wall; 0.990000 CPU; 0.990000 user; 0.000000 sys 377.19 MB/s (best of 3)
0.678651 wall; 0.680000 CPU; 0.680000 user; 0.000000 sys 552.06 MB/s (best of 5)

That should make reads on zstd revlogs a bit faster ;)

# no-check-commit
2017-02-07 23:24:47 -08:00
Jun Wu
bb3aea397e chg: verify XDG_RUNTIME_DIR
According to the specification [1], $XDG_RUNTIME_DIR should be ignored
unless:

  The directory MUST be owned by the user, and he MUST be the only one
  having read and write access to it. Its Unix access mode MUST be 0700.

This patch adds a check and ignores it if it does not meet part of the
criteria.

[1]: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
2017-02-06 17:01:06 -08:00
Anton Shestakov
d3b5c285e9 debian: update copyright years 2017-02-04 20:29:34 +08:00
Anton Shestakov
dcf9dc5b3d debian: update mailing list address 2017-02-04 20:29:13 +08:00
Yedidya Feldblum
6a9ac1262a check-code: permit functools.reduce 2017-01-21 14:43:13 -08:00
Gregory Szorc
5c3d4574aa perf: split obtaining chunks from decompression
Previously, the code was similar to what revlog._chunks() was doing,
which took a raw data segment and delta chain, obtained buffers for
the raw revlog chunks within, and decompressed them.

This commit splits the "get raw chunks" action from "decompress." The
goal of this change is to more accurately measurely decompression
performance.

On a ~50k deltachain for a manifest in mozilla-central:

! full
! wall 0.430548 comb 0.440000 user 0.410000 sys 0.030000 (best of 24)
! deltachain
! wall 0.016053 comb 0.010000 user 0.010000 sys 0.000000 (best of 181)
! read
! wall 0.008078 comb 0.010000 user 0.000000 sys 0.010000 (best of 362)
! rawchunks
! wall 0.033785 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
! decompress
! wall 0.327126 comb 0.320000 user 0.320000 sys 0.000000 (best of 31)
! patch
! wall 0.032391 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
! hash
! wall 0.012587 comb 0.010000 user 0.010000 sys 0.000000 (best of 233)
2017-02-04 08:47:46 -08:00
Augie Fackler
cde5195cd5 contrib: fix check-commit to not reject commits from hg sign and hg tag
I'm tired of having a spurious red build every time we do a
release. Fix it once and for all.
2017-01-18 23:34:35 -05:00
Gregory Szorc
4225a4e399 zstd: prevent potential free() of uninitialized memory
This is a cherry pick of an upstream fix. The free() of uninitialed
memory could likely only occur if a malloc() inside zstd fails.

The patched functions aren't currently used by Mercurial. But I don't
like leaving footguns sitting around.
2017-01-17 10:17:13 -08:00
Gregory Szorc
c3cb00b3e9 zstd: vendor python-zstandard 0.6.0
Commit 63c68d6f5fc8de4afd9bde81b13b537beb4e47e8 from
https://github.com/indygreg/python-zstandard is imported without
modifications (other than removing unwanted files).

This includes minor performance and feature improvements. It also
changes the vendored zstd library from 1.1.1 to 1.1.2.

# no-check-commit
2017-01-14 19:41:43 -08:00
Pulkit Goyal
3c7388da12 py3: replace pycompat.getenv with encoding.environ.get
pycompat.getenv returns os.getenvb on py3 which is not available on Windows.
This patch replaces them with encoding.environ.get and checks to ensure no
new instances of os.getenv or os.setenv are introduced.
2017-01-15 13:17:05 +05:30
Martin von Zweigbergk
631c635657 check-code: reject module-level @cachefunc
Module-level @cachefunc usage is risky because it can easily create a
memory "leak". Let's reject it completely for now. If a valid usage
comes up in the future, we can always improve the check or reconsider.
2017-01-13 10:11:37 -08:00
Gregory Szorc
b2e18c67d3 perf: support multiple compression engines in perfrevlogchunks
Now that the revlog has a reference to a compressor, it is
possible to swap in other compression engines. So, teach
`hg perfrevlogchunks` to do that.

The default behavior of `hg perfrevlogchunks` is now to measure the
compression performance of all compression engines implementing the
revlog compressor API. This effectively adds the no-op "none"
compressor and zstd (when available) into the default set.

While we can't yet plug alternate compressors into revlogs, this
command gives us a preview of the performance. On the mozilla-unified
repository:

$ hg perfrevlogchunks -c
! compress w/ none
! wall 0.115159 comb 0.110000 user 0.110000 sys 0.000000 (best of 86)
! compress w/ zlib
! wall 5.681406 comb 5.680000 user 5.680000 sys 0.000000 (best of 3)
! compress w/ zstd
! wall 2.624781 comb 2.620000 user 2.620000 sys 0.000000 (best of 4)

$ hg perfrevlogchunks -m
! compress w/ none
! wall 0.124486 comb 0.120000 user 0.120000 sys 0.000000 (best of 79)
! compress w/ zlib
! wall 10.144701 comb 10.150000 user 10.150000 sys 0.000000 (best of 3)
! compress w/ zstd
! wall 4.383118 comb 4.390000 user 4.390000 sys 0.000000 (best of 3)

Those numbers for zstd look promising. But they aren't the full story.
For that, we'll need to look at decompression times and storage sizes.
Stay tuned...
2017-01-02 12:02:08 -08:00
Gregory Szorc
1a6670d670 revlog: move decompress() from module to revlog class (API)
Upcoming patches will convert revlogs to use the compression engine
APIs to perform all things compression. The yet-to-be-introduced
APIs support a persistent "compressor" object so the same object
can be reused for multiple compression operations, leading to
better performance. In addition, compression engines like zstd
may wish to tweak compression engine state based on the revlog
(e.g. per-revlog compression dictionaries).

A global and shared decompress() function will shortly no longer
make much sense. So, we move decompress() to be a method of the
revlog class. It joins compress() there.

On the mozilla-unified repo, we can measure the impact of this change
on reading performance:

$ hg perfrevlogchunks -c
! chunk
! wall 1.932573 comb 1.930000 user 1.900000 sys 0.030000 (best of 6)
! wall 1.955183 comb 1.960000 user 1.930000 sys 0.030000 (best of 6)
! chunk batch
! wall 1.787879 comb 1.780000 user 1.770000 sys 0.010000 (best of 6
! wall 1.774444 comb 1.770000 user 1.750000 sys 0.020000 (best of 6)

"chunk" appeared to become slower but "chunk batch" got faster. Upon
further examination by running both sets multiple times, the numbers
appear to converge across all runs. This tells me that there is no
perceived performance impact to this refactor.
2017-01-02 13:00:16 -08:00
Martin von Zweigbergk
1840263a8c help: merge revsets.txt into revisions.txt
Selecting single and multiple revisions is closely related, so let's
put it in one place, so users can easily find it. We actually did not
even point to "hg help revsets" from "hg help revisions", but now that
they're on a single page, that won't be necessary.
2017-01-11 11:37:38 -08:00
Jun Wu
402691b892 chg: check snprintf result strictly
This makes the program more robust when somebody changes hgclient's
maxdatasize in the future.
2017-01-11 23:39:24 +08:00
Jun Wu
bb277b3ff3 chg: change server's process title
This patch uses the newly introduced "setprocname" interface to update the
process title server-side, to make it easier to tell what a worker is actually
doing.

The new title is "chg[worker/$PID]", where PID is the process ID of the
connected client. It can be directly observed using "ps -AF" under Linux, or
"ps -A" under FreeBSD.
2017-01-11 07:40:52 +08:00
Jun Wu
b61b02a865 chg: remove getpager support
We have enough bits to switch to the new chg pager code path in runcommand.
So just remove the legacy getpager support.

This is a red-only patch, and will break chg's pager support temporarily.
2017-01-10 06:59:39 +08:00
Jun Wu
5ae59a4110 chg: handle pager request client-side
This patch implements the simple S-channel pager handling at chg
client-side.

Note: It does not deal with environ and cwd currently for simplicity, which
will be fixed later.
2017-01-10 06:59:03 +08:00
Jun Wu
923ee6957d chg: check type read from S channel
The previous patch added the check server-side. This patch added it
client-side.
2017-01-06 16:14:52 +00:00
Jun Wu
734e02b02d chg: send type information via S channel (BC)
Previously S channel is only used to send system commands. It will also be
used to send pager commands. So add a type parameter.

This breaks older chg clients. But chg and hg should always come from a
single commit and be packed into a single package. Supporting running
inconsistent versions of chg and hg seems to be unnecessarily complicated
with little benefit. So just make the change and assume people won't use
inconsistent chg with hg.
2017-01-06 16:11:03 +00:00
Jun Wu
dfb8c98044 chg: add procutil.h
This patch adds a formal header procutil.h for procutil.c, and changes
Makefile to build procutil.c independently.
2017-01-02 14:57:14 +00:00
Jun Wu
8e6abc8b48 chg: let procutil maintain its own pagerpid
Previously, chg.c maintains the pagerpid. Let's move it to procutil.c.

Note: chg.c still have a pagerpid to decide whether to call attachio or not.
In the future, attachio may be moved from hgc_open to hgc_runcommand, and
hgc_runcommand handles both pager and attachio so we don't need to run
attachio twice. And chg.c will be free of pagerpid.
2017-01-02 14:43:37 +00:00
Jun Wu
4321520c94 chg: decouple hgclient from setuppager
procutil should not depend on hgclient. This patch makes the pager handling
part independent from hgclient.
2017-01-02 14:10:32 +00:00
Jun Wu
efbbb12a58 chg: decouple hgclient from setupsignalhandler
procutil should not depend on hgclient. This patch makes the signal handling
part independent from hgclient.
2017-01-02 14:04:35 +00:00
Jun Wu
fc44f74541 chg: move signal and pager handling to a separate file
In the future hgclient will deal with pager directly inside runcommand, so
related signal handling stuff needs to be decoupled from chg.c.

The signal handling and pager logic are coupled because we need to forward
SIGPIPE when pager exits. So they are moved together, otherwise a global
variable (pagerpid) is inevitable.

This patch moves related functions from chg.c to procutil.c, which was
marked as copied to maintain annotate history.

The move is done without code modification for easy review, therefore
`#include "procutil.c"` was introduced temporarily.
2017-01-02 14:02:47 +00:00
Jun Wu
2419c6679a chg: respect XDG_RUNTIME_DIR
$XDG_RUNTIME_DIR [1] is a better place for user daemons. Let's use it and
fallback to $TMPDIR.

After this patch, chg will try socket paths in the following order:

  1. $CHGSOCKNAME
  2. $XDG_RUNTIME_DIR/chg/server
  3. ${TMPDIR:-tmp}/chg$UID/server

[1]: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
2016-12-26 00:02:42 +00:00
Jun Wu
4548a63b44 chg: make "get default sockdir" a separate method
The logic to get a default socket directory will become longer in the next
patch. So let's move it out.
2016-12-25 23:49:54 +00:00
Jun Wu
f437cb2c85 chg: handle connect failure before errno gets overridden
This patch moves the error handling logic up so that errno after connect
won't be overridden.
2016-12-25 23:32:11 +00:00
Jun Wu
d23c117db4 chg: support long socket path
This patch replaces UNIX_PATH_MAX (108) with PATH_MAX (4096) so we can have
long unix path.
2016-12-23 16:26:40 +00:00
Jun Wu
c5374c4a97 chg: remove sockdirfd
See the previous patch for the reason.
2016-12-23 16:16:44 +00:00
Jun Wu
8499ade54b chg: let hgc_open support long path
"sizeof(sun_path)" is too small. Use the chdir trick to support long socket
path, like "mercurial.util.bindunixsocket".

It's useful for cases where TMPDIR is long. Modern OS X rewrites TMPDIR to a
long value. And we probably want to use XDG_RUNTIME_DIR [2] for Linux.

The approach is a bit different from the previous plan, where we will have
hgc_openat and pass cmdserveropts.sockdirfd to it. That's because the
current change is easier: chg has to pass a full path to "hg" as the
"--address" parameter. There is no "--address-basename" or "--address-dirfd"
flags. The next patch will remove "sockdirfd".

Note: It'd be nice if we can use a native "connectat" implementation.
However, that's not available everywhere. Some platform (namely FreeBSD)
does support it, but the implementation has bugs so it cannot be used [2].

[1]: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
[2]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-April/082892.html
2016-12-23 16:37:00 +00:00
Pulkit Goyal
2207f24f07 py3: add warnings in check-code related to py3
We have our own bytes versions of things like, getopt.getopt, os.sep, os.name,
sys.executable, os.environ and few more for python 3 portability. Its better
to come up with warnings if someone breaks the things which we have fixed.

After this patch, check-code will warn us to use our bytes version.
These checks run on mercurial/ and hgext/ and pycompat.py is excluded.
2016-12-21 22:42:31 +05:30
Jun Wu
e912a360fc chg: remove locks
See the previous two patches for the reason. The advantage is a simplified
code base and better throughput when starting multiple servers with multiple
confighashes. The disadvantage is starting multiple servers in parallel with
a single confighash will waste some CPU time, which is probably fine in
common use-cases.

This makes it easier to switch to relative paths to support long unix domain
socket paths.
2016-12-19 22:15:00 +00:00