I'm still cleaning this up, but it's easier to do in bite-size chunks
like this than all at once. The negative lookahead avoids one false
positive category from some output related to finding Subversion
bindings.
Differential Revision: https://phab.mercurial-scm.org/D13
It's possible to set up non-linear dependencies in Phabricator like:
o D4
|\
| o D3
| |
o | D2
|/
o D1
The old `phabread` code will print D1 twice. This patch adds de-duplication
to prevent that.
Test Plan:
Construct the above dependencies in a Phabricator test instance and make
sure the old code prints D1 twice while the new code won't.
Previously, we read Differential Revisions one by one by calling
`differential.query`.
Fetching them one by one is suboptimal. Unfortunately, there is no Conduit
API that allows us to get a stack of diffids using a single API call.
This patch tries to be smarter using a simple heuristic: when fetching D59
as a stack, previous IDs like D51, D52, D53, ..., D58 are likely belonging
to a same stack so just fetch them as well. Since `differential.query` only
returns cheap metadata without expensive diff content, it shouldn't be a big
problem for the server.
Using a test Phabricator instance, this patch reduces `phabread` reading a
10 patch stack from about 13 to 30 seconds to 8 seconds.
Previously, we call differential.getcommitmessage API to get commit
messages. Now we read that from "Differential Revision" object fetched
via "differential.query" API.
This removes one API call per patch.
This patch reworked phabread a bit so it fetches the lightweight
"Differential Revision" metadata for a stack first. Then read other data.
This allows the code to:
a) send 1 request to get all `hg:meta` data instead of N requests
b) patches are read in desired order, no need to buffer the output
"b)" reduces the memory usage from O(N^2) to O(N) since we no longer keep
old patch contents in memory.
The above `N` means the number of patches in the stack.
Previously we didn't abort. Now we abort if revs is empty. This is
consistent with "hg export" behavior. Maybe "return 1" is also a reasonable
behavior, but that's inconsistent with the existing "hg export".
Previously, `phabsend` uploads new diffs as long as the commit hash changes.
That's suboptimal because sometimes the diff is exactly the same as before,
the commit hash change is caused by a parent hash change, or commit message
change which do not affect diff content.
This patch adds a check examining actual diff contents to skip uploading new
diffs in that case.
The "hg:meta" property is for extra metadata to reconstruct the patch.
Previously it does not have node or parent information since I think by
reading the patch again, the commit message will be mangled (like, added the
"Differential Revision" line) and we cannot preserve the commit hash.
However, the "parent" information could be useful. It could be helpful to
locate the "base revision" so in case of a conflict applying the patch
directly, we might be able to use 3-way merge to resolve it correctly.
Note: "local:commits" is an existing "property" used by Phabricator that has
the node and parent information. However, it lacks of timezone information
and requires "author" and "authorEmail" to be separated. So we are using a
different "property" - "hg:meta" to be distinguished from "local:commits".
Previously, only tags can "associate" a changeset to a Differential
Revision. But the usual pattern (arc patch or hg phabread) is to put the
Differential Revision URL in commit message.
This patch makes the code read commit message to find associated
Differential Revision if associated tags are not found.
This makes some workflows possible. For example, if the author loses their
repo, or switch to another computer, they can continue download their own
patches from Phabricator and update them without needing to manually create
tags.
This patch adds a `phabread` command generating plain-text patches from
Phabricator, suitable for `hg import`. It respects `hg:meta` so user and
date information might be preserved. And it removes `Summary:` field name
which makes the commit message a bit tidier.
To support stacked diffs, a `--stack` flag was added to read dependent
patches recursively.
The `phabsend` command is intended to provide `hg email`-like experience -
sending a stack, setup dependency information and do not amend existing
changesets.
It uses differential.createrawdiff and differential.revision.edit Conduit
API to create or update a Differential Revision.
Local tags like `D123` are written indicating certain changesets were sent
to Phabricator. The `phabsend` command will use obsstore and tags
information to decide whether to update or create Differential Revisions.
The default Phabricator client arcanist is not friendly to send a stack of
changesets. It works better when a feature branch is reviewed as a single
review unit. However, we want multiple revisions per feature branch.
To be able to have an `hg email`-like UX to send and receive a stack of
commits easily, it seems we have to re-invent things. This patch adds
`phabricator.py` speaking Conduit API [1] in `contrib` as the first step.
This may also be an option for people who don't want to run PHP.
Config could be done in `hgrc` (instead of `arcrc` or `arcconfig`):
[phabricator]
# API token. Get it from https://phab.mercurial-scm.org/conduit/login/
token = cli-xxxxxxxxxxxxxxxxxxxxxxxxxxxx
url = https://phab.mercurial-scm.org/
# callsign is used by the next patch
callsign = HG
This patch only adds a single command: `debugcallconduit` to keep the patch
size small. To test it, having the above config, and run:
$ hg debugcallconduit diffusion.repository.search <<EOF
> {"constraints": {"callsigns": ["HG"]}}
> EOF
The result will be printed in prettified JSON format.
[1]: Conduit APIs are listed at https://phab.mercurial-scm.org/conduit/
The ignore regular expression has been updated to detect
"inconsistent config." If present, we track which configs have
that set and we suppress the conflicting defaults error for those
options.
I also added named groups to the regexp to aid readability.
A comment was added to profiling.py to make a desired inconsistent
value error go away.
For builds that run on hermetic environments, it's possible that the "less"
package is not installed by default, yet it's needed for tests to pass after
revision ca1519568a93 (which sets less as the fallback pager).
When trying to create a zstd bundle, the MSI based install said:
abort: compression engine zstd could not be loaded
The Inno installer is unaffected. The name will need to be updated to include
'cext' when merging into default.
Having linux wheels is going to helps system without compiler or python-dev
plus speed up the installation for everyone.
I followed the manylinux example repository
https://github.com/pypa/python-manylinux-demo
to add a make target (build-linux-wheels) using
official docker image to build python 2 linux wheels
for mercurial. It generates Python 2.6 and Python 2.7 for both
32 and 64 bits architectures.
I had to blacklist several test cases for various reasons:
* test-convert-git.t and test-subrepo-git.t because of the git version
* test-patchbomb-tls.t because of warning using tls 1.0
It's likely because the docker image is based on centos 5.0 and
openssl is outdated.
Now that environment variables override system-wide hgrc settings, we
can default Mercurial to sensible-editor and sensible-pager by default
for debian users.
Some shared-ssh installations assume that 'hg serve --stdio' is a safe
command to run for minimally trusted users. Unfortunately, the messy
implementation of argument parsing here meant that trying to access a
repo named '--debugger' would give the user a pdb prompt, thereby
sidestepping any hoped-for sandboxing. Serving repositories over HTTP(S)
is unaffected.
We're not currently hardening any subcommands other than 'serve'. If
your service exposes other commands to users with arbitrary repository
names, it is imperative that you defend against repository names of
'--debugger' and anything starting with '--config'.
The read-only mode of hg-ssh stopped working because it provided its hook
configuration to "hg serve --stdio" via --config parameter. This is banned for
security reasons now. This patch switches it to directly call ui.setconfig().
If your custom hosting infrastructure relies on passing --config to
"hg serve --stdio", you'll need to find a different way to get that configuration
into Mercurial, either by using ui.setconfig() as hg-ssh does in this patch,
or by placing an hgrc file someplace where Mercurial will read it.
mitrandir@fb.com provided some extra fixes for the dispatch code and
for hg-ssh in places that I overlooked.
This patch also makes some expected output lines in tests glob-ed for
persistence of them.
BTW, files below aren't yet changed in 2017, but this patch also
updates copyright of them, because:
- mercurial/help/hg.1.txt
almost all of "man hg" output comes from online help of hg
command, and is already changed in 2017
- mercurial/help/hgignore.5.txt
- mercurial/help/hgrc.5
"copyright 2005-201X Matt Mackall" in them mentions about
copyright of Mercurial itself
This patch also adds new check-code.py pattern to detect invalid usage
of "mercurial@selenic.com".
Change for test-convert-tla.t is tested, but similar change for almost
same test-convert-baz.t isn't yet tested actually, because I couldn't
find out the way to get "GNU Arch baz client".
AFAIK, buildbot skips test-convert-baz.t, too. Does anybody have
appropriate environment for testing?
This is a cherry pick of an upstream fix. The free() of uninitialed
memory could likely only occur if a malloc() inside zstd fails.
The patched functions aren't currently used by Mercurial. But I don't
like leaving footguns sitting around.
Previously, when runcommand raises, chg aborts with, and does not wait for
pager. The call stack is like:
hgc_runcommand -> handleresponse -> readchannel -> debugmsg("failed to
read channel") -> exit(255)
That means, chg returns to the shell, then both the pager and the shell will
read from the terminal at the same time, causing problems.
This patch fixes that by using "atexit" to register the pager cleanup
function so chg will always wait for pager even if runcommand raises.
Commit 63c68d6f5fc8de4afd9bde81b13b537beb4e47e8 from
https://github.com/indygreg/python-zstandard is imported without
modifications (other than removing unwanted files).
This includes minor performance and feature improvements. It also
changes the vendored zstd library from 1.1.1 to 1.1.2.
# no-check-commit
pycompat.getenv returns os.getenvb on py3 which is not available on Windows.
This patch replaces them with encoding.environ.get and checks to ensure no
new instances of os.getenv or os.setenv are introduced.
Module-level @cachefunc usage is risky because it can easily create a
memory "leak". Let's reject it completely for now. If a valid usage
comes up in the future, we can always improve the check or reconsider.
Now that the revlog has a reference to a compressor, it is
possible to swap in other compression engines. So, teach
`hg perfrevlogchunks` to do that.
The default behavior of `hg perfrevlogchunks` is now to measure the
compression performance of all compression engines implementing the
revlog compressor API. This effectively adds the no-op "none"
compressor and zstd (when available) into the default set.
While we can't yet plug alternate compressors into revlogs, this
command gives us a preview of the performance. On the mozilla-unified
repository:
$ hg perfrevlogchunks -c
! compress w/ none
! wall 0.115159 comb 0.110000 user 0.110000 sys 0.000000 (best of 86)
! compress w/ zlib
! wall 5.681406 comb 5.680000 user 5.680000 sys 0.000000 (best of 3)
! compress w/ zstd
! wall 2.624781 comb 2.620000 user 2.620000 sys 0.000000 (best of 4)
$ hg perfrevlogchunks -m
! compress w/ none
! wall 0.124486 comb 0.120000 user 0.120000 sys 0.000000 (best of 79)
! compress w/ zlib
! wall 10.144701 comb 10.150000 user 10.150000 sys 0.000000 (best of 3)
! compress w/ zstd
! wall 4.383118 comb 4.390000 user 4.390000 sys 0.000000 (best of 3)
Those numbers for zstd look promising. But they aren't the full story.
For that, we'll need to look at decompression times and storage sizes.
Stay tuned...
Upcoming patches will convert revlogs to use the compression engine
APIs to perform all things compression. The yet-to-be-introduced
APIs support a persistent "compressor" object so the same object
can be reused for multiple compression operations, leading to
better performance. In addition, compression engines like zstd
may wish to tweak compression engine state based on the revlog
(e.g. per-revlog compression dictionaries).
A global and shared decompress() function will shortly no longer
make much sense. So, we move decompress() to be a method of the
revlog class. It joins compress() there.
On the mozilla-unified repo, we can measure the impact of this change
on reading performance:
$ hg perfrevlogchunks -c
! chunk
! wall 1.932573 comb 1.930000 user 1.900000 sys 0.030000 (best of 6)
! wall 1.955183 comb 1.960000 user 1.930000 sys 0.030000 (best of 6)
! chunk batch
! wall 1.787879 comb 1.780000 user 1.770000 sys 0.010000 (best of 6
! wall 1.774444 comb 1.770000 user 1.750000 sys 0.020000 (best of 6)
"chunk" appeared to become slower but "chunk batch" got faster. Upon
further examination by running both sets multiple times, the numbers
appear to converge across all runs. This tells me that there is no
perceived performance impact to this refactor.
Selecting single and multiple revisions is closely related, so let's
put it in one place, so users can easily find it. We actually did not
even point to "hg help revsets" from "hg help revisions", but now that
they're on a single page, that won't be necessary.
This patch uses the newly introduced "setprocname" interface to update the
process title server-side, to make it easier to tell what a worker is actually
doing.
The new title is "chg[worker/$PID]", where PID is the process ID of the
connected client. It can be directly observed using "ps -AF" under Linux, or
"ps -A" under FreeBSD.
We have enough bits to switch to the new chg pager code path in runcommand.
So just remove the legacy getpager support.
This is a red-only patch, and will break chg's pager support temporarily.
This patch implements the simple S-channel pager handling at chg
client-side.
Note: It does not deal with environ and cwd currently for simplicity, which
will be fixed later.
Previously S channel is only used to send system commands. It will also be
used to send pager commands. So add a type parameter.
This breaks older chg clients. But chg and hg should always come from a
single commit and be packed into a single package. Supporting running
inconsistent versions of chg and hg seems to be unnecessarily complicated
with little benefit. So just make the change and assume people won't use
inconsistent chg with hg.