`_optimize` calculates weights of subtrees. "small" affects some weight
calculation (either 1 or 0.5). The weights are now only useful in `and`
optimization where we might swap two arguments and use `andsmally`.
In the real world, it seems unlikely that revsets with weight of 0.5 or 1
matters the `and` order optimization. I think the important thing is to get
weights of expensive revsets right (ex. `contains`).
This patch removes the `small` argument to simplify the interface.
As for choosing between 0.5 vs 1, things returning a single revision
(`ancestor`, `string`) has a weight of 0.5. Things returning multiple
revisions returns 1. This could be sometimes useful in the `andsmally`
optimization, ex.
(((:)-2) & expensive()) & ((1-2) & expensive())
^^^ ^
^^^^^^^ ^^^^^
^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^
weight=1 weight=0.5
would have an `andsmally` optimization so `1-2` gets executed first, which
seems to be desirable.
Differential Revision: https://phab.mercurial-scm.org/D656
Now that the part processing loop is tiny, let's move it to a separate function.
This will allow extensions to completely replace the part processing logic,
without having to replace the overall bundle processing logic or the stream
maintenance logic.
This will be useful for the infinitepush extension, so it can completely take
over receiving a bundle and rerouting it to a side store. This will also make it
easier to upstream the infinitepush functionality later.
Differential Revision: https://phab.mercurial-scm.org/D709
As part of refactoring bundle part processing let's move handler validation to
its own function.
Differential Revision: https://phab.mercurial-scm.org/D707
The processpart function also did some stream maintenance, so let's move it to
the part iterator as well, as part of moving all part iteration logic into the
class.
There is one place processpart is called outside of the normal loop, so we
manually handle the seek there.
The now-empty try/finally will be removed in a later patch, for ease of review.
Differential Revision: https://phab.mercurial-scm.org/D706
This is a trivial backport of 9f516a26a962 performed by
augie@google.com, but the change is still really Durham's not mine, so
I [augie] am leaving him as the author.
It seems like we used to pick the oldest possible version of the
changegroup to use for bundles created by the repair module (used
e.g. by "hg strip" and for temporary bundles by "hg rebase"). I tried
to preserve that behavior when I created the changegroup.safeversion()
method in 77f74106b264 (changegroup: introduce safeversion(),
2016-01-19).
However, we have recently chagned our minds and decided that these
commands are only used locally and downgrades are unlikely. That
decicion allowed us to start adding obsmarker and phase information to
these bundles. However, as the bug report shows, it means we get
different behavior e.g. when generaldelta is not enabled (because when
it was enabled, it forced us to use bundle2). The commit that actually
caused the reported bug was 26d535788092 (strip: include phases in
bundle (BC), 2017-06-15).
So, since we now depend on having more information in the bundles,
let's make sure we instead pick the newest possible changegroup
version.
Differential Revision: https://phab.mercurial-scm.org/D715
It can be useful for extensions to be able to produce the shortest
unambiguous hash (including the in-tree "show" extension). That logic
is currently inside the shortest() template function. Let's move it
out of the templater. I've put it on revlog since it's closely related
to revlog._partialmatch. We may also want a convenience method on
context, but I'll leave that for a later patch.
Differential Revision: https://phab.mercurial-scm.org/D724
I've tripped on this several times now, and am tired of debugging. Now
the header parts are part of the error message when the ''.join()
fails, which makes debugging obvious.
This was the original intent, but I bungled the logic. Otherwise if there is a
certificate chain issue, the repository can't be cloned in order for there to be
a repo object. I think I missed this case because I was inside of a Mercurial
clone as I was originally developing and testing this.
This is a regression caused by 10c1efcbeb1e. Code prior to 10c1efcbeb1e
seems to miss the "\ No newline at end of file" line.
Differential Revision: https://phab.mercurial-scm.org/D528
As Augie reported in the bug, the current heuristic of choosing the
best tag of a merge commit by taking the one with newest tag (in terms
of tagging date) currently fails in the Mercurial repo itself. Copying
the example from Yuya:
$ hg glog -T '{node|short} {latesttag}+{latesttagdistance}\n' \
-r '4.2.3: & (merge() + parents(merge()) + tag())'
o cc59efae4cc0 4.2.3+5
|\
| o 06f60e88fc3a 4.2.3+4
| |\
| | o c191a9eb0b10 4.3-rc+109
| | |
| | ~
o | 49ada93fdc10 4.3.1+2
: |
o | 229937197835 4.3.1+0
|/
o 6a83ad94c0f2 4.2.3+3
|\
| ~
o 8e9dcdd1de74 4.2.3+2
:
o 525f2b18248f 4.2.3+0
|
~
It seems to me like the best choice is the tag with the smallest
number of changes since it (across all paths, not the longest single
path). So that's what this patch does, even though it's
costly. Best-of-5 timings for Yuya's command above shows a slowdown
from 1.293s to 1.610s. We can optimize it later.
Differential Revision: https://phab.mercurial-scm.org/D447
45345e9870c3 and b30126fa95bc refactored ui methods to no longer
silently swallow some IOError instances. This is arguably the
correct thing to do. However, it had the unfortunate side-effect
of causing StdioError to bubble up to sensitive code like
transaction aborts, leading to an uncaught exceptions and failures
to e.g. roll back a transaction. This could occur when a remote
HTTP or SSH client connection dropped. The new behavior is
resulting in semi-frequent "abandonded transaction" errors on
multiple high-volume repositories at Mozilla.
This commit effectively reverts 45345e9870c3 and b30126fa95bc to
restore the old behavior.
I agree with the principle that I/O errors shouldn't be ignored.
That makes this change... unfortunate. However, our hands are tied
for what to do on stable. I think the proper solution is for the
ui's behavior to be configurable (possibly via a context manager).
During critical sections like transaction rollback and abort, it
should be possible to suppress errors. But this feature would not
be appropriate on stable.
Old versions of python 2.7 don't like that the second argument to
struct.unpack_from is a bytearray, so the change removing the util.buffer
around that argument in branchmap broke running on older versions of python
2.7.
Differential Revision: https://phab.mercurial-scm.org/D330
This vulnerability was fixed by the previous patch and there were more ways
to exploit than using '|shellcmd'. So it doesn't make sense to reject only
pipe character.
Test cases are updated to actually try to exploit the bug. As the SSH bridge
of git/svn subrepos are not managed by our code, the tests for non-hg subrepos
are just removed.
This may be folded into the original patches.
'ssh://' has an exploit that will pass the url blindly to the ssh
command, allowing a malicious person to have a subrepo with
'-oProxyCommand' which could run arbitrary code on a user's machine. In
addition, at least on Windows, a pipe '|' is able to execute arbitrary
commands.
When this happens, let's throw a big abort into the user's face so that
they can inspect what's going on.
'ssh://' has an exploit that will pass the url blindly to the ssh
command, allowing a malicious person to have a subrepo with
'-oProxyCommand' which could run arbitrary code on a user's machine. In
addition, at least on Windows, a pipe '|' is able to execute arbitrary
commands.
When this happens, let's throw a big abort into the user's face so that
they can inspect what's going on.
Our use of SSH has an exploit that will parse the first part of an url
blindly as a hostname. Prior to this set of security patches, a url
with '-oProxyCommand' could run arbitrary code on a user's machine. In
addition, at least on Windows, a pipe '|' can be abused to execute
arbitrary commands in a similar fashion.
We defend against this by checking ssh:// URLs and looking for a
hostname that starts with a - or contains a |.
When this happens, let's throw a big abort into the user's face so
that they can inspect what's going on.
The initial attempt was to discard cache when appropriate, but it appears
to be error prone. We had to carefully inspect all places where audit() is
called e.g. without actually updating filesystem, before removing files and
directories, etc.
So, this patch disables the cache of audited paths by default, and enables
it only for the following cases:
- short-lived auditor objects
- repo.vfs, repo.svfs, and repo.cachevfs, which are managed directories
and considered sort of append-only (a file/directory would never be
replaced with a symlink)
There would be more cacheable vfs objects (e.g. mq.queue.opener), but I
decided not to inspect all of them in this patch. We can make them cached
later.
Benchmark result:
- using old clone of http://selenic.com/repo/linux-2.6/ (38319 files)
- on tmpfs
- run HGRCPATH=/dev/null hg up -q --time tip && hg up -q null
- try 4 times and take the last three results
original:
real 7.480 secs (user 1.140+22.760 sys 0.150+1.690)
real 8.010 secs (user 1.070+22.280 sys 0.170+2.120)
real 7.470 secs (user 1.120+22.390 sys 0.120+1.910)
clearcache (the other series):
real 7.680 secs (user 1.120+23.420 sys 0.140+1.970)
real 7.670 secs (user 1.110+23.620 sys 0.130+1.810)
real 7.740 secs (user 1.090+23.510 sys 0.160+1.940)
enable cache only for vfs and svfs (this series):
real 8.730 secs (user 1.500+25.190 sys 0.260+2.260)
real 8.750 secs (user 1.490+25.170 sys 0.250+2.340)
real 9.010 secs (user 1.680+25.340 sys 0.280+2.540)
remove cache function at all (for reference):
real 9.620 secs (user 1.440+27.120 sys 0.250+2.980)
real 9.420 secs (user 1.400+26.940 sys 0.320+3.130)
real 9.760 secs (user 1.530+27.270 sys 0.250+2.970)
Without this patch on Windows 'hg ci -i' hangs waiting for user input
and "examine changes to 'file'? [Ynesfdaq?]" is never displayed (at least
if the diff is sufficiently small). When Ctrl+C is pressed, this prompt
becomes visible, which suggests that the buffer just wasn't flushed.
I've never seen this happening on Linux, but this looks harmless enough
to not platform-gate it.
Before this patch, explicit --pager=on is unintentionally ignored by
any disabling factor, even if priority of it is less than --pager=on
(e.g. "[ui] paginate = off").
statichttprepo inherits from localrepository. In doing so, it
obtains default implementations of various methods, like wlock().
Before this change, tags cache writing would call repo.wlock().
This failed on statichttprepo due to localrepository's wlock()
looking for an instance attribute that doesn't exist on statichttprepo
(statichttprepo doesn't call localrepository.__init__).
We /could/ define missing attributes until the base wlock() works.
However, a statichttprepo is remote and read-only and can't be
locked. The class already has a lock() that short circuits. So
it makes sense to implement a short-circuited wlock() as well. That
is what this patch does.
LockError is expected to be raised when locking fails. The constructor
takes a number of arguments that are local repository centric. Rather
than rework LockError to not require them (which would not be
appropriate for stable), this commit populates dummy values. I don't
believe they'll ever be seen by the user, as lock failures on
static http repos should be limited to well-defined (and tested)
scenarios. We can and should revisit the LockError type to improve
this.
When we changed basematcher.visitdir() in 0ca205268beb (match: make
base matcher return True for visitdir, 2017-07-14), we forgot to add
an override in nevermatcher. This led to tests failing in narrowhg.
As Durham pointed out, it's high time to add unit tests for the
matcher, so this patch also adds a first unit test.
Differential Revision: https://phab.mercurial-scm.org/D151
Without this, multiple spaces or tabs in the commit message aren't
preserved and things like tables don't align properly.
As part of adding the CSS rule, we had to cuddle the content
with the <div> to not introduce leading and trailing whitespace.
The "addbreaks" filter was also removed because it would insert
an additional newline, effectively double spacing content.
Differential Revision: https://phab.mercurial-scm.org/D113
The presence of a sparse checkout can confuse legacy clients or
clients without sparse enabled for reasons that should be obvious.
This commit introduces a new repository requirement that tracks
whether sparse is enabled. The requirement is added when a sparse
config is activated and removed when the sparse config is reset.
The localrepository constructor has been taught to not open repos
with this requirement unless the sparse feature is enabled. It yields
a more actionable error message than what you would get if the
lockout were handled strictly at the requirements verification phase.
Old clients that aren't sparse aware will see the generic
"repository requires features unknown to this Mercurial" error,
however.
The new requirement has "exp" in its name to reflect the
experimental nature of sparse. There's a chance that the eventual
non-experimental feature won't change significantly and we could
have squatted on the "sparse" requirement without ill effect. If
that happens, we can teach new clients to still recognize the old
name. But I suspect we'll sneak in some BC and we'll want a new
requirement to convey new meaning.
Differential Revision: https://phab.mercurial-scm.org/D110
In 3 functions we were writing the sparse config and updating the
working directory. In two of them we had a transaction-like process
for restoring the sparse config in case of wdir update fail.
Because the pattern is common, we've already made mistakes, and the
complexity will increase in the near future, let's consolidate the
code into a reusable function.
As part of this refactor, we end up reading the "sparse" file twice
when updating it. This is a bit sub-optimal. But I don't think it
is worth the code complexity to pass around the variables to avoid
the redundancy.
Differential Revision: https://phab.mercurial-scm.org/D109
The repo instance is currently only used to provide a changeset
lookup function as part of parsing revsets. I /think/ this allows
node fragments to resolve. I'm not sure why we wouldn't want this
to always "just work" if parsing a revset string.
Plus, an upcoming commit will introduce a new consumer that needs a
handle on the repo. So passing it more often will make that code
work more.
Passing a repo instance in all callers of revset.match* results in
a bunch of test changes. Notably, branch and tags caches get
populated as part of evaluating revsets. I'm not sure if this is
desirable. So this patch takes the conservative approach and only
passes the repo if we're passing a ui instance.
Differential Revision: https://phab.mercurial-scm.org/D97
Previously, [include] was implicit and pattern lines before a
[section] were added to includes.
Because the format may change in the future and explicit behavior,
well, more explicit, this commit changes the config parser to
reject pattern lines that don't occur in a [section].
Differential Revision: https://phab.mercurial-scm.org/D96