This was the original intent, but I bungled the logic. Otherwise if there is a
certificate chain issue, the repository can't be cloned in order for there to be
a repo object. I think I missed this case because I was inside of a Mercurial
clone as I was originally developing and testing this.
This is a regression caused by 10c1efcbeb1e. Code prior to 10c1efcbeb1e
seems to miss the "\ No newline at end of file" line.
Differential Revision: https://phab.mercurial-scm.org/D528
As Augie reported in the bug, the current heuristic of choosing the
best tag of a merge commit by taking the one with newest tag (in terms
of tagging date) currently fails in the Mercurial repo itself. Copying
the example from Yuya:
$ hg glog -T '{node|short} {latesttag}+{latesttagdistance}\n' \
-r '4.2.3: & (merge() + parents(merge()) + tag())'
o cc59efae4cc0 4.2.3+5
|\
| o 06f60e88fc3a 4.2.3+4
| |\
| | o c191a9eb0b10 4.3-rc+109
| | |
| | ~
o | 49ada93fdc10 4.3.1+2
: |
o | 229937197835 4.3.1+0
|/
o 6a83ad94c0f2 4.2.3+3
|\
| ~
o 8e9dcdd1de74 4.2.3+2
:
o 525f2b18248f 4.2.3+0
|
~
It seems to me like the best choice is the tag with the smallest
number of changes since it (across all paths, not the longest single
path). So that's what this patch does, even though it's
costly. Best-of-5 timings for Yuya's command above shows a slowdown
from 1.293s to 1.610s. We can optimize it later.
Differential Revision: https://phab.mercurial-scm.org/D447
45345e9870c3 and b30126fa95bc refactored ui methods to no longer
silently swallow some IOError instances. This is arguably the
correct thing to do. However, it had the unfortunate side-effect
of causing StdioError to bubble up to sensitive code like
transaction aborts, leading to an uncaught exceptions and failures
to e.g. roll back a transaction. This could occur when a remote
HTTP or SSH client connection dropped. The new behavior is
resulting in semi-frequent "abandonded transaction" errors on
multiple high-volume repositories at Mozilla.
This commit effectively reverts 45345e9870c3 and b30126fa95bc to
restore the old behavior.
I agree with the principle that I/O errors shouldn't be ignored.
That makes this change... unfortunate. However, our hands are tied
for what to do on stable. I think the proper solution is for the
ui's behavior to be configurable (possibly via a context manager).
During critical sections like transaction rollback and abort, it
should be possible to suppress errors. But this feature would not
be appropriate on stable.
Old versions of python 2.7 don't like that the second argument to
struct.unpack_from is a bytearray, so the change removing the util.buffer
around that argument in branchmap broke running on older versions of python
2.7.
Differential Revision: https://phab.mercurial-scm.org/D330
This vulnerability was fixed by the previous patch and there were more ways
to exploit than using '|shellcmd'. So it doesn't make sense to reject only
pipe character.
Test cases are updated to actually try to exploit the bug. As the SSH bridge
of git/svn subrepos are not managed by our code, the tests for non-hg subrepos
are just removed.
This may be folded into the original patches.
'ssh://' has an exploit that will pass the url blindly to the ssh
command, allowing a malicious person to have a subrepo with
'-oProxyCommand' which could run arbitrary code on a user's machine. In
addition, at least on Windows, a pipe '|' is able to execute arbitrary
commands.
When this happens, let's throw a big abort into the user's face so that
they can inspect what's going on.
'ssh://' has an exploit that will pass the url blindly to the ssh
command, allowing a malicious person to have a subrepo with
'-oProxyCommand' which could run arbitrary code on a user's machine. In
addition, at least on Windows, a pipe '|' is able to execute arbitrary
commands.
When this happens, let's throw a big abort into the user's face so that
they can inspect what's going on.
Our use of SSH has an exploit that will parse the first part of an url
blindly as a hostname. Prior to this set of security patches, a url
with '-oProxyCommand' could run arbitrary code on a user's machine. In
addition, at least on Windows, a pipe '|' can be abused to execute
arbitrary commands in a similar fashion.
We defend against this by checking ssh:// URLs and looking for a
hostname that starts with a - or contains a |.
When this happens, let's throw a big abort into the user's face so
that they can inspect what's going on.
The initial attempt was to discard cache when appropriate, but it appears
to be error prone. We had to carefully inspect all places where audit() is
called e.g. without actually updating filesystem, before removing files and
directories, etc.
So, this patch disables the cache of audited paths by default, and enables
it only for the following cases:
- short-lived auditor objects
- repo.vfs, repo.svfs, and repo.cachevfs, which are managed directories
and considered sort of append-only (a file/directory would never be
replaced with a symlink)
There would be more cacheable vfs objects (e.g. mq.queue.opener), but I
decided not to inspect all of them in this patch. We can make them cached
later.
Benchmark result:
- using old clone of http://selenic.com/repo/linux-2.6/ (38319 files)
- on tmpfs
- run HGRCPATH=/dev/null hg up -q --time tip && hg up -q null
- try 4 times and take the last three results
original:
real 7.480 secs (user 1.140+22.760 sys 0.150+1.690)
real 8.010 secs (user 1.070+22.280 sys 0.170+2.120)
real 7.470 secs (user 1.120+22.390 sys 0.120+1.910)
clearcache (the other series):
real 7.680 secs (user 1.120+23.420 sys 0.140+1.970)
real 7.670 secs (user 1.110+23.620 sys 0.130+1.810)
real 7.740 secs (user 1.090+23.510 sys 0.160+1.940)
enable cache only for vfs and svfs (this series):
real 8.730 secs (user 1.500+25.190 sys 0.260+2.260)
real 8.750 secs (user 1.490+25.170 sys 0.250+2.340)
real 9.010 secs (user 1.680+25.340 sys 0.280+2.540)
remove cache function at all (for reference):
real 9.620 secs (user 1.440+27.120 sys 0.250+2.980)
real 9.420 secs (user 1.400+26.940 sys 0.320+3.130)
real 9.760 secs (user 1.530+27.270 sys 0.250+2.970)
Without this patch on Windows 'hg ci -i' hangs waiting for user input
and "examine changes to 'file'? [Ynesfdaq?]" is never displayed (at least
if the diff is sufficiently small). When Ctrl+C is pressed, this prompt
becomes visible, which suggests that the buffer just wasn't flushed.
I've never seen this happening on Linux, but this looks harmless enough
to not platform-gate it.
Before this patch, explicit --pager=on is unintentionally ignored by
any disabling factor, even if priority of it is less than --pager=on
(e.g. "[ui] paginate = off").
statichttprepo inherits from localrepository. In doing so, it
obtains default implementations of various methods, like wlock().
Before this change, tags cache writing would call repo.wlock().
This failed on statichttprepo due to localrepository's wlock()
looking for an instance attribute that doesn't exist on statichttprepo
(statichttprepo doesn't call localrepository.__init__).
We /could/ define missing attributes until the base wlock() works.
However, a statichttprepo is remote and read-only and can't be
locked. The class already has a lock() that short circuits. So
it makes sense to implement a short-circuited wlock() as well. That
is what this patch does.
LockError is expected to be raised when locking fails. The constructor
takes a number of arguments that are local repository centric. Rather
than rework LockError to not require them (which would not be
appropriate for stable), this commit populates dummy values. I don't
believe they'll ever be seen by the user, as lock failures on
static http repos should be limited to well-defined (and tested)
scenarios. We can and should revisit the LockError type to improve
this.
When we changed basematcher.visitdir() in 0ca205268beb (match: make
base matcher return True for visitdir, 2017-07-14), we forgot to add
an override in nevermatcher. This led to tests failing in narrowhg.
As Durham pointed out, it's high time to add unit tests for the
matcher, so this patch also adds a first unit test.
Differential Revision: https://phab.mercurial-scm.org/D151
Without this, multiple spaces or tabs in the commit message aren't
preserved and things like tables don't align properly.
As part of adding the CSS rule, we had to cuddle the content
with the <div> to not introduce leading and trailing whitespace.
The "addbreaks" filter was also removed because it would insert
an additional newline, effectively double spacing content.
Differential Revision: https://phab.mercurial-scm.org/D113
The presence of a sparse checkout can confuse legacy clients or
clients without sparse enabled for reasons that should be obvious.
This commit introduces a new repository requirement that tracks
whether sparse is enabled. The requirement is added when a sparse
config is activated and removed when the sparse config is reset.
The localrepository constructor has been taught to not open repos
with this requirement unless the sparse feature is enabled. It yields
a more actionable error message than what you would get if the
lockout were handled strictly at the requirements verification phase.
Old clients that aren't sparse aware will see the generic
"repository requires features unknown to this Mercurial" error,
however.
The new requirement has "exp" in its name to reflect the
experimental nature of sparse. There's a chance that the eventual
non-experimental feature won't change significantly and we could
have squatted on the "sparse" requirement without ill effect. If
that happens, we can teach new clients to still recognize the old
name. But I suspect we'll sneak in some BC and we'll want a new
requirement to convey new meaning.
Differential Revision: https://phab.mercurial-scm.org/D110
In 3 functions we were writing the sparse config and updating the
working directory. In two of them we had a transaction-like process
for restoring the sparse config in case of wdir update fail.
Because the pattern is common, we've already made mistakes, and the
complexity will increase in the near future, let's consolidate the
code into a reusable function.
As part of this refactor, we end up reading the "sparse" file twice
when updating it. This is a bit sub-optimal. But I don't think it
is worth the code complexity to pass around the variables to avoid
the redundancy.
Differential Revision: https://phab.mercurial-scm.org/D109
The repo instance is currently only used to provide a changeset
lookup function as part of parsing revsets. I /think/ this allows
node fragments to resolve. I'm not sure why we wouldn't want this
to always "just work" if parsing a revset string.
Plus, an upcoming commit will introduce a new consumer that needs a
handle on the repo. So passing it more often will make that code
work more.
Passing a repo instance in all callers of revset.match* results in
a bunch of test changes. Notably, branch and tags caches get
populated as part of evaluating revsets. I'm not sure if this is
desirable. So this patch takes the conservative approach and only
passes the repo if we're passing a ui instance.
Differential Revision: https://phab.mercurial-scm.org/D97
Previously, [include] was implicit and pattern lines before a
[section] were added to includes.
Because the format may change in the future and explicit behavior,
well, more explicit, this commit changes the config parser to
reject pattern lines that don't occur in a [section].
Differential Revision: https://phab.mercurial-scm.org/D96
The changeset displayer allows setting extra keywords to be available
to the templating layer. This patch adds an argument to displaygraph()
to pass a dict of extra properties to be available to every changeset.
Differential Revision: https://phab.mercurial-scm.org/D555
This is in the same style as https://phab.mercurial-scm.org/D493.
In general, this replaces patterns such as:
```
f in self._map:
entry = self._map[f]
```
with:
```
entry = self._map.get(f):
if entry is not None:
# use entry
```
Test Plan:
`make tests`
Differential Revision: https://phab.mercurial-scm.org/D663
Config items are likely to be used in during extensions setup. So we much
register them before that.
For example this apply to the 'win32text.warn' options.
Some of the extra data need to be registered earlier than they currently are
(eg: config items). We first factor out the logic to registered them in a small
function before reusing it in the next changeset.
copytrace extension in fb-hgext has a heuristic implementation of copy tracing
which is faster than the current copy tracing. The heuristic limits the search
of copies to just files that are either:
1) Renames in the same directory
2) Moved to other directory with same name
The default copytrace implementation is very slow as it finds all the new files
that were added from merge base up to the head commit and for each file it
checks whether it this was copied or moved version of a different file.
Stash@fb did analysis for the above heuristics on the fb repo and found that
among 2,443,768 moves/copies there are only 32,234 moves/copies which does not
fall under the above heuristics which is approx. 0.013 of total copies.
This patch moves the heuristics algorithm under config
`experimental.copytrace=heuristics`.
While moving fbext to core, this patch removes couple of less useful config
options named `sourcecommitlimit` and `maxmovescandidatestocheck`.
Tests are also added for the heuristics algorithm, which are basically copied
from fbext/tests/test-copytrace.t. The tests follow a pattern creating a server
repo and then cloning to a local repo to create public and draft changesets, the
distinction which will be useful in upcoming patches.
After this patch `experimental.copytrace` has the following behaviour:
1) `off`: turns off copytracing
2) `heuristics`: use the heuristic algorithm added in this patch.
3) everything else: use the full copytracing algorithm
.. feature::
A new fast heuristic algorithm for copytracing which assumes that the files
moves are either::
1) Renames in the same directory
2) Moves in other directories with same names
You can use this algorithm by setting `experimental.copytrace=heuristics`.
Differential Revision: https://phab.mercurial-scm.org/D623
As part of separating the part iteration logic from the part handling logic,
let's move the exception handling to the part iterator class.
Differential Revision: https://phab.mercurial-scm.org/D705
As part of moving the part iterator logic to a separate class, let's move the
part counting logic and the output for it.
Differential Revision: https://phab.mercurial-scm.org/D704
Currently, the part iterator logic is tightly coupled with the part handling
logic, which means it's hard to replace the part handling logic without
duplicating the part iterator bits.
In a future diff we'll want to be able to replace all part handling, so let's
begin refactoring the part iterator logic to it's own class.
Differential Revision: https://phab.mercurial-scm.org/D703
Extensions, like remotefilelog, will want to look at the source of a pull when
determining what manifests to add to a changegroup. For instance, on push they
will include everything, while on pull they won't.
Differential Revision: https://phab.mercurial-scm.org/D686
Previously revlog.addgroup would accept a changegroup and a linkmapper and use
it to iterate of the deltas. As part of untangling the revlog-changegroup
interdependency, let's move the changegroup delta iteration logic to it's own
function and pass the simple iterator to the revlog instead.
This will make it easier to introduce non-revlogs stores in the future, without
reinventing any changegroup specific logic.
Differential Revision: https://phab.mercurial-scm.org/D688
Previously the addgroup loop would set chain to be the result of
self._addrevision(node,...). Since _addrevision now always returns the passed in
node, we can drop that behavior and just always set chain = node in the loop.
This will be useful in a future patch where we refactor the cg.deltachunk logic
to another function and therefore chain disappears entirely from this function.
Differential Revision: https://phab.mercurial-scm.org/D699
When ui.origbackuppath is set, .orig files are stored outside of the working
copy, however they still have a .orig suffix appended to them. This can cause
unexpected conflicts, particularly when tracked files or directories have .orig
at the end.
This change removes the .orig suffix from files stored in an out-of-tree
origbackuppath.
Test Plan:
Update and run unit tests.
Differential Revision: https://phab.mercurial-scm.org/D679
encoding.fromlocal() never tries to decode an ascii string since 3cb2361c60fc,
and there's no universal non-ascii string which can be decoded as any valid
character set.
Previously this check happened in the changegroup code itself. Since its
refactor, this logic needs to move out to callers that care about it, such as
this one. Otherwise we get empty bundle devel-warnings in certain extensions.
Differential Revision: https://phab.mercurial-scm.org/D690
``recordupdates`` calls into the dirstate which requires the files to be
there, so this is the last possible moment we can flush anything.
Differential Revision: https://phab.mercurial-scm.org/D673