Rebase is becoming more complex and it looks like a good idea to try some
brute force enumeration to cover cases that are hard to find manually.
Using brute force to generate repos in different shapes and enumerating the
rebase source and destination would generate too many cases that takes too
long to compute. This patch limits the "brute force" to only the "rebase
source" part. Repo and destination are still manual.
The added test cases are crafted manually to reveal some behaviors that are
not covered by other tests:
- "revlog index out of range" crash
- after rebase, p1 == p2, p2 != null
- "nothing to merge" abort
In the future we might want to add more tests here. For now I'm more
interested in revealing interesting behaviors in a minified way. I tried
some more complex cases but didn't find other interesting behaviors.
Differential Revision: https://phab.mercurial-scm.org/D262
The changelog object can store recently added revisions in memory
until the transaction is committed. We don't want to lose those
changes even if repo.invalidate(clearfilecache=True), so let's skip
the changelog when it has such 'delayed' changes.
Differential Revision: https://phab.mercurial-scm.org/D152
It is possible that the incoming note fragments have some similar content as the
existing release notes. In case of a bug fix, we match for issueNNNN in the
existing notes. For other general cases, it makes use of fuzzywuzzy library to get
a similarity score. If the score is above a certain threshold, we ignore the
fragment, otherwise add it. But the score might be misleading for small commit
messages. So, it uses similarity function only if the length of string (in words)
is above a certain value. The patch adds tests related to its usage. But it needs
improvement in the sense of combining incoming notes. We can use interactive mode
for adding notes. Maybe we can do this if similarity is under a certain range.
Essentially, these were acting as a verbose (?) flag, since they weren't being
dropped when required. Foozy has a nice description [1]. Basically, a couple
more places needed to check the features before treating it as optional.
I don't like how test-run-tests.py had to be hacked, but _hghave() can't be made
a static method. The test change was a change while developing `debugssl`,
prior to tightening up the cases where the message is printed, that this fix
would have caught.
[1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-July/101941.html
pushvars extension in fbext adds a --pushvars flag to push command using which
one send strings to server which becomes environment variables there prepended
with HG_USERVAR_. These variables can then be used to run hooks on the server.
The extension is moved directly to core and unbundling of the strings and
converting them to environment variables at server is disabled by default for
security reasons. One can turn that on by following config:
[push]
pushvars.server = true
This patch also adds the test for the extension.
Differential Revision: https://phab.mercurial-scm.org/D210
Rename bumped to phase-divergent in all external user-facing output. Only
update user-facing output for the moment, variables names, templates keyword
and potentially configuration would be done in later series.
The renaming is done according to
https://www.mercurial-scm.org/wiki/CEDVocabulary.
Differential Revision: https://phab.mercurial-scm.org/D216
Rename divergent to content-divergent in all external user-facing output. Only
update user-facing output for the moment, variables names, templates keyword
and potentially configuration would be done in later series.
The renaming is done according to
https://www.mercurial-scm.org/wiki/CEDVocabulary.
Differential Revision: https://phab.mercurial-scm.org/D215
This commit makes it so sparse treats passed paths as CWD-relative,
not repo-root-realive. This is a more intuitive behavior in my (and some
other FB people's) opinion.
This is breaking change however. My hope here is that since sparse is
experimental, it's ok to introduce BCs.
The reason (glob)s are needed in the test is this: in these two cases we
do not supply path together with slashes, but `os.path.join` adds them, which
means that under Windows they can be backslashes. To demonstrate this behavior,
one could remove the (glob)s and run `./run-tests.py test-sparse.t` from
MinGW's terminal on Windows.
Current logic is misleading (it says it drops only absolute paths, but
it actually drops all of them), not cross-platform (does not support Windows)
and IMO just wrong (as it should just error out if absolute paths are given).
This commit fixes it.
As the test case shows, when "hg rebase -d G -r 'B + D + F'" is run on
the following graph, we crash with traceback. It's reasonable to fail
because we can not easily produce a correct rebased F. The problem is
what diff to apply to either the rebased B or the rebased D. We could
potentially produce the result by e.g. applying the (F-D) diff to the
rebased B and then applying the reverse (E-D) diff on top, but that
could result in merge conflicts in each of those steps, which we don't
have a way of dealing with. So for now, let's just add a test case to
demonstrate that we crash (i.e. the AssertionError is clearly
incorrect since the user can run into it).
F
/|
C E
| |
B D G
\|/
A
Differential Revision: https://phab.mercurial-scm.org/D212
The fix in c97497e78a9b (rebase: handle successor targets (issue5198),
2016-04-11) only fixed the case where p2's successor was in the
destination, and only when the successor was exactly the destination
(i.e. not when the successor was an ancestor of it). This patch adds a
test case for when p1's successor is in the destination. It adds
another one for when the successor is an ancestor of the
destination. To do that simply, it also rewrites the test case using
drawdag.
Differential Revision: https://phab.mercurial-scm.org/D211
Rename unstable to orphan in all external user-facing output. Only update
user-facing output for the moment, variables names, templates keyword and
potentially configuration would be done in later series.
The renaming is done according to
https://www.mercurial-scm.org/wiki/CEDVocabulary.
Differential Revision: https://phab.mercurial-scm.org/D214
Rename trouble(s) to instability in all external user-facing output. Only
update user-facing output for the moment, variables names, templates keyword
and potentially configuration would be done in later series.
The renaming is done according to
https://www.mercurial-scm.org/wiki/CEDVocabulary.
Differential Revision: https://phab.mercurial-scm.org/D213
When a transaction is started, we must load the hookargs from the
bundleoperation object to the transaction so that they can be used in the
transaction. Also this patch makes sure no more hookargs are added to the
bundleoperation object once the transaction starts.
This is a part of porting fb extension bundle2hooks to core.
Differential Revision: https://phab.mercurial-scm.org/D209
This was previously landed as 4bc0c14fb501 but backed out in b63351f6a2 because
it broke hooks mid-rebase and caused conflict resolution data loss in the event
of unexpected exceptions. This new version adds the behavior back but behind a
config flag, since the performance improvement is notable in large repositories.
The old commit message was:
Recently we switched rebases to run the entire rebase inside a single
transaction, which dramatically improved the speed of rebases in repos with
large working copies. Let's also move the dirstate into a single dirstateguard
to get the same benefits. This let's us avoid serializing the dirstate after
each commit.
In a large repo, rebasing 27 commits is sped up by about 20%.
I believe the test changes are because us touching the dirstate gave the
transaction something to actually rollback.
(grafted from 9e3dc3a1638b9754b58a0cb26aaa75d868058109)
(grafted from 7d38b41d2266d9a02a15c64229fae0da5738dcec)
Differential Revision: https://phab.mercurial-scm.org/D135
This library was a backport of the Python 3 "selectors" library. It is
useful to provide a better selector interface for Python2, to address some
issues of the plain old select.select, mentioned in the next patch.
The code [1] was ported using the MIT license, with some minor modifications
to make our test happy:
1. "# no-check-code" was added since it's foreign code.
2. "from __future__ import absolute_import" was added.
3. "from collections import namedtuple, Mapping" changed to avoid direct
symbol import.
[1]: d27dbd2fdc/selectors2.py
# no-check-commit
I was several directories deep in the kernel tree, ran `hg add`, and got the
warning about the size of one of the files. I noticed that it suggested undoing
the add with a specific revert command. The problem is, it would have failed
since the path printed was relative to the repo root instead of cwd. While
here, I just fixed the other messages too. As an added benefit, these messages
now look the same as the verbose/inexact messages for the corresponding command.
I don't think most of these messages are reachable (typically the corresponding
cmdutil function does the check). I wasn't able to figure out why the keyword
tests were failing when using pathto()- I couldn't cause an absolute path to be
used by manipulating the --cwd flag on a regular add. (I did notice that
keyword is adding the file without holding wlock.)
More Windows sadness. Maybe someone can figure out how to make win32 color
work, but I think we avoid importing stuff from the mercurial package in this
module. On the plus side, this conditionalizes away a test failure.
Instead of treating files that are outside the sparse config as
ignored, this makes it so we list only those that are within the
sparse config by passing the sparse matcher to dirstate.walk().
Once we add support for narrow (sparseness applied to history, not
just working copy), we will need to do a similar restriction of the
walk over manifests, so this will be more consistent then. It also
simplifies the code a bit.
Note that a side-effect of this change is that files outside the
sparse config used to be listed as ignored, but they will now not be
listed at all. This can be seen in the test case where "hg purge" no
longer has any effect because it doesn't see that the files outside
the space config exist. To fix that, I think we should add an option
to dirstate.walk() to walk outside the sparse config. We might expose
that to the user as --no-sparse flag to e.g. "hg status" and "hg
purge", but that's work for another day.
Differential Revision: https://phab.mercurial-scm.org/D59
This is a Windows only thing. Unfortunately, the socket is closed at this point
(so the certificate is unavailable to check the chain). That means it's printed
out when verification fails as a guess, on the assumption that 1) most of the
time verification won't fail, and 2) sites using expired or certs that are too
new will be rare. Maybe this is an argument for adding more functionality to
debugssl, to test for problems and print certificate info. Or maybe it's an
argument for bundling certificates with the Windows builds. That idea was set
aside when the enhanced SSL code went in last summer, and it looks like there
were issues with using certifi on Windows anyway[1].
This was tested by deleting the certificate out of certmgr.msc > "Third-Party
Root Certification Authorities" > "Certificates", seeing `hg pull` fail (with
the new message), trying this command, and then successfully performing the pull
command.
[1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-October/089573.html
This is only useful on Windows, and avoids the need to use Internet Explorer to
build the certificate chain. I can see this being extended in the future to
print information about the certificate(s) to help debug issues on any platform.
Maybe even perform some of the python checks listed on the secure connections
wiki page. But for now, all I need is 1) a command that can be invoked in a
setup script to ensure the certificate is installed, and 2) a command that the
user can run if/when a certificate changes in the future.
It would have been nice to leverage the sslutil library to pick up host specific
settings, but attempting to use sslutil.wrapsocket() failed the
'not sslsocket.cipher()' check in it and aborted.
The output is a little more chatty than some commands, but I've seen the update
take 10+ seconds, and this is only a debug command.
find_deepest is used to find the "best" ancestors given a list. In the main
loop it keeps an invariant called 'ninteresting' which is supposed to contain
the number of non-zero entries in the 'interesting' array. This invariant is
incorrectly maintained, however, which leads the the algorithm returning an
empty result for certain graphs. This has been fixed.
Also, the 'interesting' array is supposed to fit 2^ancestors values, but is
incorrectly allocated to twice that size. This has been fixed as well.
The tests in test-ancestor.py compare the Python and C versions of the code,
and report the error correctly, since the Python version works correct. Even
so, I have added an additional test against the expected result, in the event
that both algorithms have an identical error in the future.
This fixes issue5623.
In some case, the default of one value is derived from other value. We add a
way to register them anyway and an associated devel-warning.
The registration is very naive for the moment. We might be able to have a
better way for registering each of these cases but it could be done later.
Now that we have all tracking in place, the data in `tr.changes['phases']`
dictionary should be correct and we should test it.
It is a bit late in the cycle to discuss to add any public API (eg: hooks)
that expose the data to the user, so we just add a small test extension
displaying the data. It is enabled for the phases tests.
New output have been manually checked for consistency.
This was meant as a substitute for Python's "with" with multiple
context managers before we moved to Python 2.7. We're now on 2.7, so
we should have no reason to keep ctxmanager. "hg grep --all
ctxmanager" says that it was never used anyway.
Differential Revision: https://phab.mercurial-scm.org/D73
we wrap 'repo.svfs.audit' to check for the store lock when accessing file in
'.hg/store' for writing. This caught a couple of instance where the transaction
was released after the lock, we should probably have a dedicated checker for
that case.
When the appropriate developer warnings are enabled, We wrap 'repo.vfs.audit' to
check for locks when accessing file in '.hg' for writing. Another changeset will
add a 'ward' for the store vfs (svfs).
This check system has caught a handful of locking issues that have been fixed
in previous series (mostly in 4.0). I expect another batch to be caught in third
party extensions.
We introduce two real exceptions from extensions 'blackbox.log' (because a lot of
read-only operations add entry to it), and 'last-email.txt' (because 'hg email'
is currently a read only operation and there is value to keep it this way).
In addition we are currently allowing bisect to operate outside of the lock
because the current code is a bit hard to get properly locked for now. Multiple
clean up have been made but there is still a couple of them to do and the freeze
is coming.
The backslashes in the local paths were being escaped with another backslash,
and $TESTTMP doesn't match against the double backslashed path. This doesn't
happen without the 'json' filter.
IMHO, these tests were listed up in blacklist at f82d98daa0de, because
fsmonitor extension wasn't correctly installed before actual testing
at that revision (this issue was fixed by 5b36c6365794).
A leading '/' without a following drive letter is resolved to the location of
the MSYS installation. These could be collapsed into a single line of output,
but this seems more self documenting.
The output of run-tests has no formatting by default, which hampers readability.
This patch colors the diff output when pygments is available. To avoid coloring
even when pygments is available, use --color never.
The proposed syntax [1] was originally 'set{n rel}', but it seemed slightly
confusing if template is involved. On the other hand, we want to keep 'set[n]'
for future extension. So this patch introduces 'set#rel[n]' ternary operator.
I chose '#' just because it looks like applying an attribute.
This also adds stubs for 'set[n]' and 'set#rel' operators since these syntax
elements are fundamental for constructing 'set#rel[n]'.
[1]: https://www.mercurial-scm.org/wiki/RevsetOperatorPlan#ideas_from_mpm
It's sometimes useful to show hyperlinks in log output.
"{get(peerpaths, "default")}/rev/{node}"
Since each path may have sub options, "{peerpaths}" is structured as a dict
of dicts, but the inner dict is rendered as if it were a string URL. The
implementation is ad-hoc, so there are some weird behaviors described in
the test. We might need to introduce a proper way of handling a hybrid
scalar object.
This patch adds _hybrid.__getitem__() so d['path']['url'] works.
The keyword is named as "peerpaths" since "paths" seemed too generic in
log context.
If we are bundling secret changeset and the bundle will contain phase, we
request the changegroup to be applied as secret.
It will be useful for next patch as we are now sure that secrets changesets
are applied as secret and not applied as draft then forced to secret.
Various third parties have implemented the `amend` command, which is in high
demand. This patch adds it as an experimental extension so its interface
could be formalized in core directly.
Since `commit --amend` is basically what `amend` should do. The command is
just a thin wrapper around `commit --amend` and just prevent the editor from
popping up by passing `--message`.
This changeset attempts to solve two issues with the "followlines" UI in
hgweb. First the "followlines" action is currently not easily discoverable
(one has to hover on a line for some time, wait for the invite message to
appear and then perform some action). Second, it gets in the way of natural
line selection, especially in filerevision view.
This changeset introduces an additional markup element (a <button
class="btn-followlines">) alongside each content line of the view. This button
now holds events for line selection that were previously plugged onto content
lines directly. Consequently, there's no more action on content lines, hence
restoring the "natural line selection" behavior (solving the second problem).
These buttons are hidden by default and get displayed upon hover of content
lines; then upon hover of a button itself, a text inviting followlines section
shows up. This solves the first problem (discoverability) as we now have a
clear visual element indicating that "some action could be perform" (i.e. a
button) and that is self-documented.
In followlines.js, all event listeners are now attached to these <button>
elements. The custom "floating tooltip" element is dropped as <button>
elements are now self-documented through a "title" attribute that changes
depending on preceding actions (selection started or not, in particular).
The new <button> element is inserted in followlines.js script (thus only
visible if JavaScript is activated); it contains a "+" and "-" with a
"diff-semantics" style; upon hover, it scales up.
To find the parent element under which to insert the <button> we either rely
on the "data-selectabletag" attribute (which defines the HTML tag of children
of class="sourcelines" element e.g. <span> for filerevision view and <tr> for
annotate view) or use a child of the latter elements if we find an element
with class="followlines-btn-parent" (useful for annotate view, for which we
have to find the <td> in which to insert the <button>).
On noticeable change in CSS concerns the "margin-left" of span:before
pseudo-elements in filelog view that has been increased a bit in order to
leave space for the new button to appear between line number column and
line content one.
Also note the "z-index" addition for "annotate-info" box so that the latter
appears on top of new buttons (instead of getting hidden).
In some respect, the UI similar to line commenting feature that is implemented
in popular code hosting site like GitHub, BitBucket or Kallithea.
Converting from CVS to Mercurial assumes that CVS log messages in "cvs
rlog" output are encoded in UTF-8 (or basic Latin-1). But cvs itself
is usually unaware of encoding of log messages, in practice.
Therefore, if there are commits, of which log message is encoded in
other than UTF-8, log message of corresponded revisions in the
converted repository will be broken.
To avoid such broken log messages, this patch transcodes CVS log
messages by encoding specified via "convert.cvsps.logencoding"
configuration.
This patch accepts multiple encoding for convenience, because
"multiple encoding mixed in a repository" easily occurs. For example,
UTF-8 (recent POSIX), cp932 (Windows), and EUC-JP (legacy POSIX) are
well known encoding for Japanese.
Currently, sslutil._hostsettings() performs validation that web.cacerts
exists. However, client certificates are passed in to the function
and not all callers may validate them. This includes
httpconnection.readauthforuri(), which loads the [auth] section.
If a missing file is specified, the ssl module will raise a generic
IOException. And, it doesn't even give us the courtesy of telling
us which file is missing! Mercurial then prints a generic
"abort: No such file or directory" (or similar) error, leaving users
to scratch their head as to what file is missing.
This commit introduces explicit validation of all paths passed as
arguments to wrapsocket() and wrapserversocket(). Any missing file
is alerted about explicitly.
We should probably catch missing files earlier - as part of loading
the [auth] section. However, I think the sslutil functions should
check for file presence regardless of what callers do because that's
the only way to be sure that missing files are always detected.
This revset returns all successors, including transit nodes and the source
nodes (to be consistent with existing revsets like "ancestors").
To filter out transit nodes, use `successors(X)-obsolete()`.
To filter out divergent case, use `successors(X)-divergent()-obsolete()`.
The revset could be useful to define rebase destination, like:
`max(successors(BASE)-divergent()-obsolete())`. The `max` is to deal with
splits.
There are other implementations where `successors` returns just one level of
successors, and `allsuccessors` returns everything. I think `successors`
returning all successors by default is more user friendly. We have seen
cases in production where people use 1-level `successors` while they really
want `allsuccessors`. So it seems better to just have one single revset
returning all successors by default to avoid user errors.
In the future we might want to add `depth` keyword argument to it and for
other revsets like `ancestors` etc. Or even build some flexible indexing
syntax [1] to satisfy people having the depth limit requirement.
[1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-July/101140.html
At the interactive update prompt, if (c) is chosen and then followed by `hg rm`,
both `status -R` and `status -S` show the file as 'R', and `files -R` shows no
files (OK, because explicitly removed files aren't supposed to be listed). If
`rm` follows selecting (c), then both flavors of `status` list the file as '!',
and `files -R` lists the missing file. So somehow, the (d) option has followed
a third path.
Well, mostly. The annotation on subrepo functions tacks on a parenthetical to
the abort message, which seems reasonable for a generic mechanism. But now all
messages consistently spell out 'subrepository', and double quote the name of
the repo. I noticed the inconsistency in the change for the last commit.
This simply passes the 'missing' argument down from the context of the parent
repo, so the same rules apply. subrepo.bailifchanged() is hardcoded to care
about missing files, because cmdutil.bailifchanged() is too.
In the end, it looks like this addresses inconsistencies with 'archive',
'identify', blackbox logs, 'merge', and 'update --check'. I wasn't sure how to
implement this in git, so that's left for someone more familiar with it.
Since the identify command adds a '+' for missing files, it's reasonable that
this does too. Perhaps the node field's hex value should be p1+p2 for merges?
This is a continuation of 5ba3f753c9b1. I overlooked that blackbox logs also
have a dirty marker. Also, the `hg update --check` test was updating to a
revision where the deleted file wasn't tracked, which is why status seemed to
show the deleted file was restored.
The regexes are passed to re.match(), which matches against the
beginning of the input, so the '^' doesn't do anything.
Note that unrooted patterns, such as globs and regexes from .hgignore
are instead achieved by adding '.*' to the expression given by the
user. (That's unless the user's expression started with '^', in which
case the '.*' is not added, perhaps to keep the regex cleaner?)
This is more consistent with other commands, like "commit -v" won't show
bookmark movement messages.
It will make migrating to scmutil.cleanupnodes easier.
I'm not sure if the difference on Windows for test-sparse.t is expected or not.
It looks like unless the leading '/' is followed by a drive letter, '/' is
resolved to 'C:/MinGW/msys/1.0'. But both cases abort with "not under root"
instead of just warning.
Previously repo.anyrevs only expand aliases in [revsetalias] config. This
patch makes it more flexible to accept a customized dict defining aliases
without having to couple with ui.
revsetlang.expandaliases now has the signature (tree, aliases, warn=None)
which is more consistent with templater.expandaliases. revsetlang.py is now
free from "ui", which seems to be a good thing.
When unquoted, MSYS sees the colon between the drive letter and path as a Unix
path separator and unhelpfully splits on it, feeding only the drive letter as
the command. Much chaos ensues.
I vaguely remember trying to get the test runner to use /letter/path/to/exe
syntax the last time this happened, without success. I doubt a check-code rule
would work, since sometimes it is quoted, and sometimes the quotes are escaped.
This patch migrates rebase to use scmutil.cleanupnodes API. It simplifies
the code and makes rebase code reusable inside a transaction.
This is a BC because the backup file is no longer strip-backup/*-backup.hg,
but strip-backup/*-rebase.hg. The latter looks more reasonable since the
directory name is "strip-backup" so there is no need to repeat "backup".
I think the backup file name change is probably fine as a BC, since we have
changed it before (2e51c9a7a08f) and didn't get complains. The end result
of this series will be a much more consistent and unified backup names:
command | old backup file suffix | new backup file suffix
-------------------------------------------------------------------
amend | amend-backup.hg | amend.hg
histedit | backup.hg (could be 2 files) | histedit.hg (single file)
rebase | backup.hg | rebase.hg
strip | backup.hg | backup.hg
(note: backup files are under .hg/strip-backup)
It also fixes issue5606 as a side effect because the new "delayedstrip" code
path will carefully examine nodes (safestriproots) to make sure orphaned
changesets won't get stripped by accident.
Some warning messages are changed to the new "warning: orphaned descendants
detected, not stripping HASHES", which provides more information about
exactly what changesets are left behind.
Another minor behavior change is when there is an obsoleted changeset with a
successor in the destination branch, bookmarks pointing to that obsoleted
changeset will not be moved. I have commented in test-rebase-obsolete.t
explaining why that is more desirable.
cleanupnodes takes care of bookmark movement, and bookmark movement could
cause bookmark divergent resolution as a side effect. This patch adds such
bookmark divergent resolution logic so future rebase migration will be
easier.
The revset is carefully written to be equivalent to what rebase does today.
Although I think it might make sense to remove divergent bookmarks more
aggressively, for example:
F book@1
|
E book@2
|
| D book
| |
| C
|/
B book@3
|
A
When rebase -s C -d E, "book@1" will be removed, "book@3" will be kept,
and the end result is:
D book
|
C
|
F
|
E book@2 (?)
|
B book@3
|
A
The question is should we keep book@2? The current logic keeps it. If we
choose not to (makes some sense to me), the "deleterevs" revset could be
simplified to "newnode % oldnode".
For now, I just make it compatible with the existing behavior. If we want to
make the "deleterevs" revset simpler, we can always do it in the future.
In some valid usecases, the "mapping" received by scmutil.cleanupnodes have
filtered nodes. Use unfiltered repo to access them correctly.
The added test case will fail with the old cleanupnodes code.
This is important to migrate histedit to use the cleanupnodes API.
Aliases define optional alternatives to existing options. For example the old
option ui.user was deprecated and replaced by ui.username. With this mechanism,
it's even possible to create an alias to an option in a different section.
Add ui.user as alias to ui.username as an example of this concept.
The old alternates principle in ui.config is removed as it was used only for
this option.
If the matching command lives in an in-tree extension (which is all we
scan for), and the user has disabled that extension with
"extensions.<name>=!", we were not finding it, because the path in
_disabledextensions was the empty string. If the user had set
"extensions.<name>=!<valid path>" it would work, so it seems like just
a mistake that it didn't work.
This includes one test showing how disabling a command with e.g.
"extensions.rebase=!" results in the command not being
suggested. We'll fix that next.
This is a pretty straightforward move of the code.
I converted the "force" argument to a keyword argument.
Like other recent changes, this code is tightly coupled with
working directory update code in merge.py. I suspect the code
will become more tightly coupled over time, possibly even moved
to merge.py. For now, let's get the code in core.
The delaypush() function had a reference to "repo" that was clearly
supposed to be "pushop.repo". Instead of just fixing that, let's
extract "pushop.repo.ui" to a variable, since that's the only
piece of the repo that's needed in the function.
I have not looked into why I saw a different result in the test to
start with, but that's for another patch anyway.
The code was refactored during the move to be more procedural
instead of using string formatting. This has the benefit of not
writing empty sections, which changed tests.
Sparse checkout is still highly experimental and not protected
by BC guarantees yet. We also haven't had a discussion on the UX.
To discourage use, we rename the sparse command to debugsparse.
Facebook has developed an extension to enable "sparse" checkouts -
a working directory with a subset of files. This feature is a critical
component in enabling repositories to scale to infinite number of
files while retaining reasonable performance. It's worth noting
that sparse checkout is only one possible solution to this problem:
another is virtual filesystems that realize files on first access.
But given that virtual filesystems may not be accessible to all
users, sparse checkout is necessary as a fallback.
Per mailing list discussion at
https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/095868.html
we want to add sparse checkout to the Mercurial distribution via
roughly the following mechanism:
1. Vendor extension as-is with minimal modifications (this patch)
2. Refactor extension so it is more clearly experimental and inline
with Mercurial practices
3. Move code from extension into core where possible
4. Drop experimental labeling and/or move feature into core
after sign-off from narrow clone feature owners
This commit essentially copies the sparse extension and tests
from revision 71e0a2aeca92a4078fe1b8c76e32c88ff1929737 of the
https://bitbucket.org/facebook/hg-experimental repository.
A list of modifications made as part of vendoring is as follows:
* "EXPERIMENTAL" added to module docstring
* Imports were changed to match Mercurial style conventions
* "testedwith" value was updated to core Mercurial special value and
comment boilerplate was inserted
* A "clone_sparse" function was renamed to "clonesparse" to appease
the style checker
* Paths to the sparse extension in tests reflect built-in location
* test-sparse-extensions.t was renamed to test-sparse-fsmonitor.t
and references to "simplecache" were removed. The test always skips
because it isn't trivial to run it given the way we currently run
fsmonitor tests
* A double empty line was removed from test-sparse-profiles.t
There are aspects of the added code that are obviously not ideal.
The goal is to make a minimal number of modifications as part of
the vendoring to make it easier to track changes from the original
implementation. Refactoring will occur in subsequent patches.
In blockdescendants(), we had an assertion when line range of a merge
changeset was not consistent depending on which parent was considered for
computation. For instance, this might occur when file content (in lookup
range) is significantly different between parent branches of the merge as
demonstrated in added tests (where we almost completely rewrite the "baz" file
while also introducing similarities with its content in the other branch we
later merge to).
Now, in such case, we combine line ranges from all parents by storing the
envelope of both line ranges. This is conservative (the line range is
extended, possibly unnecessarily) but at least this should avoid missing
descendants with changes in a range that would fall in that of one parent but
not in another one (the case of "baz: narrow change (2->2+)" changeset in
tests).
Advancing mtime by os.utime() fails for EPERM, if the target file is
owned by another. 0d920bcb0fd1 and related changes made some code
paths give advancing mtime up in such case, to fix issue5418.
This causes file stat ambiguity (again), if it is owned by another.
https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan
To avoid file stat ambiguity in such case, especially for
.hg/dirstate, c75c7b3e3284 made vfs.rename() copy the target file, and
advance mtime of renamed one again, if EPERM (see issue5584 for detail).
But straightforward "copy if EPERM" isn't reasonable for truncation of
append-only files at rollbacking, because rollbacking might cost much
for truncation of many filelogs, even though filelogs aren't
filecache-ed.
Therefore, this patch introduces blacklist "checkambigfiles", and
avoids file stat ambiguity only for files specified in this blacklist.
This patch consists of two parts below, which should be applied at
once in order to avoid regression.
- specify 'checkambig=True' at vfs.open(mode='a') in _playback()
according to checkambigfiles
- invoke _playback() with checkambigfiles
- add transaction.__init__() checkambigfiles argument, for _abort()
- make localrepo instantiate transaction with _cachedfiles
- add rollback() checkambigfiles argument, for "hg rollback/recover"
- make localrepo invoke rollback() with _cachedfiles
After this patch, straightforward "copy if EPERM" will be reasonable
at closing the file opened with checkambig=True, because this policy
is applied only on files, which are listed in blacklist
"checkambigfiles".
Add a 'successorssets' template that returns the list of all closest known
sucessorssets for a changectx. The elements of the list are changesets.
The "closest successors" are the first locally known revisions encountered
while, walking successors markers. It uses successorsets previously modified
to support the closest argument.
This logic respect repository filtering. So hidden revision will be skipped by
this logic unless --hidden is specified. Since we only display the visible
predecessors, this template will not display anything in most case. It makes a
good candidate for inclusion in the default log output.
I updated the test-obsmarker-template.t test file introduced with the
predecessors template to test successorssets template.
We add new tests for improving the coverage of existing obs-markers related
template (predecessors) and the new one we are introducing (successorssets).
Add a closest argument to successorssets changing the definition of latest
successors.
With "closest=false" (current behavior), latest successors are "leafs" on the
obsmarker graph. They don't have any successor and are known locally.
With "closest=true", latest successors are the closest locally-known
changesets that are visible in the repository or repoview. Closest successors
can be then obsolete, orphan.
This will be used in a later patch to show the closest successor of
changesets with the successorssets template.
These are some simple cases. More to come in a future change.
Reviewers: krbullock
Reviewed By: krbullock
Differential Revision: https://phab.mercurial-scm.org/D4
This avoids some false positives in an upcoming check-code rule.
Reviewers: krbullock
Reviewed By: krbullock
Differential Revision: https://phab.mercurial-scm.org/D3
This is a first basic visible usage of the changes tracking in the transaction.
We adds a new function computing the pre-existing changesets obsoleted by a
transaction and a transaction call back displaying this information.
Example output:
added 1 changesets with 1 changes to 1 files (+1 heads)
3 new obsolescence markers
obsoleted 1 changesets
The goal is to evolve the transaction summary into something bigger, gathering
existing output there and adding new useful one. This patch is a good first step
on this road. The new output is basic but give a user to the content of
tr.changes['obsmarkers'] and give an idea of the new options we haves. I expect
to revisit the message soon.
The caller recording the transaction summary should also be moved into a more
generic location but further refactoring is needed before it can happen.
Repository cloned-bookmark-default and tobundle exist in the working
directory of main test repository "repo". We should take care for
them, because it is known issue that fsmonitor can't handle nested
repositories.
These nested repositories are cloned from "repo", and the number of
unknown files = files in these repositories (including files under
.hg) will be changed easily in the future. But testing with fsmonitor
is not ordinary.
Therefore, test-bookmarks.t with fsmonitor might be broken silently.
This is reason why this patch uses "(glob)" for the number of unknown
files in "hg summary" output.
BTW, this patch doesn't use .hgignore to make test portable, because
.hgignore might cause another issue related to "walk_on_invalidate"
configuration of fsmonitor.
The general delta heuristic to select a delta do not scale with the number of
branch. The delta base is frequently too far away to be able to reuse a chain
according to the "distance" criteria. This leads to insertion of larger delta (or
even full text) that themselves push the bases for the next delta further away
leading to more large deltas and full texts. This full text and frequent
recomputation throw Mercurial performance in disarray.
For example of a slightly large repository
280 000 files (2 150 000 versions)
430 000 changesets (10 000 topological heads)
Number below compares repository with and without the distance criteria:
manifest size:
with: 21.4 GB
without: 0.3 GB
store size:
with: 28.7 GB
without 7.4 GB
bundle last 15 00 revisions:
with: 800 seconds
971 MB
without: 50 seconds
73 MB
unbundle time (of the last 15K revisions):
with: 1150 seconds (~19 minutes)
without: 35 seconds
Similar issues has been observed in other repositories.
Adding a new option or "feature" on stable is uncommon. However, given that this
issues is making Mercurial practically unusable, I'm exceptionally targeting
this patch for stable.
What is actually needed is a full rework of the delta building and reading
logic. However, that will be a longer process and churn not suitable for stable.
In the meantime, we introduces a quick and dirty mitigation of this in the
'experimental' config space. The new option introduces a way to set the maximum
amount of memory usable to store a diff in memory. This extend the ability for
Mercurial to create chains without removing all safe guard regarding memory
access. The option should be phased out when core has a more proper solution
available.
Setting the limit to '0' remove all limits, setting it to '-1' use the default
limit (textsize x 4).
The bundled hg should work flawlessly in most cases. Make it depend on
the external installation only if necessary since we can't control the
whole environment.
This patch doesn't implement the "exit 80" idea proposed by Jun. I don't
want to keep the capability checking sync with the actual tests.
Since os.environ may be overridden in run-tests.py, several important
variables such as PATH weren't restored.
I don't like the idea of using the system hg *by default* because the
executable and the configs are out of our control. But I don't mind as
long as the tests pass.
People often want to know what they are working on *now*. As part of
this, they also commonly want to know how that work is related to other
changesets in the repo so they can perform common actions like rebase,
histedit, and merge.
`hg show work` made headway into this space. However, it is geared
towards a complete repo view as opposed to just the current line of
work. If you have a lot of in-flight work or the repo has many heads,
the output can be overwhelming. The closest thing Mercurial has to
"show me the current thing I'm working on" that doesn't require custom
revsets is `hg qseries`. And this requires MQ, which completely changes
workflows and repository behavior and has horrible performance on large
repos. But as sub-optimal as MQ is, it does some things right, such as
expose a model of the repo that is easy for people to reason about.
This simplicity is why I think a lot of people prefer to use MQ, despite
its shortcomings.
One common development workflow is to author a series of linear
changesets, using bookmarks, branches, anonymous heads, or even topics
(3rd party extension). I'll call this a "stack." You periodically
rewrite history in place (using `hg histedit`) and reparent the stack
against newer changesets (using `hg rebase`). This workflow can be
difficult because there is no obvious way to quickly see the current
"stack" nor its relation to other changesets. Figuring out arguments to
`hg rebase` can be difficult and may require highlighting and pasting
multiple changeset nodes to construct a command.
The goal of this commit is to make stack based workflows simpler
by exposing a view of the current stack and its relationship to
other releant changesets, notably the parent of the base changeset
in the stack and newer heads that the stack could be rebased or merged
into.
Introduced is the `hg show stack` view. Essentially, it finds all
mutable changesets from the working directory revision in both
directions, stopping at a merge or branch point. This limits the
revisions to a DAG linear range.
The stack is rendered as a concise list of changesets. Alongside the
stack is a visualization of the DAG, similar to `hg log -G`.
Newer public heads from the branch point of the stack are rendered
above the stack. The presence of these heads helps people understand
the DAG model and the relationship between the stack and changes made
since the branch point of that stack. If the "rebase" command is
available, a `hg rebase` command is printed for each head so a user
can perform a simple copy and paste to perform a rebase.
This view is alpha quality. There are tons of TODOs documented
inline. But I think it is good enough for a first iteration.
Not only is the output of these commands inconsistent with respect to each
other when a file is deleted, they are internally inconsistent depending upon
whether the deleted file is in the top level repo or a subrepo. It seemed
easier to show the problems, rather than describe them. The original goal was
to fix the summary command with respect to deleted files. I haven't fixed any
of the other issues yet, in case anybody believes the current subrepo behavior
is correct.
I think a natural understanding of clean/dirty is that they are two opposite
values of a single binary repo state. If `hg update --clean -r .` changes a
file, then naturally that repo was dirty, and `hg update --check` should have
blocked it. Deleted files are special, in that they don't block a commit. But
they make the filesystem content not the same as a clean checkout.
The ignore regular expression has been updated to detect
"inconsistent config." If present, we track which configs have
that set and we suppress the conflicting defaults error for those
options.
I also added named groups to the regexp to aid readability.
A comment was added to profiling.py to make a desired inconsistent
value error go away.
External hooks end up launching cmd.exe, which knows nothing about $VAR syntax.
For some reason, I thought that Mercurial would substitute in the value, in
order to paper over the platform difference. But I can't find that in the
documentation, and there's at least one other use of this pattern [1].
[1] https://www.mercurial-scm.org/repo/hg/file/tip/tests/test-histedit-fold.t#l477
Extensions sometimes wants to add other information in the default log output
format (when no templating is used).
Add an empty function named '_exthook' for easing the extension life.
Extensions will be able to wrap this function and collaborate to display
additional information.
Exthook is called after displaying troubles and just before displaying the
files, extra and description.
Add a new test file to test it and not pollute other test files.
This patch adds special comment handling so one can create obsmarkers in
drawdag comments like "# replace: A -> B -> C", "# prune: X, Y, Z",
"split: P -> M, N" and they are just self-explained.
Update the code to correctly anchor the expression on the end of the name, to
require that the entire name match this expression. It was already anchored at
the start by using re.match(), but this does not anchor it at the end.
The 4.2 release introduces a regression regarding the behavior of rebase with
some hook failures. We add the tests from the bug report from Henrik Stuart to
our test base to prevent further regression on this.
Having a single transaction for rebase means the whole transaction gets rolled back
on error. To work around this a small hack has been added to detect merge
conflict and commit the work done so far before exiting. This hack works because
there is nothing transaction related going on during the merge phase.
However, if a hook blocks the rebase to create a changeset, it is too late to commit the
work done in the transaction before the problematic changeset was created. This
leads to the whole rebase so far being rolled back. Losing merge resolution and
other work in the process. (note: rebase state will be fully lost too).
Since issue5610 is a pretty serious regression and the next stable release is a
couple day away, we are taking the backout route until we can figure out
something better to do.
In the process of fixing issue5610 in 4.2.2, we are trying to backout
507f16f4aa51. This changeset is making changes that depend on 507f16f4aa51,
so we need to back it out first.
Since issue5610 is pretty serious regression and the next stable release is a
couple of days away, we are taking the backout route until we can figure out
something better to do.
As part of using `hg show` in my daily workflow, I've found it slightly
annoying to have to type full view names, complete with a space. I've
locally registered an alias for "swork = show work."
I think others will have this same complaint and could benefit from
some automation to streamline the creation of aliases. So, this
commit introduces a config option that allows `hg show` views to be
automatically aliased using a given prefix. e.g. a value of "s"
will automatically register "swork" and "sbookmarks." Multiple
values can be given for ultimate flexibility. This arguably isn't
needed now. But since we don't register aliases if there will be
a collision and we're bound to have a collision, it makes sense to
allow multiple prefixes so specific views can avoid collisions by
using different prefixes.
Extensions can have a 'configtable' mapping and use
'registrar.configitem(table)' to retrieve the registration function.
This behave in the same way as the other way for extensions to register new
items (commands, colors, etc).
Update the syshgenv function to attempt to completely restore the original
environment, rather than only updating a few specific variables. run_tests.py
now generates a shell script that can be used to restore the original
environment, and syshgenv sources it.
This is a bit more complicated than the previous code, but should do a better
job of running the system hg in the correct environment.
I've tested it on Linux using python 2.x, but let me know if it causes issues
in other environments. I'm not terribly familiar with how the tests get run on
Windows, for instance, and how the environment needs to be updated there.
Ancient hg does not have "hg files" so test-check-*.t will fail with
"unknown command 'files'":
$ hg files
hg: unknown command 'files'
$ hg --version
Mercurial Distributed SCM (version 2.6.2)
Test "hg files" and give up using syshg if it does not have "files" command.
Most test scripts use "hg" to interact with a temporary test repository.
However a few tests also want to run hg commands to interact with the local
repository containing the mercurial source code. Notably, many of the
test-check-* tests want to check local files and commit messages.
These tests were previously using the version of hg being tested to query the
source repository. However, this will fail if the source repository requires
extensions or other settings not supported by the version of mercurial being
tested. The source repository was typically initially cloned using the system
hg installation, so we should use the system hg installation to query it.
There was already a helpers-testrepo.sh script designed to help cope with
different requirements for the source repository versus the test repositories.
However, it only handled the evolve extension. This new behavior works with
any extensions that are different between the system installation and the test
installation.
When running the tests, define ORIG_PATH and ORIG_PYTHONPATH environment
variables that contain the original contents of PATH and PYTHONPATH, before
they were modified by run-tests.py
This will make it possible for tests to refer to the original contents of these
variables if necessary. In particular, this is necessary for invoking the
correct version of hg for examining the local repository (the mercurial
repository itself, not the temporary test repositories). Various tests examine
the local repository to check the file lists and contents of commit messages.
It's now common that an old node gets replaced by zero or more new nodes,
that could happen with amend, rebase, histedit, etc. And it's a common
requirement to do bookmark movements, strip or obsolete nodes and even
moving working copy parent.
Previously, amend, rebase, history have their own logic doing the above.
This patch is an attempt to unify them and future code.
This enables new developers to be able to do "replace X with Y" thing
correctly, without any knowledge about bookmarks, strip or obsstore.
The next step will be migrating rebase to the new API, so it works inside a
transaction, and its code could be simplified.
For long, the fact that strip does not work inside a transaction and some
code has to work with both obsstore and fallback to strip lead to duplicated
code like:
with repo.transaction():
....
if obsstore:
obsstore.createmarkers(...)
if not obsstore:
repair.strip(...)
Things get more complex when you want to call something which may call strip
under the hood. Like you cannot simply write:
with repo.transaction():
....
rebasemod.rebase(...) # may call "strip", so this doesn't work
But you do want rebase to run inside a same transaction if possible, so the
code may look like:
with repo.transaction():
....
if obsstore:
rebasemod.rebase(...)
obsstore.createmarkers(...)
if not obsstore:
rebasemod.rebase(...)
repair.strip(...)
That's ugly and error-prone. Ideally it's possible to just write:
with repo.transaction():
rebasemod.rebase(...)
saferemovenodes(...)
This patch is the first step towards that. It adds a "delayedstrip" method
to repair.py which maintains a postclose callback in the transaction object.
This is naive implementation using two-pass scanning. Tracking descendants
isn't an easy problem if both start and stop depths are specified. It's
impractical to remember all possible depths of each node while scanning from
roots to descendants because the number of depths explodes. Instead, we could
cache (min, max) depths as a good approximation and track ancestors back when
needed, but that's likely to have off-by-one bug.
Since this implementation appears not significantly slower, and is quite
straightforward, I think it's good enough for practical use cases. The time
and space complexity is O(n) ish.
revisions:
0) 1-pass scanning with (min, max)-depth cache (worst-case quadratic)
1) 2-pass scanning (this version)
repository:
mozilla-central
# descendants(0) (for reference)
*) 0.430353
# descendants(0, depth=1000)
0) 0.264889
1) 0.398289
# descendants(limit(tip:0, 1, offset=10000), depth=1000)
0) 0.025478
1) 0.029099
# descendants(0, depth=2000, startdepth=1000)
0) painfully slow (due to quadratic backtracking of ancestors)
1) 1.531138
Prepares for adding depth support. I want to process depth=0 in
revdescendants() to make things simpler.
only() also calls dagop.revdescendants(), but it filters out root revisions
explicitly. So this should cause no problem.
# descendants(0) using hg repo
0) 0.052380
1) 0.051226
# only(tip) using hg repo
0) 0.001433
1) 0.001425