Prior to this the return value was potentially None, a string, or a
list of strings. It now always returns a list of strings where each
string is always only one email address
watchman's paths encoding can differ from filesystem encoding. For example,
on Windows, it's always utf-8.
Before this patch, on Windows, mismatch in path comparison between fsmonitor
state and osutil.statfiles would yield a clean status for added/modified files.
In addition to status reporting wrong results, this leads to files being
discarded from changesets while doing history editing operations such as rebase.
Benchmark:
There is a little overhead at module import:
python -m timeit "import hgext.fsmonitor"
Windows before patch: 1000000 loops, best of 3: 0.563 usec per loop
Windows after patch: 1000000 loops, best of 3: 0.583 usec per loop
Linx before patch: 1000000 loops, best of 3: 0.579 usec per loop
Linux after patch: 1000000 loops, best of 3: 0.588 usec per loop
10000 calls to _watchmantofsencoding:
python -m timeit -s "from hgext.fsmonitor import _watchmantofsencoding, _fixencoding" "fname = '/path/to/file'" "for i in range(10000):" " if _fixencoding: fname = _watchmantofsencoding(fname)"
Windows (_fixencoding is True): 100 loops, best of 3: 19.5 msec per loop
Linux (_fixencoding is False): 100 loops, best of 3: 3.08 msec per loop
Mixing bytes and unicode creates a mess. Do things in bytes as possible.
New sysbytes() helper only takes care of ASCII characters, but avoids raising
nasty unicode exception. This is the same design principle as sysstr().
Currently, Mercurial has a number of commands to show information. And,
there are features coming down the pipe that will introduce more
commands for showing information.
Currently, when introducing a new class of data or a view that we
wish to expose to the user, the strategy is to introduce a new command
or overload an existing command, sometimes both. For example, there is
a desire to formalize the wip/smartlog/underway/mine functionality that
many have devised. There is also a desire to introduce a "topics"
concept. Others would like views of "the current stack." In the
current model, we'd need a new command for wip/smartlog/etc (that
behaves a lot like a pre-defined alias of `hg log`). For topics,
we'd likely overload `hg topic[s]` to both display and manipulate
topics.
Adding new commands for every pre-defined query doesn't scale well
and pollutes `hg help`. Overloading commands to perform read-only and
write operations is arguably an UX anti-pattern: while having all
functionality for a given concept in one command is nice, having a
single command doing multiple discrete operations is not. Furthermore,
a user may be surprised that a command they thought was read-only
actually changes something.
We discussed this at the Mercurial 4.0 Sprint in Paris and decided that
having a single command where we could hang pre-defined views of
various data would be a good idea. Having such a command would:
* Help prevent an explosion of new query-related commands
* Create a clear separation between read and write operations
(mitigates footguns)
* Avoids overloading the meaning of commands that manipulate data
(bookmark, tag, branch, etc) (while we can't take away the
existing behavior for BC reasons, we now won't introduce this
behavior on new commands)
* Allows users to discover informational views more easily by
aggregating them in a single location
* Lowers the barrier to creating the new views (since the barrier
to creating a top-level command is relatively high)
So, this commit introduces the `hg show` command via the "show"
extension. This command accepts a positional argument of the
"view" to show. New views can be registered with a decorator. To
prove it works, we implement the "bookmarks" view, which shows a
table of bookmarks and their associated nodes.
We introduce a new style to hold everything used by `hg show`.
For our initial bookmarks view, the output varies from `hg bookmarks`:
* Padding is performed in the template itself as opposed to Python
* Revision integers are not shown
* shortest() is used to display a 5 character node by default (as
opposed to static 12 characters)
I chose to implement the "bookmarks" view first because it is simple
and shouldn't invite too much bikeshedding that detracts from the
evaluation of `hg show` itself. But there is an important point
to consider: we now have 2 ways to show a list of bookmarks. I'm not
a fan of introducing multiple ways to do very similar things. So it
might be worth discussing how we wish to tackle this issue for
bookmarks, tags, branches, MQ series, etc.
I also made the choice of explicitly declaring the default show
template not part of the standard BC guarantees. History has shown
that we make mistakes and poor choices with output formatting but
can't fix these mistakes later because random tools are parsing
output and we don't want to break these tools. Optimizing for human
consumption is one of my goals for `hg show`. So, by not covering
the formatting as part of BC, the barrier to future change is much
lower and humans benefit.
There are some improvements that can be made to formatting. For
example, we don't yet use label() in the templates. We obviously
want this for color. But I'm not sure if we should reuse the existing
log.* labels or invent new ones. I figure we can punt that to a
follow-up.
At the aforementioned Sprint, we discussed and discarded various
alternatives to `hg show`.
We considered making `hg log <view>` perform this behavior. The main
reason we can't do this is because a positional argument to `hg log`
can be a file path and if there is a conflict between a path name and
a view name, behavior is ambiguous. We could have introduced
`hg log --view` or similar, but we felt that required too much typing
(we don't want to require a command flag to show a view) and wasn't
very discoverable. Furthermore, `hg log` is optimized for showing
changelog data and there are things that `hg display` could display
that aren't changelog centric.
There were concerns about using "show" as the command name.
Some users already have a "show" alias that is similar to `hg export`.
There were also concerns that Git users adapted to `git show` would
be confused by `hg show`'s different behavior. The main difference
here is `git show` prints an `hg export` like view of the current
commit by default and `hg show` requires an argument. `git show`
can also display any Git object. `git show` does not support
displaying more complex views: just single objects. If we
implemented `hg show <hash>` or `hg show <identifier>`, `hg show`
would be a superset of `git show`. Although, I'm hesitant to do that
at this time because I view `hg show` as a higher-level querying
command and there are namespace collisions between valid identifiers
and registered views.
There is also a prefix collision with `hg showconfig`, which is an
alias of `hg config`.
We also considered `hg view`, but that is already used by the "hgk"
extension.
`hg display` was also proposed at one point. It has a prefix collision
with `hg diff`. General consensus was "show" or "view" are the best
verbs. And since "view" was taken, "show" was chosen.
There are a number of inline TODOs in this patch. Some of these
represent decisions yet to be made. Others represent features
requiring non-trivial complexity. Rather than bloat the patch or
invite additional bikeshedding, I figured I'd document future
enhancements via TODO so we can get a minimal implmentation landed.
Something is better than nothing.
BTW, C implementation of hexdigest() for SHA-1/256/512 returns hex
hash in lower case, and doctest in Python standard hashlib assumes
that, too. But it isn't explicitly described in API document or so.
Therefore, we can't assume that hexdigest() always returns hex hash in
lower case, for any hash algorithms, on any Python runtimes and
versions.
From point of view of that, it is reasonable for portability that
77f8c025a6ef applies lower() on hex hash in overridefilemerge().
But on the other hand, in largefiles extension, there are still many
code paths comparing between hex hashes or storing hex hash into
standin file, without lower().
Switching to hash algorithm other than SHA-1 may be good chance to
clarify our policy about hexdigest()-ed hash value string.
- assume that hexdigest() always returns hex hash in lower case, or
- apply lower() on hex hash in appropriate layers to ensure
lower-case-ness of it for portability
As the name describes, the 2nd argument 'revorctx' of copytostore()
can accept non-changectx value, for historical reason,
But, since e91ac285f700, copyalltostore(), the only one copytostore()
client in Mercurial source tree, always passes changectx as
'revorctx'.
Therefore, it is reasonable to make copytostore() accept only
changectx as the 2nd argument, now.
AFAIK, 'uploaded' argument of copytostore() (or copytocache(), before
renaming at e2d2a21b7e90) has been never used both on caller and
callee sides, since official release of bundled largefiles extension.
copyalltostore(), only one caller of copytostore(), already knows
standin file name of the target largefile. Therefore, passing it to
copytostore() is more efficient than calculating it in copytostore()
or readstandin().
This will be used to centralize and encapsulate the logic to read hash
from given (filectx of) standin file. readstandin() isn't suitable for
this purpose, because there are some code paths, which want to read
hex hash directly from filectx.
Previously, the pull would succeed, but the subsequent rebase would fail due
to the rebase.requiredest flag. Now abort earlier with a more useful error
message.
This adds an explicit active-bookmark-handling logic
to shelve. Traditional shelve handles it by transaction aborts,
but it is a bit ugly and having an explicit functionality
seems better.
Before this patch, updatestandin() takes "standin" argument, and
applies splitstandin() on it to pick out a path to largefile (aka
"lfile" or so) from standin.
But in fact, all callers already knows "lfile". In addition to it,
many callers knows both "standin" and "lfile".
Therefore, making updatestandin() take only one of "standin" or
"lfile" is inefficient.
repo['.'] is called not as "working context" but as "parent context".
In this code path, hash value of current content of file should be
compared against hash value recorded in "parent context".
Therefore, "wctx" may cause misunderstanding in this case.
Before this patch, this code path contains two loops for m._files: one
for replacement with standin, and another for elimination of None,
which comes from previous replacement ("standin in wctx or
lfdirstate[f] == 'r'" case in tostandin()).
These two loops can be unified into simple one "for" loop.
Updating standin for newly added largefile is needed, only if same
name largefile exists in destination context at linear merging. In
such case, updated standin is used to detect divergence of largefile
at overridefilemerge().
Otherwise, standin doesn't have any responsibility for its content
(usually, it is empty).
This patch also renames argument of hexsha1(), not only for
readability ("data" isn't good name for file-like object), but also
for reviewability (including hexsha1() code helps reviewers to confirm
how these functions are similar).
BTW, copyandhash() has also similar logic, but it can't reuse
hexsha1(), because it writes read-in data into specified fileobj
simultaneously.
Before 33e44341bb82, histedit (like rebase) was only creating markers on final
success from the old-rewritten node to the newly created nodes (as of before
33e44341bb82). In case of abort the aborted attempt were stripped to restore the
repository in its state prior to the attempt.
This use of strip was on purpose. Using markers in this case introduces various
issues. The main one is that keeping the partial result of histedit as obsolete
prevents us to recreates the same nodes in a second attempt. The same operation
will lead to an identical results, using an identical node that already exists
in the repository as obsolete.
To conclude, we cannot and should not switch to obsolescence markers creation on
histedit --abort and we backout 33e44341bb82. A test to catch this class of
issue will be introduced in the next changeset.
Now that rebasestate is serialized as part of the transaction, the repo state it
sees is the version at the end of the transaction, which may have hidden nodes.
Therefore, it's possible parts of the rebase commit set are no longer visible by
the time the transaction is closing, which causes a filtered revision error in
this code. I don't think state serialization should be blocked from accessing
commits it knows exist, especially if all it's trying to do is get the hex of
them, so let's use an unfiltered repo here.
Unfortunately, the only known repro is with the fbamend Facebook extension, so
I'm not sure how to repro it in core Mercurial for a test.
There are some code paths, which apply standin() on same value
multilpe times instead of using already standin()-ed value.
"fstandin" is common name for "path to standin file" in lfutil.py, to
avoid shadowing "standin()".
readstandin() takes "node" argument to get changectx by "repo[node]".
There are some readstandin() invocations, which use ctx.node(),
ctx.rev(), or '.' as "node" argument above, even though corresponded
changectx object is already looked up on caller side.
This patch calls readstandin() with already known changectx itself, to
avoid meaningless re-construction of changectx (indirect case via
copytostore() is also included).
BTW, copytostore() uses "rev" argument only for readstandin()
invocation. Therefore, this patch also renames it to "revorctx" to
indicate that it can take not only revision ID or so but also
changectx, for readability.
There are many isstandin() invocations before splitstandin().
The former examines whether specified path starts with ".hglf/". The
latter returns after ".hglf/" of specified path if it starts with that
prefix, or returns None otherwise.
Therefore, value returned by splitstandin() can be used for
replacement of preceding isstandin(), and this replacement can omit
redundant string comparison after isstandin().
Empty changelist descriptions are valid in Perforce. If we encounter one of
them we are currently running into an IndexError. In case of empty commit
messages set the commit message to **empty changelist description**, which
follows Perforce terminology.
We only have commands.{update,rebase}.requiredest so far. We should
clearly ignore those two if HGPLAIN is in effect, and it seems like we
should ignore any future config that will be added in [commands] since
that is about changing the behavior of commands.
Thanks to Yuya for suggesting to centralize the code in ui.py.
While at it, remove the unnecessary False values passed to
ui.configbool() for the aforementioned config options.
Inspired by the dirstate fix in 39954a8760cd, this should fix any race
conditions with the fsmonitor state changing from underneath.
Since we now grab the wlock for any non-invalidate writes, the only situation
this appears to happen in is with a concurrent invalidation. Test that.
This means that the state will not be written if:
(1) either the wlock can't be obtained
(2) something else came along and changed the dirstate while we were in the
middle of a status run.
Per discussion on the mailing list, we want better release notes
for Mercurial.
This patch introduces an extension that provides a command for
producing release notes files. Functionality is implemented
as an extension because it could be useful outside of the
Mercurial project and because there is some code (like rst
parsing) that already exists in Mercurial and it doesn't make
sense to reinvent the wheel.
The general idea with the extension is that changeset authors
declare release notes in commit messages using rst directives.
Periodically (such as at publishing or release time), a project
maintainer runs `hg releasenotes` to extract release notes
fragments from commit messages and format them to an auto-generated
release notes file. More details are explained inline in docstrings.
There are several things that need addressed before this is ready
for prime time:
* Moar tests
* Interactive merge mode
* Implement similarity detection for individual notes items
* Support customizing section names/titles
* Parsing improvements for bullet lists and paragraphs
* Document which rst primitives can be parsed
* Retain arbitrary content (e.g. header section/paragraphs)
from existing release notes file
* Better error messages (line numbers, hints, etc)
Add the -B/--bookmark option to select a bookmark whose changesets
and its ancestors will be selected unless a new bookmark/head is
found.
This is inspired by hg strip -B option.
E.g. tags and bookmarks can reveal revisions that would otherwise be
hidden. A revision can also be revealed because one if its descendants
is visible. Let's use the term "pinned" for the former case
(bookmarks etc.).
In some mercurial workflows, the default destination for rebase does not
always work well and can lead to confusing behavior. With this flag enabled,
every rebase command will require passing an explicit destination, eliminating
this confusion.
We could create a patch of such name, but it wouldn't be processed properly
by mq as parseseries() strips leading/trailing whitespace.
The test of default message (added by 063a2c623014) is no longer be useful
so removed.
This issue was reported as:
https://bitbucket.org/tortoisehg/thg/issues/4693/
Since we are introducing obs-based shelve, we are no longer
stripping temporary nodes, we are obsoleting them. Therefore
it looks like stipnodes would be a misleading name, while
prune has a connotaion of "strip but with obsolescense", so
nodestoprune seems like a good rename.
Obsolescense-based shelve only needs metadata stored in .hg/shelved
and if feels that this metadata should be stored in a
simplekeyvaluefile format for potential extensibility purposes.
I want to avoid storing it in an unstructured text file where
order of lines determines their semantical meanings (as now
happens in .hg/shelvedstate. .hg/rebasestate and I suspect other
state files as well).
Not included in this series, I have ~30 commits, doubling test-shelve.t
in size and testing almost every tested shelve usecase for obs-shelve.
Here's the series for the curious now: http://pastebin.com/tGJKx0vM
I would like to send it to the mailing list and get accepted as well,
but:
1. it's big, so should I send like 6 patches a time or so?
2. instead of having a commit per test case, it more like
a commit per some amount of copy-pasted code. I tried to keep
it meaningful and named commits somewhat properly, but it is
far from this list standards IMO. Any advice on how to get it
in without turning it into a 100 commits and spending many
days writing descriptions?
3. it makes test-shelve.t run for twice as long (and it is already
a slow test). Newest test-shelve.r runs for ~1 minute.
Move "cleanupnode" (unsafe strip) into "safecleanupnode" so it's impossible
to call the unsafe function directly.
This helps reduce future programming errors.
The new method will decide between:
- cleanupnode, which calls the unsafe repair.strip
- create obsmarkers
Ideally, nobody calls "cleanupnode" directly except for "safecleanupnode".
Recently we switched rebases to run the entire rebase inside a single
transaction, which dramatically improved the speed of rebases in repos with
large working copies. Let's also move the dirstate into a single dirstateguard
to get the same benefits. This let's us avoid serializing the dirstate after
each commit.
In a large repo, rebasing 27 commits is sped up by about 20%.
I believe the test changes are because us touching the dirstate gave the
transaction something to actually rollback.
This adds an option (which defaults to False) to run entire histedits in a
single transaction. This results in 20-25% faster histedits in large repos where
transaction startup cost is expensive.
I didn't want to enable this by default because it has some unfortunate side
effects. For instance, if a pretxncommit hook throws midway through the
histedit, it will rollback the entire histedit and lose any progress the user
had made. Same if the user aborts editting a commit message. It's still worth
turning this on for large repos, but probably not for normal sized repos.
Long term, once we have inmemory merging, we could do the entire histedit in
memory, without a transaction, then we could selectively rollback just parts of
it in the event of an exception.
Tested it by running the tests with
`--extra-config-opt=histedit.singletransaction=True`. The only failure was
related to the hook rollback issue I mention above.
We only want to pop the action after the action is completed, since if the
action aborts part way through we want it to remain at the front of the list so
continue/abort will start with it.
Previously we relied on the fact that we only serialized the state file at the
beginning of the action, so the pop wasn't serialized until the next iteration
of the loop. In a future patch we will be adding a large transaction around this
area, which means if we pop the list early it might get serialized if the action
throws a user InterventionRequired error, at which point the action is not in
the list anymore. So let's only pop it once the action is really truly done.
Changeset 97936471dc8d removed the mutable default value, but did not explicitly
tested for None. Such implicit testing can introduce semantic and performance
issue. We move to an explicit testing for None as recommended by PEP8:
https://www.python.org/dev/peps/pep-0008/#programming-recommendations
Changeset 33b71926122d removed the mutable default value, but did not explicitly
tested for None. Such implicit checking can introduce semantic and performance
issue. We move to an explicit check for None as recommended by PEP8:
https://www.python.org/dev/peps/pep-0008/#programming-recommendations
This patch makes us respect pager.attend again if the extension is
enabled. It also brings back the default attend list, so e.g. summary
is not paged by default, just like it used to be before pager was
moved into core.
The named branch of the leaf changeset can be changed by updating to it,
setting the branch, and amending.
But previously, there was no good way to *just* change the branch of several
linear changes. If rebasing changes with another parent to '.', it would pick
up a pending branch change up. But when rebasing changes that have the same
parent, it would fail with 'nothing to rebase', even when the branch name was
set differently.
To fix this, allow rebasing to same parent when a branch has been set.