Previously, the --template/-T option didn't show up in help because it's marked
as experimental. It's not really experimental for show, and its quite important
for show's funcationality, so let's make sure it always shows up.
This is the beginning of a wip/smartlog view. It is basically a manually
constructed (read: fast) revset function to collect "relevant"
changesets combined with a custom template and a graph displayer.
It obviously needs a lot of work.
I'd like to get *something* usable in 4.2 so `hg show` has some value
to end-users.
Let the bikeshedding begin.
Because we're formatting to RST, short lines wrap and there
needs to be an extra line break between paragraphs to prevent
that.
In addition, the indentation in the old code was a bit off.
Refactor the code to a function (so we don't leak variables outside
the module) and modify it so it renders more correctly.
watchman's paths encoding can differ from filesystem encoding. For example,
on Windows, it's always utf-8.
Before this patch, on Windows, mismatch in path comparison between fsmonitor
state and osutil.statfiles would yield a clean status for added/modified files.
In addition to status reporting wrong results, this leads to files being
discarded from changesets while doing history editing operations such as rebase.
Benchmark:
There is a little overhead at module import:
python -m timeit "import hgext.fsmonitor"
Windows before patch: 1000000 loops, best of 3: 0.563 usec per loop
Windows after patch: 1000000 loops, best of 3: 0.583 usec per loop
Linx before patch: 1000000 loops, best of 3: 0.579 usec per loop
Linux after patch: 1000000 loops, best of 3: 0.588 usec per loop
10000 calls to _watchmantofsencoding:
python -m timeit -s "from hgext.fsmonitor import _watchmantofsencoding, _fixencoding" "fname = '/path/to/file'" "for i in range(10000):" " if _fixencoding: fname = _watchmantofsencoding(fname)"
Windows (_fixencoding is True): 100 loops, best of 3: 19.5 msec per loop
Linux (_fixencoding is False): 100 loops, best of 3: 3.08 msec per loop
Mixing bytes and unicode creates a mess. Do things in bytes as possible.
New sysbytes() helper only takes care of ASCII characters, but avoids raising
nasty unicode exception. This is the same design principle as sysstr().
Currently, Mercurial has a number of commands to show information. And,
there are features coming down the pipe that will introduce more
commands for showing information.
Currently, when introducing a new class of data or a view that we
wish to expose to the user, the strategy is to introduce a new command
or overload an existing command, sometimes both. For example, there is
a desire to formalize the wip/smartlog/underway/mine functionality that
many have devised. There is also a desire to introduce a "topics"
concept. Others would like views of "the current stack." In the
current model, we'd need a new command for wip/smartlog/etc (that
behaves a lot like a pre-defined alias of `hg log`). For topics,
we'd likely overload `hg topic[s]` to both display and manipulate
topics.
Adding new commands for every pre-defined query doesn't scale well
and pollutes `hg help`. Overloading commands to perform read-only and
write operations is arguably an UX anti-pattern: while having all
functionality for a given concept in one command is nice, having a
single command doing multiple discrete operations is not. Furthermore,
a user may be surprised that a command they thought was read-only
actually changes something.
We discussed this at the Mercurial 4.0 Sprint in Paris and decided that
having a single command where we could hang pre-defined views of
various data would be a good idea. Having such a command would:
* Help prevent an explosion of new query-related commands
* Create a clear separation between read and write operations
(mitigates footguns)
* Avoids overloading the meaning of commands that manipulate data
(bookmark, tag, branch, etc) (while we can't take away the
existing behavior for BC reasons, we now won't introduce this
behavior on new commands)
* Allows users to discover informational views more easily by
aggregating them in a single location
* Lowers the barrier to creating the new views (since the barrier
to creating a top-level command is relatively high)
So, this commit introduces the `hg show` command via the "show"
extension. This command accepts a positional argument of the
"view" to show. New views can be registered with a decorator. To
prove it works, we implement the "bookmarks" view, which shows a
table of bookmarks and their associated nodes.
We introduce a new style to hold everything used by `hg show`.
For our initial bookmarks view, the output varies from `hg bookmarks`:
* Padding is performed in the template itself as opposed to Python
* Revision integers are not shown
* shortest() is used to display a 5 character node by default (as
opposed to static 12 characters)
I chose to implement the "bookmarks" view first because it is simple
and shouldn't invite too much bikeshedding that detracts from the
evaluation of `hg show` itself. But there is an important point
to consider: we now have 2 ways to show a list of bookmarks. I'm not
a fan of introducing multiple ways to do very similar things. So it
might be worth discussing how we wish to tackle this issue for
bookmarks, tags, branches, MQ series, etc.
I also made the choice of explicitly declaring the default show
template not part of the standard BC guarantees. History has shown
that we make mistakes and poor choices with output formatting but
can't fix these mistakes later because random tools are parsing
output and we don't want to break these tools. Optimizing for human
consumption is one of my goals for `hg show`. So, by not covering
the formatting as part of BC, the barrier to future change is much
lower and humans benefit.
There are some improvements that can be made to formatting. For
example, we don't yet use label() in the templates. We obviously
want this for color. But I'm not sure if we should reuse the existing
log.* labels or invent new ones. I figure we can punt that to a
follow-up.
At the aforementioned Sprint, we discussed and discarded various
alternatives to `hg show`.
We considered making `hg log <view>` perform this behavior. The main
reason we can't do this is because a positional argument to `hg log`
can be a file path and if there is a conflict between a path name and
a view name, behavior is ambiguous. We could have introduced
`hg log --view` or similar, but we felt that required too much typing
(we don't want to require a command flag to show a view) and wasn't
very discoverable. Furthermore, `hg log` is optimized for showing
changelog data and there are things that `hg display` could display
that aren't changelog centric.
There were concerns about using "show" as the command name.
Some users already have a "show" alias that is similar to `hg export`.
There were also concerns that Git users adapted to `git show` would
be confused by `hg show`'s different behavior. The main difference
here is `git show` prints an `hg export` like view of the current
commit by default and `hg show` requires an argument. `git show`
can also display any Git object. `git show` does not support
displaying more complex views: just single objects. If we
implemented `hg show <hash>` or `hg show <identifier>`, `hg show`
would be a superset of `git show`. Although, I'm hesitant to do that
at this time because I view `hg show` as a higher-level querying
command and there are namespace collisions between valid identifiers
and registered views.
There is also a prefix collision with `hg showconfig`, which is an
alias of `hg config`.
We also considered `hg view`, but that is already used by the "hgk"
extension.
`hg display` was also proposed at one point. It has a prefix collision
with `hg diff`. General consensus was "show" or "view" are the best
verbs. And since "view" was taken, "show" was chosen.
There are a number of inline TODOs in this patch. Some of these
represent decisions yet to be made. Others represent features
requiring non-trivial complexity. Rather than bloat the patch or
invite additional bikeshedding, I figured I'd document future
enhancements via TODO so we can get a minimal implmentation landed.
Something is better than nothing.
BTW, C implementation of hexdigest() for SHA-1/256/512 returns hex
hash in lower case, and doctest in Python standard hashlib assumes
that, too. But it isn't explicitly described in API document or so.
Therefore, we can't assume that hexdigest() always returns hex hash in
lower case, for any hash algorithms, on any Python runtimes and
versions.
From point of view of that, it is reasonable for portability that
77f8c025a6ef applies lower() on hex hash in overridefilemerge().
But on the other hand, in largefiles extension, there are still many
code paths comparing between hex hashes or storing hex hash into
standin file, without lower().
Switching to hash algorithm other than SHA-1 may be good chance to
clarify our policy about hexdigest()-ed hash value string.
- assume that hexdigest() always returns hex hash in lower case, or
- apply lower() on hex hash in appropriate layers to ensure
lower-case-ness of it for portability
As the name describes, the 2nd argument 'revorctx' of copytostore()
can accept non-changectx value, for historical reason,
But, since e91ac285f700, copyalltostore(), the only one copytostore()
client in Mercurial source tree, always passes changectx as
'revorctx'.
Therefore, it is reasonable to make copytostore() accept only
changectx as the 2nd argument, now.
AFAIK, 'uploaded' argument of copytostore() (or copytocache(), before
renaming at e2d2a21b7e90) has been never used both on caller and
callee sides, since official release of bundled largefiles extension.
copyalltostore(), only one caller of copytostore(), already knows
standin file name of the target largefile. Therefore, passing it to
copytostore() is more efficient than calculating it in copytostore()
or readstandin().
This will be used to centralize and encapsulate the logic to read hash
from given (filectx of) standin file. readstandin() isn't suitable for
this purpose, because there are some code paths, which want to read
hex hash directly from filectx.
Previously, the pull would succeed, but the subsequent rebase would fail due
to the rebase.requiredest flag. Now abort earlier with a more useful error
message.
This adds an explicit active-bookmark-handling logic
to shelve. Traditional shelve handles it by transaction aborts,
but it is a bit ugly and having an explicit functionality
seems better.
Before this patch, updatestandin() takes "standin" argument, and
applies splitstandin() on it to pick out a path to largefile (aka
"lfile" or so) from standin.
But in fact, all callers already knows "lfile". In addition to it,
many callers knows both "standin" and "lfile".
Therefore, making updatestandin() take only one of "standin" or
"lfile" is inefficient.
repo['.'] is called not as "working context" but as "parent context".
In this code path, hash value of current content of file should be
compared against hash value recorded in "parent context".
Therefore, "wctx" may cause misunderstanding in this case.
Before this patch, this code path contains two loops for m._files: one
for replacement with standin, and another for elimination of None,
which comes from previous replacement ("standin in wctx or
lfdirstate[f] == 'r'" case in tostandin()).
These two loops can be unified into simple one "for" loop.
Updating standin for newly added largefile is needed, only if same
name largefile exists in destination context at linear merging. In
such case, updated standin is used to detect divergence of largefile
at overridefilemerge().
Otherwise, standin doesn't have any responsibility for its content
(usually, it is empty).
This patch also renames argument of hexsha1(), not only for
readability ("data" isn't good name for file-like object), but also
for reviewability (including hexsha1() code helps reviewers to confirm
how these functions are similar).
BTW, copyandhash() has also similar logic, but it can't reuse
hexsha1(), because it writes read-in data into specified fileobj
simultaneously.
Before 33e44341bb82, histedit (like rebase) was only creating markers on final
success from the old-rewritten node to the newly created nodes (as of before
33e44341bb82). In case of abort the aborted attempt were stripped to restore the
repository in its state prior to the attempt.
This use of strip was on purpose. Using markers in this case introduces various
issues. The main one is that keeping the partial result of histedit as obsolete
prevents us to recreates the same nodes in a second attempt. The same operation
will lead to an identical results, using an identical node that already exists
in the repository as obsolete.
To conclude, we cannot and should not switch to obsolescence markers creation on
histedit --abort and we backout 33e44341bb82. A test to catch this class of
issue will be introduced in the next changeset.
Now that rebasestate is serialized as part of the transaction, the repo state it
sees is the version at the end of the transaction, which may have hidden nodes.
Therefore, it's possible parts of the rebase commit set are no longer visible by
the time the transaction is closing, which causes a filtered revision error in
this code. I don't think state serialization should be blocked from accessing
commits it knows exist, especially if all it's trying to do is get the hex of
them, so let's use an unfiltered repo here.
Unfortunately, the only known repro is with the fbamend Facebook extension, so
I'm not sure how to repro it in core Mercurial for a test.
There are some code paths, which apply standin() on same value
multilpe times instead of using already standin()-ed value.
"fstandin" is common name for "path to standin file" in lfutil.py, to
avoid shadowing "standin()".
readstandin() takes "node" argument to get changectx by "repo[node]".
There are some readstandin() invocations, which use ctx.node(),
ctx.rev(), or '.' as "node" argument above, even though corresponded
changectx object is already looked up on caller side.
This patch calls readstandin() with already known changectx itself, to
avoid meaningless re-construction of changectx (indirect case via
copytostore() is also included).
BTW, copytostore() uses "rev" argument only for readstandin()
invocation. Therefore, this patch also renames it to "revorctx" to
indicate that it can take not only revision ID or so but also
changectx, for readability.
There are many isstandin() invocations before splitstandin().
The former examines whether specified path starts with ".hglf/". The
latter returns after ".hglf/" of specified path if it starts with that
prefix, or returns None otherwise.
Therefore, value returned by splitstandin() can be used for
replacement of preceding isstandin(), and this replacement can omit
redundant string comparison after isstandin().
Empty changelist descriptions are valid in Perforce. If we encounter one of
them we are currently running into an IndexError. In case of empty commit
messages set the commit message to **empty changelist description**, which
follows Perforce terminology.
We only have commands.{update,rebase}.requiredest so far. We should
clearly ignore those two if HGPLAIN is in effect, and it seems like we
should ignore any future config that will be added in [commands] since
that is about changing the behavior of commands.
Thanks to Yuya for suggesting to centralize the code in ui.py.
While at it, remove the unnecessary False values passed to
ui.configbool() for the aforementioned config options.
In some mercurial workflows, the default destination for rebase does not
always work well and can lead to confusing behavior. With this flag enabled,
every rebase command will require passing an explicit destination, eliminating
this confusion.