This was originally written for JSON templating where we would have to be
careful to not add extra comma, but seems generally useful.
Inner loop started by % operator has its own counter.
Previously, revlog.revision(raw=False) may try to apply the delta chain
on _cache hit. That happens if flags are non-empty. This patch makes rawtext
reused so delta chain application is avoided.
"_cache" and "rev" are moved a bit to avoid unnecessary assignments.
When writing the revlog-ng index, the third field is len(rawtext). See
revlog._addrevision:
textlen = len(rawtext)
....
e = (offset_type(offset, flags), l, textlen,
base, link, p1r, p2r, node)
self.index.insert(-1, e)
Therefore, revlog.index[rev][2] returned by revlog.rawsize should be
len(rawtext), where "rawtext" is revlog.revision(raw=True).
Unfortunately it's hard to add a test for this code path because "if l >= 0"
catches most cases.
Commit 81e1f5bbf1fc54808649562d3ed829730765c540 from
https://github.com/indygreg/python-zstandard is imported without
modifications (other than removing unwanted files).
Updates relevant to Mercurial include:
* Support for multi-threaded compression (we can use this for
bundle and wire protocol compression).
* APIs for batch compression and decompression operations using
multiple threads and optimal memory allocation mechanism. (Can
be useful for revlog perf improvements.)
* A ``BufferWithSegments`` type that models a single memory buffer
containing N discrete items of known lengths. This type can be
used for very efficient 0-copy data operations.
# no-check-commit
We now have a dedicated help topic to describe bundle specification
strings. Let's update `hg bundle`'s documentation to reflect its
existence.
While I was hear, I also tweaked some wording which I felt was out
of date and needed tweaking. Specifically, `hg bundle` no longer
just deals with "changegroup" data: it can also generate files
that have non-changegroup data.
I softly formalized the concept of a "bundle specification" a while
ago when I was working on clone bundles and stream clone bundles and
wanted a more robust way to define what exactly is in a bundle file.
The concept has existed for a while. Since it is part of the clone
bundles feature and exposed to the user via the "-t" argument to
`hg bundle`, it is something we need to support for the long haul.
After the 4.1 release, I heard a few people comment that they didn't
realize you could generate zstd bundles with `hg bundle`. I'm
partially to blame for not documenting it in bundle's docstring.
Additionally, I added a hacky, experimental feature for controlling
the compression level of bundles in 054e64c4d837. As the commit
message says, I went with a quick and dirty solution out of time
constraints. Furthermore, I wanted to eventually store this
configuration in the "bundlespec" so it could be made more flexible.
Given:
a) bundlespecs are here to stay
b) we don't have great documentation over what they are, despite being
a user-facing feature
c) the list of available compression engines and their behavior isn't
exposed
d) we need an extensible place to modify behavior of compression
engines
I want to move forward with formalizing bundlespecs as a user-facing
feature. This commit does that by introducing a "bundlespec" help
page. Leaning on the just-added compression engine documentation
and API, the topic also conveniently lists available compression
engines and details about them. This makes features like zstd
bundle compression more discoverable. e.g. you can now
`hg help -k zstd` and it lists the "bundlespec" topic.
An upcoming patch will add support for documenting bundle
specifications in more detail. As part of this, we'd like to
enumerate available bundle compression formats. In order to do
this, we need to provide the help mechanism a dict of names
and objects with docstrings.
This patch adds docstrings to compengine.bundletype and adds
a function for retrieving a dict of them. The code is not yet
used.
Previously, --headeronly would prevent --twice from working
because the ETag wasn't stored when --headeronly was used.
This feels like a bug. That feeling is reaffirmed by the fact
that this change doesn't regress any tests.
A common exploit in web applications that access paths is to insert
path separator strings like ".." to try to get the server to serve up
files it shouldn't.
We have code for detecting this in staticfile(). A subsequent commit
will need to perform this test as well. Since this is security code,
let's factor the check so we don't have to reinvent the wheel.
The previous CSS rule would also apply in pages where followlines UI was not
available (e.g. "changeset" view at /rev/<node>/). We insert a
"followlines-select" class in JavaScript on actually selectable lines and
restrict the CSS selector to use it.
We define the listener of document's "DOMContentLoaded" inline in registration
and use a function expression (anonymous) with everything inside. This makes
it clearer that this file is not a library of JavaScript functions but rather
an executable script.
(Most of changes consists of reindenting the "followlinesBox" function, so
mostly white space changes.)
Text IO sucks on Python 3 as it must be a unicode stream. We could introduce
a wrapper that converts unicode back to bytes, but it wouldn't be simple to
handle offsets transparently from/to underlying IOBase API.
Fortunately, we don't need to process huge text files, so let's stick to
bytes IO and convert EOL in memory.
These tests would run if hghave.has_serve() were enabled on Windows. Windows
has no issue allowing an unpriviledged process to open port 13, so it doesn't
abort. The other tests are related to how MSYS tries to be helpful and converts
Unix constructs to the Windows equivalent. There isn't any way to disable this
behavior, though it supposedly doesn't happen if the exe is linked against the
MSYS library.
The daemonized serve process doesn't print these lines out (see 0c9eeb8c6b87).
I was able to get it to with the following hack:
diff --git a/mercurial/win32.py b/mercurial/win32.py
--- a/mercurial/win32.py
+++ b/mercurial/win32.py
@@ -418,6 +418,11 @@
return str(ppid)
def spawndetached(args):
+
+ import subprocess
+ return subprocess.Popen(args, cwd=pycompat.getcwd(), env=encoding.environ,
+ creationflags=subprocess.CREATE_NEW_PROCESS_GROUP).pid
+
# No standard library function really spawns a fully detached
# process under win32 because they allocate pipes or other objects
# to handle standard streams communications. Passing these objects
However, MSYS translates --prefixes starting with '/' to 'C:/MinGW/msys/1.0',
which changes the output. The output isn't so important that I want to spend a
bunch of time on this, and risk breaking some subtle behavior of `hg serve -d`
with the more complicated code.
The http test simply wasn't updated in d264983c27fa for Windows. It looks like
the https test meant to glob away the error message in fa7f5826debe, but forgot
the '*', and was subsequently removed in 01af1e5e4062.
Currently, Mercurial has a number of commands to show information. And,
there are features coming down the pipe that will introduce more
commands for showing information.
Currently, when introducing a new class of data or a view that we
wish to expose to the user, the strategy is to introduce a new command
or overload an existing command, sometimes both. For example, there is
a desire to formalize the wip/smartlog/underway/mine functionality that
many have devised. There is also a desire to introduce a "topics"
concept. Others would like views of "the current stack." In the
current model, we'd need a new command for wip/smartlog/etc (that
behaves a lot like a pre-defined alias of `hg log`). For topics,
we'd likely overload `hg topic[s]` to both display and manipulate
topics.
Adding new commands for every pre-defined query doesn't scale well
and pollutes `hg help`. Overloading commands to perform read-only and
write operations is arguably an UX anti-pattern: while having all
functionality for a given concept in one command is nice, having a
single command doing multiple discrete operations is not. Furthermore,
a user may be surprised that a command they thought was read-only
actually changes something.
We discussed this at the Mercurial 4.0 Sprint in Paris and decided that
having a single command where we could hang pre-defined views of
various data would be a good idea. Having such a command would:
* Help prevent an explosion of new query-related commands
* Create a clear separation between read and write operations
(mitigates footguns)
* Avoids overloading the meaning of commands that manipulate data
(bookmark, tag, branch, etc) (while we can't take away the
existing behavior for BC reasons, we now won't introduce this
behavior on new commands)
* Allows users to discover informational views more easily by
aggregating them in a single location
* Lowers the barrier to creating the new views (since the barrier
to creating a top-level command is relatively high)
So, this commit introduces the `hg show` command via the "show"
extension. This command accepts a positional argument of the
"view" to show. New views can be registered with a decorator. To
prove it works, we implement the "bookmarks" view, which shows a
table of bookmarks and their associated nodes.
We introduce a new style to hold everything used by `hg show`.
For our initial bookmarks view, the output varies from `hg bookmarks`:
* Padding is performed in the template itself as opposed to Python
* Revision integers are not shown
* shortest() is used to display a 5 character node by default (as
opposed to static 12 characters)
I chose to implement the "bookmarks" view first because it is simple
and shouldn't invite too much bikeshedding that detracts from the
evaluation of `hg show` itself. But there is an important point
to consider: we now have 2 ways to show a list of bookmarks. I'm not
a fan of introducing multiple ways to do very similar things. So it
might be worth discussing how we wish to tackle this issue for
bookmarks, tags, branches, MQ series, etc.
I also made the choice of explicitly declaring the default show
template not part of the standard BC guarantees. History has shown
that we make mistakes and poor choices with output formatting but
can't fix these mistakes later because random tools are parsing
output and we don't want to break these tools. Optimizing for human
consumption is one of my goals for `hg show`. So, by not covering
the formatting as part of BC, the barrier to future change is much
lower and humans benefit.
There are some improvements that can be made to formatting. For
example, we don't yet use label() in the templates. We obviously
want this for color. But I'm not sure if we should reuse the existing
log.* labels or invent new ones. I figure we can punt that to a
follow-up.
At the aforementioned Sprint, we discussed and discarded various
alternatives to `hg show`.
We considered making `hg log <view>` perform this behavior. The main
reason we can't do this is because a positional argument to `hg log`
can be a file path and if there is a conflict between a path name and
a view name, behavior is ambiguous. We could have introduced
`hg log --view` or similar, but we felt that required too much typing
(we don't want to require a command flag to show a view) and wasn't
very discoverable. Furthermore, `hg log` is optimized for showing
changelog data and there are things that `hg display` could display
that aren't changelog centric.
There were concerns about using "show" as the command name.
Some users already have a "show" alias that is similar to `hg export`.
There were also concerns that Git users adapted to `git show` would
be confused by `hg show`'s different behavior. The main difference
here is `git show` prints an `hg export` like view of the current
commit by default and `hg show` requires an argument. `git show`
can also display any Git object. `git show` does not support
displaying more complex views: just single objects. If we
implemented `hg show <hash>` or `hg show <identifier>`, `hg show`
would be a superset of `git show`. Although, I'm hesitant to do that
at this time because I view `hg show` as a higher-level querying
command and there are namespace collisions between valid identifiers
and registered views.
There is also a prefix collision with `hg showconfig`, which is an
alias of `hg config`.
We also considered `hg view`, but that is already used by the "hgk"
extension.
`hg display` was also proposed at one point. It has a prefix collision
with `hg diff`. General consensus was "show" or "view" are the best
verbs. And since "view" was taken, "show" was chosen.
There are a number of inline TODOs in this patch. Some of these
represent decisions yet to be made. Others represent features
requiring non-trivial complexity. Rather than bloat the patch or
invite additional bikeshedding, I figured I'd document future
enhancements via TODO so we can get a minimal implmentation landed.
Something is better than nothing.
The "genbits" implementation is actually incorrect. This patch fixes it. A
good "genbits" implementation should pass the below assertion:
n = 3 # or other number
l = list(genbits(n))
assert 2**(n*2) == len(set((l[i]<<n)+l[i+1] for i in range(len(l)-1)))
An assertion is added to make sure "genbits" won't work unexpectedly.
According to the document added above, we should check L1 == L2, and the
only way to get L1 in all cases is to call "rawsize()", and the only way to
get L2 is to call "revision(raw=True)". Therefore the fix.
Meanwhile there are still a lot of things about flagprocessor broken in
revlog.py. Tests will be added after revlog.py gets fixed.
It seems a good idea to list all kinds of "surprises" and expected behavior
to make the upcoming changes easier to understand.
Note: the comment added does not reflect the actual behavior of the current
code.
In filerevision view (/file/<rev>/<fname>) we add some event listeners on
mouse clicks of <span> elements in the <pre class="sourcelines"> block.
Those listeners will capture a range of lines selected between two mouse
clicks and a box inviting to follow the history of selected lines will then
show up. Selected lines (i.e. the block of lines) get a CSS class which make
them highlighted. Selection can be cancelled (and restarted) by either
clicking on the cancel ("x") button in the invite box or clicking on any other
source line. Also clicking twice on the same line will abort the selection and
reset event listeners to restart the process.
As a first step, this action is only advertised by the "cursor: cell" CSS rule
on source lines elements as any other mechanisms would make the code
significantly more complicated. This might be improved later.
All JavaScript code lives in a new "linerangelog.js" file, sourced in
filerevision template (only in "paper" style for now).
If cache hit and flags are empty, no flag processor runs and "text" equals
to "rawtext". So we check flags, and return rawtext.
This resolves performance issue introduced by a previous patch.
All 3 users of _addrevision use raw:
- addrevision: passing rawtext to _addrevision
- addgroup: passing rawtext and raw=True to _addrevision
- clone: passing rawtext to _addrevision
There is no real user using _addrevision(raw=False). On the other hand,
_addrevision is low-level code dealing with raw revlog deltas and rawtexts.
It should not transform rawtext to non-raw text.
This patch removes the "raw" parameter from "_addrevision", and does some
rename and doc change to make it clear that "_addrevision" expects rawtext.
Archeology shows 886a08012bbe added "raw" flag to "_addrevision", follow-ups
fe1e206cb389 and 1cfa6239c923 seem to make the flag unnecessary.
test-revlog-raw.py no longer complains.