About 3 years ago, in August 2014, the logic to select what markers to select on
push was ported from the evolve extension to Mercurial core. However, for some
unclear reasons, the tests for that logic were not ported alongside.
I realised it a couple of weeks ago while working on another push related issue.
I've made a clean up pass on the tests and they are now ready to integrate the
core test suite. This series of changesets do not change any logic. I just adds
test for logic that has been around for about 10 versions of Mercurial.
They are a patch for each test case. It makes it easier to review and postpone
one with documentation issues without rejecting the wholes series.
This patch introduce the common script that setup the basic environment for the
test cases. Once this script is in. We can accept the other patches in any
order.
Each test case comes it in own test file. It help parallelism and does not
introduce a significant overhead from having a single unified giant test file.
Here are timing to support this claim.
# Multiple test files version:
# run-tests.py --local -j 1 test-exchange-*.t
53.40s user 6.82s system 85% cpu 1:10.76 total
52.79s user 6.97s system 85% cpu 1:09.97 total
52.94s user 6.82s system 85% cpu 1:09.69 total
# Single test file version:
# run-tests.py --local -j 1 test-exchange-obsmarkers.t
52.97s user 6.85s system 85% cpu 1:10.10 total
52.64s user 6.79s system 85% cpu 1:09.63 total
53.70s user 7.00s system 85% cpu 1:11.17 total
Previously, when runcommand raises, chg aborts with, and does not wait for
pager. The call stack is like:
hgc_runcommand -> handleresponse -> readchannel -> debugmsg("failed to
read channel") -> exit(255)
That means, chg returns to the shell, then both the pager and the shell will
read from the terminal at the same time, causing problems.
This patch fixes that by using "atexit" to register the pager cleanup
function so chg will always wait for pager even if runcommand raises.
Unlike revset, function arguments are pre-processed in templater. That's why
we need to define argspec per function. An argspec field looks somewhat
redundant in @templatefunc definition as a name field contains human-readable
list of arguments. I'll make function doc be built from argspec later.
Ported separate() function as an example.
Before this patch, there was a single way to see multiple shelves using
`--patch --list` which show all the shelves. Doing `--patch s1 s2` returns an
error. This patch allows to show multiple shelves using `--patch` and `--stat`.
The final part of integrating the compression manager APIs into
revlog storage is the plumbing for repositories to advertise they
are using non-zlib storage and for revlogs to instantiate a non-zlib
compression engine.
The main intent of the compression manager work was to zstd all
of the things. Adding zstd to revlogs has proved to be more involved
than other places because revlogs are... special. Very small inputs
and the use of delta chains (which are themselves a form of
compression) are a completely different use case from streaming
compression, which bundles and the wire protocol employ. I've
conducted numerous experiments with zstd in revlogs and have yet
to formalize compression settings and a storage architecture that
I'm confident I won't regret later. In other words, I'm not yet
ready to commit to a new mechanism for using zstd - or any other
compression format - in revlogs.
That being said, having some support for zstd (and other compression
formats) in revlogs in core is beneficial. It can allow others to
conduct experiments.
This patch introduces *highly experimental* support for non-zlib
compression formats in revlogs. Introduced is a config option to
control which compression engine to use. Also introduced is a namespace
of "exp-compression-*" requirements to denote support for non-zlib
compression in revlogs. I've prefixed the namespace with "exp-"
(short for "experimental") because I'm not confident of the
requirements "schema" and in no way want to give the illusion of
supporting these requirements in the future. I fully intend to drop
support for these requirements once we figure out what we're doing
with zstd in revlogs.
A good portion of the patch is teaching the requirements system
about registered compression engines and passing the requested
compression engine as an opener option so revlogs can instantiate
the proper compression engine for new operations.
That's a verbose way of saying "we can now use zstd in revlogs!"
On an `hg pull` conversion of the mozilla-unified repo with no extra
redelta settings (like aggressivemergedeltas), we can see the impact
of zstd vs zlib in revlogs:
$ hg perfrevlogchunks -c
! chunk
! wall 2.032052 comb 2.040000 user 1.990000 sys 0.050000 (best of 5)
! wall 1.866360 comb 1.860000 user 1.820000 sys 0.040000 (best of 6)
! chunk batch
! wall 1.877261 comb 1.870000 user 1.860000 sys 0.010000 (best of 6)
! wall 1.705410 comb 1.710000 user 1.690000 sys 0.020000 (best of 6)
$ hg perfrevlogchunks -m
! chunk
! wall 2.721427 comb 2.720000 user 2.640000 sys 0.080000 (best of 4)
! wall 2.035076 comb 2.030000 user 1.950000 sys 0.080000 (best of 5)
! chunk batch
! wall 2.614561 comb 2.620000 user 2.580000 sys 0.040000 (best of 4)
! wall 1.910252 comb 1.910000 user 1.880000 sys 0.030000 (best of 6)
$ hg perfrevlog -c -d 1
! wall 4.812885 comb 4.820000 user 4.800000 sys 0.020000 (best of 3)
! wall 4.699621 comb 4.710000 user 4.700000 sys 0.010000 (best of 3)
$ hg perfrevlog -m -d 1000
! wall 34.252800 comb 34.250000 user 33.730000 sys 0.520000 (best of 3)
! wall 24.094999 comb 24.090000 user 23.320000 sys 0.770000 (best of 3)
Only modest wins for the changelog. But manifest reading is
significantly faster. What's going on?
One reason might be data volume. zstd decompresses faster. So given
more bytes, it will put more distance between it and zlib.
Another reason is size. In the current design, zstd revlogs are
*larger*:
debugcreatestreamclonebundle (size in bytes)
zlib: 1,638,852,492
zstd: 1,680,601,332
I haven't investigated this fully, but I reckon a significant cause of
larger revlogs is that the zstd frame/header has more bytes than
zlib's. For very small inputs or data that doesn't compress well, we'll
tend to store more uncompressed chunks than with zlib (because the
compressed size isn't smaller than original). This will make revlog
reading faster because it is doing less decompression.
Moving on to bundle performance:
$ hg bundle -a -t none-v2 (total CPU time)
zlib: 102.79s
zstd: 97.75s
So, marginal CPU decrease for reading all chunks in all revlogs
(this is somewhat disappointing).
$ hg bundle -a -t <engine>-v2 (total CPU time)
zlib: 191.59s
zstd: 115.36s
This last test effectively measures the difference between zlib->zlib
and zstd->zstd for revlogs to bundle. This is a rough approximation of
what a server does during `hg clone`.
There are some promising results for zstd. But not enough for me to
feel comfortable advertising it to users. We'll get there...
readline() returns '' only when EOF is encountered, in which case, Python's
getpass() raises EOFError. We should do the same to abort the session as
"response expected."
This bug was reported to
https://bitbucket.org/tortoisehg/thg/issues/4659/
When converting a Git repository to Mercurial at Mozilla, I encountered
a scenario where I didn't want `hg convert` to automatically add the
"committer: <committer>" line to commit messages. While I can hack around
this by rewriting the Git commit before it is fed into `hg convert`,
I figured it would be a useful knob to control.
This patch introduces a config option that allows lots of control
over the committer value. I initially implemented this as a single
boolean flag to control whether to save the committer message. But
then there was feedback that it would be useful to save the committer
in extra data. While this patch doesn't implement support for saving
in extra data, it does add a mechanism for extending which actions
to take on the committer field. We should be able to easily add
actions to save in extra data.
Some of the implemented features weren't asked for. But I figured they
could be useful. If nothing else they demonstrate the extensibility
of this mechanism.
The result of diffstatdata should not depend on having noprefix set or not, as
was reported in issue 4755. Forcing noprefix to false on call makes sure the
parser receives the diff in the correct format and returns the proper result.
Another way to fix this would have been to change the regular expressions in
path.diffstatdata(), but that would have introduced many unecessary special
cases.
The rev argument has the same meaning as startrev of follow(), and I think
startrev is more informative.
followlines() is new function, we can make BC now.
revlog.compress() compares the compressed size to the input size
and throws away the compressed data if it is larger than the input.
This is the correct thing to do, as storing compressed data that
is larger than the input takes up more storage space and makes reading
slower.
However, the comparison was implemented inconsistently. For the
streaming compression mode, we threw away the result if it was
greater than or equal to the input size. But for the one-shot
compression, we threw away the compression only if it was greater
than the input size!
This patch changes the comparison for the simple case so it is
consistent with the streaming case.
As a few tests demonstrate, this adds 1 byte to some revlog entries.
This is because of an added 'u' header on the chunk. It seems
somewhat wrong to increase the revlog size here. However, IMO the cost
of 1 byte in storage is insignificant compared to the performance gains
of avoiding decompression. This patch should invite questions around
the heuristic for throwing away compressed data. For example, I'd argue
we should be more liberal about rejecting compressed data, additionally
doing so where the number of bytes saved fails to reach a threshold.
But we can have this discussion another time.
It was probably unintentional for regex, as the meaning of some sequences like
\S and \s is actually inverted by changing the case. For backward compatibility
however, the matching is forced to case insensitive.
Since we did a directory rename on the stores, the source
repository's lock path now references the dest repository's
lock path and the dest repository's lock path now references
a non-existent filename.
So releasing the lock on the source will unlock the dest and
releasing the lock on the dest will no-op because it fails due
to file not found. So we clean up the dest's lock manually.
The store contains more than just revlogs. This patch teaches the
upgrade code to copy regular files as well.
As the test changes demonstrate, the phaseroots file is now copied.
Our next step for in-place upgrade is to migrate store data. Revlogs
are the biggest source of data within the store and a store is useless
without them, so we implement their migration first.
Our strategy for migrating revlogs is to walk the store and call
`revlog.clone()` on each revlog. There are some minor complications.
Because revlogs have different storage options (e.g. changelog has
generaldelta and delta chains disabled), we need to obtain the
correct class of revlog so inserted data is encoded properly for its
type.
Various attempts at implementing progress indicators that didn't lead
to frustration from false "it's almost done" indicators were made.
I initially used a single progress bar based on number of revlogs.
However, this quickly churned through all filelogs, got to 99% then
effectively froze at 99.99% when it got to the manifest.
So I converted the progress bar to total revision count. This was a
little bit better. But the manifest was still significantly slower
than filelogs and it took forever to process the last few percent.
I then tried both revision/chunk bytes and raw bytes as the
denominator. This had the opposite effect: because so much data is in
manifests, it would churn through filelogs without showing much
progress. When it got to manifests, it would fill in 90+% of the
progress bar.
I finally gave up having a unified progress bar and instead implemented
3 progress bars: 1 for filelog revisions, 1 for manifest revisions, and
1 for changelog revisions. I added extra messages indicating the total
number of revisions of each so users know there are more progress bars
coming.
I also added extra messages before and after each stage to give extra
details about what is happening. Strictly speaking, this isn't
necessary. But the numbers are impressive. For example, when converting
a non-generaldelta mozilla-central repository, the messages you see are:
migrating 2475593 total revisions (1833043 in filelogs, 321156 in manifests, 321394 in changelog)
migrating 1.67 GB in store; 2508 GB tracked data
migrating 267868 filelogs containing 1833043 revisions (1.09 GB in store; 57.3 GB tracked data)
finished migrating 1833043 filelog revisions across 267868 filelogs; change in size: -415776 bytes
migrating 1 manifests containing 321156 revisions (518 MB in store; 2451 GB tracked data)
That "2508 GB" figure really blew me away. I had no clue that the raw
tracked data in mozilla-central was that large. Granted, 2451 GB is in
the manifest and "only" 57.3 GB is in filelogs. But still.
It's worth noting that gratuitous loading of source revlogs in order
to display numbers and progress bars does serve a purpose: it ensures
we can open all source revlogs. We don't want to spend several minutes
copying revlogs only to encounter a permissions error or similar later.
As part of this commit, we also add swapping of the store directory
to the upgrade function. After revlogs are converted, we move the
old store into the backup directory then move the temporary repo's
store into the old store's location. On well-behaved systems, this
should be 2 atomic operations and the window of inconsistency show be
very narrow.
There are still a few improvements to be made to store copying and
upgrading. But this commit gets the bulk of the work out of the way.
Now that all the upgrade planning work is in place, we can start
doing the real work: actually upgrading a repository.
The main goal of this commit is to get the "framework" for running
in-place upgrade actions in place.
Rather than get too clever and low-level with regards to in-place
upgrades, our strategy is to create a new, temporary repository,
copy data to it, then replace the old data with the new. This allows
us to reuse a lot of code in localrepo.py around store interaction,
which will eventually consume the bulk of the upgrade code.
But we have to start small. This patch implements adding new
repository requirements. But it still sets up a temporary
repository and locks it and the source repo before performing the
requirements file swap. This means all the plumbing is in place
to implement store copying in subsequent commits.
This commit introduces code for determining what actions/improvements
an upgrade should perform.
The "upgradefindimprovements" function introduces a mechanism to
return a list of improvements that can be made to a repository.
Each improvement is effectively an action that an upgrade will
perform. Associated with each of these improvements is metadata
that will be used to inform users what's wrong and what an
upgrade will do.
Each "improvement" is categorized as a "deficiency" or an
"optimization." TBH, I'm not thrilled about the terminology and
am receptive to constructive bikeshedding. The main difference
between a "deficiency" and an "optimization" is a deficiency
is always corrected (if it deviates from the current config) and
an "optimization" is an optional action that goes above and beyond
to improve the state of the repository (usually by requiring more
CPU during upgrade).
Our initial set of improvements identifies missing repository
requirements, a single, easily correctable problem with
changelog storage, and a set of "optimizations" related to delta
recalculation.
The main "upgraderepo" function has been expanded to handle
improvements. It queries for the list of improvements and determines
which of them will run based on the current repository state and user
I went through numerous iterations of the output format before
settling on a ReST-inspired definition list format. (I used
bulleted lists in the first submission of this commit and could
not get it to format just right.) Even with the various iterations,
I'm still not super thrilled with the format. But, this is a debug*
command, so that should mean we can refine the output without BC
concerns.
This commit introduces functionality for upgrading a repository in
place. The first part that's implemented is testing for upgrade
"compatibility." This is done by examining repository requirements.
There are 5 functions returning sets of requirements that control
upgrading. Why so many functions? Mainly to support extensions.
Functions are easier to monkeypatch than module variables.
Astute readers will see that we don't support "manifestv2" and
"treemanifest" requirements in the upgrade mechanism. I don't have
a great answer for why other than this is a complex set of patches
and I don't want to deal with the complexity of these experimental
features just yet. We can teach the upgrade mechanism about them
later, once the basic upgrade mechanism is in place.
This commit also introduces the "upgraderepo" function. This will be
our main routine for performing an in-place upgrade. Currently, it
just implements requirements checking. The structure of some code in
this function may look a bit weird (e.g. the inline function that is
only called once). But this will make sense after future commits.
Currently, if Mercurial introduces a new repository/store feature or
changes behavior of an existing feature, users must perform an
`hg clone` to create a new repository with hopefully the
correct/optimal settings. Unfortunately, even `hg clone` may not
give the correct results. For example, if you do a local `hg clone`,
you may get hardlinks to revlog files that inherit the old state.
If you `hg clone` from a remote or `hg clone --pull`, changegroup
application may bypass some optimization, such as converting to
generaldelta.
Optimizing a repository is harder than it seems and requires more
than a simple `hg` command invocation.
This commit starts the process of changing that. We introduce
`hg debugupgraderepo`, a command that performs an in-place upgrade
of a repository to use new, optimal features. The command is just
a stub right now. Features will be added in subsequent commits.
This commit does foreshadow some of the behavior of the new command,
notably that it doesn't do anything by default and that it takes
arguments that influence what actions it performs. These will be
explained more in subsequent commits.
Selecting single and multiple revisions is closely related, so let's
put it in one place, so users can easily find it. We actually did not
even point to "hg help revsets" from "hg help revisions", but now that
they're on a single page, that won't be necessary.
Content-Security-Policy (CSP) is a web security feature that allows
servers to declare what loaded content is allowed to do. For example,
a policy can prevent loading of images, JavaScript, CSS, etc unless
the source of that content is whitelisted (by hostname, URI scheme,
hashes of content, etc). It's a nifty security feature that provides
extra mitigation against some attacks, notably XSS.
Mitigation against these attacks is important for Mercurial because
hgweb renders repository data, which is commonly untrusted. While we
make attempts to escape things, etc, there's the possibility that
malicious data could be injected into the site content. If this happens
today, the full power of the web browser is available to that
malicious content. A restrictive CSP policy (defined by the server
operator and sent in an HTTP header which is outside the control of
malicious content), could restrict browser capabilities and mitigate
security problems posed by malicious data.
CSP works by emitting an HTTP header declaring the policy that browsers
should apply. Ideally, this header would be emitted by a layer above
Mercurial (likely the HTTP server doing the WSGI "proxying"). This
works for some CSP policies, but not all.
For example, policies to allow inline JavaScript may require setting
a "nonce" attribute on <script>. This attribute value must be unique
and non-guessable. And, the value must be present in the HTTP header
and the HTML body. This means that coordinating the value between
Mercurial and another HTTP server could be difficult: it is much
easier to generate and emit the nonce in a central location.
This commit introduces support for emitting a
Content-Security-Policy header from hgweb. A config option defines
the header value. If present, the header is emitted. A special
"%nonce%" syntax in the value triggers generation of a nonce and
inclusion in <script> elements in templates. The inclusion of a
nonce does not occur unless "%nonce%" is present. This makes this
commit completely backwards compatible and the feature opt-in.
The nonce is a type 4 UUID, which is the flavor that is randomly
generated. It has 122 random bits, which should be plenty to satisfy
the guarantees of a nonce.
All the hgweb templates include mercurial.js in their header. All
the hgweb templates have the same <script> boilerplate to run
process_dates(). This patch factors that function call into
mercurial.js as part of a DOMContentLoaded event listener.
With this commit, the HTTP transport now parses the X-HgProto-<N>
header to determine what media type and compression engine to use for
responses. So far, we only compress responses that are already being
compressed with zlib today (stream response types to specific
commands). We can expand things to cover additional response types
later.
The practical side-effect of this commit is that non-zlib compression
engines will be used if both ends support them. This means if both
ends have zstd support, zstd - not zlib - will be used to compress
data!
When cloning the mozilla-unified repository between a local HTTP
server and client, the benefits of non-zlib compression are quite
noticeable:
engine server CPU (s) client CPU (s) bundle size
zlib (l=6) 174.1 283.2 1,148,547,026
zstd (l=1) 99.2 267.3 1,127,513,841
zstd (l=3) 103.1 266.9 1,018,861,363
zstd (l=7) 128.3 269.7 919,190,278
zstd (l=10) 162.0 - 894,547,179
none 95.3 277.2 4,097,566,064
The default zstd compression level is 3. So if you deploy zstd
capable Mercurial to your clients and servers and CPU time on
your server is dominated by "getbundle" requests (clients cloning
and pulling) - and my experience at Mozilla tells me this is often
the case - this commit could drastically reduce your server-side
CPU usage *and* save on bandwidth costs!
Another benefit of this change is that server operators can install
*any* compression engine. While it isn't enabled by default, the
"none" compression engine can now be used to disable wire protocol
compression completely. Previously, commands like "getbundle" always
zlib compressed output, adding considerable overhead to generating
responses. If you are on a high speed network and your server is under
high load, it might be advantageous to trade bandwidth for CPU.
Although, zstd at level 1 doesn't use that much CPU, so I'm not
convinced that disabling compression wholesale is worthwhile. And, my
data seems to indicate a slow down on the client without compression.
I suspect this is due to a lack of buffering resulting in an increase
in socket read() calls and/or the fact we're transferring an extra 3 GB
of data (parsing HTTP chunked transfer and processing extra TCP packets
can add up). This is definitely worth investigating and optimizing. But
since the "none" compressor isn't enabled by default, I'm inclined to
punt on this issue.
This commit introduces tons of tests. Some of these should arguably
have been implemented on previous commits. But it was difficult to
test without the server functionality in place.
Now that servers expose a capability indicating they support
application/mercurial-0.2 and compression, clients can key off
this to say they support responses that are compressed with
various compression formats.
After this commit, the HTTP wire protocol client now sends an
"X-HgProto-<N>" request header indicating its support for
"application/mercurial-0.2" media type and various compression
formats.
This commit also implements support for handling
"application/mercurial-0.2" responses. It simply reads the header
compression engine identifier then routes the remainder of the
response to the appropriate decompressor.
There were some test changes, but only to logging. That points to
an obvious gap in our test coverage. This will be addressed in a
subsequent commit once server support is in place (it is hard to
test without server support).
This commit introduces support for advertising a server's support for
media types and compression formats in accordance with the spec defined
in internals.wireproto.
The bulk of the new code is a helper function in wireproto.py to
obtain a prioritized list of compression engines available to the
wire protocol. While not utilized yet, we implement support
for obtaining the list of compression engines advertised by the
client.
The upcoming HTTP protocol enhancements are a bit lower-level than
existing tests (most existing tests are command centric). So,
this commit establishes a new test file that will be appropriate
for holding tests around the functionality of the HTTP protocol
itself.
Rounding out this change, `hg debuginstall` now prints compression
engines available to the server.
Currently, bundle compression uses the default compression level
for the active compression engine. The default compression level
is tuned as a compromise between speed and size.
Some scenarios may call for a different compression level. For
example, with clone bundles, bundles are generated once and used
several times. Since the cost to generate is paid infrequently,
server operators may wish to trade extra CPU time for better
compression ratios.
This patch introduces an experimental and undocumented config
option to control the bundle compression level. As the inline
comment says, this approach is a bit hacky. I'd prefer for
the compression level to be encoded in the bundle spec. e.g.
"zstd-v2;complevel=15." However, given that the 4.1 freeze is
imminent, I'm not comfortable implementing this user-facing
change without much time to test and consider the implications.
So, we're going with the quick and dirty solution for now.
Having this option in the 4.1 release will enable Mozilla to
easily produce and test zlib and zstd bundles with non-default
compression levels in production. This will help drive future
development of the feature and zstd integration with Mercurial.
Detailed hint message is now provided when 'pull --rebase' operation detects
unclean working dir, for example:
abort: uncommitted changes
(cannot pull with rebase: please commit or shelve your changes first)
Added tests for uncommitted merge, and for subrepo support verifying that same
hint is also passed to subrepo state check.
Add the ability for revlog objects to process revision flags and apply
registered transforms on read/write operations.
This patch introduces:
- the 'revlog._processflags()' method that looks at revision flags and applies
flag processors registered on them. Due to the need to handle non-commutative
operations, flag transforms are applied in stable order but the order in which
the transforms are applied is reversed between read and write operations.
- the 'addflagprocessor()' method allowing to register processors on flags.
Flag processors are defined as a 3-tuple of (read, write, raw) functions to be
applied depending on the operation being performed.
- an update on 'revlog.addrevision()' behavior. The current flagprocessor design
relies on extensions to wrap around 'addrevision()' to set flags on revision
data, and on the flagprocessor to perform the actual transformation of its
contents. In the lfs case, this means we need to process flags before we meet
the 2GB size check, leading to performing some operations before it happens:
- if flags are set on the revision data, we assume some extensions might be
modifying the contents using the flag processor next, and we compute the
node for the original revision data (still allowing extension to override
the node by wrapping around 'addrevision()').
- we then invoke the flag processor to apply registered transforms (in lfs's
case, drastically reducing the size of large blobs).
- finally, we proceed with the 2GB size check.
Note: In the case a cachedelta is passed to 'addrevision()' and we detect the
flag processor modified the revision data, we chose to trust the flag processor
and drop the cachedelta.
The conditional was updating the repository, which wasn't reflected in
subsequent logs on Windows, so the conditional is narrowed to just the serve
commands. The serve operation generates log files, so those are deleted to keep
the output of summary consistent.
The extra glob in test-command-template.t caused it to say no result was
reported. It used to be (within the past year), that both this and the missing
glob cases could be fixed simply by editing any output in the test, and
re-running it in interactive mode. But that no longer works, and I had to diff
*.t against *.t.err. I didn't dig into what changed.
Refuse to run 'hg pull --rebase' if there are uncommitted changes:
so that instead of going ahead with fetching changes and then suddenly aborting
the rebase, we can warn user of uncommitted changes (or unclean repo state)
right up front.
In tests, we create a 'histedit' session to verify that also an unfinished
state is detected and handled.
This revset returns the history of a range of lines (fromline, toline) of a
file starting from `rev` or the current working directory.
Added tests in test-annotate.t which already contains a reasonably complex
repository.
The function filters diff blocks as generated by mdiff.allblock function based
on whether they are contained in a given line range based on the "b-side" of
blocks.
The "troubles" template keyword returns a list of evolution troubles.
It is EXPERIMENTAL, as anything else related to changeset evolution.
Test it in test-obsolete.t which has troubled changesets.
Every other template has the "raw" link load "raw-file." However,
fileannotate.tmpl's "raw" link loads "raw-annotate." This feels
inconsistent and wrong.
As far as I can tell, linking to the "raw annotate" view has occurred
since 2006.
Similar to git, we add a special string:
HG: ------------------------ >8 ------------------------
that means anything below it is ignored in a commit message.
This is helpful for integrating with third-party tools that display the
Let's resurrect the docstring since our help module can detect the EXPERIMENTAL
tag and display it only if -v is specified.
This patch updates the test added by bbdfa2d5aaa2 since wdir() is now
documented.
Adds a new function to treemanifest that allows walking over the directories in
the tree. Currently it only accepts a matcher to prune the walk, but in the
future it will also accept a list of trees and will only walk over subtrees that
differ from the versions in the list. This will be useful for identifying what
parts of the tree are new to this revision, which is useful when deciding the
minimal set of trees to send to a client given that they have a certain tree
already.
Since this is intended for an extension to use, the only current consumer is a
test. In the future this function may be useful for implementing other
algorithms like diff and changegroup generation.
Previously, revlog.size equals to revlog.rawsize. However, the flag
processor framework could make a difference - "size" could mean the length
of len(revision(raw=False)), while "rawsize" means len(revision(raw=True)).
This patch makes it so.
This corrects "hg status" output when flag processor is involved. The call
stack looks like:
basectx.status -> workingctx._buildstatus -> workingctx._dirstatestatus
-> workingctx._checklookup -> filectx.cmp -> filelog.cmp -> filelog.size
-> revlog.size
watchman's paths encoding can differ from filesystem encoding. For example,
on Windows, it's always utf-8.
Before this patch, on Windows, mismatch in path comparison between fsmonitor
state and osutil.statfiles would yield a clean status for added/modified files.
In addition to status reporting wrong results, this leads to files being
discarded from changesets while doing history editing operations such as rebase.
Benchmark:
There is a little overhead at module import:
python -m timeit "import hgext.fsmonitor"
Windows before patch: 1000000 loops, best of 3: 0.563 usec per loop
Windows after patch: 1000000 loops, best of 3: 0.583 usec per loop
Linx before patch: 1000000 loops, best of 3: 0.579 usec per loop
Linux after patch: 1000000 loops, best of 3: 0.588 usec per loop
10000 calls to _watchmantofsencoding:
python -m timeit -s "from hgext.fsmonitor import _watchmantofsencoding, _fixencoding" "fname = '/path/to/file'" "for i in range(10000):" " if _fixencoding: fname = _watchmantofsencoding(fname)"
Windows (_fixencoding is True): 100 loops, best of 3: 19.5 msec per loop
Linux (_fixencoding is False): 100 loops, best of 3: 3.08 msec per loop
"pylint --version" shows:
pylint 2.0.0,
astroid 1.5.0
Python 2.7.13 (default, Dec 21 2016, 07:16:46)
[GCC 6.2.1 20160830]
I got "Your code has been rated at 10.00/10" every time and didn't know how
to turn it off. Therefore the fix.
This is similar to "revlog: use raw revisions in revdiff". revdiff()
generates raw text used in revlog directly.
This makes test-flagprocessor.t happy.
Similar to fixes in revlog.py, this patch uses "rawtext" to explicitly label
contents expected to be raw, and makes sure content stored in _cache is raw
text.
Now test-flagprocessor.t points us to another issue.
"baserevision" returns the text that will be used to apply deltas. Since
deltas are against raw texts, "baserevision" should return raw text.
Now test-flagprocessor.t points us to a new error.
This shows flag processor is broken with a bundle repo.
The test creates non-liner history to exercise code path where the
deltaparent cannot be reused.
Duplicating entire tests just because the output is different is both error
prone and can make the tests harder to read. This harnesses the existing '(?)'
infrastructure, both to improve readability, and because it seemed like the path
of least resistance.
The form is:
$ test_cmd
output (hghave-feature !) # required if hghave.has_feature(), else optional
out2 (no-hghave-feature2 !) # req if not hghave.has_feature2(), else optional
I originally extended the '(?)' syntax. For example, this:
2 r4/.hg/cache/checkisexec (execbit ?)
pretty naturally reads as "checkisexec, if execbit". In some ways though, this
inverts the meaning of '?'. For '(?)', the line is purely optional. In the
example, it is mandatory iff execbit. Otherwise, it is carried forward as
optional, to preserve the test output. I tried it the other way, (listing
'no-exec' in the example), but that is too confusing to read. Kostia suggested
using '!', and that seems fine.
Previously, if a series of optional output lines marked with '(?)' had a (glob)
in one of the first lines, the output would be reordered such that it came last
if none of the lines were output. The (re) declaration wasn't affected, which
was helpful in figuring this out. There were no tests for '(re) (?)' so add
that to make sure everything plays nice.
This patch adds --binary option to `hg diff` and `hg export` to allow more
control about when binary diffs are displayed in Git mode as well as some
tests to verify it behaves correctly (issue5510).
This was originally written for JSON templating where we would have to be
careful to not add extra comma, but seems generally useful.
Inner loop started by % operator has its own counter.
Commit 81e1f5bbf1fc54808649562d3ed829730765c540 from
https://github.com/indygreg/python-zstandard is imported without
modifications (other than removing unwanted files).
Updates relevant to Mercurial include:
* Support for multi-threaded compression (we can use this for
bundle and wire protocol compression).
* APIs for batch compression and decompression operations using
multiple threads and optimal memory allocation mechanism. (Can
be useful for revlog perf improvements.)
* A ``BufferWithSegments`` type that models a single memory buffer
containing N discrete items of known lengths. This type can be
used for very efficient 0-copy data operations.
# no-check-commit
We now have a dedicated help topic to describe bundle specification
strings. Let's update `hg bundle`'s documentation to reflect its
existence.
While I was hear, I also tweaked some wording which I felt was out
of date and needed tweaking. Specifically, `hg bundle` no longer
just deals with "changegroup" data: it can also generate files
that have non-changegroup data.
I softly formalized the concept of a "bundle specification" a while
ago when I was working on clone bundles and stream clone bundles and
wanted a more robust way to define what exactly is in a bundle file.
The concept has existed for a while. Since it is part of the clone
bundles feature and exposed to the user via the "-t" argument to
`hg bundle`, it is something we need to support for the long haul.
After the 4.1 release, I heard a few people comment that they didn't
realize you could generate zstd bundles with `hg bundle`. I'm
partially to blame for not documenting it in bundle's docstring.
Additionally, I added a hacky, experimental feature for controlling
the compression level of bundles in 054e64c4d837. As the commit
message says, I went with a quick and dirty solution out of time
constraints. Furthermore, I wanted to eventually store this
configuration in the "bundlespec" so it could be made more flexible.
Given:
a) bundlespecs are here to stay
b) we don't have great documentation over what they are, despite being
a user-facing feature
c) the list of available compression engines and their behavior isn't
exposed
d) we need an extensible place to modify behavior of compression
engines
I want to move forward with formalizing bundlespecs as a user-facing
feature. This commit does that by introducing a "bundlespec" help
page. Leaning on the just-added compression engine documentation
and API, the topic also conveniently lists available compression
engines and details about them. This makes features like zstd
bundle compression more discoverable. e.g. you can now
`hg help -k zstd` and it lists the "bundlespec" topic.
Previously, --headeronly would prevent --twice from working
because the ETag wasn't stored when --headeronly was used.
This feels like a bug. That feeling is reaffirmed by the fact
that this change doesn't regress any tests.
These tests would run if hghave.has_serve() were enabled on Windows. Windows
has no issue allowing an unpriviledged process to open port 13, so it doesn't
abort. The other tests are related to how MSYS tries to be helpful and converts
Unix constructs to the Windows equivalent. There isn't any way to disable this
behavior, though it supposedly doesn't happen if the exe is linked against the
MSYS library.
The daemonized serve process doesn't print these lines out (see 0c9eeb8c6b87).
I was able to get it to with the following hack:
diff --git a/mercurial/win32.py b/mercurial/win32.py
--- a/mercurial/win32.py
+++ b/mercurial/win32.py
@@ -418,6 +418,11 @@
return str(ppid)
def spawndetached(args):
+
+ import subprocess
+ return subprocess.Popen(args, cwd=pycompat.getcwd(), env=encoding.environ,
+ creationflags=subprocess.CREATE_NEW_PROCESS_GROUP).pid
+
# No standard library function really spawns a fully detached
# process under win32 because they allocate pipes or other objects
# to handle standard streams communications. Passing these objects
However, MSYS translates --prefixes starting with '/' to 'C:/MinGW/msys/1.0',
which changes the output. The output isn't so important that I want to spend a
bunch of time on this, and risk breaking some subtle behavior of `hg serve -d`
with the more complicated code.
The http test simply wasn't updated in d264983c27fa for Windows. It looks like
the https test meant to glob away the error message in fa7f5826debe, but forgot
the '*', and was subsequently removed in 01af1e5e4062.
Currently, Mercurial has a number of commands to show information. And,
there are features coming down the pipe that will introduce more
commands for showing information.
Currently, when introducing a new class of data or a view that we
wish to expose to the user, the strategy is to introduce a new command
or overload an existing command, sometimes both. For example, there is
a desire to formalize the wip/smartlog/underway/mine functionality that
many have devised. There is also a desire to introduce a "topics"
concept. Others would like views of "the current stack." In the
current model, we'd need a new command for wip/smartlog/etc (that
behaves a lot like a pre-defined alias of `hg log`). For topics,
we'd likely overload `hg topic[s]` to both display and manipulate
topics.
Adding new commands for every pre-defined query doesn't scale well
and pollutes `hg help`. Overloading commands to perform read-only and
write operations is arguably an UX anti-pattern: while having all
functionality for a given concept in one command is nice, having a
single command doing multiple discrete operations is not. Furthermore,
a user may be surprised that a command they thought was read-only
actually changes something.
We discussed this at the Mercurial 4.0 Sprint in Paris and decided that
having a single command where we could hang pre-defined views of
various data would be a good idea. Having such a command would:
* Help prevent an explosion of new query-related commands
* Create a clear separation between read and write operations
(mitigates footguns)
* Avoids overloading the meaning of commands that manipulate data
(bookmark, tag, branch, etc) (while we can't take away the
existing behavior for BC reasons, we now won't introduce this
behavior on new commands)
* Allows users to discover informational views more easily by
aggregating them in a single location
* Lowers the barrier to creating the new views (since the barrier
to creating a top-level command is relatively high)
So, this commit introduces the `hg show` command via the "show"
extension. This command accepts a positional argument of the
"view" to show. New views can be registered with a decorator. To
prove it works, we implement the "bookmarks" view, which shows a
table of bookmarks and their associated nodes.
We introduce a new style to hold everything used by `hg show`.
For our initial bookmarks view, the output varies from `hg bookmarks`:
* Padding is performed in the template itself as opposed to Python
* Revision integers are not shown
* shortest() is used to display a 5 character node by default (as
opposed to static 12 characters)
I chose to implement the "bookmarks" view first because it is simple
and shouldn't invite too much bikeshedding that detracts from the
evaluation of `hg show` itself. But there is an important point
to consider: we now have 2 ways to show a list of bookmarks. I'm not
a fan of introducing multiple ways to do very similar things. So it
might be worth discussing how we wish to tackle this issue for
bookmarks, tags, branches, MQ series, etc.
I also made the choice of explicitly declaring the default show
template not part of the standard BC guarantees. History has shown
that we make mistakes and poor choices with output formatting but
can't fix these mistakes later because random tools are parsing
output and we don't want to break these tools. Optimizing for human
consumption is one of my goals for `hg show`. So, by not covering
the formatting as part of BC, the barrier to future change is much
lower and humans benefit.
There are some improvements that can be made to formatting. For
example, we don't yet use label() in the templates. We obviously
want this for color. But I'm not sure if we should reuse the existing
log.* labels or invent new ones. I figure we can punt that to a
follow-up.
At the aforementioned Sprint, we discussed and discarded various
alternatives to `hg show`.
We considered making `hg log <view>` perform this behavior. The main
reason we can't do this is because a positional argument to `hg log`
can be a file path and if there is a conflict between a path name and
a view name, behavior is ambiguous. We could have introduced
`hg log --view` or similar, but we felt that required too much typing
(we don't want to require a command flag to show a view) and wasn't
very discoverable. Furthermore, `hg log` is optimized for showing
changelog data and there are things that `hg display` could display
that aren't changelog centric.
There were concerns about using "show" as the command name.
Some users already have a "show" alias that is similar to `hg export`.
There were also concerns that Git users adapted to `git show` would
be confused by `hg show`'s different behavior. The main difference
here is `git show` prints an `hg export` like view of the current
commit by default and `hg show` requires an argument. `git show`
can also display any Git object. `git show` does not support
displaying more complex views: just single objects. If we
implemented `hg show <hash>` or `hg show <identifier>`, `hg show`
would be a superset of `git show`. Although, I'm hesitant to do that
at this time because I view `hg show` as a higher-level querying
command and there are namespace collisions between valid identifiers
and registered views.
There is also a prefix collision with `hg showconfig`, which is an
alias of `hg config`.
We also considered `hg view`, but that is already used by the "hgk"
extension.
`hg display` was also proposed at one point. It has a prefix collision
with `hg diff`. General consensus was "show" or "view" are the best
verbs. And since "view" was taken, "show" was chosen.
There are a number of inline TODOs in this patch. Some of these
represent decisions yet to be made. Others represent features
requiring non-trivial complexity. Rather than bloat the patch or
invite additional bikeshedding, I figured I'd document future
enhancements via TODO so we can get a minimal implmentation landed.
Something is better than nothing.
The "genbits" implementation is actually incorrect. This patch fixes it. A
good "genbits" implementation should pass the below assertion:
n = 3 # or other number
l = list(genbits(n))
assert 2**(n*2) == len(set((l[i]<<n)+l[i+1] for i in range(len(l)-1)))
An assertion is added to make sure "genbits" won't work unexpectedly.
In filerevision view (/file/<rev>/<fname>) we add some event listeners on
mouse clicks of <span> elements in the <pre class="sourcelines"> block.
Those listeners will capture a range of lines selected between two mouse
clicks and a box inviting to follow the history of selected lines will then
show up. Selected lines (i.e. the block of lines) get a CSS class which make
them highlighted. Selection can be cancelled (and restarted) by either
clicking on the cancel ("x") button in the invite box or clicking on any other
source line. Also clicking twice on the same line will abort the selection and
reset event listeners to restart the process.
As a first step, this action is only advertised by the "cursor: cell" CSS rule
on source lines elements as any other mechanisms would make the code
significantly more complicated. This might be improved later.
All JavaScript code lives in a new "linerangelog.js" file, sourced in
filerevision template (only in "paper" style for now).
All 3 users of _addrevision use raw:
- addrevision: passing rawtext to _addrevision
- addgroup: passing rawtext and raw=True to _addrevision
- clone: passing rawtext to _addrevision
There is no real user using _addrevision(raw=False). On the other hand,
_addrevision is low-level code dealing with raw revlog deltas and rawtexts.
It should not transform rawtext to non-raw text.
This patch removes the "raw" parameter from "_addrevision", and does some
rename and doc change to make it clear that "_addrevision" expects rawtext.
Archeology shows 886a08012bbe added "raw" flag to "_addrevision", follow-ups
fe1e206cb389 and 1cfa6239c923 seem to make the flag unnecessary.
test-revlog-raw.py no longer complains.
See the added comment. revdiff is meant to output the raw delta that will be
written to revlog. It should use raw.
test-revlog-raw.py now shows "addgroupcopy test passed", but there is more
to fix.
Using external content provided by flagprocessor when building revlog delta
is wrong, because deltas are applied to raw contents in revlog.
This patch fixes the above issue by adding "raw=True".
test-revlog-raw.py now shows "local test passed", but there is more to fix.
As documented at revlog.__init__, revlog._cache stores raw text.
The current read and write usage of "_cache" in revlog.revision lacks of
raw=True check.
This patch fixes that by adding check about raw, and storing rawtext
explicitly in _cache.
Note: it may slow down cache hit code path when raw=False and flags=0. That
performance issue will be fixed in a later patch.
test-revlog-raw now points us to a new problem.
The python hooks have access to the hook type information. There is not reason
for external hook to not be aware of it too.
For the record my use case is to make sure a hook script is configured for the
right type.
Hooks related to the transaction are aware of the transaction id. By definition
this txn-id is unique and different for each transaction. As a result it can
never be predicted in test and always needs matching. As a result, touching any
like with this data is annoying. We solve the problem once and for all by
installing an automatic replacement. In test, this will now show as:
TXNID=TXN:$ID$
Previously, the pull would succeed, but the subsequent rebase would fail due
to the rebase.requiredest flag. Now abort earlier with a more useful error
message.
As shown in issue5513, --continue is broken when destination is required. This
adds a patch that demonstates this silly behavior, which will be fixed in a
future patch.
When "patch" query parameter is present in requests to filelog view, line ids
in patches diff are no longer unique in the page since several patches are
shown on the same page. We now prefix line id by changeset shortnode when
several patches are displayed in the same page to have unique line ids
overall.
revlog.revision takes either node or rev, but taking a rev is more
efficient, because converting rev to node is just a seek and read.
That's cheaper than converting node to rev, which may require O(n) walk in
revlog index for the first times, and then triggering building the radix
tree index. Even with the radix tree built, rev -> node is still faster than
node -> rev because the radix tree requires more jumps in memory.
So r.revision(r.node(rev)) should be changed to r.revision(rev). This patch
adds a check-code rule to detect that.
The previous clause for filter out a diff hunk was too restrictive. We need to
consider the following cases (assuming linerange=(lb, ub) and the @s2,l2
hunkrange):
<-(s2)--------(s2+l2)->
<-(lb)---(ub)->
<-(lb)---(ub)->
<-(lb)---(ub)->
previously on the first and last situations were considered.
In test-hgweb-filelog.t, add a couple of lines at the beginning of file "b" so
that the line range we will follow does not start at the beginning of file.
This covers the change in aforementioned diff hunk filter clause.
a91c6275 introduces flushing ui buffers after a worker finished. If the ui was
not flushed before the worker was started, fork will copy the existing buffers
to the worker. This causes messages issued before the worker started to be
written to the terminal for each worker.
We are now flushing the ui before we start a worker and add an appropriate test
which will fail before this patch.
This is similar to what 9704c8e70d2d does. Since 1363aaf74791 has changed
"127.0.0.1" to "$LOCALIP". The glob pattern needs update accordingly. It is
expected to fix tests running in some BSD jails.
We now handle a "linerange" URL query parameter to filter filelog using
a logic similar to followlines() revset.
The URL syntax is: log/<rev>/<file>?linerange=<fromline>:<toline>
As a result, filelog entries only consists of revision changing specified
line range.
The linerange information is propagated to "more"/"less" navigation links but
not to numeric navigation links as this would apparently require a dedicated
"revnav" class.
Only update the "paper" template in this patch.
Add support for a "patch" query parameter in filelog web command similar to
--patch option of `hg log` to display the diff of each changeset in the table
of revisions. The diff text is displayed in a dedicated row of the table that
follows the existing one for each entry and spans over all columns. Only
update "paper" template in this patch.
Before 33e44341bb82, histedit (like rebase) was only creating markers on final
success from the old-rewritten node to the newly created nodes (as of before
33e44341bb82). In case of abort the aborted attempt were stripped to restore the
repository in its state prior to the attempt.
This use of strip was on purpose. Using markers in this case introduces various
issues. The main one is that keeping the partial result of histedit as obsolete
prevents us to recreates the same nodes in a second attempt. The same operation
will lead to an identical results, using an identical node that already exists
in the repository as obsolete.
To conclude, we cannot and should not switch to obsolescence markers creation on
histedit --abort and we backout 33e44341bb82. A test to catch this class of
issue will be introduced in the next changeset.
Commit messages often contain vertically aligned text. The default
paper style already uses monospace fonts for rendering commit messages.
And, AFAICT, a number of Git servers also render commit messages
with monospace. It seems like the reasonable thing to do.
This commit converts all instances of the full commit message
in the gitweb style to render with monospace.
Empty changelist descriptions are valid in Perforce. If we encounter one of
them we are currently running into an IndexError. In case of empty commit
messages set the commit message to **empty changelist description**, which
follows Perforce terminology.
When the config is set to true, status output becomes relative to the
working directory. This has bugged me since I started using hg and it
turns it is sillily simple to support it (unless I missed something,
of course).
We could also add a --relative flag, but I would personally always
want that on, and I haven't heard any use for having it sometimes on,
so this patch only lets you enable it via config.
We only have commands.{update,rebase}.requiredest so far. We should
clearly ignore those two if HGPLAIN is in effect, and it seems like we
should ignore any future config that will be added in [commands] since
that is about changing the behavior of commands.
Thanks to Yuya for suggesting to centralize the code in ui.py.
While at it, remove the unnecessary False values passed to
ui.configbool() for the aforementioned config options.
Unlike p1 = null, p2 = null denotes the revision has only one parent, which
shouldn't be considered a child of the null revision. This was spotted while
fixing the issue4682 and rediscovered as issue5439.
This is the behavior of the default __import__() function, which doesn't
validate the existence of the fromlist items. Later on, the missing attribute
is detected while processing the import statement.
https://hg.python.org/cpython/file/v2.7.13/Python/import.c#l2575
The comtypes library relies on this (maybe) undocumented behavior, and we
got a bug report to TortoiseHg, sigh.
https://bitbucket.org/tortoisehg/thg/issues/4647/
The test added at 0be19b069edf verifies the behavior of the import statement,
so this patch only adds the test of __import__() function and works around
CPython/PyPy difference.
Currently the warning is ambiguous about whether the new tag (possibly specified
via --rev) is being added on a branch head or whether the working directory is
based on a branch head. Clarify the error message to eliminate this ambiguity.
test-largefiles-update.t creates temporary file exec-bit.patch inside
the working directory for no-execbit platform specific test, but
subsequent tests aren't aware of it.
On execbit platform, subsequent tests can run successfully, because
exec-bit.patch isn't created.
But on no-execbit platform, this temporary file makes subsequent tests
show "? exec-bit.patch" at each "hg status".
journal extension uses util.shellquote() to record command line, but
result of it depends on runtime platform: double quotation is used on
Windows and OpenVMS, but single quotation is used otherwise.
test-journal-share.t sometimes specifies commit messages including
white space on command line. It makes journal output depend on runtime
platform, but commit message itself isn't important in this test case.
On Windows, strftime() doesn't support format code "%s", and it causes
"invalid format string" error.
https://msdn.microsoft.com/en-us/library/fe06s4ak.aspx
test-command-template.t examines not seconds value in UTC, but
arithmetic calculation. Therefore, using format code "%Y" instead of
"%s" should be reasonable.
FYI:
- Python standard library reference doesn't list "%s" up in format
code list required for "C standard (1989 version)", even though it
also mentions that additional format codes are required for "C
standard (1999 version)"
https://docs.python.org/2.7/library/datetime.html#strftime-and-strptime-behavior
- The Open Group Base Specifications Issue 7 (IEEE Std 1003.1-2008,
2016 Edition) doesn't require strftime to support format code "%s"
http://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html
- "man strftime" of (Open/Oracle) Solaris and Mac OS X (= UNIX
certified OSs) describes about format code "%s"
If environment variable looks like PATH or so (e.g. any of components
joined by ":" contains "/"), ":" in it is replaced with ";" by MinGW
at spawning Windows native process, to follow path concatenation style
of Windows.
Therefore, "bundle:../full.hg" is converted into "bundle;..\full.hg"
on MinGW.
Difference between "/" and "\" is automatically ignored by "(glob)",
but difference between ":" and ";" should be globed explicitly.
On Windows platform, invoking printenv.py directly via hook is
problematic, because:
- unless binding between *.py suffix and python runtime, application
selector dialog is displayed, and running test is blocked at each
printenv.py invocations
- it isn't safe to assume binding between *.py suffix and python
runtime, because application binding is easily broken
For example, installing IDE (VisualStudio with Python Tools, or
so) often requires binding between source files and IDE itself.
This patch invokes printenv.py via sh -c for test portability. This is
a kind of follow up for 9e4331825bea, which eliminated explicit
"python" for printenv.py. There are already other 'sh -c "printenv.py"'
in *.t files, and this fix should be reasonable.
This changes were confirmed in cases below:
- without any application binding for *.py suffix
- with binding between *.py suffix and VisualStudio
This patch also replaces "echo + redirection" style with "heredoc"
style, because:
- hook command line is parsed by cmd.exe as shell at first, and
- single quotation can't quote arguments on cmd.exe, therefore,
- "printenv.py foobar" should be quoted by double quotation, but
- nested quoting (or tricky escaping) isn't readable
cl._partialmatch() can be pretty slow if hidden revisions are involved. This
patch cancels the slowdown introduced by the previous patch by using an
unfiltered changelog, which means shortest(node) isn't always the shortest.
The result isn't perfect, but seems okay as long as shortest(node) is short
enough to type and can be used as an identifier.
(with hidden revisions)
% hg log -R hg-committed -r0:20000 -T '{node|shortest}\n' --time > /dev/null
(.^^) time: real 1.530 secs (user 1.480+0.000 sys 0.040+0.000)
(.^) time: real 43.080 secs (user 43.060+0.000 sys 0.030+0.000)
(.) time: real 1.680 secs (user 1.650+0.000 sys 0.020+0.000)
cl.index.partialmatch() isn't a drop-in replacement for cl._partialmatch().
It has no knowledge about hidden revisions, and it raises ValueError if a node
shorter than 4 chars is given. Instead, use index.partialmatch() through
cl._partialmatch(), which has no such problems and gives the identical result
with/without --pure.
The test output was sampled with --pure without this patch, which shows the
most correct result. However, we'll need to switch to using an unfiltered
changelog because _partialmatch() of a filtered changelog can be an order of
magnitude slower.
(with hidden revisions)
% hg log -R hg-committed -r0:20000 -T '{node|shortest}\n' --time > /dev/null
(.^) time: real 1.530 secs (user 1.480+0.000 sys 0.040+0.000)
(.) time: real 43.080 secs (user 43.060+0.000 sys 0.030+0.000)
On some platforms, cwd can't be removed. In which case, util.unlinkpath()
continues with no error since the failure of directory removal isn't critical.
So it doesn't make sense to run the test added by 6395630fdfdc on those
platforms. OTOH, we need to run the test in test-rebase-scenario-global.t
since the repository is referenced after that.
This is a fix for a regression introduced by the patches for issue4028.
The test changes are due to us doing fewer _checkcopies searches now, which
makes some test outputs revert to the pre-issue4028 behavior. That issue itself
remains fixed, we only skip copy tracing for files where it isn't relevant.
As a nice side effect, this makes copy detection much faster when tracing
backwards through lots of renames.
Over the past week I've had to instruct multiple people to run
Python code to query the ssl module to see what TLS protocol support
is present. I think it would be useful for `hg debuginstall` to print
this info to make it easier to access and debug why Mercurial is
complaining about using an insecure TLS 1.0 protocol.
Ideally we'd also print the path to the CA cert bundle. But the APIs
for querying that in sslutil can emit warnings, making it slightly
more difficult to integrate into `hg debuginstall`. That work will
have to wait for another day.
A code snippet that has been around since largefiles was introduced was wrong:
Standins no longer found in lfdirstate has *not* been removed -
they have probably just been deleted ... or not created.
This wrong reporting did that 'up -C' didn't undo the change and didn't sync
the two dirstates.
Instead of reporting such files as removed, propagate the deletion to the
standin file and report the file as deleted.
Largefiles are fragile with the design where dirstate and lfdirstate must be
kept in sync.
To be less fragile, mark all clean largefiles as unsure ("normallookup") before
updating standins. After standins have been updated and we know exactly which
largefile standins actually was changed, mark the unchanged largefiles back to
clean ("normal").
This will make the failure mode more safe. If interrupted, the next command
will continue to perform extra hashing of all largefiles. That will do that all
largefiles that are out of sync with their standin will be marked dirty and
they will show up in status and can be cleaned with update --clean.
Test using existing changesets in a clean working directory, revealing problems
with files that don't show up as modified or do show up as removed when they
just not have been written yet.
Revlog can now be configured to store full snapshot only. This is used on the
changelog. However, the changegroup packing was still recomputing deltas to be
sent over the wire.
We now just reuse the full snapshot directly in this case, skipping delta
computation. This provides use with a large speed up(-30%):
# perfchangegroupchangelog on mercurial
! wall 2.010326 comb 2.020000 user 2.000000 sys 0.020000 (best of 5)
! wall 1.382039 comb 1.380000 user 1.370000 sys 0.010000 (best of 8)
# perfchangegroupchangelog on pypy
! wall 5.792589 comb 5.780000 user 5.780000 sys 0.000000 (best of 3)
! wall 3.911158 comb 3.920000 user 3.900000 sys 0.020000 (best of 3)
# perfchangegroupchangelog on mozilla central
! wall 20.683727 comb 20.680000 user 20.630000 sys 0.050000 (best of 3)
! wall 14.190204 comb 14.190000 user 14.150000 sys 0.040000 (best of 3)
Many tests have to be updated because of the change in bundle content. All
theses update have been verified. Because diffing changelog was not very
valuable, the resulting bundle have similar size (often a bit smaller):
# full bundle of mozilla central
with delta: 1142740533B
without delta: 1142173300B
So this is a win all over the board.
When working in a rotated DAG (for a graftlike merge), there can be files
that are renamed both between the base and the topological CA, and between
the TCA and the endpoint farther from the base. Such renames span the TCA
(and thus need both passes of _checkcopies to be fully detected), but may
not necessarily be divergent.
Make _checkcopies return "incomplete copies" and "incomplete divergences"
in this case, and let mergecopies recombine them once data from both passes
of _checkcopies is available.
With this patch, all known cases involving renames and grafts pass.
(Developed together with Pierre-Yves David)
Add a "trouble" line in changeset header along with a couple of labels on
"log.changeset" line to indicate whether a changeset is troubled or not and
which kind trouble occurs.
During a graftlike merge, _checkcopies runs from ctx to tca, possibly
passing over the merge base. If there is a rename both before and after
the base, then we're actually dealing with divergent renames.
If there is no rename on the other side of tca, then the divergence is
contained entirely in the range of one _checkcopies invocation, and
should be detected "in the loop" without having to rely on the other
_checkcopies pass.