Similar to what was explained in the previous commit, the diff code
expected copy source to be in "ctx1", which is not always the case
during a merge. This has been broken since before hg 2.0.
Also similar to the previous commit, we fix the problem by fixing up
the copy dict.
During a merge, if the user removes a file that came from parent 2 and
did not exist in parent 1, the file's status will be "removed". This
surprises the diff code, which crashes because it expects removed
files exist in parent 1. This has been broken since ff976121fb34
(trydiff: use 'not in addedset' for symmetry with 'not in removedset',
2014-12-23).
Fix by fixing up the list of removed file, similar to how we currently
fix up the list of modified and added files during a merge.
This prepares for future patches, and it also lets us remove the ugly
"ctx1" argument to _filepairs() (ugly because of its assymmetry --
there's no "ctx2" argument).
The behaviour in this case is undefined. Instead of silently doing something
"random" and surprising, at least issue a warning.
(This should perhaps be considered a "deprecation" and turned into an error in
a future release.)
Positional parameters are also treated as revisions, but the order of revisions
matters and it will often be wrong if the user understands it as `-r` taking
multiple revisions as `-r REV1 REV2`.
(Alternatively, `-r` could be turned into a no-op flag as the documentation
suggests. That would however be less "semantic markup" and I agree with the
implementation in 40cbb25097c8 but not the documentation.)
Closing files that have been appended to is slow on Windows/NTFS.
CloseHandle() calls on this platform often take 1-10ms - and that's
on my i7-6700K Skylake processor with a modern and fast SSD. Contrast
with other I/O operations, such as writing data, which take <100us.
This means that creating/appending thousands of files can add
significant overhead. For example, cloning mozilla-central creates
~232,000 revlog files. Assuming 1ms per CloseHandle(), that yields
232s (3:52) of wall time waiting for file closes!
The impact of this overhead can be measured most directly when applying
stream clone bundles. Applying these files is effectively uncompressing
a tar archive (read: it's very fast).
Using a RAM disk (read: no I/O wait), the difference in wall time for a
`hg debugapplystreamclonebundle` for a ~1731 MB mozilla-central bundle
between Windows and Linux from the same machine is drastic:
Linux: ~12.8s (128MB/s)
Windows: ~352.0s (4.7MB/s)
Windows is ~27.5x slower. Yikes!
After this patch:
Linux: ~12.8s (128MB/s)
Windows: ~102.1s (16.1MB/s)
Windows is now ~3.4x faster. Unfortunately, it is still ~8x slower than
Linux. Profiling reveals a few hot code paths that could likely be
improved. But those are for other patches.
This patch introduces test-clone-uncompressed.t because existing tests
of `clone --uncompressed` are scattered about and adding a variation for
background thread closing to e.g. test-http.t doesn't feel correct.
Closing files that have been appended to is relatively slow on
Windows/NTFS. This makes several Mercurial operations slower on
Windows.
The workaround to this issue is conceptually simple: use multiple
threads for I/O. Unfortunately, Python doesn't scale well to multiple
threads because of the GIL. And, refactoring our code to use threads
everywhere would be a huge undertaking. So, we decide to tackle this
problem by starting small: establishing a thread pool for closing
files.
This patch establishes a mechanism for closing file handles on separate
threads. The coordinator object is basically a queue of file handles to
operate on and a thread pool consuming from the queue.
When files are opened through the VFS layer, the caller can specify
that delay closing is allowed.
A proxy class for file handles has been added. We must use a proxy
because it isn't possible to modify __class__ on built-in types. This
adds some overhead. But as future patches will show, this overhead
is cancelled out by the benefit of closing file handles on background
threads.
This provides a general-purpose interface to all custom namespaces.
The {namespaces} keyword honors the definition order of namespaces as they
are kept by sortdict.
This is necessary to obtain a _hybrid object from a dict. If get() yields
a value, it would be stringified.
I see no benefit to make get() lazy, so this patch just changes "yield" to
"return".
In templater, a callable symbol exists for lazy evaluation, which should have
f(**mapping) signature. On the other hand, _hybrid.__call__(), which was
introduced by 4e182fb53989, generates mapping for each element.
This patch renames _hybrid.__call__() to _hybrid.itermaps() so that a _hybrid
object can be a value of a mapping dict.
{namespaces % "{namespace}: {names % "{name }"}\n"}
~~~~~
a _hybrid object
The added content is inside a verbose container.
I figure it makes sense to explicitly document behavior, including
with the caveat it may change later. People can't say they weren't
warned!
Before the existence of `hg debugbundle --spec`, the process for
defining the BUNDLESPEC value in manifests was not very clear and not
trivial to automate, especially in the case of stream clone bundles.
This patch adds documentation to note the existence of
`hg debugbundle --spec`. We drop the reference to stream clone
requirements handling because it is now redundant with
`hg debugbundle --spec`. While we are here, we further reinforce the
importance of defining BUNDLESPEC.
We don't currently have a mechanism for inferring bundle spec strings
from bundle files. This patch adds one.
This will eventually be used to make the producing of clone bundles
manifests easier.
RFC 7159 does not state that U+007F must be escaped, but it is widely
considered a control character. As '\x7f' is invisible on a terminal, and
Python's json.dumps() escapes '\x7f', let's do the same.
Some evil evil awful tool adds resource forks to files it's comparing.
Our Mac-specific code to do bulk stats was accidentally using "total
size" which includes those forks in the file size, causing them to be
reported as modified. This changes it to only care about the normal
data size and thus agree with what Mercurial's expecting.
If we move all the files out of one directory, but into two different
directories, we should not consider it a directory rename. The
detection of this case was broken.
This lets us greatly simply acquire/release cycles.
If the block completes without raising an exception, the transaction
is closed.
Code pattern before:
try:
tr = repo.transaction('x')
# zillions of lines of code
tr.close()
finally:
tr.release()
And after:
with tr.transaction('x'):
# ...