Previously, nodemap or childmap will be removed if they are broken at load
time, but not runtime. That means with verify=1, the repo will stay in a
broken state forever if either cache is broken. This patch removes the cache
files if things go wrong at runtime.
Differential Revision: https://phab.mercurial-scm.org/D1711
After applying suggestions from https://phab.mercurial-scm.org/D1564
to catch all exceptions in the same way I actually broke the handling of
KeyboardInterrupt on windows. The reason is that KeyboardInterrupt doesn't
dervie from Exception, but BaseException:
https://docs.python.org/2/library/exceptions.html starting from python 2.5
Test Plan:
Run hg on windows and ctrl-c during a large update. No random
exceptions from threads surface in the shell. Previously we'd nearly always get
stack traces from some of threads
Run tests ./run-tests.py
[...]
Failed test-convert-svn-encoding.t: output changed
# Ran 622 tests, 41 skipped, 1 failed.
python hash seed: 2962682116
The test failing seems to have nothing to do with the change and fails on base
revision as well
Differential Revision: https://phab.mercurial-scm.org/D1718
02c30db0443d (lfs: add a repo requirement for this extension once an lfs
file is committed) introduced a regression that prevents committing file
deletion. This patch fixes that.
Differential Revision: https://phab.mercurial-scm.org/D1717
The changes I see on the buildbot match the ones I see on my laptop,
and all look reasonable.
Differential Revision: https://phab.mercurial-scm.org/D1713
The indexes module depends directly on the python27-sys crate for low-level
manipulation of Python structures. This is located in a subdir in the cpython
git repository.
While that works for git repositories, this causes problems when the crate is
vendored, as Cargo is no longer able to find the directory.
Add a comment to the git URL line with the name of the subdirectory. This will
be used by the build script to locate the vendored version of the crate.
Differential Revision: https://phab.mercurial-scm.org/D1712
I've seen the following error a few times recently when running the tests with
`yes | ./run-tests.py --local -j9 -i`:
Errored test-add.t: Traceback (most recent call last):
File "./run-tests.py", line 821, in run
self.runTest()
File "./run-tests.py", line 910, in runTest
if self._result.addOutputMismatch(self, ret, out, self._refout):
File "./run-tests.py", line 1774, in addOutputMismatch
rename(test.errpath, test.path)
File "./run-tests.py", line 571, in rename
os.remove(src)
WindowsError: [Error 32] The process cannot access the file because it is being
used by another process: 'c:\\Users\\Matt\\projects\\hg\\tests\\test-add.t.err'
This change doesn't fix the problem, but it seems like a simple enough
improvement.
We could use patch.diffhunks() instead of patch.diff() to get filenames
without parsing patch content, but that isn't always possible because we
sometimes feed raw patch data to patch.diffstat().
Similar to upstream commit ddd65b4f (tests: alias syshg and syshgenv so they
can be switched conditionally, 2017-07-01), use testrepohg so it can use
system hg to handle additional repo requirements like lz4revlog or
treedirstate.
Differential Revision: https://phab.mercurial-scm.org/D1707
Change clindex so it can use the rust nodemap for node lookups and partial
match. By default, the code runs in a "verify" mode that compares the rust
nodemap against the original nodemap. Once we gain more confident, we can
drop the support of the original nodemap (so it does not build a radix
tree).
After this, there is no need to enable perftweaks.cachenoderevs or
fastpartialmatch.
Performance wise, the nodemap seems to be a bit faster than revlog.c trie.
The former takes 91% (+/-5%) of the time needed for the latter. That was
tested by running the following script 20 times on fbsource:
from mercurial.cext import parsers
from hgext3rd.rust import indexes
import contextlib, os
m = {}
@contextlib.contextmanager
def measure(name):
utime, stime, cutime, cstime, elapsed = os.times()
try:
yield
finally:
utime2, stime2, cutime, cstime, elapsed2 = os.times()
print('%14s: User %.3f Sys %.3f Real %.3f'
% (name, utime2 - utime, stime2 - stime, elapsed2 - elapsed))
m[name] = elapsed2 - elapsed
cl = open('00changelog.i').read()
with measure('nodemap'):
nm = indexes.nodemap(cl, indexes.nodemap.emptyindexbuffer())
nm.partialmatch('da0443cd533f4073e8d0e324d55')
with measure('revlog index'):
idx = parsers.index(cl, False)
idx.partialmatch('da0443cd533f4073e8d0e324d55')
print('nodemap: %.2f revlog' % (m['nodemap'] / m['revlog index']))
Test Plan:
**Checks clindex with rust nodemap can be a drop-in replacement**
Change `revlog.py` temporarily to disable its inline feature so clindex will be used:
-_maxinline = 131072
+_maxinline = 0
Then run tests with the extension enabled:
./run-tests.py -l --extra-config-opt=extensions.clindex= --extra-config-opt=clindex.lagthreshold=1 --extra-config-opt=clindex.logpath=/tmp/l -j `nproc`
Check the test output change makes sense (ex. extension list change, .hg
content change, help text change) and /tmp/l does not have contents like
"inconsistent" or "corrupted".
**Check memory safety**
Rebuild Python to be valgrind friendly (--without-pymalloc --with-valgrind),
Disable re2 by changing `import re2` in util.py to `import not_exist` since it
will trigger a lot of "read uninitialized memory" reports.
Use valgrind to run a few hg commands:
valgrind python2 hg commit -m 1 --config extensions.clindex= --config clindex.lagthreshold=1
Especially, do a rebase with obsstore disabled to exercise the "strip" code
path and make sure valgrind does not report anything except for memory leak
(normal for Python).
Differential Revision: https://phab.mercurial-scm.org/D1472
Thanks to D1411. It's now easy to add another rust module.
Test Plan:
`make local` and make sure `hgext3rd/rust/indexes.so` is built and
importable.
Differential Revision: https://phab.mercurial-scm.org/D1471
This makes it possible to do zero-copy reading the changelog.i data from
Python to Rust because eventually changelog.i does not have to be fully
accessed and together with mmap, that's at least 100ms perf win.
There is a much more general purposed (support writable, non-contiguous
buffer) `PyBuffer` implementation in rust-cpython github version [1].
However, what `NodeMap` really wants is `AsRef<[u8]>` and `AsRef<[u32]>`,
which are not provided by that.
Since the code is short, let's just add our own wrapper. It also allows us
to use the published version of rust-cpython instead of pulling from github.
[1]: https://github.com/dgrunwald/rust-cpython/blob/master/src/buffer.rs
Test Plan:
cargo build
Differential Revision: https://phab.mercurial-scm.org/D1493
A simple nodemap index (node -> rev, and prefix -> node). See comment in
the code for the actual format.
Note this does not cover every features of a Mecurial nodemap. For example,
it does not have `__setitem__`, `__delitem__`, and does not treat rev -1
specially. Those could be implemented in a higher level.
Differential Revision: https://phab.mercurial-scm.org/D1469
Implement the main radix tree with tests. A quick benchmark shows the
insertion performance is similar to the known revlog.c implementation.
The time complexity is about `O(N * log N)` for inserting or looking up `N`
entries. The `log` part is because the prefix length is increasing. A rough
(not so accurate) real world benchmark is like:
| N | Insert | Lookup | Checked Lookup [1] | Index Size |
| 10k | 0.70ms | 0.25ms | 0.36ms | 0.25MB |
| 20k | 1.3ms | 0.58ms | 0.8ms | 0.45MB |
| 50k | 4.9ms | 1.9ms | 2.6ms | 1.1MB |
| 100k | 11ms | 4.5ms | 6.8ms | 2.5MB |
| 200k | 26ms | 13ms | 17ms | 4.9MB |
| 500k | 68ms | 46ms | 54ms | 11MB |
| 1M | 170ms | 130ms | 150ms | 24MB |
| 2M | 420ms | 300ms | 350ms | 51MB |
| 5M | 1.2s | 0.9s | 1.1s | 110MB |
| 10M | 2.7s | 2.3s | 2.7s | 220MB |
| 20M | 6.2s | 5.1s | 5.8s | 490MB |
| 50M | 19s | 16s | 18s | 1.2GB |
[1]: After lookup, verify the key id maps to the key. Can be skipped if key
length is fixed and index data could be trusted.
Test Plan:
`cargo test --lib`. Also use `kcov` to make sure every line is covered
except for `return false` in quickcheck functions, or things requiring a
buffer size that exceeds `u64`.
```
cargo rustc --lib --profile test -- -Ccodegen-units=1 -Clink-dead-code -Zno-landing-pads
kcov --include-path $PWD/src --verify target/kcov ./target/debug/*-????????????????
```
Differential Revision: https://phab.mercurial-scm.org/D1291
Keys are `[u8]` stored in a plain buffer. This diff adds functions to read
and write two kinds of keys - fixed sized (20 bytes) and variant-length
ones. The latter uses VLQ encoding to store key length.
Applications could implement other functions. For example, Mercurial could
use revision numbers instead of offsets to refer to commit hashes.
The read functions will be used in the main radix tree logic.
Test Plan:
`cargo test --lib`
Differential Revision: https://phab.mercurial-scm.org/D1432
The radix tree would be using base16 to save space and support hex-string
prefix queries. Therefore a base16 iterator is needed.
`DoubleEndedIterator`, `ExactSizeIterator` and fast paths of `skip` and
`take` are implemented so certain patterns (ex. `skip.take.rev`, used in a
future patch) could work without a temporary `Vec`.
Test Plan:
`cargo test --lib`
Differential Revision: https://phab.mercurial-scm.org/D1290
This is an implementation of the radix tree in Rust that converts `[u8]` to
integers. It is intended to be used as a basic building block for efficient
source-control related key-value lookups. For example, to convert a commit
hash to an internal number (ex. revision number).
The file format is quite similar to the C radix tree implementation in
`mercurial/cext/revlog.c`, with some improvements:
- Not coupled with CPython APIs
- Not coupled with revlog or revision numbers
- Not coupled with malloc / realloc so the abstraction managing the buffer
could be more flexible
- Support variant-length keys
- Explicitly use little-endian so the format is platform-independent
Test Plan:
`cargo test --lib`
Differential Revision: https://phab.mercurial-scm.org/D1289
Previously, hg repack would undelta every entry, then redelta it. If the delta
base node was the same before and after (which should be common), this resulted
in a lot of work and made repacking every entry in a chain an n^2 operation.
Let's use the new getdelta API to reuse the delta when possible.
Differential Revision: https://phab.mercurial-scm.org/D1690
In order to improve repack performance, we need to be able to fetch a single
delta from a data store. This diff adds the new api and implements it in all our
existing interfaces.
Differential Revision: https://phab.mercurial-scm.org/D1688
In a future patch we will be reading delta chain entries from another function,
so let's move it to its own function.
Differential Revision: https://phab.mercurial-scm.org/D1687
In a future diff we'll be doing metadata parsing in another function as well, so
let's move it to a separate function.
Differential Revision: https://phab.mercurial-scm.org/D1686
If a backup is in progress (the backup lock is taken), we shouldn't show the
backup summary message as it will soon be out-of-date.
Differential Revision: https://phab.mercurial-scm.org/D1705
Add a templatekeyword `backingup` which evaluates to True if infinitepush is
currently performing a background backup. This allows log output to be
customized for not-backed-up revisions if a backup is currently in progress.
Differential Revision: https://phab.mercurial-scm.org/D1704
The dirstatemap identity attribute should be a property, as the dirstate
accesses it this way. Currently it is using the bound method as if it were the
identity, which it is not.
Differential Revision: https://phab.mercurial-scm.org/D1680
Infinitepush has a function in backupcommands.py for accessing the srcrepo of
shared repos. This is useful in other extensions, so extract it to a utility
module.
Differential Revision: https://phab.mercurial-scm.org/D1603
Verify can be extremely slow on large repos, as it tries to inspect every
commit, possibly downloading trees for each one, so this extension disables it.
This also disables the verification step of rollback.
Differential Revision: https://phab.mercurial-scm.org/D1614
There are two usages that I found and switched over
1) hg ssl
2) hg diff --since-last-arc-diff
The new phabricator_graphql_client.py file came mostly from
fbsource/fbcode/phabricator, but I had to modify it so it could use the lightweight url client instead of the (unavailable) thirdparty "requests" library
This requires the config changes in D6495153, and replaces D5339248
Test Plan:
ran "hg ssl" on www and fbsource on my sandbox, and verified that it looked the same before & after my changes.
ran "hg diff --since-last-arc-diff" on a modified workspace, and an unmodified workspace (unmodified results differ, so i'll tidy those up and edit this comment away from the test plan once they match)
Differential Revision: https://phab.mercurial-scm.org/D1608
Summary:
Core commit 0980979a0d48 made `changectx` argument required in `memfilectx`.
I have no idea what I am doing, but I looked how the commit author fixed the
other invocations and tried to mimic it.
Test Plan: - rt
Reviewers: quark, #sourcecontrol
Differential Revision: https://phabricator.intern.facebook.com/D6568569
If a backup head is hidden (e.g. by rebase), then all its ancestor commits
appear as not-backed-up, even though they are.
Consider hidden commits as part of determining whether a revset is backed up or
not, and then filter out the hidden changesets afterwards.
Test Plan:
Add a new test case.
Differential Revision: https://phab.mercurial-scm.org/D1613
The goal is to reduce the amount of hand tuning of new/changed tests that is
required on Windows. Since the OS prints the proper paths everywhere else, this
is limited to Windows. These are based on the check-code rules that were
dropped in 217bd5cb0914.
There are some minor tweaks, because those were trying to detect '/' paths
without a '(glob)' at the end, whereas these detect '\' paths. Also, it looks
like the 'no changes made to subrepo' one was broke, because the path to the
subrepo has been getting output but was not in the pattern. End anchors are
dropped because '(glob)' is no longer required, but '(feature !)' annotations
are a possibility.
The 'saved backup bundle' pattern dropped from run-tests.py was simply carrying
over the first capture group. The replace() method runs prior to evaluating
'\1', but it wasn't doing anything because of the 'r' prefix on '\\'.
The 'not recording move' entry is new, because I stumbled upon it searching for
some of these patterns. There are probably others.
Previously we raised a LockHeld exception if a repack was already in progress,
which caused hg repack to exit non-zero. This changes it to exit 0 since a
repack is in progress to accomplish the desired result. This is useful in
automation that wants to run repack to maintain the repository but depends on
exit codes to know if a command succeeded or not.
Differential Revision: https://phab.mercurial-scm.org/D1668
The branch information was properly preserved in the changeset, but the
"active" branch of the working copy could be lost (the branch of the base
being used).
Histedit used to behave properly in this regard but the case was not tested
and regressed 4 years ago in c038baa4b6f0.
Summary: For now pushrebase only supports the legacy exchnage mode.
Test Plan: - fewer tests fail after this change
Reviewers: #fbhgext
Differential Revision: https://phab.mercurial-scm.org/D1660
Summary:
A 'bookmarks' capability was recently introduced in the core. Let's make
tests aware of it.
Test Plan: - rt
Reviewers: #fbhgext
Differential Revision: https://phab.mercurial-scm.org/D1659
Summary:
[1] adds a new check which moves bookmarks race check earlier, and this conflicts
with `pushrebase`. In reality `pushrebase`-enabled repos don't need this check
anyway, so we can just remove this generator.
[1] https://www.mercurial-scm.org/repo/hg/rev/f135b42f3faf
Test Plan: - run pushrebase tests, see them change their status from failing to passing
Reviewers: #fbhgext
Differential Revision: https://phab.mercurial-scm.org/D1655
A dirstate update that appends data to the treedirstate tree file, followed by
a hard reboot before the filesystem cache is flushed, can result in a dirstate
tree root that is referred to by the dirstate file, but does not contain the
correct data. Ensure the appended data is synced to disk before returning from
`Store.flush()`.
Differential Revision: https://phab.mercurial-scm.org/D1654