Summary:
On WWW, an `hg update` ends up writing ~15GB worth of data onto the
IndexedLogDataStore, which eats up their precious RAM. As a quick workaround,
let's reduce the max number of logs from 10 down to 4, and increase the size of
each log to keep the total expected size around ~10GB.
Ideally, both these values should be able to configured within the hg config,
but since the IndexedLog is written within hg_memcache_client, we would have to
plumb the values onto it. Medium term, hg_memcache_client will be folded into
hg itself, and this change will be much easier by then.
Do the same for the IndexedLogHistoryStore.
Reviewed By: quark-zju
Differential Revision: D17354856
fbshipit-source-id: 0a75953f40e1982eaf43557f7866f089873300db
Summary:
`ascii` was used as the default / fallback, which is not a user-friendly choice.
Nowadays utf-8 dominates:
- Rust stdlib is utf-8.
- Ruby since 1.9 is utf-8 by default.
- Python 3 is unicode by default.
- Windows 10 adds utf-8 code page.
Given the fact that:
- Our CI sets HGENCODING to utf-8
- Nuclide passes `--encoding=utf-8` to every command.
- Some people have messed up with `LC_*` and complained about hg crashes.
- utf-8 is a super set of ascii, nobody complains that they want `ascii`
encoding and the `utf-8` encoding messed their setup up.
Let's just use `utf-8` as the default encoding. More aggressively, if someone
sets `ascii` as the encoding, it's almost always a mistake. Auto-correct that
to `utf-8` too.
This should also make future integration with Rust easier (where it's enforced
utf-8 and does not have an option to change the encoding). In the future we
might just drop the flexibility of choosing customized encoding, so this diff
autofixes `ascii` to `utf-8`, instead of allowing `ascii` to be set. We cannot
enforce `utf-8` yet, because of Windows.
Here is our encoding strategy vs the upstream's:
| item | upstream | | ours | ours |
| | current | ideal | current | ideal |
| CLI argv | bytes | bytes | utf-8 [1] | utf-8 |
| path | bytes | auto [3] | migrating [2] | utf-8 |
| commit message | utf-8 | utf-8 | utf-8 | utf-8 |
| bookmark name | utf-8 | utf-8 | utf-8 | utf-8 |
| file content | bytes | bytes | bytes | bytes |
[1]: Argv was accidentally enforced utf-8 for command-line arguments by a Rust
wrapper. But it simplified a lot of things and is kind of ok: everything that
can be passed as CLI arguments are utf-8: -M commit message, -b bookmark, paths,
etc. There is no "file content" passed via CLI arguments.
[2]: Path is controversial, because it's possible for systems to have non-utf8
paths. The upstream behavior is incorrect if a repo gets shared among different
encoding systems (ex. both Linux and Windows). We have to know the encoding of
paths to be able to convert them suitable for the local system. One way is to
enforce UTF-8 for paths. The other is to keep encoding information stored with
individual paths (like Ruby strings). The UTF-8 approach is much simpler with
the tradeoff that non-utf-8 paths become unsupported, which seems to be a
reasonable trade-off.
[3]: See https://www.mercurial-scm.org/wiki/WindowsUTF8Plan.
Reviewed By: singhsrb
Differential Revision: D17098991
fbshipit-source-id: c0ff1e586a887233bd43cdb854fb3538aa9b70c2
Summary:
It can fail with:
test-fb-hgext-treemanifest-treeonly-copyamend.t ...
--- test-fb-hgext-treemanifest-treeonly-copyamend.t
+++ test-fb-hgext-treemanifest-treeonly-copyamend.t.err
@@ -124,6 +124,7 @@
adding a/b/c/d/e/f/g/h/i/j/file3
fetching tree '' efa8fa4352b919302f90e85924e691a632d6bea0, found via 9f95b8f1011f
11 trees fetched over *s (glob)
+ 5 files fetched over 1 fetches - (5 misses, 0.00% hit ratio) over 0.00s
or:
--- test-fb-hgext-treemanifest-treeonly-copyamend.t
+++ test-fb-hgext-treemanifest-treeonly-copyamend.t.err
@@ -124,6 +124,7 @@
adding a/b/c/d/e/f/g/h/i/j/file3
fetching tree '' efa8fa4352b919302f90e85924e691a632d6bea0, found via 9f95b8f1011f
11 trees fetched over *s (glob)
+ 12 files fetched over 1 fetches - (12 misses, 0.00% hit ratio) over 0.00s
It fails more easily on Ubuntu. But it's also possible on CentOS.
Stabilize the test by allowing the optional output.
Reviewed By: singhsrb
Differential Revision: D17346110
fbshipit-source-id: ca6d1de5163e1b2bcb7bea5c619220d6f5e2c864
Summary:
Split the crate to improve build time.
Before this change, a naive change on any of the simple modules can still take
20+ seconds to compile, even with incremental compilation enabled.
This diff splits the crate into multiple smaller crates. A simple change to a
simple crate can take < 10 seconds to re-compile.
Different from pre-D13923866 state, there is still only one single Python
extension.
Reviewed By: xavierd
Differential Revision: D17345706
fbshipit-source-id: c7e2e6f0e1b86071c863cfb8989070a581825956
Summary: Diffusion does not have local commit information for imported diffs (e.g. imported from GitHub), and it will return a list for such commits. This will break `hg ssl`. We can simply skip it if Diffusion is giving us list.
Reviewed By: quark-zju
Differential Revision: D17334156
fbshipit-source-id: 4c4278de94e24c646a3e789377c12f42adb4307e
Summary: Add a prefetch method to the `remotetreestore` in the `treemanifest` extension, along with the necessary plumbing to call it from Rust code.
Reviewed By: quark-zju
Differential Revision: D17335773
fbshipit-source-id: 2b71638f56ea7e1398348f437d737a599d8be476
Summary: This diff provides an implementation of the diff operation for trees which processes directories in BFS order (i.e., layer by layer). This allows the iterator to perform a bulk prefetch of the changed nodes in each layer at the start of each layer of the traversal. This should hopefully provide a more efficient fetch pattern than the existing implementation, which requires a full prefetch of both trees upfront for reasonable performance.
Reviewed By: xavierd
Differential Revision: D17276971
fbshipit-source-id: 284f1d458f43cb76befe27e85f53a641f29d7550
Summary:
Add a `prefetch` method to the `TreeStore` trait. This will be used by code using the store to signal that certain keys will be accessed soon. The default implementation is a no-op, but in the case of stores where prefetching makes sense (such as stores backed by remote servers), the default implementation can be overridden to include the appropriate prefetching logic.
For now, this change is a no-op, but later in this stack it will be used to signal to the underlying Python data store to perform the appropriate tree fetches via the Eden API. This will be used to support a more efficient pattern of bulk tree fetches during the diff operation.
Reviewed By: sfilipco
Differential Revision: D17276970
fbshipit-source-id: 22a5d847e5be5dbf1b0a74b47587a98d840b8cdc
Summary: `scm-prompt` is a bit special. They didn't trigger those tests when modified.
Reviewed By: wez
Differential Revision: D17346163
fbshipit-source-id: ffafc017373031905cbf1fc2f80a3a8e8a606094
Summary:
This adds the remote server to logging in the lfs extensions, which will let us
know which LFS server we're talking to. This is only collected on batch
requests.
Reviewed By: ikostia
Differential Revision: D17341928
fbshipit-source-id: a458ba3b0a4dce1b3f4ab3ea0d509f9715044f0e
Summary:
I would like to change the length of the displayed hash in scm-prompt
to 8. Why such an impactful diff? Because `hg sl` shows 8 characters, and I
always get confused when the hash in my prompt doesn't match `hg sl`
Reviewed By: wez
Differential Revision: D17312417
fbshipit-source-id: 3d7e4947c8202e93697c232dbd5abd04e7baee96
Summary:
This updates the LFS extension to send a client correlator when connecting to a
LFS server. This might be helpful for troubleshooting.
Reviewed By: quark-zju
Differential Revision: D17319281
fbshipit-source-id: 3549c0710ad010f9566a961abeedfbb5366bf49c
Summary:
`extern crate` is usually no longer needed in 2018 edition of Rust. This diff removes `extern crate` lines from fbcode where possible, replacing #[macro_use] with individual import of macros.
Before:
```
#[macro_use]
extern crate futures_ext;
extern crate serde_json;
```
After:
```
use futures_ext::try_boxfuture;
```
Reviewed By: Imxset21
Differential Revision: D17313537
fbshipit-source-id: 70462a2c161375017b77fa44aba166884ad2fdc3
Summary:
This wrapper was needed to allow internal mutability of stores. Now that the
internal mutability is done inside the store, this wrapper is now redundant and
the code can be simplified significantly.
Reviewed By: quark-zju
Differential Revision: D17278152
fbshipit-source-id: c488208d4875e26e9551deb86a7c22abbda085ef
Summary:
This allows any MutableHistoryStore to be shared and written from multiple
threads.
Reviewed By: quark-zju
Differential Revision: D17278149
fbshipit-source-id: 69f81bb0b182cb27022f13b2e6330b7fc805cbaa
Summary:
This allows any MutableDeltaStore to be shared and written from multiple
threads.
Reviewed By: quark-zju
Differential Revision: D17278153
fbshipit-source-id: 17e1474ca1c6d5285cac7dbf519bfd2d5da6e08d
Summary: This will enable switching MutableHistoryStore to use `&self` instead of `&mut self`.
Reviewed By: quark-zju
Differential Revision: D17278151
fbshipit-source-id: 5a6edde5efb0ada14b994d11f33f0aa48780446e
Summary: This will enable switching MutableHistoryStore to use `&self` instead of `&mut self`.
Reviewed By: quark-zju
Differential Revision: D17278154
fbshipit-source-id: cc66d2874bd86235cd39ce3f5357d155e20ef447
Summary: This will enable switching MutableDeltaStore to use `&self` instead of `&mut self`.
Reviewed By: quark-zju
Differential Revision: D17278155
fbshipit-source-id: e7c5d464fb6ba2b31b07127832104b8bd4062fa0
Summary: This will enable switching MutableDeltaStore to use `&self` instead of `&mut self`.
Reviewed By: quark-zju
Differential Revision: D17278148
fbshipit-source-id: c90f62461f784f4a8efb4e5b0ba0c3e21a6f9f77
Summary:
The RefCell makes the IndexedLog not Sync, let's replace the RefCell with
atomic operation to make it Sync.
The `index lookup (disk, verified)` benchmark does not changed much with
this change.
Reviewed By: quark-zju
Differential Revision: D17298580
fbshipit-source-id: 41cb8fea7e06676f3e2cbca3475ac863b0d8454d
Summary: Compiling failed due to moved crates and removed feature.
Reviewed By: quark-zju
Differential Revision: D17298579
fbshipit-source-id: 02e35e71896175ad415efa75c9558074b66cbfa0
Summary:
It's hard (impossible?) to write the iter method for a Store that uses interior
mutability with RefCell and/or Mutex. Due to how MutableDeltaStore and
MutableHistoryStore are designed, it's impossible to use a MutableDeltaStore in
a UnionStore while still being able to add to it due to the requirement of the
add method to be passed in a `&mut self`. To change this, the method needs to
be changed to accept a `&self`, which thus requires the use of a Mutex/RefCell.
Since the IterableStore is never used to get a subset of the keys, let's change
and rename it to always return a Vec of all the keys.
Reviewed By: quark-zju
Differential Revision: D17278150
fbshipit-source-id: 64bad54e69dd89a91acad0d0877d888685858994
Summary:
Change the message so it looks more like a user error, not a crash in the
software. This is motivated by some people reporting `hg rage` crash with
`abort: no repository found` messages.
Reviewed By: sfilipco
Differential Revision: D17292658
fbshipit-source-id: 9988f54f2ff8fd48949bcd35c13309c117f3afc6
Summary:
This is a short term workaround for an issue affecting sandcastle
where the scratch dir is being created as root and subsequently blocking
the job from creating new scratch dirs inside it.
We'll fix this up in some sandcastle specific config in a follow up diff,
but this should unblock the hgbuild for now.
Reviewed By: quark-zju
Differential Revision: D17296990
fbshipit-source-id: 823a86312a6684b385395bd6427c9b8ce2639d5e
Summary:
The latest error reponses that we see for phabricator have only one field called
`error` that lists a message. Adding handling for this case.
Reviewed By: quark-zju
Differential Revision: D17290349
fbshipit-source-id: ebfd9d7b07f30cbfe3171259efcfc6a00a1abdce
Summary: I think these are left over from pre-2018 code where they may have been necessary. In 2018 edition, import paths in `use` always begin with a crate name or `crate`/`super`/`self`, so `use $ident;` always refers to a crate. Since extern crates are always in scope in every module, `use $ident` does nothing.
Reviewed By: Imxset21
Differential Revision: D17290473
fbshipit-source-id: 23d86e5d0dcd5c2d4e53c7a36b4267101dd4b45c
Summary:
With `remotefilelog.fetchpacks` enabled everywhere, loosefiles are no longer
being generated. Let's add an option that no longer reads the ones that are
left on disk.
Reviewed By: quark-zju
Differential Revision: D17263163
fbshipit-source-id: fa801232be4f6c2df959f57e818a418400246b5e
Summary:
The `remotefilelog.packlocaldata` has been on for a while now. Instead of
keeping the code around, let's simply remove it and fix all the tests that
assume a commit will generate loosefiles.
Reviewed By: quark-zju
Differential Revision: D17244837
fbshipit-source-id: e65ed16c9818be61be9ccbe19ce3fa18c890d70b
Summary:
As loosefiles are deprecated and being removed, testing for a filenode should
also consider looking into packfiles and indexededlogs. One easy way to achieve
this is to use the getmissing API that the various stores have.
Reviewed By: quark-zju
Differential Revision: D17244835
fbshipit-source-id: 5ef0d5a048bb61fb0945167c61d77829ee4570e1
Summary:
During a shallow clone, all the local loosefiles from the server are sent over
the wire. Usually, this is never exercised, expect when the clone is pulling
from another shallow client, which may have local data. Since loosefiles are no
longer being generated for draft commits, we also need to send the local
packfiles, make sure these are sent too.
Reviewed By: quark-zju
Differential Revision: D17244836
fbshipit-source-id: 52e6d3cba8c42e69bf782e220b9561be7a6268ab
Summary:
The current error message is a bit noisy. Let's just get to the point about the
filename and line number that is interesting without tracebacks. This only
affects functions using the `autofix.eq` API, other kinds of exceptions will
have tracebacks as usual.
Before, run-tests.py (19 lines):
--- test-empty-t.py.out
+++ test-empty-t.py.err
@@ -0,0 +1,14 @@
+Traceback (most recent call last):
+ File "hg/tests/test-empty-t.py", line 71, in <module>
+ """
+ File "hg/tests/testutil/dott/shobj.py", line 89, in __eq__
+ autofix.eq(out, rhs, nested=1, eqfunc=eqglob)
+ File "hg/tests/testutil/autofix.py", line 93, in eq
+ raise AssertionError("actual != expected\n%s" % diff)
+AssertionError: actual != expected
+--- expected
++++ actual
+@@ -1 +1 @@
+-someheads
++allheads
+
ERROR: test-empty-t.py output changed
Before, run directly via python (13 lines):
Traceback (most recent call last):
File "test-empty-t.py", line 71, in <module>
"""
File "hg/tests/testutil/dott/shobj.py", line 89, in __eq__
autofix.eq(out, rhs, nested=1, eqfunc=eqglob)
File "hg/tests/testutil/autofix.py", line 93, in eq
raise AssertionError("actual != expected\n%s" % diff)
AssertionError: actual != expected
--- expected
+++ actual
@@ -1 +1 @@
-someheads
+allheads
After, run-tests.py (8 lines):
--- test-empty-t.py:71 (expected)
+++ test-empty-t.py:71 (actual)
@@ -1 +1 @@
-someheads
+allheads
ERROR: test-empty-t.py output changed
After, run directly (5 lines):
% python test-empty-t.py
--- test-empty-t.py:71 (expected)
+++ test-empty-t.py:71 (actual)
@@ -1 +1 @@
-someheads
+allheads
Reviewed By: xavierd
Differential Revision: D17277286
fbshipit-source-id: a48d4d1e817f67e221a901977e0c0f8bdc1a62ab
Summary:
Previously `python --fix test-foo-t.py` is the only way to autofix the test.
That's a bit annoying because `run-tests.py` has more features (ex. run many
tests together).
This diff changes `run-tests.py` to pass `--fix` to Python `-t.py` tests so
the autofix works in a familiar way. `--fix` was added as an alias to
`--update-output` to make it consistent with the Python UX.
Reviewed By: xavierd
Differential Revision: D17277287
fbshipit-source-id: e815a79895161862d844de165710cc36d6709727
Summary:
Comparing to `.t` tests, `dott` Python tests cannot autofix commands without
outputs. This diff makes it able to do so. It's less strict than the AST
parsing (for example, it does not handle `#` comments precisely). But
practically it might be good enough. We can update it to use real AST parsing
if it becomes an issue.
This should make `dott` Python tests easier to use.
Reviewed By: xavierd
Differential Revision: D17277285
fbshipit-source-id: 11ef6ec4327a6547d49b544c63bc000a3c351947
Summary:
The current `dott` library enforces `sh % "foo"` to output nothing. That's
done by checking in `__del__`. `__del__` is special and cannot raise exceptions
so the current code put exceptions in a "delayed" list and raise it later.
However, the "later" raise uses a new traceback that is useless.
This diff changes it to use `sys.exc_info` to save the traceback information
so we can re-raise with more accurate exception:
For example, the exception before looks like:
% python test-empty-t.py
Traceback (most recent call last):
File "test-empty-t.py", line 71, in <module>
sh % "cd .."
File "fbcode/scm/hg/tests/testutil/dott/shobj.py", line 151, in __mod__
return LazyCommand(command)
File "fbcode/scm/hg/tests/testutil/dott/shobj.py", line 28, in __init__
raise _delayedexception[0]
UnboundLocalError: local variable 'code' referenced before assignment
The `shobj.py` in traceback is pointless.
With the change, it now looks like:
% python test-empty-t.py
Traceback (most recent call last):
File "test-empty-t.py", line 71, in <module>
sh % "cd .."
File "fbcode/scm/hg/tests/testutil/dott/shobj.py", line 153, in __mod__
return LazyCommand(command)
File "fbcode/scm/hg/tests/testutil/dott/shobj.py", line 97, in __del__
autofix.eq(out, "", nested=1, fixfunc=_fixoutput)
File "fbcode/scm/hg/tests/testutil/autofix.py", line 74, in eq
fix = fixfunc(code, parse, lineno, parse(path))
UnboundLocalError: local variable 'code' referenced before assignment
Reviewed By: xavierd
Differential Revision: D17277288
fbshipit-source-id: 91f22b75b2e2efd632f5844b1d2554e7406be926
Summary:
Now that no repack code uses kind/id/cleanup, let's remove the code and all the
associated types.
Reviewed By: quark-zju
Differential Revision: D17207644
fbshipit-source-id: 7e8891b288e1b6193c4fc3a52f18115fe199a8e2
Summary:
Now that the python code no longer use the repack logic from Rust, let's remove
the RepackablePyExt trait and associated implementation.
Reviewed By: quark-zju
Differential Revision: D17207642
fbshipit-source-id: 22d233fb20e5846418cb27d55fa5862ade403728
Summary:
The markledger functionality is only useful when using the Python based repack
logic, and since that code now only deals with loosefiles, we can make
markledger a no-op.
Reviewed By: quark-zju
Differential Revision: D17207641
fbshipit-source-id: edbe06fac3d170e9e09e2aa7824fb4cd651bf386
Summary:
We were still using the Python based repack, let's switch to the Rust one. As
far as I can tell, this code is unused, so the test change should be safe to
do.
Reviewed By: quark-zju
Differential Revision: D17207643
fbshipit-source-id: 89d0ba85327077dfc4e26c55ade3284beeb44b50
Summary:
Set `-j 100` to workaround a deadlock issue in `cc`. This should unblock our
contbuild.
Reviewed By: singhsrb
Differential Revision: D17286866
fbshipit-source-id: 547888c2e6b1f4c0e5552b4a6502839766fa8141
Summary:
Update the chg code to correctly honor the `pager.stderr` setting, and avoid
piping stderr to the pager when it is disabled.
Since only the server-side code knows the hg config values, the server-side
chgserver.py code passes this config setting back to the client when sending
the pager request.
Reviewed By: quark-zju
Differential Revision: D17109106
fbshipit-source-id: 6b69b1a7de9f61db51af7b0ba00d65fa5053a795
Summary:
This refactors the code the code that sends system and pager requests from the
chg server to the client.
Previously these were both sent using the `S` channel code. The `S` channel
code was slightly unusual and had special handling: in general the code
assumed that upper-case channel codes did not include any request body data,
but this wasn't true for the `S` channel data. I changed this to lower-case
in order to eliminate this special case handling, and I also split up the
system and pager data into different channel codes, since they have fairly
different behavior. System requests are now sent with an `s` channel code,
and pager requests are sent with a `p` channel code.
I also changed the code to require that the server always adds a terminating
`\0` byte after each environment variable value. Previously the client code
was responsible for adding a nul terminator on the last string, which could
potentially require the client to copy the data into a larger buffer in order
to do so.
I also made a minor change to the client-side `readchannel()` code so that it
can read the channel type and body data length with a single system call
instead of making 2 separate `recv()` calls.
The main benefit of these changes is that they will let the server pass some
additional configuration information with pager requests. The change to make
use of this new field will come in a subsequent diff.
Reviewed By: quark-zju
Differential Revision: D17216291
fbshipit-source-id: c3044cf3d5f5e103f0b62d083e4ef3764160f20e
Summary:
For Eden we currently have a gap in our post-mount behavior;
we don't perform the update hooks for a freshly cloned and mounted repo.
The thought is that we'll explicitly trigger them by invoking `hg
debugedenrunpostupdatehook` at the end of the mount sequence.
Reviewed By: quark-zju
Differential Revision: D17237197
fbshipit-source-id: 9c2212c61735068c287eb98761503ce31bfee8a6
Summary:
When we upgraded from structopt 0.2 to 0.3 in D17138630, we had to put a no_version on all of our structopt data structures or else structopt would fail to compile with the error:
```
`CARGO_PKG_VERSION` environment variable is not defined, use `version = "version"` to
set it manually or `no_version` to not set it at all
```
Fbcode isn't using Cargo so there wasn't a CARGO_PKG_VERSION associated with each cli.
I raised with the maintainers in [TeXitoi/structopt#243](https://github.com/TeXitoi/structopt/issues/243) that requiring no_version attributes everywhere is annoying, and it's been fixed in structopt 0.3.1. This diff updates to 0.3.1 and removes 42 no_version attributes from fbcode.
Reviewed By: bolinfest
Differential Revision: D17231406
fbshipit-source-id: 2bab2864afbf23b34ea5d73884462455d863c139
Summary:
Add benchmark for the newly added mincode serialization. An example run shows:
serialize by cbor 32.573 ms 9.701 MB
deserialize by cbor 79.471 ms
serialize by cbor-packed 28.574 ms 7.801 MB
deserialize by cbor-packed 73.337 ms
serialize by bincode 25.175 ms 6.789 MB
deserialize by bincode 23.193 ms
serialize by mincode 19.687 ms 5.389 MB
deserialize by mincode 24.852 ms
serialize by handwritten 2.939 ms 5.389 MB
deserialize by handwritten 7.963 ms
serialize by abomonation 1.752 ms 10.389 MB
deserialize by abomonation 6.060 ms
Interesting facts:
- mincode serialization is actually faster than bincode. (it would appear
slower if vec was not preallocated).
- mincode is much slower than handwritten. This is partially caused by serde
not having a native "fixed array" type so it cannot do `write_all(&[u8; 20])`
but has to `write_u8(x)` 20 times (which translates to
`Vec::extend_from_slice` 20 times.
Regardless, the main reason I think mincode is compelling is its compactness
and relatively good performance. Although handwritten is the fastest, the
mincode performance is fine in the commit storage usecase, since mincode is
probably not the bottleneck.
Reviewed By: kulshrax
Differential Revision: D17087352
fbshipit-source-id: 820ff8538d3ab9ebef2eda0a40cad126e26db622