Summary: This was horribly broken, and we have no tests.
Reviewed By: singhsrb
Differential Revision: D23720984
fbshipit-source-id: 4ad47c767b0d18f700c855a7bb43f38f5c5ef317
Summary:
When I added the surrogateescape patch for the email parser decoder
used during patches, I incorrectly added a corresponding encoder on the other
end when we get the data out of the parser. It turns out the parser is
smart/dumb. When using get_payload() it attempts a few different decodings of
the data and ends up replacing all the non-ascii characters with replacement
bits (question marks). Instead we should use get_payload(decode=True), which
bizarrely actually encodes the data into bytes, correctly detecting the presence
of surrogates and using the correct ascii+surrogateescape encoding.
Reviewed By: singhsrb
Differential Revision: D23720111
fbshipit-source-id: ed40a15056c39730c91067b830f194fbe41e5788
Summary:
This test is flaky due to `hg up` not always reading data from the stores, and
thus not always failing to reading the LFS blob. A better way to force read
from the store is to simply use `hg log -p` to read from the stores.
Reviewed By: DurhamG, singhsrb
Differential Revision: D23718823
fbshipit-source-id: 98bc37a76e93a67d031ba7bfa124b1db816983a1
Summary: The files use Python 3 only syntax and is not really used. Skip them so Python 2 build won't hit invalid syntax issues.
Reviewed By: chadaustin
Differential Revision: D23717662
fbshipit-source-id: f911a83937be9ccc40194f321e3b41625a68e703
Summary:
Running `setup.py` with Python 3 for Python 2 build will cause issues as
`setup.py` writes `.pyc` files in Python 3 format.
Reviewed By: chadaustin
Differential Revision: D23717661
fbshipit-source-id: 38cfabdfdf20424a21f8a5bdaf826e74da2304ac
Summary:
In preparation of moving away from SSH as an intermediate entry point for
Mononoke, let Mononoke work with newly introduced Metadata. This removes any
assumptions we now make about how certain data is presented to us, making the
current "ssh preamble" no longer central.
Metadata is primarily based around identities and provides some
backwards-compatible entry points to make sure we can satisfy downstream
consumers of commits like hooks and logs.
Simarly we now do our own reverse DNS resolving instead of relying on what's
been provided by the client. This is done in an async matter and we don't rely
on the result, so Mononoke can keep functioning in case DNS is offline.
Reviewed By: farnz
Differential Revision: D23596262
fbshipit-source-id: 3a4e97a429b13bae76ae1cdf428de0246e684a27
Summary:
As it says in the title, this adds support for receiving compressed responses
in the revisionstore LFS client. This is controlled by a flag, which I'll
roll out through dynamicconfig.
The hope is that this should greatly improve our throughput to corp, where
our bandwidth is fairly scarce.
Reviewed By: StanislavGlebik
Differential Revision: D23652306
fbshipit-source-id: 53bf86d194657564bc3bd532e1a62208d39666df
Summary:
This imports the async-compression crate. We have an equivalent-ish in
common/rust, but it targets Tokio 0.1, whereas this community-supported crate
targets Tokio 0.2 (it offers a richer API, notably in the sense that we
can use it for Streams, whereas the async-compression crate we have is only for
AsyncWrite).
In the immediate term, I'd like to use this for transfer compression in
Mononoke's LFS Server. In the future, we might also use it in Mononoke where we
currently use our own async compression crate when all that stuff moves to
Tokio 0.2.
Finally, this also updates zstd: the version we link to from tp2 is actually
zstd 1.4.5, so it's a good idea to just get the same version of the zstd crate.
The zstd crate doesn't keep a great changelog, so it's hard to tell what has changed.
At a glance, it looks like the answer is not much, but I'm going to look to Sandcastle
to root out potential issues here.
Reviewed By: StanislavGlebik
Differential Revision: D23652335
fbshipit-source-id: e250cef7a52d640bbbcccd72448fd2d4f548a48a
Summary: That might be used to pass more data to the server
Reviewed By: markbt
Differential Revision: D23704722
fbshipit-source-id: a6e41d615f6548f2f8fd036814c59573a45f93bc
Summary:
EdenFS is adding a Python 3 Thrift client intended for use by other
projects, and the Mercurial Python 2 build doesn't understand Python 3
syntax files, so switch the default getdeps build to Python 3.
Reviewed By: quark-zju
Differential Revision: D23587932
fbshipit-source-id: 6f47f1605987f9b37f888d29b49a848370d2eb0e
Summary:
We've often had cases where we need to nuke peoples caches for various
reasons. It's a hug pain since we haven't a way to communicate with all hg
clients. Now that we have configerator dynamicconfigs, we can use that to reach
all clients.
This diff adds support for configs like:
```
[hgcache-purge]
foo=2020-08-20
```
The key, 'foo' in this case, is an identifier used to only run this purge once.
The value is a date after which this purge will no longer run. This is useful
for bounding the damager from forgetting about a purge and having it delete caches
over and over in the future for new repos or repos where the run once marker
file is deleted for some reason.
Reviewed By: quark-zju
Differential Revision: D23044205
fbshipit-source-id: 8394fcf9ba6df09f391b5317bad134f369e9b416
Summary:
`hg cloud rejoin` is used in fbclone
By providing a bit more information about the workspaces available we can improve user
experience and try to eliminate the confusion multiple workspaces cause.
Reviewed By: mitrandir77
Differential Revision: D23623063
fbshipit-source-id: 7598c1b58597032c9cfcef0b44b0ec1b00510ffa
Summary:
The corpus rev that biggrep has indexed may not be available in the
local client. Later on in the function it will pull that revision, but earlier
in the function the new logic I added a few weeks ago is just crashing.
That logic was trying to diff against the earlier revision, but that's pretty
arbitrary. Let's just diff against one of the revs at random
(deterministically) and get rid of the need for the hash to exist in the repo
early in the command.
Reviewed By: sfilipco
Differential Revision: D23635801
fbshipit-source-id: 1c284d710b8df9539a696e900183bc10d5d71869
Summary:
Fixes a few issues with Mononoke tests in Python 3.
1. We need to use different APIs to account for the unicode vs bytes difference
for path hash encoding.
2. We need to set the language environment for tests that create utf8 file
paths.
3. We need the redaction message and marker to be bytes. Oddly this test still
fails with jq CLI errors, but it makes it past the original error.
Reviewed By: quark-zju
Differential Revision: D23582976
fbshipit-source-id: 44959903aedc5dc9c492ec09a17b9c8e3bdf9457
Summary:
For repositories that have the old-style LFS extension enabled, the pointers
are stored in packfiles/indexedlog alongside with a flag that signify to the
upper layers that the blob is externally stored. With the new way of doing LFS,
pointers are stored separately.
When both are enabled, we are observing some interesting behavior where
different get and get_meta calls may return different blobs/metadata for the
same filenode. This may happen if a filenode is stored in both a packfile as an
LFS pointers, and in the LFS store. Guaranteeing that the revisionstore code is
deterministic in this situation is unfortunately way too costly (a get_meta
call would for instance have to fully validate the sha256 of the blob, and this
wouldn't guarantee that it wouldn't become corrupted on disk before calling
get).
The solution take here is to simply ignore all the lfs pointers from
packfiles/indexedlog when remotefilelog.lfs is enabled. This way, there is no
risk of reading the metadata from the packfiles, and the blob from the
LFSStore. This brings however another complication for the user created blobs:
these are stored in packfiles and would thus become unreadable, the solution is
to simply perform a one-time full repack of the local store to make sure that
all the pointers are moved from the packfiles to to LFSStore.
In the code, the Python bindings are using ExtStoredPolicy::Ignore directly as
these are only used in the treemanifest code where no LFS pointers should be
present, the repack code uses ExtStoredPolicy::Use to be able to read the
pointers, it wouldn't be able to otherwise.
Reviewed By: DurhamG
Differential Revision: D22951598
fbshipit-source-id: 0e929708ba5a3bb2a02c0891fd62dae1ccf18204
Summary:
hg-http's built client should provide integration with Mercurial's stats
collection mechanisms.
Reviewed By: kulshrax
Differential Revision: D23577867
fbshipit-source-id: 93c777021bc347511322269d678d6879710eed3e
Summary:
Add `with_stats_reporting` to HttpClient. It takes a closure that will be
called with all `Stats` objects generated. We then use this function in
the hg-http crate to integrate with the metrics backend used in Mercurial.
Reviewed By: kulshrax
Differential Revision: D23577869
fbshipit-source-id: 5ac23f00183f3c3d956627a869393cd4b27610d4
Summary: Rust based metrics so that even Rust libraries can write metrics.
Reviewed By: quark-zju
Differential Revision: D23577870
fbshipit-source-id: b19904968d9372c8ce19775fb37c7af53a370ea5
Summary:
We start off simple here. Python only really has counters so we only implement
counters. There are a lot of options on how to improve this and things get
slightly complicated when we look at the how ecosystem and fb303. Anyway,
simple start.
Reviewed By: quark-zju
Differential Revision: D23577874
fbshipit-source-id: d50f5b2ba302d900b254200308bff7446121ae1d
Summary:
Slash is probably the standard metric delimiter nowadays. Since we don't have
that many metrics I think that it makes sense to look at slash as the
standard metric delimiter going forward.
This diff updates parsing of metric names to treat both '_' and '/' as
delimiters.
Reviewed By: quark-zju
Differential Revision: D23577876
fbshipit-source-id: 03997b1285df9c52d6e2837b5af5372deb69b133
Summary:
The command is easier to use than `hg cloud join --switch`.
Also highlight the workspace name in the output of `hg cloud status`
Reviewed By: mitrandir77
Differential Revision: D23601507
fbshipit-source-id: 74eb17c9366a9dbe96881c8e3e0705619fadb3d6
Summary:
Streaming clone implementation did not check that received files have the corrects. This change addresses it.
Before this change if connection was interrupted for whatever reason client would treat fetch of changeset as successful and proceed with cloning operations, but later checks would report corruption of internal state of hg data. This is based on user [report](https://fb.workplace.com/groups/scm/permalink/3177150312334567/)
Reviewed By: quark-zju, krallin
Differential Revision: D23572058
fbshipit-source-id: d740b45ca217cd6db0a65e01aabc2ba9a4835221
Summary: The Mercurial codebase uses hyphens in crate names rather than underscores. This is similar to the convention favored by the larger Rust community, though it is different from Mononoke, which uses underscores. While we'll probably need to eventually settle on a consistent convention for all of projects in the Eden SCM repo, for now, `http_client` should be made consistent with the adjacent crates.
Reviewed By: sfilipco
Differential Revision: D23585721
fbshipit-source-id: d2e690d86815be02d7b8d645198bcd28e8cbd6e0
Summary: No more tokio-core! More `async/await`.
Reviewed By: kulshrax
Differential Revision: D23586509
fbshipit-source-id: b2e766ddb7575bc96963432f0c8582b4370b19aa
Summary:
This diff adds a `SocketTransport` implementation that no longer uses legacy `tokio-core` based futures but `tokio-tower` and `tower-service` for processing Thrift requests.
The old implementation is renamed to `SocketTransportLegacy` for better transitioning.
Reviewed By: dtolnay
Differential Revision: D20019196
fbshipit-source-id: 3bee684e9254bf1a81669ef0d2c2262a55e75daa
Summary:
In order to keep the hgcache size bounded we need to keep track of pack
file size even during normal operations and delete excess packs.
This has the negative side effect of deleting necessary data if the operation is
legitimately huge, but we'd rather have extra downloading time than fill up the
entire disk.
Reviewed By: quark-zju
Differential Revision: D23486922
fbshipit-source-id: d21be095a8671d2bfc794c85918f796358dc4834
Summary:
In a future diff we'll add logic to delete old pack files. We'll want
to use this pack iteration code, so let's move it to a function.
Reviewed By: quark-zju
Differential Revision: D23486920
fbshipit-source-id: 5f872e946ffe816289c925dd2e03c292e29da5af
Summary:
As the repository grows the opportunity for large downloads increases.
Today all writes to data packs get sent straight to disk, but we have no way to
prevent this from eating all the disk.
Let's automatically flush datapacks when they reach a certain size (default
4GB). In a future diff this will let us automatically garbage collect data packs
to bound the maximum size of packs.
Rotatelog already have this behavior.
Reviewed By: quark-zju
Differential Revision: D23478780
fbshipit-source-id: 14f9f707e8bffc59260c2d04c18b1e4f6bdb2f90
Summary:
See D23538897 for context. This adds a killswitch so we can rollout client
certs gradually through dynamicconfig.
Reviewed By: StanislavGlebik
Differential Revision: D23563905
fbshipit-source-id: 52141365d89c3892ad749800db36af08b79c3d0c
Summary:
Like it says in the title, this updates remotefilelog to present client
certificates when connecting to LFS (this was historically the case in the
previous LFs extension). This has a few upsides:
- It lets us understand who is connecting, which makes debugging easier;
- It lets us enforce ACLs.
- It lets us apply different rate limits to different use cases.
Config-wise, those certs were historically set up for Ovrsource, and the auth
mechanism will ignore them if not found, so this should be safe. That said, I'd
like to a killswitch for this nonetheless. I'll reach out to Durham to see if I
can use dynamic config for that
Also, while I was in there, I cleaned up few functions that were taking
ownership of things but didn't need it.
Reviewed By: DurhamG
Differential Revision: D23538897
fbshipit-source-id: 5658e7ae9f74d385fb134b88d40add0531b6fd10
Summary:
Generated by formatting with rustfmt 2.0.0-rc.2 and then a second time with fbsource's current rustfmt (1.4.14).
This results in formatting for which rustfmt 1.4 is idempotent but is closer to the style of rustfmt 2.0, reducing the amount of code that will need to change atomically in that upgrade.
---
*Why now?* **:** The 1.x branch is no longer being developed and fixes like https://github.com/rust-lang/rustfmt/issues/4159 (which we need in fbcode) only land to the 2.0 branch.
---
Reviewed By: zertosh
Differential Revision: D23568779
fbshipit-source-id: 477200f35b280a4f6471d8e574e37e5f57917baf
Summary:
This makes it easy for `metaedit` to be used by automation. Provided
with a simple JSON file with hash->{user, message} mapping metaedit will
do all of its work without any prompts.
Reviewed By: quark-zju
Differential Revision: D23545527
fbshipit-source-id: 18763ecacff9143b9ad492faf654b176b0f86d1f
Summary:
The "meaningfulparents" concept is coupled with rev numbers.
Remove it. This changes default templates to not show parents, and `{parents}`
template to show parents.
Reviewed By: DurhamG
Differential Revision: D23408970
fbshipit-source-id: f1a8060122ee6655d9f64147b35a321af839266e
Summary:
Now that the Rust revisionstore records undesired filename fetches,
let's log those results to Scuba in Python.
Reviewed By: StanislavGlebik
Differential Revision: D23462572
fbshipit-source-id: b55f2290e30e3a5c3b67d9f612b24bc3aad403a8
Summary:
We want to be able to record when fetches to certain paths happen.
Let's add recording infrastructure to the new ReportingRemoteDataStore.
A future diff will make the seen accessible from Python for scuba logging.
Reviewed By: xavierd
Differential Revision: D23462574
fbshipit-source-id: 5d749f2429e26e8e7fe4fb5adc29140b4309eac9
Summary:
We want to monitor what paths are fetched from our remote servers.
Since all of our remote stores are hidden behind the RemoteDataStore interface,
let's create a wrapper around that. A future diff will insert the actual
monitoring and reporting.
Reviewed By: quark-zju
Differential Revision: D23462571
fbshipit-source-id: e6031f19db23f7d1b09767efb9613d7528fb457d
Summary: This hopefully makes it more obvious so it looks less like an hg crash.
Reviewed By: kulshrax
Differential Revision: D23509569
fbshipit-source-id: 7174780bc7e9841e3f89a482280c49427b62fb74
Summary:
The revs can change after flush. For example, during pushrebase, some ctx might
initially have a non-master Id assigned, and later got assigned an Id in the
master group:
```
ipdb> p self.__dict__
{'_repo': <edenscm.hgext.fastannotate.protocol.localreposetup.<locals>.fastannotaterepo object at 0x7f2415b3f8e0>, '_rev': 72057594038527478, '_node': b'\xb6\x12\xcd\x81b#\xa3\x01\xe2pP\x84\x05{\xd2He\xbe\xcc\xf0'}
ipdb> p self._node
b'\xb6\x12\xcd\x81b#\xa3\x01\xe2pP\x84\x05{\xd2He\xbe\xcc\xf0'
ipdb> p self._repo.changelog.rev(self._node)
7198913
ipdb> p self._rev
72057594038527478
```
Note that `self._rev` becomes inconsistent with `changelog.rev(self._node)`.
The error looks like:
$ hg push -r . --to master --debug --trace --traceback --verbose
...
pushing rev 556400239977 to destination ...
...
1 commits found
list of changesets:
556400239977b9ed523eae5ad28773784c975f7f
sending unbundle command
...
added 79 commits with 0 changes to 0 files
moving remote bookmark 'remote/master' to 84829e9242e4
...
using eden update code path
Traceback (most recent call last):
...
File "/opt/fb/mercurial/edenscm/mercurial/merge.py", line 2220, in update
return eden_update.update(
File "/opt/fb/mercurial/edenscm/mercurial/eden_update.py", line 126, in update
stats, actions = _handle_update_conflicts(
...
File "/opt/fb/mercurial/edenscm/mercurial/context.py", line 503, in _changeset
return self._repo.changelog.changelogrevision(self.rev())
# self = <changectx 84829e9242e4>
File "/opt/fb/mercurial/edenscm/mercurial/changelog2.py", line 312, in changelogrevision
return changelogrevision(self.revision(nodeorrev))
# nodeorrev = 72057594038527521
File "/opt/fb/mercurial/edenscm/mercurial/changelog2.py", line 365, in revision
node = self.node(nodeorrev)
# nodeorrev = 72057594038527521
File "/opt/fb/mercurial/edenscm/mercurial/changelog2.py", line 280, in node
raise IndexError("revlog index out of range")
Traceback (most recent call last):
File "/opt/fb/mercurial/edenscm/mercurial/changelog2.py", line 278, in node
return self.idmap.id2node(rev)
error.CommitLookupError: 'N599585 cannot be found'
Change `context` object to not memorizing revs.
Reviewed By: DurhamG
Differential Revision: D23468702
fbshipit-source-id: b623bcec99b09d61169371e08c69fc6d6f38935c
Summary:
This is based on fbsource data, building level 5 proves to be not useful.
This would save 300ms in the write path.
Reviewed By: sfilipco
Differential Revision: D23494505
fbshipit-source-id: ca795b4900af40dbfdaa463d36f3169413bf6a62
Summary:
Previously the IdMap's "Name -> Id" index simply ignores the "reassign
non-master" request. It turns out stale entries in that index can cause
issues as demonstrated by the previous diff.
Update IdMap to actually remove both indexes of non-master group on
remove_non_master so it cannot have stale entries.
To optimize the index, the format of IdMap is changed from:
[ 8 bytes Id (Big Endian) ] [ Name ]
to:
[ 8 bytes Id (Big Endian) ] [ 1 byte Group ] [ Name ]
So the index can use reference to the slice, instead of embedding the bytes, to
reduce index size.
The filesystem directory name for IdMap used by NameDag is bumped to `idmap2`
so it won't read the incompatible old `idmap` data.
Reviewed By: sfilipco
Differential Revision: D23494508
fbshipit-source-id: 3cb7782577750ba5bd13515b370f787519ed3894
Summary: Some vertexes can disappear from the graph!
Reviewed By: sfilipco
Differential Revision: D23494506
fbshipit-source-id: ecbf2a4169e5fc82596e89a4bfe4c442a82e9cd2
Summary: The TestDag struct will be used to do some more complicated tests.
Reviewed By: sfilipco
Differential Revision: D23494507
fbshipit-source-id: 11350f9e448725ae49f50a7b6f19efc57ad84448
Summary:
A few things here:
- The heads must be bytes.
- The arguments to wireproto must be strings (we used to encode / decode them,
but we shouldn't).
- The bookmark must be a string (otherwise it gets serialized as `"b\"foo\""`
and then it deserializes to that instead of `foo`).
Reviewed By: StanislavGlebik
Differential Revision: D23499846
fbshipit-source-id: c8a657f24c161080c2d829eb214d17bc1c3d13ef
Summary:
Replacing places where the tokio runtime is instantiated inside the edenapi
client crate.
Reviewed By: quark-zju
Differential Revision: D23468596
fbshipit-source-id: ef68718c7d5b89b6477a2946daaa51618b53d06a
Summary:
At open time, it's pointless to attempt to create new levels. So let's just
read the existing max_level and do not try to build max_level + 1.
This turns out to save 300ms in profiling result.
Reviewed By: sfilipco
Differential Revision: D23494509
fbshipit-source-id: 4ea326a3cc21792790ea0b87e5bf608a94ae382b
Summary:
With MultiLog, per-log meta was previously entirely ignored. However, they can
be useful for updated indexes. For example, application defines a new index,
and opens a Log via MultiLog. The application would expect the new index is
built only once. Without MultiLog, per-log meta is updated at open time in
place. With MultiLog, the updated index meta is not written back to the
multimeta so the new index would be rebuilt multiple times undesirably.
Update MultiLog to reuse the per-log meta if it's compatible so it can pick up
new indexes.
Reviewed By: sfilipco
Differential Revision: D23488212
fbshipit-source-id: c8b3e6b5589dbda2e76a143d15085862a93dae22
Summary:
The poisoned meta makes investigation harder. ex. `debugdumpindexlog` won't
work on those logs.
Reviewed By: sfilipco
Differential Revision: D23488213
fbshipit-source-id: b33894d8c605694b6adf5afdaed45707fbd7357e
Summary:
Change dag_ops benchmarks to use different IdDagStores. An example run shows:
benchmarking dag::iddagstore::indexedlog_store::IndexedLogStore
building segments (old) 856.803 ms
building segments (new) 127.831 ms
ancestors 54.288 ms
children (spans) 619.966 ms
children (1 id) 12.596 ms
common_ancestors (spans) 3.050 s
descendants (small subset) 35.652 ms
gca_one (2 ids) 164.296 ms
gca_one (spans) 3.132 s
gca_all (2 ids) 270.542 ms
gca_all (spans) 2.817 s
heads 247.504 ms
heads_ancestors 40.106 ms
is_ancestor 108.719 ms
parents 243.317 ms
parent_ids 10.752 ms
range (2 ids) 7.370 ms
range (spans) 23.933 ms
roots 620.150 ms
benchmarking dag::iddagstore::in_process_store::InProcessStore
building segments (old) 790.429 ms
building segments (new) 55.007 ms
ancestors 8.618 ms
children (spans) 196.562 ms
children (1 id) 2.488 ms
common_ancestors (spans) 545.344 ms
descendants (small subset) 8.093 ms
gca_one (2 ids) 24.569 ms
gca_one (spans) 529.080 ms
gca_all (2 ids) 38.462 ms
gca_all (spans) 540.486 ms
heads 103.930 ms
heads_ancestors 6.763 ms
is_ancestor 16.208 ms
parents 103.889 ms
parent_ids 0.822 ms
range (2 ids) 1.748 ms
range (spans) 6.157 ms
roots 197.924 ms
benchmarking dag::iddagstore::bytes_store::BytesStore
building segments (old) 724.467 ms
building segments (new) 90.207 ms
ancestors 23.812 ms
children (spans) 348.237 ms
children (1 id) 4.609 ms
common_ancestors (spans) 1.315 s
descendants (small subset) 20.819 ms
gca_one (2 ids) 72.423 ms
gca_one (spans) 1.346 s
gca_all (2 ids) 116.025 ms
gca_all (spans) 1.470 s
heads 155.667 ms
heads_ancestors 19.486 ms
is_ancestor 51.529 ms
parents 157.285 ms
parent_ids 5.427 ms
range (2 ids) 4.448 ms
range (spans) 13.874 ms
roots 365.568 ms
Overall, InProcessStore > BytesStore > IndexedLogStore. The InProcessStore
uses `Vec<BTreeMap<Id, StoreId>>` for the level-head index, which is more
efficient on the "Level" lookup (Vec), and more cache efficient (BTree).
BytesStore outperforms IndexedLogStore because it does not need to verify
checksum on every read access - the checksum was verified at store creation
(IdDag::from_bytes).
Note: The `BytesStore` is something optimized for serialization, and hasn't been sent.
Reviewed By: sfilipco
Differential Revision: D23438174
fbshipit-source-id: 6e5f15188e3b935659ccde25fac573e9b963b78f
Summary: This allows them to use the SyncableIdDag APIs.
Reviewed By: sfilipco
Differential Revision: D23438170
fbshipit-source-id: 7ec7288cfb8186b88f85f0212a913cb0dffe7345
Summary: Other IdDagStores can also use the API. This will be used in benchmarks.
Reviewed By: sfilipco
Differential Revision: D23438180
fbshipit-source-id: 565552b66372dcfbb268c397883f627491d6e154
Summary:
Similar to `IdDagStore::sync` -> `GetLock::persist`, `reload` is more related
to filesystem/internal state exchange, and should be protected by a lock. So
let's move the API there, and requires a lock.
Reviewed By: sfilipco
Differential Revision: D23438169
fbshipit-source-id: 4228106b7739a1a758677adfddd213ad54aa4b6a
Summary:
`NameDag::reload` is used in `flush` to get a "fresh" NameDag.
In a future diff the `IdDag::reload` API gets changed, so let's
remove NameDag's use of it.
Instead, let's just re-`open` the path again to get a fresh NameDag.
It's a bit more expensive but probably okay, and easier to understand.
`get_new_segment_size()` was added as an internal API to preserve tests.
This also solves an issue where `NameDag` cannot recover properly if its
`flush` fails, because the old `NameDag` state is not lost.
After removing `NameDag::reload`, `idMap::reload` is no longer used publicly
and was made private.
Reviewed By: sfilipco
Differential Revision: D23438179
fbshipit-source-id: 0a32556a2cd786919c233d7efcae1cb9cbc5fb09
Summary:
The word "sync" is bi-directional: flush + reload. It was indexedlog::Log's
behavior. However, in the IdDag context "sync" is confusing - it is actually
only used to write data out, with protection from lock. Rename to `persist`
to clarify it's memory -> disk. Besides, requires a reference to a lock object
as a lightweight prove that some lock is held.
Reviewed By: sfilipco
Differential Revision: D23438175
fbshipit-source-id: 3d9ccd7431691d1c4e2ee74f3c80d95f5e7243b5
Summary:
This removes the need of cloning `IdMap`.
SyncableIdMap is a bit tricky. I added some comments to clarify things.
Reviewed By: sfilipco
Differential Revision: D23438176
fbshipit-source-id: fe66071da07067ed6c53a6437790af1d81b28586
Summary:
Make the test cover IndexedLogIdDagStore. The only change is the parent index
returns children in a different order.
Reviewed By: sfilipco
Differential Revision: D23438173
fbshipit-source-id: bcfabcd329e45bbc5e7e773103fa42307c23c35d
Summary:
There aren't too many thigs that we can do with the responses that we get back
from the server. Thigs are somewhat application specific for this endpoint.
One option that is not available right now and might make sense to add is
limiting the number of entries that are printed for a given location.
Reviewed By: kulshrax
Differential Revision: D23456220
fbshipit-source-id: eb24602c3dea39b568859b82fc27b7f6acc77600
Summary:
To reduce the size over the wire on cases where we would be traversing the
changelog on the client, we want to allow the endpoint to return a whole parent
chain with their hashes.
Reviewed By: kulshrax
Differential Revision: D23456216
fbshipit-source-id: d048462fa8415d0466dd8e814144347df7a3452a
Summary:
Renaming all the LocationToHash related structures to CommitLocationToHash.
This is done for consistency. I realized the issue when the command for reading
the request from cbor was not what I was expecting it to be. The reason was that
the commit prefix was used inconsistently for LocationToHash.
Reviewed By: kulshrax
Differential Revision: D23456221
fbshipit-source-id: 0181dcaf81368b978902d8ca79c5405838e4b184
Summary:
The default archive behavior archives the entire working copy. That is
undesirable and easy to accidentally trigger in a large repository. Let's
prevent it and require users to specify what they want archived.
Reviewed By: quark-zju
Differential Revision: D23464818
fbshipit-source-id: c39a631d618c2007e442e691cda542400cf8f4c3
Summary:
Replacing uses of the custom Runtime in lfs with the global runtime in the
`async-runtime` crate.
Reviewed By: xavierd
Differential Revision: D23468347
fbshipit-source-id: 61d2858634a37eb2d7d807104702d24889ec047a
Summary:
debugstacktrace is broken right now on Python 3: it wants to write to stderr,
which expects `bytes`, but it tries to write a `str`. This fixes it.
Reviewed By: DurhamG
Differential Revision: D23447984
fbshipit-source-id: 5896ae858f6022276fa47e08636c700159a2a678
Summary: Make it possible to test other IdDagStores.
Reviewed By: sfilipco
Differential Revision: D23438178
fbshipit-source-id: e5fc1b20833c71dd7569c77c31c76a26a6e357fe
Summary:
Now SpanSet can easily support `push_front`, we can just use SpanSet
efficiently without SpanSetAsc.
Reviewed By: sfilipco
Differential Revision: D23385246
fbshipit-source-id: b2e0086f014977fa990d5142e6eee844293e7ca5
Summary: To remove SpanSetAsc, its API needs to be implemented on SpanSet.
Reviewed By: sfilipco
Differential Revision: D23385250
fbshipit-source-id: ebd9d537287b5c1cde6e2c52ffb6da57dbd71852
Summary: This will make it possible to `push_front` and remove SpanSetAsc special case.
Reviewed By: sfilipco
Differential Revision: D23385249
fbshipit-source-id: 63ac67e9bce7cb281236399b3fb86eba23bbf8a0
Summary:
This makes it easier to replace Vec<Span> with VecDeque<Span> in SpanSet for
efficient push_front and deprecates SpanSetAsc (which uses Id in a bit hacky
way - they are not real Ids).
Reviewed By: sfilipco
Differential Revision: D23385245
fbshipit-source-id: b612cd816223a301e2705084057bd24865beccf0
Summary:
One user reports very very slow rebase (tens of minutes and running). The
commit is not very large. Python 2 can complete the rebase in 6 seconds.
I tracked it down to this code path. Making the change makes Python 3
rebase fast too (< 10 seconds). I haven't tracked down exactly why Python
3 is slow yet (maybe N^2 a += b)?
Some numbers about the slow merge:
ipdb> p len(m3.atext)
17984924
ipdb> p len(m3.btext)
17948110
ipdb> p len(m3.a)
613353
ipdb> p len(m3.b)
612129
ipdb> p len(m3.base)
612135
Reviewed By: singhsrb
Differential Revision: D23441221
fbshipit-source-id: 14b725439f4ecd3352edca512cdde32958b2ce29
Summary:
Previously the `is_valid()` function only checks about ordering.
Make it also check "no mergeable adjacent spans" and `span.low<=span.high`.
To provide better debug messages, the function does assertions
directly without returning a bool.
Reviewed By: sfilipco
Differential Revision: D23385247
fbshipit-source-id: 84829e9242e47e68dc2a4b2a6775b13331eba959
Summary:
Previously, `SpanSet::from_sorted_spans` allows having adjacent spans like
`[1..=2, 3..=4]`, while `SpanSet::from_spans` would merge them into `[1..=4]`.
Change it so `SpanSet::from_sorted_spans` merges them too. This simplifies
the `contains` logic and could make some Sets more efficient.
Reviewed By: sfilipco
Differential Revision: D23385248
fbshipit-source-id: 85b5ba9533f15034779e93255085a4fa09c6328a
Summary:
See the test change. Partially successful auto restack should have bookmarks
moved.
Reviewed By: DurhamG
Differential Revision: D23441932
fbshipit-source-id: 07e509a70bcc5cf81f702d40ec1b8dc4a5a781ff
Summary:
Those commands are broken right now: they try to write bytes but don't use
`writebytes`.
Reviewed By: DurhamG
Differential Revision: D23450968
fbshipit-source-id: 5d554771459f81718d90e5bad9a4c439cbb05d97
Summary:
When Python 3 wants to upload a file-like object, it does something a bit
awkward: it sets the `Transfer-Encoding` to `chunked`, but doesn't actually
chunk the data. Also, for some reason ,it still sets the `Content-Length`. I'm
not sure where that is coming from.
The thing is, when you set `Transfer-Encoding` to `chunked`, you do need to
chunk, or the other end is going to get very confused.
Unfortunately, this is not what happens here (note that the "send" logs are
from enabling http tracing in Python here, and those logs are basically one
line before `.send()` into a socket, so the chunking doesn't appear to happen
elsewhere):
```
[torozco@devbig051]~/opsfiles_bin % echo "aaaa" | ~/fbcode/buck-out/gen/eden/scm/__hg-py3__/hg-py3.sh debuglfssend https://mononoke-lfs.internal.tfbnw.net/opsfiles_bin
send: b'PUT /opsfiles_bin/upload/11a77c3d96c06974b53d7f40a577e6813739eb5c811b2a86f59038ea90add772/5 HTTP/1.1\r\nAccept-Encoding: identity\r\nContent-length: 5\r\nx-client-correlator: tQT3yBfFEzhVtqI5\r\naccept: application/mercurial-0.1\r\ncontent-type: application/x-www-form-urlencoded\r\nhost: mononoke-lfs.internal.tfbnw.net\r\ntransfer-encoding: chunked\r\nuser-agent: mercurial/4.4.2_dev git/2.15.1\r\n\r\n'
sendIng a read()able
send: b'aaaa\n'
reply: 'HTTP/1.1 400 Bad request\r\n'
header: Content-Type: text/html; charset=utf-8
header: Access-Control-Allow-Origin: *
header: proxy-status: client_read_error; e_upip="AcLKajO63Vab0hC4kzGZQsqck3P_YOu7HsBzshC-NCbuo31tlWWqCiVw5xVLh44LYYe7qioCPqYSb8-1cBpdvFDZb_t5oYRP1Q"; e_proxy="AcJjRKHG02qo6Bv6fEPCUVF7DpCyrq3rmSnXhRLWakKWREEvVpk4jc-tzDyG6l9jvn3vNo8PYPG_5hLtC3L1"
header: Date: Tue, 01 Sep 2020 13:10:35 GMT
header: Connection: close
header: Content-Length: 2959
```
What's a bit confusing to me here is where this Content-length header comes
from. Indeed, normally Python 3 will:
- Not infer a content-length for file-like objects (which is what we have)
https://fburl.com/ms94eq31
- Set Transfer-Encoding if no Content-Length is present:
https://fburl.com/f81g8v2j
So, it's a bit unexpected that a) we have a Content-Length (we shouldn't), and
that we b) also have a Transfer-Encoding header. That said, setting the
Content-Length does fix the problem, so that's what this diff does.
Reviewed By: DurhamG
Differential Revision: D23450969
fbshipit-source-id: e1f535ff3d0b49c0c914130593d9aebe89ba18ca
Summary:
As a follow up to the previous diff, let's also warn if dirstate includes
marker files that should not be included in any sparse profiles.
Reviewed By: DurhamG
Differential Revision: D23414361
fbshipit-source-id: 3d171328bf0ba5754e5bacde85f09abb4fed8603
Summary: There seems to be no need to use a shell.
Reviewed By: DurhamG
Differential Revision: D23124756
fbshipit-source-id: 7de1c23e2325fe88dc4c6a2c90563d06f109ed2f
Summary:
The Rust process utility avoids issues with interaction with Python and can do file
redirection on Windows.
Reviewed By: DurhamG
Differential Revision: D23124755
fbshipit-source-id: f72b88bafd19b3b41e53afbf6a4095d0d6bcb93a
Summary:
The Rust bindings handle the cross-platform differences and avoids issues
with Python / Rust interaction. Use it.
As we're here, extend the API to support cwd and env.
Reviewed By: DurhamG
Differential Revision: D23124171
fbshipit-source-id: fdc13f6eaeb25c05b53d385eb220af33dad984e1
Summary:
Spawning processes turns out to be tricky.
Python 2:
- "fork & exec" in plain Python is potentially dangerous. See D22855986 (c35b8088ef).
Disabling GC might have solved it, but still seems fragile.
- "close_fds=True" works on Windows if there is no redirection.
- Does not work well with `disable_standard_handle_inheritability` from `hgmain`.
We patched it. See `contrib/python2-winbuild/0002-windows-make-subprocess-work-with-non-inheritable-st.patch`.
Python 3:
- "subprocess" uses native code for "fork & exec". It's safer.
- (>= 3.8) "close_fds=True" works on Windows even with redirection.
- "subprocess" exposes options to tweak low-level details on Windows.
Rust:
- No "close_fds=True" support for both Windows and Unix.
- Does not have the `disable_standard_handle_inheritability` issue on Windows.
- Impossible to cleanly support "close_fds=True" on Windows with existing stdlib.
https://github.com/rust-lang/rust/pull/75551 attempts to add that to stdlib.
D23124167 provides a short-term solution that can have corner cases.
Mercurial:
- `win32.spawndetached` uses raw Win32 APIs to spawn processes, bypassing
the `subprocess` Python stdlib.
- Its use of `CreateProcessA` is undesirable. We probably want `CreateProcessW`
(unless `CreateProcessA` speaks utf-8 natively).
We are still on Python 2 on Windows, and we'd need to spawn processes correctly
from Rust anyway, and D23124167 kind of fills the missing feature of `close_fds=True`
from Python. So let's expose the Rust APIs.
The binding APIs closely match the Rust API. So when we migrate from Python to
Rust, the translation is more straightforward.
Reviewed By: DurhamG
Differential Revision: D23124168
fbshipit-source-id: 94a404f19326e9b4cca7661da07a4b4c55bcc395
Summary:
The Rust upstream took the "set F_CLOEXEC on every opened file" approach and
provided no support for closing fds at spawn time to make spawn lightweight [1].
However, that does not play well in our case:
- On Windows:
- stdin/stdout/stderr are not created by Rust, and inheritable by
default (other process like `cargo`, or `dotslash` might leak them too).
- a few other handles like "Null", "Afd" are inheritable. It's
unclear how they get created, though.
- Fortunately, files opened by Python or C in edenscm (ex. packfiles) seem to
be not inheritable and do not require special handling.
- On Linux:
- Files opened by Python or C are likely lack of F_CLOEXEC and need special
handling.
Implement logic to close file handlers (or set F_CLOEXEC) explicitly.
[1]: https://github.com/rust-lang/rust/issues/12148
Reviewed By: DurhamG
Differential Revision: D23124167
fbshipit-source-id: 32f3a1b9e3ae3a9475609df282151c9d6c4badd4
Summary:
It uses `sys.argv`, which might be rewritten by `debugshell`. Capture
`sys.argv` to make hgcmd more reliable.
Reviewed By: DurhamG
Differential Revision: D22993215
fbshipit-source-id: 5fa319e8023b656c6cdf96cb3229ea9f2c9b9b99
Summary: This allows us to run commands after changes were made to the repo.
Reviewed By: DurhamG
Differential Revision: D22993218
fbshipit-source-id: d9943dcda94da42970fb9107f48f4caa14b6a9d4
Summary:
Some code paths (ex. metalog.commit) use `util.timer()` as a way to get
seconds since epoch, and get 0 for tests. Other use-cases of `util.timer()`
are ad-hoc time measure for displaying speed / progress. They do not need high
precision or strong guarantee that the clock does not go backwards. Drop the
`time.perf_counter()` to meet the first use-case's expectation.
Reviewed By: singhsrb
Differential Revision: D23431253
fbshipit-source-id: 8bf2d1ed32e284e17285742e1d0fd7178f181fb3
Summary:
With segments backend, the revision numbers will be longer than commit hashes
and are confusing.
Reviewed By: DurhamG
Differential Revision: D23408971
fbshipit-source-id: e2057fa644fc7b6be4291f879eee3235bb4e687b
Summary:
Pulling from older repos (ex. years ago) could require GBs of commit text data.
Flush commit data if they exceed certain size.
This is for revlog compatibility.
In the future we probably just make commit text lazy to avoid this kind of issues.
Reviewed By: DurhamG
Differential Revision: D23408834
fbshipit-source-id: 273384f5a05be07877bb1c9871c17b53ba436233
Summary: This would be used to avoid excessive memory usage during pull.
Reviewed By: DurhamG
Differential Revision: D23408833
fbshipit-source-id: 8edd95ab8201697074f65cc118d14755a230567d
Summary:
`addcommits` is designed to be more efficiently if called with a batch of
commits. So let's buffer the commits to add then only call it once.
This avoids some N^2 behaviors, for example, the NameDag internally will
prepare "snapshot" of itself which involves coping the pending Rust vecs
about the segments and id <-> hash map.
The change makes `pull` usable from unusably slow:
Original Python Revlog backend:
```
In [1]: %trace repo.pull(bookmarknames=['master'],quiet=False)
5191 +466 | Apply Changegroup edenscm.mercurial.bundle2 line 516
| - Commits = 125 :
| - Range = a1d1b3ade136:2e3fe78af189 :
5191 +466 | changegroup.cg1unpacker.apply edenscm.mercurial.changegroup line 313
5192 +416 | Progress Bar: commits (progressbar)
5192 +415 | changelog.changelog.addgroup edenscm.mercurial.changelog line 536
5192 +409 | revlog.revlog.addgroup edenscm.mercurial.revlog line 2116
5215 +371 | changelog.changelog._addrevision (125 times) edenscm.mercurial.changelog line 558
```
DoubleWrite (Segments + Revlog) backend, Before:
```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
2396 +154059 | Apply Changegroup edenscm.mercurial.bundle2 line 516
| - Commits = 323 :
| - Range = cb0b100180ba:5fb57c74f72e :
2396 +154059 | changegroup.cg1unpacker.apply edenscm.mercurial.changegroup line 313
2397 +151433 \ Progress Bar: commits (progressbar)
2397 +151433 | changelog2.changelog.addgroup edenscm.mercurial.changelog2 line 334
```
DoubleWrite (Segments + Revlog) backend, After:
```
In [2]: %trace repo.pull(bookmarknames=['master'],quiet=False)
4629 +512 | Apply Changegroup edenscm.mercurial.bundle2 line 516
| - Commits = 45 :
| - Range = cf23c6972934:1ff0c5f0e7ad :
4629 +512 | changegroup.cg1unpacker.apply edenscm.mercurial.changegroup line 313
4630 +494 | changelog2.changelog.addgroup edenscm.mercurial.changelog2 line 334
```
Reviewed By: DurhamG
Differential Revision: D23390435
fbshipit-source-id: dd97a5008dedd844d4134b87bfef190fa739a80b
Summary:
The users of addrevisoncb are gone.
This also removes the "alwayscache" parameter of "_addrevision".
Reviewed By: DurhamG
Differential Revision: D23390437
fbshipit-source-id: 7edd9dd0b93d4cb9d4f35d088a1aef719b450ec1
Summary: It is about legacy revlog formats that are no longer relevant.
Reviewed By: DurhamG
Differential Revision: D23390436
fbshipit-source-id: 58c2c432804181bcc6517d6c988777b843fc9ba4
Summary:
We have a few safeguards against creating full checkouts. However we have
sparse profiles that are not full, but that include very large directories
which normally should not be included.
This diff adds a logic that checks if a new sparse profile has any of the "marker"
files i.e. some files from a folder that should not be included. Operation
aborts if that the case, however there's always a way to workaround that.
Reviewed By: DurhamG
Differential Revision: D23414200
fbshipit-source-id: 626f392319eb1be8b35f39cadafb61f3c1dfefe3
Summary:
"hg diff" has --sparse option which diffs only files inside a sparse checkout.
The problem is that it doesn't work on eden checkouts because eden repo doesn't
have sparsematch() function.
This diff makes it so that if sparsematch() function doesn't exist then
--sparse option is just ignored.
The motivation for this change is
https://fb.workplace.com/groups/corehg/?post_id=687768245151742. There are some
diff calls that are triggered by arc lint that race with "hg update" and might download
loads of data on people's laptops. This diff doesn't fix the race, but it:
1) Makes sure we don't download too much data that are not in sparse profiles.
2) arc lint doesn't care about files outside of sparse profiles anyway, so
running --sparse make sense.
Reviewed By: DurhamG
Differential Revision: D23396918
fbshipit-source-id: 2a386fdbeab85187e2c2acab69cb86b74124d46f
Summary:
This is practically just 0 in our production setup during `pull`s. In the
future when the commit data become lazy, it's no longer possible to read the
files locally. So let's just don't scan the commits.
Reviewed By: DurhamG
Differential Revision: D23390438
fbshipit-source-id: 4c54c4aac5fd840205296ab86955ec1b8ab76607
Summary:
Mergedrivers can call dirstate.add directly and are adding paths with
"." and "..". Let's block those paths.
Reviewed By: quark-zju
Differential Revision: D23375469
fbshipit-source-id: 64e9f20169cfd50325ecd8ebcc1dd3be7a5cb202
Summary:
extdiff uses shutil.rmtree which calls os.rmdir with new python 3
options. Since we pathc os.rmdir, we need to support those options.
Reviewed By: quark-zju
Differential Revision: D23350968
fbshipit-source-id: 081d179dcd67b51ffdeb6b85899adf4e574a8d0f
Summary: Similar to D18528858 so module names do not need to be spelled twice.
Reviewed By: markbt
Differential Revision: D23091380
fbshipit-source-id: a2a261abc9c78c8805cea62b38498ba65398796d
Summary: This crate would fail to build without the "fb" feature because `serde_json` was listed as an optional dependency (but is used in a way that isn't conditional on the `fb` feature). This diff makes the dependency non-optional, and also silences several dead code warnings that are emitted when building without the "fb" feature.
Reviewed By: quark-zju
Differential Revision: D23386786
fbshipit-source-id: b00a8b0b8b0b978c1cfab2838629fcb388a076e9
Summary:
The `debugfsync` command calls fsync on newly modified files in svfs.
Right now it only includes locations that we know have constant number
of files.
The fsync logic is put in a separate crate to avoid slow compiles.
Reviewed By: DurhamG
Differential Revision: D23124169
fbshipit-source-id: 438296002eed14db599d6ec225183bf824096940
Summary:
A warning means that every tree fetched will be printed in the edenfs log,
which is way too much. Let's decrease this to a debug message.
Reviewed By: genevievehelsel
Differential Revision: D23385778
fbshipit-source-id: d77f1cac3efb945d4b95750822f2f12f48c75ffe
Summary: `len(repo)` can no longer predicate the next rev number. Use nodes instead.
Reviewed By: DurhamG
Differential Revision: D23307791
fbshipit-source-id: cc20e53f039eee2a714748352e8e98aab253095a
Summary:
Some functions might be called very frequently. For example,
`phases.phasecache.loadphaserevs` might be called 100k+ times.
That makes the tracing data harder to process.
Limit the count of spans to 1k by default so the data is cheaper to process,
and some highly repetitive cases can now be reasoned about. Note the limit
is only put on static Span Ids. If a span uses dynamic metadata or ask for
different Span Ids each time, they will not be limited.
In debugshell,
td = %trace repo.revs('smartlog()')
len(td.serialize())
dropped from 6MB to 0.87MB.
It's also possible to reason about:
td = %trace len(repo.revs('ancestors(.)'))
in debugshell (taking 30s, 98KB serialized, vs 21s without tracing), while
previously the result would be too large to show (`%trace` just hangs).
Reviewed By: DurhamG
Differential Revision: D23307793
fbshipit-source-id: 3c1e9885ce7a275c2abd8935a4e4539a4f14ce83
Summary: Set a default limit so the output won't be too long.
Reviewed By: DurhamG
Differential Revision: D23307792
fbshipit-source-id: 7e2ed99e96bbde06436a034e78f899fc2e3e03f8
Summary:
The debugshell command can be long running and contains uninteresting stuff.
Do not profile it.
Practically this hides showing the background statprof thread when using `%trace`.
Reviewed By: DurhamG
Differential Revision: D23278597
fbshipit-source-id: bad97de22e1be2be8b866bee705ea3a6755aa54b
Summary:
This allows entering ipdb for code like: `ipdb` or `ipdb()`. It can be handy to
debug something.
Reviewed By: DurhamG
Differential Revision: D23278599
fbshipit-source-id: 4355dd1944617aeb795450935789f01f66f094eb
Summary: This makes it possible to get tracing results, or run hg commands directly.
Reviewed By: DurhamG
Differential Revision: D23278601
fbshipit-source-id: e7dc92080d2881cb4155a481df5ca93f324828fc
Summary:
The `--trace` flag enables tracing Python modules.
For compatibility reasons, it also enables `--traceback`.
It can be used with debugshell to make `%trace` more useful.
Reviewed By: sfilipco
Differential Revision: D23278600
fbshipit-source-id: d6d0b34bd5c48111f8cd33d7df115f349b0e95b6
Summary:
I found this when I aborted an rebase Dxxx and trying rebasing again and it
complained about "nothing to rebase". It was caused by Dxxx resolving into
a hidden commit.
Reviewed By: sfilipco
Differential Revision: D23307794
fbshipit-source-id: f7a956b5300240089b6a4648f28cf4a152ee2433
Summary:
We shouldn't delete from a dictionary while iterating over it, instead we should iterate over a copy and then delete from the original.
`.items()` returns a view of the dict, while wrapping it in `list` makes a deep copy.
Reviewed By: DurhamG
Differential Revision: D23283668
fbshipit-source-id: a168eef1ed2a1ce02fe71b3f6e3aed090965d2a4
Summary:
Mononoke throws an error if we request the nullid. In the long term we
want to get rid of the concept of the nullid entirely, so let's just add some
Python level blocks to prevent us from attempting to fetch it. This way we can
start to limit how much Rust has to know about these concepts.
Reviewed By: sfilipco
Differential Revision: D23332359
fbshipit-source-id: 8a67703ba1197ead00d4984411f7ae0325612605
Summary:
Corp has a different concept of tier than prod. Let's load the corp
tier into our tier set as well.
Reviewed By: quark-zju
Differential Revision: D23354056
fbshipit-source-id: c9543b8253f042c7b1224578e0687b4bdf21738e
Summary:
The Python 3 email library internally stores the message as text, even
though our input and requested output is bytes. Let's make our own wrapper
around the parser to use ascii surrogateescape encoding so we can get the
actual bytes out later and not get universal newlines.
Based off the upstream 7b12a2d2eedc995405187cdf9a35736a14d60706,
which is basically a copy of the BytesParser implementation (https://github.com/python/cpython/blob/3.8/Lib/email/parser.py) with
newline=chr(10) added.
Reviewed By: quark-zju
Differential Revision: D23363965
fbshipit-source-id: 880f0642cce96edfdd22da5908c0b573887bed12
Summary:
`hg cloud rejoin` command is used in fbclone and it is supposed to print a
message on RegistrationError but this has been broken recently.
Reviewed By: markbt
Differential Revision: D23342773
fbshipit-source-id: 4f3318848953656dea65a2b5d4d832694f6b353c
Summary:
There are users who prefer run `hg cloud leave` if they notice they are
connected to commit cloud sync.
Proving more information and add a prompt might help them to change their mind.
For some users who left new fbclone will connect them back. So on next leave they can learn more information about Commit Cloud Workspaces.
Reviewed By: markbt
Differential Revision: D23346091
fbshipit-source-id: 72f170f7133cd64b772ec75ae29a85dc8809e351
Summary:
When updating to the null commit, the logic that computes the update
distance was broken. The null commit is pre-resolved to -1, which when passed to
a revset raw gets resolved as the tip commit. In large repositories this can
take a long time and use a lot of memory, since it's computing the difference
between tip and null.
Let's fix it to not pass the raw rev number, and also to handle the case of a 0
distance update.
Reviewed By: quark-zju
Differential Revision: D23358402
fbshipit-source-id: 3b0a1fe1bbcb07effba4d0ab2c092e66bdc02e67
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/46
See https://github.com/facebookexperimental/eden/runs/1034006668:
error: unused import: `env::set_var`
--> src/lfs.rs:1539:15
|
1539 | use std::{env::set_var, str::FromStr};
| ^^^^^^^^^^^^
|
note: the lint level is defined here
--> src/lib.rs:125:9
|
125 | #![deny(warnings)]
| ^^^^^^^^
= note: `#[deny(unused_imports)]` implied by `#[deny(warnings)]`
error: unnecessary braces around method argument
--> src/lfs.rs:2439:36
|
2439 | remote.batch_upload(&objs, { move |sha256| local_lfs.blobs.get(&sha256) })?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: remove these braces
|
note: the lint level is defined here
--> src/lib.rs:125:9
|
125 | #![deny(warnings)]
| ^^^^^^^^
= note: `#[deny(unused_braces)]` implied by `#[deny(warnings)]`
error: aborting due to 2 previous errors
error: could not compile `revisionstore`.
I dropped `#![deny(warnings)]` as I don't think warnings like the above ones
should break the build. (denying specific warnings that we care about explicitly
might be a better approach)
Reviewed By: singhsrb
Differential Revision: D23362178
fbshipit-source-id: 02258f57727edfac9818cd29dda5e451c7ca80a7
Summary: Now that it is possible to control which features are enabled on manually-managed dependencies, we can reenable autocargo for `edenapi`. See D23216925, D23327844, and D23329351 (840e6dd6f6) for context.
Reviewed By: dtolnay
Differential Revision: D23335122
fbshipit-source-id: 8ce250c3a106d2a02f457f7ed531623dd866232f
Summary: The command does not crash but `-` lines are ignored.
Reviewed By: DurhamG
Differential Revision: D23357655
fbshipit-source-id: f48568bc193f947503bc19f3e192b33346c317e1
Summary:
Pull Request resolved: https://github.com/facebookexperimental/eden/pull/45
Fix referring to 'version' without proper codegen by making 'version' compile
without codegen. This fixes configparser test when version/src/lib.rs was not
generated.
Make unneeded deps without 'fb' feature optional.
This would hopefully fix the "EdenSCM Rust Libraries" GitHub workflow.
Reviewed By: DurhamG
Differential Revision: D23269864
fbshipit-source-id: f9e691fe0a75159c4530177b8a96dad47d2494a9
Summary: This makes the code simpler.
Reviewed By: sfilipco
Differential Revision: D23269858
fbshipit-source-id: bb9ac0bd1696f7429ca1856e6c63e04fabc2757a
Summary: This makes the code simpler.
Reviewed By: sfilipco
Differential Revision: D23269866
fbshipit-source-id: 30c9e9d218378c0d6df8b822b2a81df2b38f5b01
Summary: Will be used to simplify code.
Reviewed By: sfilipco
Differential Revision: D23269859
fbshipit-source-id: bed0c4dca075ff60900025642af1d84bdd03452d
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for IdConvert and PrefixLookup.
Reviewed By: sfilipco
Differential Revision: D23269861
fbshipit-source-id: a837f3984ff4e1bd5a3983dd1642b9f064f51a36
Summary:
`impl<T> Trait for T` in the current Rust makes it impossible to have
`impl<Q> Trait for Q`. Avoid using it for DagAlgorithm.
Reviewed By: sfilipco
Differential Revision: D23269860
fbshipit-source-id: 031e75e9bf1f1eec2b9e8f36220ef8b817a143a5
Summary: LowLevelAccess is a subset of NameDagStorage. Use the latter instead.
Reviewed By: sfilipco
Differential Revision: D23269865
fbshipit-source-id: 81ebb1e986d8b02c968a9a237ad9a97d4afd54bf
Summary:
If there are too many heads, the current `descendants` algorithm would visit
all "old" heads. For example, with this graph:
head9999 (N9999)
/
Z (master)
:
: (many heads)
:/
: head2 (N2)
:/
C head1 (N1)
|/
B head0 (N0)
|/
A
`A::head9999` or `Z::head9999` will visit N0, N1, ..., N9999, because
`descendands_up_to` is provided with `max_id = N9999` and Z as a vertex in the
master group, is before N0 in non-master. The current algorithm also means
`descendands_up_to` gets linearly slower as the user uses the repo more, which
is quite undesirable.
This diff changes `descendands_up_to` to take an `ancestors` set, which is
`::head9999` in this case, and iterate non-master flat segments in it. So it
will skip N0 to N9998 directly by finding the N9999 flat segment and only use
it. The number of heads will have a smaller impact on performance.
Another slowness is `draft::draft_heads`, if there are too many `draft_heads`,
the internal calculation of `::draft_heads` can be slow. Optimize it by
limiting `draft_heads` to `draft:`. Practically this affects `y::` revset as
`y::` is translated to `y::visible_heads` and `visible_heads` can be large.
`cargo bench --bench dag_ops -- '::-master'` shows significant difference:
Before:
range (master::draft) 18.112 s
range (recent_draft::drafts) 2.594 s
After:
range (master::draft) 72.542 ms
range (recent_draft::drafts) 14.932 ms
In my fbsource checkout there were 20k+ heads. The improvement of
`master::recent_draft` (`x::y`) is pretty visible, and `y::` is also improved:
% lhg debugbenchmarkrevsets -m -x 'p1(min(7e8c86ae % master))' -Y 'draft() & 7e8c86ae' -e 'x::y' -e 'y::' --no-default
# x: 168f5228e570fb6b2ff7f851bd82413102748d84 (p1(min(7e8c86ae % master)))
# y: 7e8c86aec68ebc6e0b8254afcb381315991fd21c (draft() & 7e8c86ae)
# before
| revset \ backend | segments | revlog | revlog-cpy |
|------------------|----------|--------|------------|
| x::y | 17ms | 0.1ms | 0.5ms |
| y:: | 3.3ms | 0.7ms | 1.3ms |
# after
| revset \ backend | segments | revlog | revlog-cpy |
|------------------|----------|--------|------------|
| x::y | 0.2ms | 0.1ms | 0.6ms |
| y:: | 1.0ms | 0.7ms | 1.3ms |
Reviewed By: sfilipco
Differential Revision: D23214387
fbshipit-source-id: 4d11db84cd28f4e04e8b991cbc650c9d5781fd27
Summary:
Lots of non-master heads is not an exercised graph in the benchmarks.
Add it as it practically happens. This will be used by the next change.
Reviewed By: sfilipco
Differential Revision: D23259879
fbshipit-source-id: 7fe290d14403e42e6d135bde56e2d5c8519ae530
Summary:
Currently the fuzz test only uses the master group. Let it exercise non-master
group too.
Reviewed By: DurhamG
Differential Revision: D23214388
fbshipit-source-id: 7108a1055fbdda2b012f93c5948fb83ef3b9a96f
Summary:
The calculation can take tens of milliseconds. Cache it.
Invalidate the cache on transaction commit.
This will improve perf on revsets like `descendants` that will use
`head()`.
Reviewed By: DurhamG
Differential Revision: D23196412
fbshipit-source-id: 2913310ebb97e1c0346198c1e2738799799c740a
Summary: Provide a way to see segments.
Reviewed By: sfilipco
Differential Revision: D23196408
fbshipit-source-id: b1418f945a5a3364ac73b0f97466d973dd4b6300