Summary:
Instead of manually dropping some of the datapack/historypack fields, we can
drop the entire object. This allows implementing the Drop trait more easily.
But, this prevents the code from later using some of the object fields. We can
use replace to move them in a zero-copy fashion.
Reviewed By: DurhamG
Differential Revision: D15076017
fbshipit-source-id: 4831dfcc2005c957862d32eeda02f62796be3afb
Summary:
Use a dedicated Span type so we can enforce reverse ordering and `start <= end`
directly on the Span structure.
The constructor of `SpanSet` becomes more expensive because it recreates the
`Vec`, and sorts it. Practically, hopefully it's fine. Internal logic like union will
not use that constructor.
Some comments and tweaks have been made to make the code easier to read.
There are some performance changes, though:
Before:
intersection 5.030 ms
union 5.920 ms
difference 4.804 ms
After:
intersection 6.036 ms
union 5.426 ms
difference 4.710 ms
`intersection` becomes slower, while `union` and `difference` become a bit faster.
Hopefully the regression is within the acceptable range.
Reviewed By: sfilipco
Differential Revision: D15023651
fbshipit-source-id: ea7845d5d20faf204cfb85c66fc3bd6e25c9fc0c
Summary: This would provide some data about changes around SpanSet.
Reviewed By: sfilipco
Differential Revision: D15023652
fbshipit-source-id: 4cff7d1876fe20cd876f26926f31e018b6c88fd9
Summary:
Complete the IdMap interface so it's usable.
There are 2 possible use patterns:
- On-disk IdMap + In-memory additions. Practically, the server provides an
on-disk map, and the client might assign missing commits on demand. The
client still needs to update the IdMap during pull.
- Everything is on-disk. There are no in-memory additions. This is more complex
because the local commits might become part of the server commits in the
future, and it might require Ids for those commits to be re-assigned.
I haven't decided which way to go exactly. So let's keep the interface flexible
for both.
That said, I do want to reduce the chance of causing filesystem race conditions
for filesystem writes. In this case, both reads and writes should hold a lock.
So a dedicated type is used to encourage the pattern of:
- get the dedicated type (and hold the filesystem lock)
- read, write, sync
Write related methods are not moved to the dedicated type, to cover the
in-memory addition use-case.
Reviewed By: sfilipco
Differential Revision: D15008517
fbshipit-source-id: 5d117ed7f2947aed6ed524a3b5199c071908c4ae
Summary:
There will be lots of algorithms or structures that operate on integers as
commit identities. The source of truth of commit identities are the commit
hashes. Add a map to be able to translate between them.
The map is designed to be sparse, so it can be used as a cache if the map
is moved to server-side.
The map does not take `[u8; 20]` as its value type, with the intention to
support other hash functions. For example, Bonsai Blake2 hashes have 32 bytes.
Since the integer id is in global namespace and can conflict if there
are multiple writers. The interface is designed to make sure an explicit
critical section is needed for write (to filesystem) operations.
Reviewed By: sfilipco
Differential Revision: D15008518
fbshipit-source-id: 9f53aae551c54e1b47b5f837642ea00fca8579c3
Summary:
The spanset is a set of integer spans. It will be used by some DAG related
operations. It'll be used as a subset of mercurial/smartset.py.
Note: smartset.py also has a Python `spanset` structure. That is different
from this Rust spanset in these ways:
- The Rust set does not preserve ordering.
- The Rust set can have multiple spans, instead of just one.
- The Rust set is less abstract (for now). Its set operations (union, etc.)
only work on the same type.
This diff adds some initial functions for it.
Reviewed By: sfilipco
Differential Revision: D15004985
fbshipit-source-id: c2e5e2a80e2e4681c2f443e0d8a83dc97f7be371
Summary: The scmdag library is going to have things related to the commit graph.
Reviewed By: sfilipco
Differential Revision: D15004984
fbshipit-source-id: f274cceeabae4a57985763216572f7cd055f8e07
Summary: Similar to D7562864, but work with cargo workspaces.
Reviewed By: singhsrb
Differential Revision: D15084323
fbshipit-source-id: 3c15f2ceabb73dd54028523b6da5eb7857e7c842
Summary:
This test is failing on OSX due to case collisions. Let's just avoid
the case collisions by using the directory name as `x` instead of `a`.
Reviewed By: quark-zju
Differential Revision: D15083754
fbshipit-source-id: 0752f06c71c315e349a8eea8dbe7da14e564f1b2
Summary:
It's been failing in continous run. Somehow one line of traceback is missing in
opt run, not sure why but it's easy enough to skip it.
Reviewed By: quark-zju
Differential Revision: D15080872
fbshipit-source-id: 55eff2d471da05b109faa04b6801db1e6245d7a6
Summary:
I'm getting reports on Eden failed to import data from Mercurial. One of the error message I have seen is `local variable 'rcvd' referenced before assignment` (unfortunately we don't have the backtrace to locate the exact problem), and this is the only place in our codebase has a variable named `rcvd`.
The log contains this error: P62827869$210-217
So if there is any exception raised in `self._connect`, `rcvd` and `total` will not be initialized, and these variables are being used to log the error message in the `except` block:
diffusion/FBS/browse/master/fbcode/scm/hg/edenscm/hgext/remotefilelog/fileserverclient.py;18aa2052b752f1255dd53474d541c4a4177bfef5$732-737
Reviewed By: quark-zju
Differential Revision: D15069162
fbshipit-source-id: 16ec56820107fbbb24d426ce309b38a88d7eae5b
Summary: A more robust parsing regex for task references. It avoids some additional false positives, yet should cover all valid cases.
Reviewed By: Daij-Djan
Differential Revision: D15053374
fbshipit-source-id: 85410b5181eb9921d513b3a61ef3a1591b54539b
Summary:
The `path` argument passed to the follow revset is absolute. So `path` instead
of `relpath` should be used.
Reviewed By: DurhamG
Differential Revision: D15071189
fbshipit-source-id: 6aec76fa1a8cabd545a375aa40448cc75dbd1d6d
Summary: Release the GIL during data fetching to allow for progress bars to update properly. The data fetching code is pure Rust and does not interact with the Python interpreter at all, so releasing the GIL here is safe.
Differential Revision: D15051852
fbshipit-source-id: 144da953720951f9a30aadfc2b7fc8c8bc6b14aa
Summary:
When detecting split completion for the purposes of recording mutation, we used
`cmdutil.comparechunks`. This doesn't work for curses-recordings where the
only change is that some lines of a patch were excluded.
A simpler check is to just generate the patch that split is going to use, and
see if that matches the original patch.
Additionally, prevent the insertion of empty split records, as this can
cause crashes.
Reviewed By: mitrandir77
Differential Revision: D15063983
fbshipit-source-id: ba717d7f065faea93a500caaf10a1dbf582c7ab1
Summary:
Rather than relying on implicit conversion from a missing visibleheads file,
explicitly convert visibility information from obsmarkers when enabling
visibility tracking.
Reviewed By: mitrandir77
Differential Revision: D15063290
fbshipit-source-id: 44309f3cdf92c4ae100570b3bd98a240999ed558
Summary: Track visibility changes in the blackbox log to help with debugging issues.
Reviewed By: mitrandir77
Differential Revision: D15062971
fbshipit-source-id: c547618168f5eb08e6343e0b0d97db136e151a7a
Summary:
Ordinarily loops are prevented in the mutation graph as the predecessors must exist
at the point that the successor is created. However, backfilling from a complex
obsolescence graph may inadvertently introduce cycles.
Since loops are invalid, we can safely ignore any mutation edges that may
introduce them. The `allpredecessors` and `allsuccessors` functions already
do this.
Add loop detection and ignoring to the `predecessorsset` and `successorssets`
functions.
Reviewed By: mitrandir77
Differential Revision: D15062399
fbshipit-source-id: fe892d9236c8d8dc4e1322b82618ab4bca35d30a
Summary:
Use the graphnode `-` for all invisible commits, even obsolete ones.
Users will only see them in their logs if:
- they run log with `--hidden`.
- they have invisible commits that are temporarily unhidden (e.g. they've checked it out).
Reviewed By: mitrandir77
Differential Revision: D15061894
fbshipit-source-id: 86873bd86cb15cef72dae248b8e2a636378cc547
Summary:
When computing the fate of a commit, if we find the immediate successor is the
next visible commit, we return the operation of the mutation record for that
single operation. If it's not available, we then look at whether or not the
commit is public to decide between the fallbacks of `land` or `rewrite`.
Unfortunately, not all land operations end up with a `land` op. In particular,
obsmarkers created with pullcreatemarkers and then backfilled into mutation
records have a blank operation, and so we show `rewrite` here when we should
show `land`.
Switch around how we calculate the fate. Compute a default operation of `rewrite`
or `land`, based on the phase of the successor, and then use that if the successor
is not the immediate successor, or if the recorded operation is blank.
Reviewed By: mitrandir77
Differential Revision: D15061863
fbshipit-source-id: 753b0b58f84e653b40f9918f7ad3b3adfff359d8
Summary:
In the automigrate step at the start of pull, perform automigrations for
mutation and visibility.
If `mutation.automigrate` is set to true, then backfill obsmarkers into the
mutation store. This step can take a couple of seconds for large obsstores, so
print a message.
If `visibility.automigrate` is set to `"start"`, switch to explicit visibility
tracking. If it is set to `"stop"`, switch back to obsmarker-based tracking.
Reviewed By: mitrandir77
Differential Revision: D15046139
fbshipit-source-id: 284268d42b52c6b296c5c1b73db7bc218ae29a0a
Summary: Reading a comment is easier than trying to figure out the on-disk format.
Reviewed By: kulshrax
Differential Revision: D15056859
fbshipit-source-id: 097ed8bcaa51369aba4bcc9ed1cc95ebd6a67a66
Summary:
Compressing/Decompressing data can be expensive, so avoid doing it when not
needed. I though about using a RefCell but decided on just using mutable
reference as an Entry will always be private to indexedlogdatastore.rs.
Reviewed By: kulshrax
Differential Revision: D15056862
fbshipit-source-id: ac0b811f2df563be86e3ade9abe89476db5d13cc
Summary: This will allow decompression to be done on the fly as opposed to always.
Reviewed By: kulshrax
Differential Revision: D15056860
fbshipit-source-id: 60635c431579fc924a61d08b35688222ec4930bb
Summary:
Delta chains are only created during repack, as every download operation
fetches the full content of the file. Even if we wanted to support them,
interrupted chains adds undesirable complexity as it can lead to chain loops if
we're not careful. Let's just not support delta chains for now to avoid this.
Reviewed By: kulshrax
Differential Revision: D15056861
fbshipit-source-id: 4b0474ce134e946952a70f363190faf50850abe0
Summary: Now that IndexedLog are also in this crate, its name is no longer relevant.
Reviewed By: kulshrax
Differential Revision: D15056502
fbshipit-source-id: cb00c8322ac4ff7da97c8faaec2959e5f68ca4ca
Summary:
Once `remotefilelog.fetchpacks` is enabled, `hg gc` will no longer be able to
limit the size of the hgcache. This will be particularly challenging for
Sandcastle/Quicksand as they already see hgcache over 100GB.
The long-term plan is switching to IndexedLog based stores with a log rotate
functionality to control the cache size. In the meantime, we can implement
a basic logic to enforce the size of the hgcache that simply remove packfiles
once the cache is over the configured size.
One complication of this method is that several concurrent Mercurial processes
could be running and accessing the packfiles being removed. In this case, we
can split the packfiles in 2 categories: ones created a while back, and new
ones. Removing packfiles from the first case, lookups will simply raise a
KeyError and data will be re-fetched from Memcache/Mononoke, ie: failure is
acceptable. The second category belongs to ones that were just created by
downloading them from Memcache/Mononoke, and the code strongly assume that they
will stick around. A failure at this point will not be recovered.
One way of fixing this would be to handle these failures properly and simply
retry, the other is to not remove new packfiles. A time of 10 minutes was chosen
to categorize the packfiles.
Reviewed By: quark-zju
Differential Revision: D15014076
fbshipit-source-id: 014eea0251ea3a630aaaa75759cd492271a5c5cd
Summary:
Clean up some of the calls to `ui.log` and how they appear in blackbox logging.
* Make the names of the events consistently use `snake_case`.
* For watchman, only log once for each watchman command. Include whether or not it failed.
* Unify `fsmonitor` logging under the `fsmonitor` event.
* Omit the second argument when it is empty - it is optional and does nothing when empty.
* Increase the number of blackbox lines included in rage to 100.
Reviewed By: quark-zju
Differential Revision: D14949868
fbshipit-source-id: a9aa8251e71ae7ca556c08116f8f7c61ff472218
Summary: Per the title, if we attempt to fetch file data and history over HTTP and the fetch fails, fall back to SSH rather than crashing.
Differential Revision: D15035947
fbshipit-source-id: 2d00a49a51a0c8809daf1d28a6e3ab7f571415b0
Summary: Add a debug option for HTTP data fetching. The intended usage is for it to gate verbose debug messages; the option can be set by Chef for users in the hg_dev tier.
Differential Revision: D15040988
fbshipit-source-id: b7eaa3bab4200e083cffc5822fb9873611725e6b
Summary:
On Windows, all the tests that are expecting to find some files in $CACHEDIR
would fail due to the directory not existing. Interestingly enough, printing
$CACHEDIR would print a reasonable path, which is the same as $TESTTMP.
Trying to understand this better, I passed --keep-tmpdir to run-tests and
realized that the "real" $TESTTMP was somewhere in my home directory, while
the real $CACHEDIR was in fact C:\tmp.
I haven't fully understood why, but it looks like $PWD is expanded in C:\tmp,
while $TESTTMP is expanded into something else.
Reviewed By: quark-zju
Differential Revision: D15041274
fbshipit-source-id: 0d167183d74df5f6ab84360c5699e96808fceb9b
Summary: Almost all tests are failing on Windows due to a warning about BaseException.message being deprecated. Replace it with str(e).
Reviewed By: kulshrax
Differential Revision: D15039020
fbshipit-source-id: d984af91ec447b2f721eab2e3c6d39a0b350fb57
Summary:
The prechangegroup hook didn't have throw=True set, so if the hooks
failed we ignored it. This seems to have been the case for a long time, but we
only recently hit it.
Reviewed By: kulshrax
Differential Revision: D15038494
fbshipit-source-id: 4fa9ed4924c02732e3e4070e747a80fbe63564c9
Summary: Add a new config option to toggle file validation.
Differential Revision: D15034687
fbshipit-source-id: 3783ea1dacad9d1e494a5de1388f703db0ed1129
Summary: We have to pass a lot of config options across the FFI boundary; these are currently passed as arguments to the Eden API client constructor. Let's use argument unpacking to avoid repeating a bunch of argument names in the call to the constructor.
Differential Revision: D15034480
fbshipit-source-id: 74d0830c686c8863fcede6e57404aec3f0a58ea1
Summary:
When eden request a tree, it manually commit the pending mutable pack files. In
the unlikely case where the temporary files are removed from the disk, the
pack.close() operation will fail, since the pending packs aren't reset, the
next commit that happens while the repo object is closed will try again. This
time, it may try to close an already closed packfile, leading to P62634761.
Reviewed By: quark-zju
Differential Revision: D15015632
fbshipit-source-id: 016617334498c0161feed9dcec5ce24df931ad9c
Summary: Updated help text for hg amend and moved some options to verbose
Reviewed By: kulshrax
Differential Revision: D15004588
fbshipit-source-id: 4c9e0bffb522184ac8750ed8aa4eb5a53b309bd0
Summary:
I want to give Store a more specific name so that it doesn't get
confused with other Store abstractions that we will add in the
future.
Reviewed By: singhsrb
Differential Revision: D15007383
fbshipit-source-id: 499bcda4aecd5389e3bc1eba5206ba72a69c4c3d
Summary:
Python's `next()` can raise the `StopIteration` exception, unlike Rust's `next`
which just returns `None` instead.
Fix it by providing a default value `None`.
Reviewed By: singhsrb
Differential Revision: D15008773
fbshipit-source-id: df885c63b8130ceac38f86c89f2547dde2d519ba
Summary:
The warning isn't that useful, and can actually cause more harm than good, as running `hg prefetch -r .`
can download gigabytes of unnecessary data to the hgcache.
Reviewed By: quark-zju
Differential Revision: D14999458
fbshipit-source-id: b0ff2c2ad0e441622066fac10a5efafe8de588db
Summary:
The pdf generated by D14940136 used an older version of texlive. It has issues
with text colors after a footnotemark.
Correct it by recompiling with a newer version of texlive. Besides, I made some
minor edits to make the style more consistent (ex. capitalized some words,
removed some periods).
Reviewed By: sfilipco
Differential Revision: D15001655
fbshipit-source-id: 31bb7741ab18bba1594c553650e4710f537c2399
Summary:
`Log::lookup_range` exposes the range query feature provided by `Index`.
The iterator is made double-ended by the way.
Reviewed By: sfilipco
Differential Revision: D14895477
fbshipit-source-id: 6aef0973e009bf8fc6f3b5e5a8f6c54e57c81360
Summary:
The RangeIter is actually faster. The main reason is that it avoids recursion.
RangeIter does require double Vec, which seems like extra overhead. Practically
it does not seem to matter much.
The RangeIter code is also better written than PrefixIter. So let's delete
PrefixIter, and switch prefix lookups to use RangeIter.
Before:
index prefix scan (2B) 89.788 ms
index prefix scan (1B) 72.337 ms
index prefix scan (2B, disk) 102.098 ms
index prefix scan (1B, disk) 90.445 ms
After:
index prefix scan (2B) 76.335 ms
index prefix scan (1B) 54.517 ms
index prefix scan (2B, disk) 91.798 ms
index prefix scan (1B, disk) 67.143 ms
Reviewed By: sfilipco
Differential Revision: D14895478
fbshipit-source-id: 79a01774fb640c78fc5733db82f86f0f9403c960
Summary:
This would provide data about scan_prefix performance.
The benchmark code is slightly changed to share the index across test cases.
That reduces test setup cost.
Reviewed By: sfilipco
Differential Revision: D14895481
fbshipit-source-id: e70098bd202e102822a0829c0ae28de8d49fbe85
Summary:
This API allows range query, similar to `BTreeMap::range`.
It's going to be used by segmented changelog. There are spans (start, end)
stored in the index, and we need to find spans by rev (start <= rev <= end).
Initially I was changing PrefixIter incrementally towards the new RangeIter.
There are too many small commits and I got some useful feedback early. Now
it seems cleaner to just introduce the desired state of RangeIter first.
We can later migrate prefix lookup to RangeIter, if perf regression is
negligible.
The added code is long. But some of them are modified from existing code:
- `RangeIter::next_internal` is modified from `PrefixIter::next`.
- `Index::get_stack_by_bound` is modified from `Index::scan_prefix_base16`.
The tests helped find some issues of the code. I hope they're not too weak.
Reviewed By: sfilipco
Differential Revision: D14895479
fbshipit-source-id: fb8f1bd35c61187fe5f7764fa485206bbb13c8e0
Summary:
This test was broken by D14971701 on OSX because it has a case
insensitive filesystem.
Reviewed By: kulshrax
Differential Revision: D14986692
fbshipit-source-id: a2a924d7aae4f3b96e7691e824a82087c1ff8513