Summary: Implement the main history logic by visiting, skipping, and splitting segments.
Reviewed By: DurhamG
Differential Revision: D33265878
fbshipit-source-id: f3165752cf9fc8cd0bd245be9427769804f9e556
Summary: Those are individual algorithms used by upcoming changes.
Reviewed By: DurhamG
Differential Revision: D33265884
fbshipit-source-id: b09813df4fb6477c4f0aa60a853b68af026b43c7
Summary:
The problem is: given a list of paths, and a root tree, find
the content ids of the paths.
It can be a bit complex, if these are considered:
- avoid resolving common prefixes of paths multiple times
(ex. if paths are "a/b/c" and "a/b/d", only visit "a" and "a/b" once)
- prefetch in batches per tree depth
(need O(max tree depth) round-trips)
This module is to solve the problem. See the docstring for details.
Reviewed By: DurhamG
Differential Revision: D33339416
fbshipit-source-id: 3400e17799b42cf489576228bd486a671ddaaa5f
Summary:
The path history area is problematic in multiple ways:
- Visiting commits and checking their trees following commit graph can be too
slow.
- The "fastlog" service in Mononoke can provide faster path history for a
single file or a single directory. However, it requires complex infra to
maintain the indexes and do not handle following multiple paths nicely.
- hg's linkrev is a tech-debt we'd like to remove but it's not easy to do so.
Partially because a replacement will need new storage and protocol design,
and might face questions like offline UX, etc.
This crate is an attempt to tackle problems above:
- Visiting segments and skip segments aggressively.
- If we use commit graph and regular tree reads, then there is no need for an
external service, or hg's linkrev.
with one main downside caused by bisect:
- "change + revert" might cancel out. History can be incomplete.
But that downside seems acceptable considered the other wins.
This diff adds the crate with some high level commments.
Reviewed By: DurhamG
Differential Revision: D33265881
fbshipit-source-id: e29be3d8e9fa8cd9f011144a7104429edcb25ec4
Summary:
Instead of passing just the old and new master nodes. Pass a list to support
more complicated cases.
Reviewed By: DurhamG
Differential Revision: D33594530
fbshipit-source-id: 2087d4fce79eb5cff3c1d381cfc82f9bd6ad89c4
Summary:
debughiddencommit can produce visible ephemeral commits if the backup
fails (like if certs are invalid). Let's ensure we make them invisible even in
the case of a backup error.
https://fb.workplace.com/groups/asic.infra/posts/1314042865686399/
Reviewed By: mrkmndz
Differential Revision: D33668078
fbshipit-source-id: 6df48709ef183afa229f96fa7a526c479e8b4c0a
Summary:
The phrasing implied that "update --clean" would only discard the
conflicting files, but in reality it discards everything. Let's make the message
clearer.
Reviewed By: quark-zju
Differential Revision: D33662474
fbshipit-source-id: 60aeb7db72d45e894d959d9f83285f34132c603b
Summary:
Some git repos (ex. linux.git) contain non-utf8 commit messages.
It crashes `ctx.description()` in various places (ex. parsing
Phabricator URL, showing commit message template, etc.)
Let's just use the `from_utf8_lossy` function from Rust to avoid
such encoding issues.
This does not change the write path of git commits.
Reviewed By: DurhamG
Differential Revision: D33280048
fbshipit-source-id: bf6abbcf0aaf48ec2593c78756a1892cdc556e93
Summary: `dedup` removes duplciated items in a list while maintaining the item order.
Reviewed By: DurhamG
Differential Revision: D33486862
fbshipit-source-id: c891922826d9f3fb3b7300a407791345d75b4b6c
Summary: Fix some doctests failing on Python 3. Most of them are encoding issues.
Reviewed By: DurhamG
Differential Revision: D33486863
fbshipit-source-id: 0258a0b6306718a33e1d966e8e3c3a465f183cc2
Summary: Fix some doctests failing on Python 3. Most of them are encoding issues.
Reviewed By: DurhamG
Differential Revision: D33486866
fbshipit-source-id: 5f47dc4f773431022cc4976f7a3e91c77eb99809
Summary:
They are no longer used. This also avoids issues fixing their doctest, which is
failing on Python 3 due to encoding issues.
Reviewed By: DurhamG
Differential Revision: D33486867
fbshipit-source-id: 66186f39c6aa19f2eada8dc6e4b751871debe126
Summary: Drop dependency on fancyopts so we can remove it and its broekn doctests.
Reviewed By: DurhamG
Differential Revision: D33489535
fbshipit-source-id: ce491526bfedba909a5391bad5cc21af82b3db12
Summary:
Drop dependency on fancyopts so we can remove it and its broken doctests.
The error type is slighly different, which affects the tests.
Reviewed By: DurhamG
Differential Revision: D33489537
fbshipit-source-id: 6aa680227f80536ba2573e77a9a0caf26131c0ee
Summary:
The Python codebase wants `opts['foo_bar']` for flag `foo-bar`. Previously this
normalization happens in `dispatch.py`. Move it to `pycliparser` so `pycliparser`
can used in more places.
This will be used to replace fancyopts, so doctest in fancyopts can be deleted.
Reviewed By: danielocfb
Differential Revision: D33489539
fbshipit-source-id: ca6a23dde3408a9bfa07557b8ba16cbe1d546ab1
Summary:
The only callsite in dispatch.py actually provide the global flags. So there is
no need to append the global flags again. This makes `parsecommand` more flexible.
Reviewed By: danielocfb
Differential Revision: D33489536
fbshipit-source-id: 05d623a3ec51d585d9307cb5660d841d6d222bc2
Summary:
Previously parsecommand only takes 3-item tuples. Practically it could be 4 or
5 items. Handle them so parsecommand is easier to use.
Reviewed By: danielocfb
Differential Revision: D33489538
fbshipit-source-id: 568207d55a6a55b68d2a54a2aeef6a34f9603a5c
Summary: In uiconfig.py, we now use parselist from Rust. Let's just drop the Python configlist implementation.
Reviewed By: DurhamG
Differential Revision: D33486865
fbshipit-source-id: 4633b93e49b634dd0f20c2acad756957e29d4ab5
Summary: I just added the ignorematcher but didn't clean up all its code from my first stab. In particular, don't reference _matchers directly.
Reviewed By: quark-zju
Differential Revision: D33590121
fbshipit-source-id: f50dfcdddcc2a9ba52b53e5c6aff8b9170c9bcc7
Summary:
We were seeing two mmap's open for a given indexedlog. It turns out the
Index OpenOptions keeps a reference to the original key_buf, which meant we held
onto the original mmap forever.
This diff stops doing that.
Props to xavierd for noticing the double mmaps.
Reviewed By: quark-zju
Differential Revision: D33594545
fbshipit-source-id: f1ac3f6752886971a0f325874ac581f937234a4d
Summary:
When using curses during interactive revert, we now properly handle the transition to or from having no newline at the end of file. We do this by peeking ahead one line and trimming the apparent newline if the next line is "No newline at end of file".
This is motivated by upstream https://phab.mercurial-scm.org/D8762, but that test didn't exercise the bug for me, and the code change didn't work properly when reverting back to the no-newline case.
Reviewed By: quark-zju
Differential Revision: D33541600
fbshipit-source-id: 6e605fe2f6017baad0aa8232313a209f68fc871c
Summary: This adds two columns that shows the current download and upload speed for each process in either Kb/s or Mb/s
Reviewed By: quark-zju
Differential Revision: D33557150
fbshipit-source-id: f279904d78ac1e06a9bf1d3c286e3af7285b73a9
Summary:
This adds a column that shows whether a process is running. It shows `RUNNING` if the process is currently executing, or `TERMINATED (n)` if the process has finished. Here `n` is the exit code of the process (e.g. 0). Processes that terminate before debugtop starts running are not shown.
An option is added to the command for controlling the amount of time a terminated progress is shown after it finishes is also added.
Reviewed By: quark-zju
Differential Revision: D33522251
fbshipit-source-id: f8444298155aabecf4a33387b6ca56b67068367a
Summary:
This appears to have broken lfs fetching on a lot of laptops. https://fb.workplace.com/groups/mercurialusers/posts/4639345729448344/
Original commit changeset: 1447c880c767
Original Phabricator Diff: D33506809 (e791747460)
Reviewed By: quark-zju
Differential Revision: D33588115
fbshipit-source-id: d8aee673a582d22124f4354f58829fdd186ea33c
Summary: It will be used by pathhistory.
Reviewed By: DurhamG
Differential Revision: D33339951
fbshipit-source-id: dbb1bd509cce2fb54bc7f8d392ab8bcb11788e03
Summary: Hack things up so the sparse "ignore" matcher delegates to the gitignorematcher's "explain" method. This allows the "debugignore" command to give more useful information about why a particular file is ignored.
Reviewed By: quark-zju
Differential Revision: D33586208
fbshipit-source-id: 51bf69f39dbba2c724e9ec28211d3bc0b6c9b0fd
Summary:
This allows users to create a new snapshot reusing the latest snapshot's storage. This allows uploading very similar snapshots even faster (after the bugfix on D33096364).
There is already an optimisation to avoid re-uploading files. However, it still performs a get/put on the server side. This allows bypassing it altogether. The tradeoff is that the same bubble is used so we don't extend the lifetime.
In the future, we want to avoid the get/put on server side either way, but that needs a bunch more work.
Reviewed By: markbt
Differential Revision: D33098266
fbshipit-source-id: 94baad6d1db516a6300963d240c354c86a90fc05
Summary:
hg (and edenfs) spend a significant amount of time hashing lfs data when reading from the indexedlog store. Indexedlog already has checksums for each chunk of data, so, assuming the data was correct when inserted, we only need to verify the total content size (i.e. that we have all the chunks). In a previous commit I added content verification when inserting lfs data into indexedlog, and this commit introduces a config flag to remove the verification when reading. We still check the blob's total size which covers the case of missing chunks.
Note that there is risk if existing indexedlog lfs entries are invalid since this commit removes the validity check. I added a trace point I will use to verify that we don't currently get hash mismatches in practice.
Reviewed By: DurhamG
Differential Revision: D32444377
fbshipit-source-id: 8f2e857c0d88c00687500ad107b3a5ebc79956d6
Summary: We now have an explicit check verifying the blob's size when reading from the indexedlog. This is redundant with the current content hash verification, but I'm preparing to remove the content hashing on read.
Reviewed By: DurhamG
Differential Revision: D32444383
fbshipit-source-id: 79868175563621e234f2a7e8055afc83fad56f12
Summary:
Currently we verify the content hash every time we read lfs data from indexedlog, but this is slow. Instead, we can verify once at insertion time and skip the verification when reading. This commit adds the insertion time verification.
Note that this introduces an error if the hash mismatches where previously the data would be inserted and silently ignored when reading.
Reviewed By: DurhamG
Differential Revision: D32444380
fbshipit-source-id: c096263f7a13279e21e31216a3a7132c52e630f2
Summary: I accidentally triggered this code path in a test and was seeing an exception: "Expected type that converts to PyBytes but received str". Fix by calling "encode()" on the python strs.
Reviewed By: quark-zju
Differential Revision: D32444381
fbshipit-source-id: cac33c4b4b06ecf71329bbd8746fbfef7f5be1ad
Summary:
The sampling layer writes configured tracing events out to the hg sampling file. To make things play well with our existing TracingCollector, I added a tracing Filter for the sampling layer.
Note that the sampling Layer will not work properly if you set EDENSCM_LOG or LOG since that actives the EnvFilter which does not respect per-Layer filtering and will filter events before they make it to the sampling layer.
Reviewed By: quark-zju
Differential Revision: D32444378
fbshipit-source-id: 6eeb782b4a8c0aa6e9b19fc319ca7663d4cf45d8
Summary: "sampling" refers to the python hg feature where certain ui.log keys can be marked for export to a specified file. The scm telemetry wrapper shuffles the file contents off to scuba. In rust, we now have a tracing Layer that implements the event export format. It matches tracing events using the "target" metadata attribute since that can be checked statically before the event is instantiated. Note that I am not currently taking advantage of that, but will in a following commit.
Reviewed By: quark-zju
Differential Revision: D32444379
fbshipit-source-id: c5d9fd5e28271656082d82f6584925b304ab02eb
Summary: Use LevelFilter instead of implementing Layer::enabled. This way the filtering only applies to this Layer rather than all Layers. This is in preparation for adding another Layer.
Reviewed By: quark-zju
Differential Revision: D32444382
fbshipit-source-id: cd6a78d33d1de91ab41c92b7f76895b9d335a80f
Summary: The added tests allow testing most parts of debugtop without compiling the entirety of hg or running its integration test.
Differential Revision: D33485457
fbshipit-source-id: 8ec37322ec04b4a73d4b4e2c0f053d5206e224d1
Summary: This moves most of the contents of debugtop to another crate in order to improve modularity and compile times.
Differential Revision: D33484736
fbshipit-source-id: 15df453fc3b3e263878779998767d31aa885640a
Summary:
If there's a hard reboot, the backup file could be empty and this
version check could throw an index error. Let's handle that gracefully.
Differential Revision: D33553319
fbshipit-source-id: de2fec48766d9f7e75adaf3d1642b48a09d67cf3
Summary: This will stop us reading on-disk certs for lfs.
Reviewed By: farnz
Differential Revision: D33506809
fbshipit-source-id: 1447c880c767106e85994ff1c419e90d843d82eb
Summary:
With fastcopytrace, attempting to rebase a directory rename over a file rename
(or vice versa) is not successful, as the copy source that fastcopytrace comes
up with doesn't exist in the rebase source commit.
Currently this crashes with an obscure `[copy source filename] not found in manifest`
error. We can do better: if the copy source doesn't exist in the source
manifest at the point where we are merging, we can treat this as a conflict.
The user can either resolve this manually (by renaming the file to the new
destination), or they can try again with full copytrace, which should succeed.
While it's not strictly accurate, we treat this as a "change/delete" conflict,
as there is no "rename/delete" conflict type.
Reviewed By: DurhamG
Differential Revision: D33259386
fbshipit-source-id: 321f1942b0e31c3d97a4c4a32ee1eae6b6a740ce
Summary:
Add a test that demonstrates that fastcopytrace fails when a file that is
renamed later in the stack is renamed in the base commit, and then the rest of
the stack is restacked.
Reviewed By: quark-zju
Differential Revision: D18170733
fbshipit-source-id: 89c12abd8da598e07cf1b32ada11ac013a1945b0
Summary:
D33159847 (03a71ef9db) is unsound. The hg tree format is "file name + ... + hex hash", not
"hex hash" first. So file name containing spaces would cause the data to be
treated as git format incorrectly.
Fix it by passing the format from store explicitly to `Entry`.
Reviewed By: DurhamG
Differential Revision: D33534887
fbshipit-source-id: 31f12cc082f62b24794a46675efcdbf92c2551d5
Summary:
When creating a transaction we now automatically clean up an existing (abandoned) transaction if it is empty. This seems safe since recover() should be a no-op (other than cleaning up the tx files).
I've seen multiple cases of empty transaction files due to commands crashing/being killed in a transaction (but before anything has been written).
Reviewed By: quark-zju
Differential Revision: D33482320
fbshipit-source-id: a6ef74a30de96c600385a701ab2ab61bb149afb9
Summary:
This change fixes a bunch of typos that I stumbled upon reading through code and
documentation.
Reviewed By: quark-zju
Differential Revision: D33511166
fbshipit-source-id: 185ce3ac9dd2311d757fc2a3859b63c253f44dd2
Summary:
We're seeing issues where EdenFS continues to see TLS errors even when
the on-disk cert has renewed. I believe this is due to persistent TLS session
state captured in the Multi.
Let's throw away the Multi if we hit a TLS issue.
Reviewed By: quark-zju
Differential Revision: D33242914
fbshipit-source-id: 629c150383149735643cd762833ef1be95f93a2b
Summary: Adds a field for the total of downloaded bytes and another one for the total of uploaded bytes
Differential Revision: D33344429
fbshipit-source-id: ed8a0d15f077f8b3490d5f808f2e8b6f7d674511
Summary: Currently, using `hg snapshot show` without a snapshot id, simply crashes. This diff makes it exit with a better message.
Reviewed By: markbt
Differential Revision: D33185824
fbshipit-source-id: c2eaf98623416ae7de3dc4221c601f9fd47f5fd4
Summary:
This will fail a snapshot creation if two commands are run in parallel. We don't want this. Since writing latest to the metalog is of secondary importance, let's not fail the command if it fails.
In the future we want to always overwrite the metalog, but that doesn't seem so straightforward so I'll leave it to another diff.
We grab a lock when running create, but metalog is created when hg starts running, so it conflicts anyway. If we could reload after grabbing the lock, the issue would also be fixed.
Reviewed By: markbt
Differential Revision: D33168496
fbshipit-source-id: f4927c3bbca1a76669079b232d323636e56e8918
Summary: This adds a way to check if a given snapshot is the working copy. It will be used for some optimisations and asserts by ASIC use-case.
Reviewed By: markbt
Differential Revision: D33166960
fbshipit-source-id: 834cbd7f8c2ca3946ece6119164291773210c896
Summary:
Add a builtin support for retrying flaky tests.
This will be used by the next change.
Differential Revision: D33515456
fbshipit-source-id: 345fc08f059b06656cf557eaae2cf52fdaca5bd6
Summary: The test didn't account for `debugtop` running for at leas or more than 0.1 seconds. This adds a regex check for fixing it.
Reviewed By: quark-zju
Differential Revision: D33486332
fbshipit-source-id: 7aa51dc4f0da7093cf45a3d85548405d241df79f
Summary: There was one background progresstest invocation we weren't waiting for, so sometimes it was writing it's final runlog file and messing up later tests.
Reviewed By: quark-zju
Differential Revision: D33481623
fbshipit-source-id: e56d00d3c2139d9628b02b60449f0d4c617f7a30
Summary:
Change a couple spinner topics:
- "updating to <dest>" => "updating"
- "checking for conflicts" => "conflict check"
Previously the spinner would truncate to the first word, but we now allow two words which left you hanging with a topic of "Updating to" or "Checking for". Quick fix for now, but we should support longer topics.
Reviewed By: DurhamG
Differential Revision: D33190199
fbshipit-source-id: a8314d5e2e091a16df727a7e03df5728e871aeb3
Summary: This change is necessary for making D33344429 work, but was put on another diff for the sake of clarity.
Differential Revision: D33344585
fbshipit-source-id: ce391e64ad85f3b92c92a8a6e84c17d5200a3b26
Summary:
This will only be printed in environments that have cats configures - sandcastle.
It will hopefully give us some insight why sone i18n jobs are not passing cats even though they should.
Reviewed By: Croohand
Differential Revision: D33430311
fbshipit-source-id: 569531c4fc0cd882b4d7ccb691a88ea12453a794
Summary: This test very occasionally flakes. Add some extra output for next time it fails.
Reviewed By: DurhamG
Differential Revision: D33178609
fbshipit-source-id: 0e4e4cfd38a28b16d41248f3bdee774ce63e58a6
Summary:
The test doesn't pass because "socat" is not installed (on osx, at least).
Original commit changeset: 5fa732ae5f0a
Original Phabricator Diff: D33019682 (86d6ab8f6a)
Differential Revision: D33437436
fbshipit-source-id: 9a141d686de3a1c198f389f7f709b24a3c4dab08
Summary:
This adds a top command for `hg` by leveraging `runlog`. Currently it has four columns:
- `PID`: The process ID of the mercurial command
- `PROGRESS`: In case the process has progress bars, the position of the first one divided by the total of the first one. Otherwise `-`.
- `TIME SPENT`: The number of hours, minutes, seconds, and milliseconds the process has been running for. Has the same format as the original top.
- `CMD`: The `hg` subcommand and its arguments.
By now it prints a table instead of refreshing.
Differential Revision: D33177164
fbshipit-source-id: e2a1f274ca86f392cb086fabb46ddf03c2afdadb
Summary: ssh has been replaced by a tls connection to mononoke & is dead in prod, so this failing test can be deleted.
Reviewed By: singhsrb
Differential Revision: D33441306
fbshipit-source-id: 941b79dc2bc6ead52aaf3cdad2c7e961cf7ad51b
Summary:
The "Portions" license cannot be updated automatically. So this is a manual
update using:
sd -s 'Portions Copyright (c) Facebook, Inc. and its affiliates.' 'Portions Copyright (c) Meta Platforms, Inc. and affiliates.' `rg -l Facebook`
sd -s 'Copyright (c) Facebook, Inc. and its affiliates.' 'Copyright (c) Meta Platforms, Inc. and affiliates.' `rg -l Facebook`
Differential Revision: D33420114
fbshipit-source-id: 49ae00a7b62e3b8cc6c5dd839b3c104a75e72a56
Summary:
This affects the telemetry wrapper. Namely, the `git` binary will spawn a
`scm-telem-log` process which might have side effects writing `.scm.sqlite`
files when trying to log to scuba, which breaks the git tracking test in
`test-run-tests.t`.
A cleaner fix would be changing the git wrapper to stop spawning
`scm-telem-log` when `TESTTMP` is set. But that would take longer to roll out
and be effective.
Differential Revision: D33438494
fbshipit-source-id: d6af78be434cc0e966170dccaf4ff5c2b3868a52
Summary:
The test is for hg serevers with some SQL requirements. It's no longer relevant
because we have migrated commit cloud to Mononoke in production.
Differential Revision: D33437823
fbshipit-source-id: 71caa2b4bb46f7413cc1317ed37d415a726d8af2
Summary: From D33197621 (06b3283905) the repository default not existing is no longer a failure. Update test.
Differential Revision: D33436746
fbshipit-source-id: 49df6f86aa694b8af77eebb6b140dd0b28733718
Summary:
You can't run Autocargo right now, it fails with:
```
Error: File CargoTomlPath { file: PathInFbcode("configerator/structs/data_access_policies/upf/eval/Cargo.toml"), dir: PathInFbcode("configerator/structs/data_access_policies/upf/eval") } is covered by both ctp and pxl projects
```
This is affecting me (on Buck2) since some of our dependencies use Autocargo
for their Cargo.tomls and those are now missing some dependencies (notably,
`below` needs `nix`). This also affects people trying to vendor crates:
https://fb.workplace.com/groups/rust.language/permalink/7625318880849994/https://fb.workplace.com/groups/rust.language/permalink/7577471608968055/
This diff fixes it, and regenerates Cargo.tomls as a result.
Note: I'm not sure exactly what "pxl" is, but it seems to be owned by the team working on data access so that's a better home for that crate than ctp.
Reviewed By: ahornby
Differential Revision: D33428045
fbshipit-source-id: f3feab3ae04069672040423c145c69a58445ef96
Summary:
CentOS 8 `python3` defaults to Python 3.6, which has issues with IPython
shell tab completion. Prefer Python 3.8 to avoid issues.
Note: I also tried Python 3.9 but it seems the Cython we use generates
incompatible code. So I'm sticking to Python 3.8 for now.
Reviewed By: markbt
Differential Revision: D33339419
fbshipit-source-id: 4708a6713ce9ce63ed4bd5e3cc08a3086ae57edb
Summary:
It's normal if a repo does not have an `.arcconfig`.
Do nothing in that case.
Reviewed By: markbt
Differential Revision: D33351382
fbshipit-source-id: a0d2ce9e5e92ae3e25454d8f21bc858f460341c6
Summary: It will be used by upcoming changes.
Reviewed By: markbt
Differential Revision: D33339415
fbshipit-source-id: 759de1d7929ce3db56d7cffda7e6ac352287a49a
Summary:
See previous diffs for context. The "Greater => break" fast path might be
unsound if directories are involved in git. Fix it by keep checking a few
more entries.
Reviewed By: markbt
Differential Revision: D33369604
fbshipit-source-id: 196020bcac7c5a3839164555f9def894c0448946
Summary:
Take some extra effort to force the right order when serializing entries to
bytes. Previously the order happens to match `BTreeMap<PathComponentBuf, _>`
order. But that is problematic for git.
This also makes `lookup_hg` able to use fast path to skip for tests that do
not have elements sorted.
Reviewed By: markbt
Differential Revision: D33369605
fbshipit-source-id: e939cc2089be63b69ccd590870b28f7468cb35ed
Summary:
Remove the use of `Result` and `IntoIterator` for `Entry::from_elements`.
This will make the next change easier.
Reviewed By: markbt
Differential Revision: D33369608
fbshipit-source-id: f0ef9b35ae4157e1aaab67ef079e260f47c6f854
Summary:
The git tree has some complexity in item ordering. Failing to respect its
ordering could lead to wrong "lookup" or serialization. Let's add some
functions to compare basenames like what git does.
Reviewed By: markbt
Differential Revision: D33369603
fbshipit-source-id: 306db58d9e403186f5e51a7b25c7fab3bfe08a1b
Summary:
The latter API takes bytes directly without utf8 overhead. This addresses
a TODO comment.
Reviewed By: markbt
Differential Revision: D33339417
fbshipit-source-id: af3c6bc49e5dfbb857c5dc7439b783c79d62d493
Summary:
The main API of `Elements` is the `next()`. It currently constructs `Element`,
with allocated `PathComponentBuf` and fully parsed flags for every entry.
The `next()` API is more suitable for the `TreeManifest` structure, to parse
and cache massive paths within a same root tree (1 root tree x N paths).
It is less efficient when there are a lot of different root trees, and each
needs to resolve a single path (N root trees x 1 path).
This diff adds a `Elements::lookup` API more suitable for the
(N root trees x 1 path) use-case. It seems to cut down the tree
lookup cost by at least 1/3 in upcoming pathhistory logic.
Reviewed By: markbt
Differential Revision: D33339410
fbshipit-source-id: 8cf36cda2b60e88d458758fa6eef6695c962b9e3
Summary: They will be reused in upcoming changes.
Reviewed By: markbt
Differential Revision: D33339409
fbshipit-source-id: 6b94b6ebdc31b114c7ec5c7f05fb3801fc600b95
Summary: The method can deal with both hg and git format. Rename to clarify.
Reviewed By: markbt
Differential Revision: D33339414
fbshipit-source-id: d6435199bb419bd0bb33931821777d94ed5270ca
Summary:
It seems 100644 is not the only mode. Treat 100664 as regular, too.
Note: 100664 is the only "non standard" mode according to git fsck.
See also:
- The latest git fsck logic: bb6832d552/fsck.c (L669)
- Introduced by: 42ea9cb286
Reviewed By: markbt
Differential Revision: D33339413
fbshipit-source-id: 2a9c4ae795a31de24f2bd73e482eb99d7b5e56f8
Summary: It will be used by upcoming changes.
Reviewed By: markbt
Differential Revision: D33339952
fbshipit-source-id: 72639d8f9fe7e6740cf382597be246562a472d0d
Summary:
Previously, segments are internal details that aren't much exposed, because
none of the call-site care about them. However, there are new use-cases that
might benefit from accessing the segments hierarchy, such as a "log" history
algorithm.
This diff adds APIs to convert `IdSet` to segments with level chosen by
the callsite. It will be used by pathhistory, which was trying to use bisect to
answer histroy quereies. Context on why it needs the segments details:
I first tried to use a plain linear bisect for pathhistory, but it turns out
the bisect must be aware of the graph shape more deeply than I thought, and
it seems more efficient if the bisect algorithm has access to segments
information directly.
The issues I ran into is "leaked" heads or roots. When bisecting a range
`low:high`, `heads(low:high)` might be not just `high`, or `roots(low:high)`
might be not just `low`. If we find `low` and `high` have same content on
given paths, and blindly mark `low:high` as "unchnaged", then those
"leaked" heads and roots might be marked incorrectly.
Using existing Dag APIs, it's possible to find the "leaked" commits by
`(low:high)-descandents(low)-ancestors(high)` and unmark them as "unchanged".
However, the `descandents` algorithm is `O(flat segments, aka. merges)`.
That's too slow for the whole bisect history algorithm.
Reviewed By: markbt
Differential Revision: D33339418
fbshipit-source-id: e0af56bfb046eb1a38dc6586a1970325c608aad5
Summary:
The `segments` being "collect()"ed into was changed t a `BTreeSet` so there is
no need to care about the order of the `Vec` segments.
Reviewed By: markbt
Differential Revision: D33339412
fbshipit-source-id: 2c0bcaaee0e31244dd19d16305c714222768721d
Summary: Make `repo.changelog.inner` able to be used as a commit -> tree converter.
Reviewed By: DurhamG
Differential Revision: D33265880
fbshipit-source-id: 78b7cb59568731c3ab1f2cc8007d20ec2de3016b
Summary:
This can be useful to move Python importhelper logic to native Rust, and for
upcoming changes about pathhistory.
Reviewed By: DurhamG
Differential Revision: D33265879
fbshipit-source-id: a228ad86aac01ce8d40ae5ba43142f4f6c3f90c3
Summary: This is useful to get an independent commit text reader.
Reviewed By: DurhamG
Differential Revision: D33265883
fbshipit-source-id: 21493f2512c6cce87456732c264d2184bbbd00d4
Summary: This will be used by upcoming changes.
Reviewed By: DurhamG
Differential Revision: D33265877
fbshipit-source-id: 1178448f2a566c88f38d9120d34618d2677f4cc2
Summary: The NULL id is a valid tree id in hg. Handle it in Rust.
Reviewed By: DurhamG
Differential Revision: D33280047
fbshipit-source-id: 7b1f5038fd49d253c6f33d3ba81e17ff097eb5ab
Summary:
This will be used by pathhistory to do graph calculations without relying on
(suboptimal patterns of) network requests.
Reviewed By: DurhamG
Differential Revision: D33280045
fbshipit-source-id: 37af9c686a765b159b7976a9c828d78f19c198a5
Summary:
Allows the IdSet iterator to have faster random access.
This will be used by pathhistory.
Reviewed By: DurhamG
Differential Revision: D33280044
fbshipit-source-id: 70e473ff4a955efc14080bc060ad6f0d1919605b
Summary:
See previous diff for context. I also searched `.iter_segments`, and confirmed
other use-cases are fine (only using level 0).
Reviewed By: DurhamG
Differential Revision: D33280046
fbshipit-source-id: 6178d9592027f57e8ea7f58cbf78ea1bdd54b007
Summary:
The `children_set` algorithm does the wrong thing if the highest level segments
are missing, which leads to wrong `roots` result. This is found when I test
pathhistory logic.
Reviewed By: DurhamG
Differential Revision: D33280043
fbshipit-source-id: 17a9d04b8627cec7084dc0eeb3933fae6fe6f774
Summary:
When I tried to checkout `linux.git` using system config I got
errors about filescmstore. Pretend the git fileslog has it to
solve the problem.
Reviewed By: DurhamG
Differential Revision: D33237339
fbshipit-source-id: 23aa543d2cc3da3ea041f6524031b3532303344b
Summary:
Git tags are translated to remotenames. Auto clean could cause trouble.
Therefore disable auto clean for git repos.
Reviewed By: DurhamG
Differential Revision: D33237333
fbshipit-source-id: 2e857899e122803c4bad03280827f59858826fde
Summary: Use the new reference updating API to make bookmark changes effective.
Reviewed By: DurhamG
Differential Revision: D33237331
fbshipit-source-id: f9103c78315d4d3efed715f14d0d6e8db17ca25a
Summary:
Previously we only sync git references to metalog. Add an API to
do the reverse sync.
Git tags are moved from bookmarks to remotenames with `tags/` prefix so they won't look too verbose in smartlog output.
Reviewed By: DurhamG
Differential Revision: D33237337
fbshipit-source-id: 0a73c4fa718fdd80bd30c58e48d037fa292c30ac
Summary:
Use Rust refencode to encode and decode commit references in Python.
This removes duplicated implementation.
Reviewed By: DurhamG
Differential Revision: D33237327
fbshipit-source-id: 2e4baa5a422fa562f355da2a82633d5b02c78a7d
Summary:
Commit references encoding and decoding was previously Python only.
Make them available in Rust.
Reviewed By: DurhamG
Differential Revision: D33237334
fbshipit-source-id: ec363873be93f30bf88e321a20aedcdcc7b60659
Summary:
This allows writing git commits using the same Rust abstraction.
The commit text must be in git format, though.
Note: This only allows adding commits to the git storage without
updating references.
Reviewed By: DurhamG
Differential Revision: D33237332
fbshipit-source-id: 643a309718b4ded22343c96ac668f9a8c044c59f
Summary: Move some reusable logic so they can be reused in the next diff.
Reviewed By: DurhamG
Differential Revision: D33237328
fbshipit-source-id: 57f71de6edf02f9bda0b5589264be96411450877
Summary:
Make `flush()` able to write git trees.
Note: the `flush()` cannot write hg trees with genuine SHA1s, because it does
know p1 and p2. `finalize()` is used to write hg trees. `finalize()` has too
much hg complexity, which makes it less fesible for the git tree writes.
The "incorrect" `flush()` behavior for hg trees is kept for tests. Assertions
are added to ensure it's not misued.
Reviewed By: DurhamG
Differential Revision: D33237338
fbshipit-source-id: 390f1c8314f482b6ac3acf9b53c0626bedf1d76b
Summary:
Make TreeStore provides info whether it wants git or hg serialization.
Will be used by `flush()`.
Reviewed By: DurhamG
Differential Revision: D33237326
fbshipit-source-id: f63782dbbd5bb2711ddcc8f7c0581ef077d58dcc
Summary: Add missing methods to gitfilelog so status can work.
Reviewed By: DurhamG
Differential Revision: D33237329
fbshipit-source-id: 3da9ca54163811c7d95b1f681ccee46c99585a06
Summary:
The required trait `TreeStore` was moved from `manifest-tree` to `storemodel`.
So there is no needto depend on `manifest-tree`.
Reviewed By: DurhamG
Differential Revision: D33198004
fbshipit-source-id: 88660b305483672338938da616643d4ffa1756be
Summary:
Since the default behaviour involves rebasing when there are no conflicts, make
the `--rebase` option non-advanced, and explain how to use both variants in the
help text.
Reviewed By: DurhamG
Differential Revision: D33235437
fbshipit-source-id: 6a690a5eb7cef2ac4db0dceee23d8f98285ad8dc
Summary:
When hiding a bookmark, the quotes are in the wrong place when printing out the
original bookmark location. Fix this up, and change to single quotes for
consistency with other bookmark output.
Reviewed By: DurhamG
Differential Revision: D33235253
fbshipit-source-id: 9f18c2f357f7794a94118ba27355252ddb698850
Summary:
The `prune` command is left over from obsmarkers. Previously, it was a way to manually construct obsmarkers. Now it is functionally the same as `hg hide`.
There is a small difference: `hg prune` would move bookmarks to the parent,
rather than removing them when a bookmarked commit was hidden. Prune is very
rarely used, so making this consistent seems ok.
Reviewed By: DurhamG
Differential Revision: D33235254
fbshipit-source-id: 7f688de2f60c13c172ce53f68a986a4cc94349fb
Summary:
Currently, the rust filesystem walker is single-threaded. Parallelizing the walk would increase the speed at which the walker can traverse the entire repository. The change is implemented with the thread paradigm over async to avoid compatibility issues.
**Performance Tests:**
*Pay attention to the performance timer print statement in `walker.rs`* - must checkout `19420b94aa9de691ece95`
```
$ cd ~/fbsource-noneden/fbobjc
$ time ~/fbsource/fbcode/buck-out/gen/eden/scm/__hg__/hg.sh purge --all --config workingcopy.enablerustwalker=True --config workingcopy.rustwalkerthreads=thread_count
```
| Devserver | `walk()` time | `hg purge` time |
| -- | -- | -- |
| **`thread_count=0` (SingleWalker)** | N/A | `58.80s user 31.51s system 99% cpu 1:30.43 total` |
| **`thread_count=1` (MultiWalker)** | `26.727605348s` | `57.32s user 33.76s system 99% cpu 1:31.50 total` |
| **`thread_count=2` (MultiWalker)** | `16.211584683s` | `61.12s user 33.54s system 118% cpu 1:20.04 total` |
| **`thread_count=4` (MultiWalker)** | `9.326652975s` | `66.80s user 33.47s system 128% cpu 1:18.23 total` |
| **`thread_count=8` (MultiWalker)** | `5.969804152s` | `72.37s user 35.69s system 132% cpu 1:21.76 total` |
| **`thread_count=64` (MultiWalker)** | `4.97111422s` | `78.74s user 39.43s system 145% cpu 1:21.07 total` |
| Mac | `walk()` time | `hg purge` time |
| -- | -- | --|
| **`thread_count=0` (SingleWalker)** |
| **`thread_count=1` (MultiWalker)** |
| **`thread_count=2` (MultiWalker)** |
| **`thread_count=4` (MultiWalker)** |
| **`thread_count=8` (MultiWalker)** |
| **`thread_count=64` (MultiWalker)** |
Reviewed By: DurhamG
Differential Revision: D29691021
fbshipit-source-id: 4aa7a43d2b24316e2f889d96b212c35b27c43cfd
Summary:
The shelve test used to have code to check visibility (with obsmarkers) worked
ok in a transaction. We've removed obsmarkers, so this doesn't apply anymore.
Delete the test code - we've been running without it ok for a couple of years, and `shelve` is likely to get replaced by snapshots at some point anyway.
Reviewed By: DurhamG
Differential Revision: D33234288
fbshipit-source-id: 809e469016cd0c84927735f2f459e57743e90ec3
Summary:
If paths.default is missing (ex. an `hg init` or a git repo), the commitcloud
revset (and `hg sl` using it) crash like:
Traceback (most recent call last):
File "edenscm/hgext/commitcloud/__init__.py", line 241, in _smartlog
status.summary(repo)
File "edenscm/hgext/commitcloud/status.py", line 68, in summary
unbackeduprevs = repo.revs("notbackedup()")
File "edenscm/mercurial/localrepo.py", line 1409, in revs
return m(self, subset=subset)
File "edenscm/mercurial/revset.py", line 2799, in mfunc
return getset(repo, subset, tree, order)
# order = 'define'
# subset = <fullreposet+ []>
File "edenscm/mercurial/revset.py", line 105, in getset
return methods[x[0]](repo, subset, *x[1:], order=order)
# order = 'define'
# subset = <fullreposet+ []>
File "edenscm/mercurial/revset.py", line 359, in func
return func(repo, subset, b)
# subset = <fullreposet+ []>
File "edenscm/hgext/commitcloud/__init__.py", line 318, in notbackedup
heads = backupstate.BackupState(repo, ccutil.getremotepath(repo.ui)).heads
File "edenscm/hgext/commitcloud/util.py", line 71, in getremotepath
path = ui.paths.getpath("default")
File "edenscm/mercurial/ui.py", line 1890, in getpath
raise error.RepoError(_("repository %s does not exist") % name)
# name = 'default'
Traceback (most recent call last):
File "edenscm/mercurial/ui.py", line 1883, in getpath
return self[name]
KeyError: 'default'
Avoid that by detecting the empty path and just return empty sets.
Differential Revision: D33197621
fbshipit-source-id: fb5676a7f7be4937328d487824c6d5ee0b18fc59
Summary:
This endpoint will be useful to support pulling multiple branches, or even
initial clone.
Differential Revision: D32800590
fbshipit-source-id: 4a74b95d22e3ea2ac451ec516106d7c1775db7dc
Summary:
This endpoint will work for multiple heads cases.
It can eventually replace both `clone_data` and `pull_fast_forward_master`.
Differential Revision: D32800593
fbshipit-source-id: 9b58403bda5ab532f54d4d2e02afe9aa6cae7543
Summary:
Change `import_pull_data` from a linear iteration of segments, to a DFS of
segments from heads. This allows us to properly handle id reservation.
There are other benefits. For example, the new code path allows the
`import_pull_data` to de-fragment segments, ignore segments that are
not referred by heads, etc. This can be seen from the test change.
Differential Revision: D32800589
fbshipit-source-id: bfa7d7c2e5c1e77cc2c736ca6628c1ccd30540e6
Summary: If `reserve_size` is `0`, then there is no need to lookup the vertex.
Differential Revision: D32800596
fbshipit-source-id: 3da9e9d861421501105220098c7d163e8f060868
Summary:
With the reader lock, we can detect active readers. If they exist,
skip auto repair in `open_with_repair` to be safe.
Reviewed By: DurhamG
Differential Revision: D33137394
fbshipit-source-id: 0fba5e20835b2fd065f2064e50d8a30a446562fb
Summary:
The reader locks provide a way to test if there is an active reader.
They are useful to decide whether it's safe to run automatic repair.
Reviewed By: DurhamG
Differential Revision: D33137392
fbshipit-source-id: 24532fb642b148b84b5722a62ffabaabf3483420
Summary:
ScopedDirLock is blocking and exclusive. Extend it so it can be used for
shared and non-blocking use-cases, which is interesting to test if there are
running readers.
Reviewed By: DurhamG
Differential Revision: D33137390
fbshipit-source-id: 07d290a0a06248cdb0769b70ebf2dde3f063a293
Summary:
This helps solving certain conflicts efficiently. For example, rebasing
D32800583 (66a7c13139) V3 caused a lot of conflicts like:
<<<<<<< dest: cf04306ad6ac - quark: [hg] edenapi: use "reponame" associated...
.allow_threads(|| block_unless_interrupted(api.files_attrs(repo, spec.0))
=======
.allow_threads(|| block_unless_interrupted(inner.files_attrs(spec.0)))
>>>>>>> source: e4fbe1289312 - quark: [hg] edenapi: drop reponame from edenap...
These conflicts are "obvious" to resolve using word-level diff and merge.
Run ordinary line-based merge algorithm, for each conflicted region, try
word-based merge algorithm. If it works, then pick the result. Otherwise
still show line-based conflicts.
The word merge feature is off by default so it won't affect existing behaviors.
Reviewed By: mitrandir77
Differential Revision: D10456947
fbshipit-source-id: 30f942ed2b26b2efba1d37a078e5b57ee3a56dcc
Summary:
Minimal changes to make diff working.
I haven't checked corner cases like rename handling.
They probably need some extra work.
Reviewed By: DurhamG
Differential Revision: D33159848
fbshipit-source-id: 8033fd5386dd79d90cc2fb9a292044743e139ade
Summary: Use the git store accordingly to support checkout.
Reviewed By: DurhamG
Differential Revision: D33159850
fbshipit-source-id: c1e84cb75fe330636e8e07a90ce0a5e59c4691ef
Summary:
Previously, changelog is aware of git but not elsewhere.
Move the `gitdir` and requirement stuf to a dedicated place,
so non-changelog logic can use them.
Reviewed By: DurhamG
Differential Revision: D33159849
fbshipit-source-id: 631fa729001d3e0e9a195df27ad2b633c1574985
Summary:
Thin wrapper around GitStore. It can be used in pycheckout or pymanifest in the
native code paths.
Reviewed By: DurhamG
Differential Revision: D33159845
fbshipit-source-id: e395818b2e5bd6c9d560e76fe128a21047cc7000
Summary:
Implement a `GitStore` that can fit nativecheckout and manifest-tree
abstractions.
Reviewed By: DurhamG
Differential Revision: D33159851
fbshipit-source-id: 5f354b130567abf390de6207e5f626b274d22efe
Summary:
Implement the git tree deserialization.
The git tree format is different from hg tree and can be detected
automatically.
Reviewed By: DurhamG
Differential Revision: D33159847
fbshipit-source-id: 73b2a470e62d8380cc576347728374aa770942ee
Summary:
See the previous diff for context. Make `pycheckout` use abstractions so
it can accept `ReadFileContents` defined by different crates.
There are no need to leak the "scmstore" detail to `pycheckut`.
The `apply_scmstore*` methods are removed, since they will be just the
same as `apply*`.
Reviewed By: DurhamG
Differential Revision: D33159846
fbshipit-source-id: 20e4df4ba6bb287e1bdb3a0636013a45f80bc7fc
Summary:
See the previous diff for context. Make `pymanifest` use abstractions so
it can accept `TreeStore` defined by different crates.
`revisionstore` details are moved back to `pyrevisionstore`.
This also seems to improve `prefetch` and `subdirdiff` performance since
they will no longer go through Python.
Reviewed By: DurhamG
Differential Revision: D33157647
fbshipit-source-id: e65183689ca54d8c2beb277c6b8e52aa0f1571ac
Summary:
Pure Rust code can have abstraction like:
fn foo(x: impl Into<T>)
However, the `impl` syntax cannot be used in Python functions. That's a
problem because functions need abstractions will have to take a `PyObject`,
and do downcast manually with all actual impls in its dependency.
For example, suppose `foostore` and `barstore` provide Rust types implementing
the `TreeStore` trait (defined by `treemodel`). And `pymanifest` wants a `impl
TreeStore`, then `pymanifest`'s dependency tree will look like:
pymanifest
+- pyfoostore - foostore - treemodel
+- pybarstore - barstore - treemodel
And changes to pymanifest are required to support any new store types.
This diff introduces a way to achive the `impl Into` in Python binding
functions. One can write:
def foo(x: ImplInto<T>)
The dependency tree can be cleaner:
pymanifest
+- treemodel
And adding new store types or bindings do not require changing `pymaniest`
that uses abstraction.
Reviewed By: yancouto
Differential Revision: D33157648
fbshipit-source-id: 41e4bf03770f1ed568dbec7f7cb69ca3f67251a8
Summary:
"convert-cert" is required for HTTPS to work on Windows, but previously it didn't get set by default. Mercurial set it globally, edenapi client set it locally (for usage via EdenFS), but revisionstore/lfs did not set it locally (so it didn't work via EdenFS on Windows).
hg_http::http_config() is a new function that creates an http_client::Config from an hg config. This config is then passed to hg_http::http_client() (and the client passes it along to requests). This is the new unified way to configure http knobs via the hg config. Currently it only configures convert_cert, but I will move more settings over.
This commit should fix rust LFS fetching via EdenFS on Windows.
Reviewed By: DurhamG
Differential Revision: D33107052
fbshipit-source-id: 2eb46b3745046a90ecac532b3526371843a79fed
Summary: Now instead of calling Request::post(url) you call client.post(url). This makes it possible for configuration to flow from the client to the request auotmatically. This is a step towards getting rid of global request configuration hooks.
Reviewed By: DurhamG
Differential Revision: D33107054
fbshipit-source-id: 21eaecfec09f60b402703431070807e694cd85c4
Summary:
Move a couple of existing HttpClient fields onto a new Config object and pipe through to existing users. This is step one to unifying http client configuration.
I renamed "verbose" to "verbose_stats" to differentiate from http.verbose which turns on verbose request output.
Reviewed By: DurhamG
Differential Revision: D33107053
fbshipit-source-id: d00e5c0d21a04cff1bb62ebd86dfc59962983249
Summary:
The repair messages can be useful to see if things are wrong.
Previously it's only in memory. Attempt to write them to `repair.log` for
easier investigation.
Reviewed By: DurhamG
Differential Revision: D33137391
fbshipit-source-id: d84b35f84c0a70930c5ff9a0d5ccb7e295f64838
Summary:
According to buck, the bindings is unused in the eden project:
fbcode % buck query 'rdeps("//eden/...","//eden/scm/lib/edenapi/bindings:edenapithin",1)'
//eden/scm/lib/edenapi/bindings:edenapithin
//eden/scm/lib/edenapi/bindings:c_api
fbcode % buck query 'rdeps("//eden/...","//eden/scm/lib/edenapi/bindings:c_api",1)'
//eden/scm/lib/edenapi/bindings:c_api
It seems that:
- D25104843 is the next step to make it useful in C++ but that diff never lands.
- EdenApi feature is added in the `backingstore` crate. ex. D18605549 (ae1dae6b96), and D32371710 (74115cda20)
Therefore it seems unlikely the edenapi C bindings will get used. Let's just
remove it.
Reviewed By: chadaustin
Differential Revision: D33139302
fbshipit-source-id: b57cec42d3b2ccfe0d9a4fb5af77a05f98b510d6
Summary:
In production we simply cannot avoid OS crashes or hard reboots, or hard VM
shutdown without proper fsync implementation. That leads to annoying corruption
issues but are easy to fix (ex. missing 'latest' or 'meta' files). Use
open_with_repair to reduce support burden for these kinds of issues.
Reviewed By: DurhamG
Differential Revision: D33109477
fbshipit-source-id: c72861761d61df42e774fb2a43e3c22bd7e13ab3