Summary:
The core `clone.streamclone` is the new clean way to do a streaming clone with
selectivepull. Detect the use of it and skip remotenames' own exclone logic.
Reviewed By: DurhamG
Differential Revision: D21011396
fbshipit-source-id: 50fdbf4c2761a96c50e23f21a87ef636fac74afb
Summary:
The current `clone --shallow` command has some issues:
- It fetches *all* remote bookmarks, since selectivepull does not work with
streamclone, then remove most remote bookmarks in a second transaction.
- It goes through remotenames, which is racy, and D20703268 does not fix the
clone case. Possible cause of T65349853.
- Too many wrappers (ex. in remotefilelog, remotenames, fastdiscovery) wtih
many configurations (ex. narrow-heads on/off) makes it hard to reason about.
Instead of bandaidding the clone function, this diff adds a new clone implementation
that aims to solve the issues:
- Use streamclone, but do not pull all remote names.
- Pull selectivepull names explicitly with a working "discovery" strategy
(repo heads should be non-empty with narrow-heads on or off).
- Do clone in one transaction. Outside world won't see an incomplete state.
- Use `repo.pull` API, which is not subject to race conditions.
- Eventually, this might be the only supported "clone" after Mononoke becoming
the single source of truth.
Note: the code path still goes through bookmarks.py and remotenames.py.
They will be cleaned up in upcoming diffs.
Reviewed By: DurhamG
Differential Revision: D21011401
fbshipit-source-id: d8751ac9bd643e9661e58c87b683be285f0dc925
Summary:
In the past we hide the revlog headrevs API with the idea that calculating
heads in the DAG is not going to scale, and heads should be based on references
(remotenames, visible heads). Practically calculating heads in the DAG based
on segmented changelog is not going to be painfully slow so we probably can
afford it.
Therefore let's just re-expose the DAG-based heads API as rawheads. The only
user of it is in dagutil.py.
This will be used in the next diff where streamclone first gets the revlog
changelog copied without remote bookmarks. Then it needs to do a pull
which requires the heads information.
Reviewed By: DurhamG
Differential Revision: D21296530
fbshipit-source-id: a81a61e3b58c921a3390fda8f716bd7ae0e55ed1
Summary:
Move the logic to write repo hgrc ([paths]) and set [paths] config options
earlier, so other logic can use the [paths] config.
Some tests are changed because remotenames can now write bookmarks in more
cases.
Reviewed By: DurhamG
Differential Revision: D21011397
fbshipit-source-id: 4b921a02c20daeef31d44a03264a89b975303aa5
Summary:
"[paths] being empty" will no longer be a way to test initial clone, use
transaction name instead.
Reviewed By: DurhamG
Differential Revision: D21011395
fbshipit-source-id: e257fe8eb2efd45ac52fad7c74363151b0a8c417
Summary: This will be used by hg-git to test initial clone.
Reviewed By: DurhamG
Differential Revision: D21011400
fbshipit-source-id: 11a1a41631830273a6407e419ebe5ff21964e7de
Summary: It is not used and makes the already complicated clone logic more complicated.
Reviewed By: DurhamG
Differential Revision: D21011394
fbshipit-source-id: 3620f7372a9f3cefc60618052c768c6c2cbe04f9
Summary:
They are not used. Remove it to make it a bit easier to move stream_out_shallow
to core.
Note: this does not remove all include/excludepatterns yet.
Reviewed By: DurhamG
Differential Revision: D21011403
fbshipit-source-id: f6d27a3e2472f6c69f95a958ac99f75a8b8f8b74
Summary: It will be used in the next change.
Reviewed By: DurhamG
Differential Revision: D21011399
fbshipit-source-id: 6bdffc79af0474e42562686109417882a8cb2cd6
Summary:
While this isn't the right fix, this is what shipped in our packages, for the
sake of being able to reproduce the package, let's land this as it is. A
future change will remove this ifdef.
Below is pkaush original description:
In Eden Windows we treat all the files as regular files and don't have a
concept of symlinks and executable files. Fixing the TreeEntryType::getType()
to return REGULAR_FILE for executable file and symlink.
Reviewed By: wez
Differential Revision: D20481051
fbshipit-source-id: 0b0c4d7aea28134383ef45aeafc02930b420286b
Summary:
Right now, we debug-print the root cause and pretty-print everything else. This
is pretty bad because the root cause is usually the one thing we would want to
pretty print so we can add instructions there (such as "your hooks failed, fix
it").
This fixes this so we stop pretty-printing the root cause, but also debug print
the whole error, which gives us more developer-friendly context and is easier
for automation to match on.
This is actually in common/rust ... but we're the only people using it AFAICT.
Reviewed By: StanislavGlebik
Differential Revision: D21522518
fbshipit-source-id: 10158811574b56024e14852229e4541da19d5609
Summary:
Add the `hg cloud hide` command. This allows removal of commits, bookmarks and
remote bookmarks from a cloud workspace, even when the items are omitted
locally.
Reviewed By: DurhamG, quark-zju
Differential Revision: D21409384
fbshipit-source-id: 24b64c207c78f9b0258e9cf6a578db7b14c84901
Summary:
Limits on concurrent calls are a bit hard to reason about, and it's not super
obvious what a good limit when all our underlying limits are expressed in QPS
(and when our data sets don't have peak concurrency - instead they have
completion time + # blob accesses).
Considering our past experience with ThrottledBlob has been quite positive
overall, I'd like to just use the same approach in ContextConcurrencyBlobstore.
To be safe, I've also updated this to be driven by tunables, which make it
easier to rollout and rollback.
Note that I removed `Debug` on `CoreContext` as part of this because it wasn't
used anywhere. We can bring back a meaningful implementation of `Debug` there
in the future if we want to. That triggered some warnings about unused fields,
which for now I just silenced.
Reviewed By: farnz
Differential Revision: D21449405
fbshipit-source-id: 5ca843694607888653a75067a4396b36e572f070
Summary:
The motivation for making this function async is that it needs to spawn things,
so it should only ever execute while polled by an executor. If we don't do
this, then it can panic if there is no executor, which is annoying.
I've been wanting to do this for a while but hadn't done it because it required
refactoring a lot of things (see the rest of this stack). But, now, it's done.
Reviewed By: mitrandir77
Differential Revision: D21427348
fbshipit-source-id: bad077b90bcf893f38b90e5c470538d2781c51e9
Summary:
This updates our blobrepo factory code to async / await. The underlying
motivation is to make this easier to modify. I've ran into this a few times
now, and I'm sure others have to, so I think it's time.
In doing so, I've simplified the code a little bit to stop passing futures
around when values will do. This makes the code a bit more sequential, but
considering none of those futures were eager in any way, it shouldn't really
make any difference.
Reviewed By: markbt
Differential Revision: D21427290
fbshipit-source-id: e70500b6421a95895247109cec75ca7fde317169
Summary:
I'd like to add some async fns to session creation. The motivation is that I
want to instantiate an AsyncRateLimiter there, and that requires an async
context because it needs to spawn on the Tokio runtime, and the best way to
enforce this is to just make the function async.
Reviewed By: StanislavGlebik
Differential Revision: D21427291
fbshipit-source-id: 75b0d41b62a77ade3d624e24adc57a938b722d9c
Summary:
At least let's tell the use what to do about the problem and, where we can,
what the conflicting file was (see the attached task).
Reviewed By: farnz
Differential Revision: D21459412
fbshipit-source-id: 52b90cf7d41ebe6550083c6673b4e93b10edf5e2
Summary:
I initially wanted to modify this and it'll be easier to do so if it's
async-await. While in there, add tests and update the code to bail early if any
conflict is hit.
In writing the tests, I noted that the code that we need is already there and
his does work as expected, so I'm not actually going to modify this more, but
it's probably stil worth it to land the tests.
Reviewed By: StanislavGlebik
Differential Revision: D21457899
fbshipit-source-id: 91350962fa2d96a88e4595d1ae47ef7678dad8cb
Summary: I'm going to asyncify some things here. Let's start with this.
Reviewed By: farnz
Differential Revision: D21451761
fbshipit-source-id: 64c78de4ab640b826a3ec1d6d84149d46f225024
Summary:
Right now we're only logging hooks that outright fail, which isn't great. Let's
log rejections as well.
Reviewed By: johansglock
Differential Revision: D21522804
fbshipit-source-id: 6bfc6b12394099b04faa9d23f164b436935f9fb3
Summary: This will reduce the amount of space they take in scuba.
Reviewed By: xavierd
Differential Revision: D21483472
fbshipit-source-id: 9de49dedef480932f8583dd17fe6625d222a3285
Summary: add two methods calls as typehints (no real code change).
Reviewed By: zeroxoneb
Differential Revision: D21337646
fbshipit-source-id: 8079883f7f989251965d3308c5374f097023b57a
Summary: This allows us to query tracing data for fsmonitor walk events.
Reviewed By: DurhamG
Differential Revision: D19797709
fbshipit-source-id: 1ff76dd6122cf56787e7928711f604f9c3d571cc
Summary:
Pass `configparser::config::ConfigSet` to `repack` in
`revisionstore/src/repack.rs` so that we can use various config values in `filter_incrementalpacks`.
* `repack.maxdatapacksize`, `repack.maxhistpacksize`
* The overall max pack size
* `repack.sizelimit`
* The size limit for any individual pack
* `repack.maxpacks`
* The maximum number of packs we want to have after repack (overrides sizelimit)
Reviewed By: xavierd
Differential Revision: D21484836
fbshipit-source-id: 0407d50dfd69f23694fb736e729819b7285f480f
Summary:
Let's not allow proceeding with memcommit if repo is locked. This what normal
push flow does, so we should allow it here as well.
Reviewed By: markbt
Differential Revision: D21502435
fbshipit-source-id: 80e665f065fb0cd882bc99482769a3de01d3de30
Summary:
Make the repo path in Option<WrappedPath> available in stream output in preparation for using it in the corpus dumper to write to disk
The path is Option as not all nodes can have an associated file system path (e.g. BonsaiChangeset)
The headlines changes are in sampling.rs and sizing.rs. The progress.rs change slightly generalises to allow any type convertible to NodeType as the main walk identifier in the output stream.
Some refactors done as part of this
* NodeSamplingHandler is renamed to WalkSampleMapping to reflect this is what it stores.
* WalkSampleMapping generic parameters are extended to take both a key and a sample type
* NodeSamplingHandler::start_node() is moved to a new SampleTrigger::map_keys() type. This is so that SamplingWalkVisitor doesn't need the full WalkSampleMapping generic parameters.
Reviewed By: krallin
Differential Revision: D20835662
fbshipit-source-id: 58db622dc63d7f869a092739d1187a34b77219f6
Summary: Make sampling blobstore handlers fallible in preparation for corpus dumper so we can know if writes to disk/directory creations failed.
Reviewed By: farnz
Differential Revision: D21168632
fbshipit-source-id: d25123435e8f54c75aaabfc72f5fa653e5cf573d
Summary:
Not all node types can have a path associated
Reset the tracked path to None if the route is taking us through a node type that can't have a repo path.
Reviewed By: krallin
Differential Revision: D21228372
fbshipit-source-id: 2b1e291f09232500adce79c630d428f09cd2d2cc
Summary:
Add new --sample-offset argument so that in combination with the existing --sample-rate the whole repo can be sampled in slices
For --sample-rate=N, this allows us to scrub or corpus dump 1/Nth of the repo a time, which is particularly useful for corpus dumping on machines with limited disk.
Also factored out the sampling args construction as 3 of the 4 walk variants use them (only validate does not)
Reviewed By: krallin
Differential Revision: D21158486
fbshipit-source-id: 94f98ceb71c22e0e9d368a563cdb04225b6fc459
Summary: use ArcIntern for WrappedPath to reduced walker memory usage for paths
Reviewed By: farnz
Differential Revision: D21230828
fbshipit-source-id: 525bac5a14b205659e177e03bd83bf06d1444617
Summary:
- Added uptime field to DaemonInfo thrift struct
- Created startTime member variable in EdenServer
- Made appropriate refactoring changes to EdenMain and EdenServer
- Changed main.py and util.py to use the new uptime value
Reviewed By: genevievehelsel
Differential Revision: D21471140
fbshipit-source-id: 8868de667dfb95de93e3e71b90c0412fb3825388
Summary:
If http_proxy.no is set, we should respect it to avoid sending traffic to it
whenever required.
Reviewed By: wez
Differential Revision: D21383138
fbshipit-source-id: 4c8286aaaf51cbe19402bcf8e4ed03e0d167228b
Summary:
When Qing implemented all the get method, the translate_lfs_missing function
didn't exist, and I forgot to add them in the right places when landing the
diff that added it. Fix this.
Reviewed By: sfilipco
Differential Revision: D21418043
fbshipit-source-id: baf67b0fe60ed20aeb2c1acd50a209d04dc91c5e
Summary: Make them reusable in other Python bindings, ex. pymutation.
Reviewed By: sfilipco
Differential Revision: D21486524
fbshipit-source-id: 258455c6a442353c77588fadcb560cb5a170926e
Summary: This makes it easier to visualize a MemNameDag.
Reviewed By: sfilipco
Differential Revision: D21486523
fbshipit-source-id: c65f1fc421bd654dc820faae3c93f2aa57f910d4
Summary:
This will allow clients to operate on MemNameDag.
Unfortunately, it isn't that easy to reuse code in `py_class!`. Since they are
just thin wrappers, I live with the copy-paste for now.
Reviewed By: sfilipco
Differential Revision: D21479015
fbshipit-source-id: ddcc7f5c7ede6bb1e9c73d058779805875b09200
Summary: This would be handy to visualize a MemNameDag.
Reviewed By: sfilipco
Differential Revision: D21486522
fbshipit-source-id: c8d7147dc53a1a7c1b8b09ce055493c69cceba2f
Summary:
Use MemNameDag::from_ascii to simplify the tests. This removes the need of:
- using tempdir
- converting between Id and VertexName manually via an IdMap
- depending on drawdag directly
Reviewed By: sfilipco
Differential Revision: D21486519
fbshipit-source-id: f04061d8892f043de40e7e321273acc51e15308a
Summary:
It seems handy to construct a Dag just from ASCII. Therefore move it to a
public interface.
Reviewed By: sfilipco
Differential Revision: D21486525
fbshipit-source-id: de7f4b8dfcbcc486798928d4334c655431373276
Summary:
They are part of the read-only algorithms that are not specific to a certain
type of NameDag.
Reviewed By: sfilipco
Differential Revision: D21479017
fbshipit-source-id: 3fa58071ac43246d3cd45d84384ee93c7385f414
Summary:
Adds an in-memory NameDag so we can construct the DAG and use its algorithms by
just providing parents function and heads.
Reviewed By: sfilipco
Differential Revision: D21479021
fbshipit-source-id: e12d53a97afec77b2307d5efbb280bd506dee0ba
Summary: Adds an in-memory IdMap to be used in an in-memory NameDag.
Reviewed By: sfilipco
Differential Revision: D21479018
fbshipit-source-id: bc702762b059e8659c6ab322f3c39f032e95d5b6
Summary:
This allows them to switch to a different IdMap implementation relatively
easily.
Reviewed By: sfilipco
Differential Revision: D21479023
fbshipit-source-id: 8ecb99cafe2093ec7d14b848ffa08581c5300414
Summary: This will allow different IdMap implementations.
Reviewed By: sfilipco
Differential Revision: D21479016
fbshipit-source-id: 852501896fddcb82624338acd9dceee41150e302
Summary:
`NameDag::add_heads` API changes the internal `dag` state without updating
`snapshot_map`. That will cause queries relying on `snapshot_map` to fail.
Update it so that `snapshot_map` gets updated by `add_heads`.
Reviewed By: sfilipco
Differential Revision: D21479019
fbshipit-source-id: 70528aa4a488cef3dc71bf21dd89e45cfe763794
Summary:
This makes it easier to add an "in-memory-only" NameDag with all the algorithms
implemented.
Reviewed By: sfilipco
Differential Revision: D21479020
fbshipit-source-id: c1a73e95f3291c273c800650f70db2a7eb0966d7