Summary:
It seems both are accepted but +0000 is more "standard" and is what used by
git.
This change restores some commit hashes in tests to pre-D33668305 (4dad74346b) state.
Reviewed By: DurhamG
Differential Revision: D33694197
fbshipit-source-id: aef6a587144c4b3b9045fa13dbaf8e7e2ac75af9
Summary: The git repos using git protocols cannot use commitcloud and infinitepush.
Reviewed By: DurhamG
Differential Revision: D33688111
fbshipit-source-id: bc8f8d7b8b5c84f567546b486bef15b5c92c1f32
Summary:
Provide a way to test if an extension is enabled or not. It will be used by the
next change.
Reviewed By: DurhamG
Differential Revision: D33688114
fbshipit-source-id: e0d73a3ec376b2b8acbe4daadc17f83f63f9beed
Summary:
There are 2 problems:
- timezone offset format "+HHMM" is in minutes, not seconds.
- the sign +/- is flipped.
This first issue was found by `git fsck`.
Reviewed By: DurhamG
Differential Revision: D33668305
fbshipit-source-id: 4af64caea7ab2dc2cec920573140b858e8fda1e7
Summary:
Previously, the `git` requirement indicates using the git format, store, and
protocols. In the future we might use the git format without using the git
store, or protocols. Use different requirements so future changes can be
easier.
Reviewed By: danielocfb
Differential Revision: D33521792
fbshipit-source-id: 1a764afb66380af106c4088b49a902e1f0eafb64
Summary:
Tags are treated as remote bookmarks prefixed with `tags/`. With previous
changes, pulling (via auto pull) or pushing tags just work. Add tests about it.
Reviewed By: DurhamG
Differential Revision: D33518819
fbshipit-source-id: 4e59d23d27cb92d884b66a24bf61bf2e36161fa7
Summary:
Previously the git remote url (ex. url of origin) is stored in git (by
`git remote add`). It has the benefit that git recognizes the names like
`"origin"`, and does the reference mapping work (ex. `refs/heads/foo` ->
`refs/remotes/origin/foo`) during pull.
But it turns out the approach has more downsides, namely:
- `hg paths --add` does not work.
- auto pulling like `hg checkout remote/tags/v1` does not work.
- `repo.pull` API is broken for git repos.
This diff changes the remote urls in git repos to be stored in `hgrc`, just
like a regular hg repo to address above issues.
Regarding on git, now we can maintain the reference explicitly so there is
no need for git to do the mapping on fetch.
Reviewed By: DurhamG
Differential Revision: D33518822
fbshipit-source-id: 3955d76bf76d73be60cb1363cbc3684f5336f605
Summary:
Previously, `pull origin -B foo --update` will try to resolve `repo['foo']`,
instead of `repo['origin/foo']`. This only works if `origin` is set as
`remotenames.hoist`. Change it so it works for other remote names.
Reviewed By: DurhamG
Differential Revision: D33518831
fbshipit-source-id: f73540ae1e3fa9e1906b1f6a3e42cb63684fc423
Summary:
`pull -B main --checkout` should checkout `main` even if nothing got pulled.
This fixes a test by a future change.
Reviewed By: DurhamG
Differential Revision: D33518827
fbshipit-source-id: 632588199ef5feabafe8814d9efde838b1613bf5
Summary:
Previously, `pull` takes refspecs. The refspec is a git detail. It forces the
callsite to understand the git refs and git-fetch refspec. Provide a higher
level `pull` so the callsite only needs to understand remote names (ex. "main")
and commit hashes.
This makes `git.pull` more similar to `repo.pull`, and makes it easier to add
auto pull and `hg paths` features to git repos.
Reviewed By: DurhamG
Differential Revision: D33518825
fbshipit-source-id: ed109b42e26c897c7f9569c00f6630709372a258
Summary:
Make memfilectx able to preserve the "m" flag and the submodule filenode.
Without this change rebase will turn submodules into regular files.
Reviewed By: DurhamG
Differential Revision: D33518829
fbshipit-source-id: 954210ad0b0d140244bb8da973cd9dc457ede3cc
Summary: Use the `Subproject commit SHA1` format similar to git.
Reviewed By: DurhamG
Differential Revision: D33518824
fbshipit-source-id: 676c1ae8c0b35df1f965d1ed787ae2580352a1ef
Summary:
Repos like pytorch uses nested submodules, and same dependency can appear in
different nested submodules. For example, `pybind11` has 4 copies:
pytorch-git/.git % find | grep 'pybind11$'
./modules/third_party/onnx/modules/third_party/pybind11
./modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11
./modules/third_party/pybind11
./modules/third_party/tensorpipe/modules/third_party/pybind11
Practically, this reduces pytorch's `.hg/store/gitmodules` size from 506MB
to 442MB. Note the git version of `.git/modules` takes 862MB. That might be
caused by git fetching `HEAD` or the `branch` config specified in `.gitmodules`
instead of just fetching the commit needed to checkout submodules.
Clonging them in different places is a waste of space. Let's make nested repos
with same URLs share a same backing repo.
Reviewed By: DurhamG
Differential Revision: D33518828
fbshipit-source-id: 1c091bf41c263d701eecdc266ba49723b3ad5eec
Summary:
Make workingctx provides the "m" (submodule) flag, and the submodule node in
its manifest. So repo.commitctx can use them.
Reviewed By: DurhamG
Differential Revision: D33518820
fbshipit-source-id: 2bc943f1727608cf96553699baf7d1c81311c2fa
Summary:
This makes it possible to select submodule changes using matcher and commit
them.
Reviewed By: DurhamG
Differential Revision: D33518826
fbshipit-source-id: bd3f3555dfacbed1a00fe372af24849ff6182970
Summary:
Constructing the submodule repos can be slow. Let's just check the dirstate p1
and avoid heavyweight checkout if possible. `--clean` can force a slower
submodule checkout.
Practically this reduces pytorch (with 37 submodules) checkout from 1.5s to 0.37s.
Reviewed By: DurhamG
Differential Revision: D33518833
fbshipit-source-id: f1b5a2803a4f0560e93b56597db743f290bda303
Summary:
Submodules seem commonly used. Add basic read (checkout) support for them.
Things become complicated with watchman and edenfs. But this diff
ignore them for now.
Reviewed By: DurhamG
Differential Revision: D33518834
fbshipit-source-id: 829800cc01e64e07386b0767b753052881db300e
Summary:
Upcoming changes will add more git tests. Move the GIT_ environment variables
to test runner so they won't need to be repeated in tests.
Reviewed By: danielocfb
Differential Revision: D33518823
fbshipit-source-id: 5fd69905215ef6ebfa4c49815992deec62fe794d
Summary:
The old-style class warning only applies to Python 2. I want to use Python 3
dataclass in a future diff.
Reviewed By: DurhamG
Differential Revision: D33518830
fbshipit-source-id: 89a8b653f1ff083975911af7e3e095737635d622
Summary:
Some git repos (ex. pytorch) has thousands of branches that won't scale well.
Change no-argument pull to respect selective pull to mitigate it.
Reviewed By: DurhamG
Differential Revision: D33485186
fbshipit-source-id: b735704850847c49b0bfeb94972e4bb778e726a4
Summary: Instead of returning a set, return a list with maintained order.
Reviewed By: danielocfb
Differential Revision: D33486861
fbshipit-source-id: 36e7703f4b9f13994bc75ca0f6f243e7d3abfeb6
Summary:
FETCH_HEAD is a bit noisy in the `pull` output, especially with upcoming
changes. It's currently only used to figure out `pull --update` or the initial
checkout commit during clone. Let's do that without using FETCH_HEAD.
Reviewed By: DurhamG
Differential Revision: D33485187
fbshipit-source-id: dab92d3db677fd77a06022f50f499dd8c9ef000f
Summary:
Previously, clone runs git fetch on its own. Make it reuse the `pull` method so
upcoming changes are easier.
Reviewed By: danielocfb
Differential Revision: D33485188
fbshipit-source-id: 9f215fcb7a013cd947c7fce263999911446dd5c4
Summary:
Reading a null OID causes an error in libgit2:
odb: cannot read object: null OID cannot exist; class=Odb (9); code=NotFound (-3)
Since libgit2 special-cased the null id and hg wants to read null sometimes.
Let's just handle it in the gitstore.
Reviewed By: markbt
Differential Revision: D33485189
fbshipit-source-id: ed96921353ef592750fe361eee06ebcc0a25f07b
Summary:
Pass binary nodes instead, since the `node` variable is expected to be binary.
This was caught by the `assert len(node) == 20` in gitfilelog.
With this diff, rebase with merging now seems to work for git.
Reviewed By: danielocfb
Differential Revision: D33372099
fbshipit-source-id: 58b36d13337b9557632e9146f7ff7f963531940c
Summary: Add a way to init an empty repo backed by git without cloning.
Reviewed By: danielocfb
Differential Revision: D33372098
fbshipit-source-id: ef5758c8289f0cf8eea830e1c5c9ba3af30135b2
Summary:
The code `os.path.basename(repo.ui.config("paths", "default")` will crash if paths.default is not set.
There is no need to read paths.default for reponame manually after D32570353 (30f98e1fad). So let's just remove it.
This makes `cloud sync` show the right error messages without crashing in a git
repo.
Reviewed By: markbt
Differential Revision: D33370562
fbshipit-source-id: 8e893a81bb012876ccc57cde44b2549f81af789c
Summary:
There isn't a good tag UX in hg. Too many tags can also slow down
graph syncing. Let's just skip the tags in clone and pull for now.
Maybe we can treat them like lazy remote names in the future.
Reviewed By: danielocfb
Differential Revision: D33370564
fbshipit-source-id: 311f4d6367cd952c7c7b0686fab3229e2de2c0b0
Summary:
Previously, `bookmarks --remote` connects to the remote with listkeys
capability without checking if the remote is valid or not. That crashes in a
git repo where `[paths]` is empty. Fix it by falling back to local remotenames
listing.
Reviewed By: DurhamG
Differential Revision: D33370563
fbshipit-source-id: 2626a57dd39d35fd0205af79ffa02c5587490c69
Summary:
Support git submodule in the trees (serialzation + deserialization).
This enables reading git trees with submodules without errors, and
modifying the trees without losing the submodules entries.
The added "submodule" type forces downstream users to update.
Namely, `checkout` is updated to silently ignore them.
Reviewed By: DurhamG
Differential Revision: D33369607
fbshipit-source-id: c2bce1882df3958b272dd8a6dcc3e4052f2704f5
Summary:
Use an enum variant for regular files so `Option` becomes unnecessary.
This makes the type simpler and makes upcoming changes cleaner.
Note: The type now matches `manifest::FileType` but `manifest::FileType`
will be changed in upcoming changes.
Reviewed By: danielocfb
Differential Revision: D33369606
fbshipit-source-id: 2b01d74875f97675d79082fe39bf4bdea26fdb07
Summary:
Change `ctx.files()` to run diff for git repos. This would make a few revsets
and templates working.
Reviewed By: danielocfb
Differential Revision: D33354722
fbshipit-source-id: c6449d7f405f2d2c568294eeebe47d371e46cb12
Summary:
Support pushing to a git repo using `git push` command.
This is implemented in remotenames' `push`, since that's
what used in production. The vanilla `push` is unchanged.
The `%include builtin:git.rc` in hgrc should have remotenames
enabled.
Reviewed By: DurhamG
Differential Revision: D33351380
fbshipit-source-id: f1e2dfd64168b83d1cf608c0490e56634688da15
Summary: There is no need to prefetch trees if there is no remote repo set.
Reviewed By: DurhamG
Differential Revision: D33351378
fbshipit-source-id: 13a0dc56b712bc6be5373958b763febc1652e76b
Summary: Found it when I was reading the code.
Reviewed By: DurhamG
Differential Revision: D33458497
fbshipit-source-id: 06cc1e3d2afbcd345ba9b0471546aa5719106e77
Summary:
The `%include builtin:git.rc` will be written to repo `hgrc` when creating
repos backed by git. The main benefit is that we can update the builtin
git.rc without migrating existing hgrc files in the future.
Reviewed By: DurhamG
Differential Revision: D33352777
fbshipit-source-id: 9b6904e4bc0eff53bf623ce9ae382d723da1213e
Summary:
Sometimes it's unclear why a config is ineffective. This makes it
possible to figure out what's going on.
Reviewed By: DurhamG
Differential Revision: D33352778
fbshipit-source-id: bd7d591200e6a15a5e09f2057cc9303e5f1c344c
Summary: Delegate the pull work to `git fetch` to complete its task.
Reviewed By: DurhamG
Differential Revision: D33351385
fbshipit-source-id: 16af5312d805d53f3ba1275baae3137357134ee7
Summary:
dev-logger is used for tracing logs in tests. Make it easier to use
by reading the LOG env var, instead of RUST_LOG, and use a shorter
format without color for fast outputting, easier redirect and editing.
Reviewed By: DurhamG
Differential Revision: D33339411
fbshipit-source-id: a4a9e0336b17856c07076cf19f56bd99064d94e4
Summary:
It's not used for features. But having access to the segments in Python
makes it easier to prototype stuff. For example, this is the Python
prototype of "pathhistory" in `debugshell.py` with some parameters
to tweak:
command("lg")
def lg(ui, repo, *args, **opts):
logfile(repo, *args, **opts)
def logfile(repo, path="Documentation/logo.gif"):
cache = {}
def get(
rev,
rev2node=repo.changelog.idmap.id2node,
clr=repo.changelog.changelogrevision,
tm=bindings.manifest.treemanifest,
ts=repo.manifestlog.datastore,
path=path,
cache=cache,
):
if rev in cache:
return cache[rev]
node = rev2node(rev)
mnode = clr(node).manifest
fnode = tm(ts, mnode).get(path)
cache[rev] = fnode
return fnode
cl = repo.changelog
dag = repo.changelog.dag
nset = repo.dageval(lambda: ancestors(lookup(".")))
q = deque(dag.segments(nset))
torevs = repo.changelog.torevs
roots = repo.changelog.torevs(dag.roots(nset))
spans = bindings.dag.spans
spansetfromrange = bindings.dag.spans.unsaferange
dagsegments = dag.segments
tonodes = cl.tonodes
skipbylevel = [0] * 5
splitbylevel = [0] * 5
skipbypdiff = [0] * 5
visitbylevel = [0] * 5
psameskip = 0
psameblock = 0
skipmulti = 0
out = []
skippedfnodes = set()
skippedparents = spans([])
opt_dedicated_l0_bisect = False
opt_skip_none_ancestors = False
opt_skip_multi = False
opt_skip_any_psame = True
def d(msg, write=repo.ui.write_err, verbose=repo.ui.verbose):
if verbose:
write("%s\n" % msg)
d(f"QUEUE SIZE: {len(q)}")
interesting = spans([782537])
skipanc = spans([])
while q:
# try testing 8 ranges, as long as parents
th = 4
if opt_skip_multi and len(q) > th:
multiparents = []
j = 0
for i in range(th):
if q[i]["has_root"]:
break
h = q[i]["high"]
if i > 0 and (h not in q[i - 1]["parents"]):
break
multiparents += [p for p in q[i - 1]["parents"] if p != h]
j = i + 1
# Maybe track parent origins, so skip multi tests more efficiently?
if j > 1:
multiparents += q[j - 1]["parents"]
high = q[0]["high"]
f = get(high)
mpdiff = [p for p in multiparents if get(p) != f]
if not mpdiff:
skipmulti += 1
d(f"SKIP MULTI {j}")
for i in range(j):
q.popleft()
continue
s = q.popleft()
indent = s.get("indent", 0)
low = s["low"]
high = s["high"]
parents = s["parents"]
ppdiff = s.get("ppdiff", [])
pintersect = [p for p in parents if p in ppdiff]
hasroot = s["has_root"]
level = s["level"]
f = get(high)
plen = len(parents)
pdiff = [p for p in parents if get(p) != f]
psame = [p for p in parents if get(p) == f]
rs = roots & spansetfromrange(low, high)
rdiff = [p for p in rs if get(p) != f]
rsame = not rdiff
visitbylevel[level] += 1
action = None
if low == high:
if (not psame) and (
plen > 0 or f is not None
): # all parents are different, or no parents
action = "TAKE "
out.append(low)
else:
action = "SKIP1 "
# if not action and high in skippedparents:
if not action and len(spansetfromrange(low, high) & skippedparents) == high - low + 1:
action = "SKIPP "
if not action and high in skipanc:
action = "SKIPA "
# skip ancestors?
# (can be wrong optimization?)
if opt_skip_none_ancestors and not action and f is None:
anc = dag.ancestors([cl.node(high)])
ancids = torevs(anc)
skipanc = skipanc + ancids
# skip it
if opt_skip_any_psame and psame:
# mark pdiff as "skipped", so we don't output reverted commtis
newskip = cl.torevs(cl.dag.only(cl.tonodes(pdiff), cl.tonodes(psame)))
if newskip:
d(f" {' ' * indent}NEW SKIP: {newskip}")
if newskip & interesting:
d(f" {' ' * indent} SKIP & INTERSTING: {newskip & interesting}")
skippedparents = skippedparents + newskip
psameskip += 1
if not action and (((opt_skip_any_psame) and psame or len(psame) == plen) and rsame):
# skip it
action = "SKIPS "
skipbylevel[level] += high - low + 1
# but need to check the roots
if rs:
d(f"ROOTS {rs}")
for r in rs:
if get(r) is not None:
out.append(r)
# output it
if not action and (low == high):
if psame == 0:
action = "TAKE0 "
out.append(low)
else:
action = "SKIP0 "
if not action:
action = "SPLIT "
splitbylevel[level] += 1
if level > 0:
subsegs = dagsegments(
tonodes(spansetfromrange(low, high)), maxlevel=level - 1
)
# filter by pdiff
pdiffspans = spans(pdiff)
for subseg in reversed(subsegs):
if (
all(p not in pdiffspans for p in subseg["parents"])
and not subseg["has_root"]
):
# skip it
skipbypdiff[subseg["level"]] += 1
else:
# keep it, put it in pdiffspans
pdiffspans = pdiffspans + spansetfromrange(
subseg["low"], subseg["high"]
)
subseg["indent"] = indent + 1
subseg["ppdiff"] = pdiff
q.appendleft(subseg)
else:
# Dedicated L0 bisect
if opt_dedicated_l0_bisect:
# high --- mid --- low ---- end
end = low
while high > end:
# Check bisect results... tend to be the merge?
d(f"BISEC{' ' * indent} {high} -- {low} {end}")
mid = (low + high) // 2
if get(mid) == get(high):
high = mid
low = (high + end) // 2
else:
if mid + 1 == high:
# output
d(f"TAKE {' ' * indent} {high}")
out.append(high)
high = mid
low = end
else:
low = mid
else:
mid = (low + high) // 2
s1 = {
"low": mid + 1,
"high": high,
"parents": [mid],
"has_root": False,
"level": 0,
"indent": indent + 1,
}
s2 = {
"low": low,
"high": mid,
"parents": parents,
"has_root": not parents,
"level": 0,
"indent": indent + 1,
}
q.appendleft(s2)
q.appendleft(s1)
d(
f"{action}{' ' * indent}L{level} {high-low+1} {high} -> {low} (pdiff: {pdiff} / {plen}; rdiff: {rdiff} / {len(rs)})"
)
if pintersect:
d(f" {' ' * indent}PPDIFF: {pintersect}")
iintersect = interesting & spansetfromrange(low, high)
if iintersect:
d(f" {' ' * indent}INTERESTING: {iintersect}")
repo.ui.write("skips: %r\n" % (skipbylevel,))
repo.ui.write("skippd: %r\n" % (skipbypdiff,))
repo.ui.write("splits: %r\n" % (splitbylevel,))
repo.ui.write("visits: %r\n" % (visitbylevel,))
repo.ui.write("out[:12] %r %d\n" % (out[:12], len(out)))
repo.ui.write("cache: %r\n" % (len(cache),))
repo.ui.write("skipmult %d\n" % (skipmulti,))
repo.ui.write("sameskip %d / %d\n" % (psameskip, psameblock))
globals().update(locals())
return
Reviewed By: DurhamG
Differential Revision: D33339407
fbshipit-source-id: 8f773ba465e36e896606548cdf71c38b1c31c147
Summary:
git backend does not have filelog, linkrev to calculate pathcopies using the
existing code path. So let's skip it for now.
Reviewed By: DurhamG
Differential Revision: D33282042
fbshipit-source-id: 65a427f87d7c08db91ba0ef3d73da51663f3095c
Summary:
The old code paths might access filelog and linkrev to figure out what files to
show, which do not work in git.
Avoid them in the PathHistory log path. This simplifies stuff and make `log -p`
work for git.
Reviewed By: DurhamG
Differential Revision: D33282043
fbshipit-source-id: 4b0a48d4bafbc2cb54742620d7cd4f05d2a0f3ba
Summary:
Make `hg log PATH` work in a git repo. Under the hood it uses the PathHistory
abstraction which handles trees, multiple paths, removed paths that are
difficult to handle without PathHistory.
Reviewed By: DurhamG
Differential Revision: D33280173
fbshipit-source-id: 6126a0f7498fb39e3b93f6ac44b443a681e467d0
Summary:
Now that we have a files2 endpoint that can return errors, let's add an
option to call it from the client.
Reviewed By: quark-zju
Differential Revision: D33283030
fbshipit-source-id: 3eda0fe870be1f4b74e0f0f60b11518c7bfe508f
Summary:
Currently fetching files from edenapi has no mechanism to report errors
to the client, so if the server can't find the key or hits an error, the client
just doesn't hear anything back.
As part of improving our network reliability and debugability, let's enable
passing errors back. The commit APIs use a pattern involving a response object
that includes the request input (in files case a Key) and a Result, so let's
follow that same pattern. This will require a new files2 endpoint, which is
introduced in a subsequent diff.
In this diff, we just introduce the FileResponse type and convert the current
FileEntry response from files v1 into the new type so we can make the endpoint
swappable later.
D27549923 (82b689ad9d) has discussion on why to choose this pattern.
Reviewed By: quark-zju
Differential Revision: D33283031
fbshipit-source-id: ec2e34760ee47ead95964e2d33e0be4173bb4e77
Summary:
Made the PathHistory feature accessible via revset, or `repo.pathhistory`
if one do not want the rev number tech-debt.
It seems useable. For paths with long history, the bisect overhead becomes
significant and it can be much slower than simple traversal.
In linux.git (note `--time` excludes Python start up overhead):
% lhg log -r '_pathhistory(::.,"README")' --time >/dev/null
time: real 0.690 secs (user 0.550+0.000 sys 0.130+0.000)
% lhg log -r '_pathhistory(::master,"mm/damon")' --time >/dev/null
time: real 0.420 secs (user 0.320+0.000 sys 0.090+0.000)
% lhg log -r '_pathhistory(::master,"mm")' --time >/dev/null
time: real 9.760 secs (user 7.290+0.000 sys 1.840+0.000)
(17k commits in output)
% lhg log -r '_pathhistory(::master,"")' -T '{node}\n' >/dev/null
time: real 178.720 secs (user 167.540+0.000 sys 11.250+0.000)
(1060k commits in output)
Git:
% time git log README >/dev/null
0.46s user 0.12s system 99% cpu 0.576 total
% time git log mm/damon >/dev/null
0.63s user 0.13s system 99% cpu 0.770 total
% time git log mm >/dev/null
1.78s user 0.27s system 98% cpu 2.085 total
% time git log --format=%H >/dev/null
12.61s user 0.61s system 99% cpu 13.256 total
In fbsource. Files with shallow path and short history works okay:
The `tools/signedsource` has 8 changes. It takes about 1 second for a change
running from a devserver closer to the server.
Cold manifest cache + cold commit cache + semi-cold server cache:
% hg clone --configfile /etc/mercurial/repo-specific/fbsource.rc -U fb://fbsource /tmp/fbs1
% rm -rf /var/cache/hgcache/fbsource/manifests
% lhg --cwd /tmp/fbs1 log -r '_pathhistory(::master,"tools/signedsource")' --time >/dev/null
time: real 10.780 secs (user 1.120+0.000 sys 0.490+0.000)
Cold manifest cache + cold commit cache + semi-warm server cache:
% # same commands
time: real 6.850 secs (user 1.480+0.000 sys 0.790+0.000)
Cold manifest cache + warm commit cache + semi-warm server cache:
% rm -rf /var/cache/hgcache/fbsource/manifests
% lhg --cwd /tmp/fbs1 log -r '_pathhistory(::master,"tools/signedsource")' --time >/dev/null
time: real 1.330 secs (user 0.370+0.000 sys 0.160+0.000)
Warm local cache:
% lhg --cwd /tmp/fbs1 log -r '_pathhistory(::master,"tools/signedsource")' --time >/dev/null
time: real 0.260 secs (user 0.180+0.000 sys 0.090+0.000)
Files with long history or deeper paths can be slow. But the output is in a
streaming fashion so it is visible to see the progress:
% lhg log -r '_pathhistory(::master,"fbcode/eden/scm/edenscm/mercurial/dispatch.py")' --time -T '{node|short} {desc|firstline}\n' > /tmp/log
time: real 169.840 secs (user 7.960+0.000 sys 3.690+0.000)
% wc -l /tmp/log
79
% lhg log -r '_pathhistory(::master,"fbcode/eden/scm/edenscm/mercurial/dispatch.py")' --time -T '{node|short} {desc|firstline}\n' > /tmp/dlog
time: real 1.220 secs (user 0.820+0.000 sys 0.390+0.000)
Reviewed By: DurhamG
Differential Revision: D33280174
fbshipit-source-id: 97334b3b110f21722be7be3dce095468a887a0d6
Summary: Provide access to PathHistry features. The main API is `__next__`.
Reviewed By: DurhamG
Differential Revision: D33280042
fbshipit-source-id: e89b970991efce04937ce22d92ad4bbcd495b85f
Summary: Implement the main history logic by visiting, skipping, and splitting segments.
Reviewed By: DurhamG
Differential Revision: D33265878
fbshipit-source-id: f3165752cf9fc8cd0bd245be9427769804f9e556
Summary: Those are individual algorithms used by upcoming changes.
Reviewed By: DurhamG
Differential Revision: D33265884
fbshipit-source-id: b09813df4fb6477c4f0aa60a853b68af026b43c7
Summary:
The problem is: given a list of paths, and a root tree, find
the content ids of the paths.
It can be a bit complex, if these are considered:
- avoid resolving common prefixes of paths multiple times
(ex. if paths are "a/b/c" and "a/b/d", only visit "a" and "a/b" once)
- prefetch in batches per tree depth
(need O(max tree depth) round-trips)
This module is to solve the problem. See the docstring for details.
Reviewed By: DurhamG
Differential Revision: D33339416
fbshipit-source-id: 3400e17799b42cf489576228bd486a671ddaaa5f
Summary:
The path history area is problematic in multiple ways:
- Visiting commits and checking their trees following commit graph can be too
slow.
- The "fastlog" service in Mononoke can provide faster path history for a
single file or a single directory. However, it requires complex infra to
maintain the indexes and do not handle following multiple paths nicely.
- hg's linkrev is a tech-debt we'd like to remove but it's not easy to do so.
Partially because a replacement will need new storage and protocol design,
and might face questions like offline UX, etc.
This crate is an attempt to tackle problems above:
- Visiting segments and skip segments aggressively.
- If we use commit graph and regular tree reads, then there is no need for an
external service, or hg's linkrev.
with one main downside caused by bisect:
- "change + revert" might cancel out. History can be incomplete.
But that downside seems acceptable considered the other wins.
This diff adds the crate with some high level commments.
Reviewed By: DurhamG
Differential Revision: D33265881
fbshipit-source-id: e29be3d8e9fa8cd9f011144a7104429edcb25ec4
Summary:
Instead of passing just the old and new master nodes. Pass a list to support
more complicated cases.
Reviewed By: DurhamG
Differential Revision: D33594530
fbshipit-source-id: 2087d4fce79eb5cff3c1d381cfc82f9bd6ad89c4
Summary:
debughiddencommit can produce visible ephemeral commits if the backup
fails (like if certs are invalid). Let's ensure we make them invisible even in
the case of a backup error.
https://fb.workplace.com/groups/asic.infra/posts/1314042865686399/
Reviewed By: mrkmndz
Differential Revision: D33668078
fbshipit-source-id: 6df48709ef183afa229f96fa7a526c479e8b4c0a
Summary:
The phrasing implied that "update --clean" would only discard the
conflicting files, but in reality it discards everything. Let's make the message
clearer.
Reviewed By: quark-zju
Differential Revision: D33662474
fbshipit-source-id: 60aeb7db72d45e894d959d9f83285f34132c603b
Summary:
Some git repos (ex. linux.git) contain non-utf8 commit messages.
It crashes `ctx.description()` in various places (ex. parsing
Phabricator URL, showing commit message template, etc.)
Let's just use the `from_utf8_lossy` function from Rust to avoid
such encoding issues.
This does not change the write path of git commits.
Reviewed By: DurhamG
Differential Revision: D33280048
fbshipit-source-id: bf6abbcf0aaf48ec2593c78756a1892cdc556e93
Summary: `dedup` removes duplciated items in a list while maintaining the item order.
Reviewed By: DurhamG
Differential Revision: D33486862
fbshipit-source-id: c891922826d9f3fb3b7300a407791345d75b4b6c
Summary: Fix some doctests failing on Python 3. Most of them are encoding issues.
Reviewed By: DurhamG
Differential Revision: D33486863
fbshipit-source-id: 0258a0b6306718a33e1d966e8e3c3a465f183cc2
Summary: Fix some doctests failing on Python 3. Most of them are encoding issues.
Reviewed By: DurhamG
Differential Revision: D33486866
fbshipit-source-id: 5f47dc4f773431022cc4976f7a3e91c77eb99809
Summary:
They are no longer used. This also avoids issues fixing their doctest, which is
failing on Python 3 due to encoding issues.
Reviewed By: DurhamG
Differential Revision: D33486867
fbshipit-source-id: 66186f39c6aa19f2eada8dc6e4b751871debe126
Summary: Drop dependency on fancyopts so we can remove it and its broekn doctests.
Reviewed By: DurhamG
Differential Revision: D33489535
fbshipit-source-id: ce491526bfedba909a5391bad5cc21af82b3db12
Summary:
Drop dependency on fancyopts so we can remove it and its broken doctests.
The error type is slighly different, which affects the tests.
Reviewed By: DurhamG
Differential Revision: D33489537
fbshipit-source-id: 6aa680227f80536ba2573e77a9a0caf26131c0ee
Summary:
The Python codebase wants `opts['foo_bar']` for flag `foo-bar`. Previously this
normalization happens in `dispatch.py`. Move it to `pycliparser` so `pycliparser`
can used in more places.
This will be used to replace fancyopts, so doctest in fancyopts can be deleted.
Reviewed By: danielocfb
Differential Revision: D33489539
fbshipit-source-id: ca6a23dde3408a9bfa07557b8ba16cbe1d546ab1
Summary:
The only callsite in dispatch.py actually provide the global flags. So there is
no need to append the global flags again. This makes `parsecommand` more flexible.
Reviewed By: danielocfb
Differential Revision: D33489536
fbshipit-source-id: 05d623a3ec51d585d9307cb5660d841d6d222bc2
Summary:
Previously parsecommand only takes 3-item tuples. Practically it could be 4 or
5 items. Handle them so parsecommand is easier to use.
Reviewed By: danielocfb
Differential Revision: D33489538
fbshipit-source-id: 568207d55a6a55b68d2a54a2aeef6a34f9603a5c
Summary: In uiconfig.py, we now use parselist from Rust. Let's just drop the Python configlist implementation.
Reviewed By: DurhamG
Differential Revision: D33486865
fbshipit-source-id: 4633b93e49b634dd0f20c2acad756957e29d4ab5
Summary: I just added the ignorematcher but didn't clean up all its code from my first stab. In particular, don't reference _matchers directly.
Reviewed By: quark-zju
Differential Revision: D33590121
fbshipit-source-id: f50dfcdddcc2a9ba52b53e5c6aff8b9170c9bcc7
Summary:
We were seeing two mmap's open for a given indexedlog. It turns out the
Index OpenOptions keeps a reference to the original key_buf, which meant we held
onto the original mmap forever.
This diff stops doing that.
Props to xavierd for noticing the double mmaps.
Reviewed By: quark-zju
Differential Revision: D33594545
fbshipit-source-id: f1ac3f6752886971a0f325874ac581f937234a4d
Summary:
When using curses during interactive revert, we now properly handle the transition to or from having no newline at the end of file. We do this by peeking ahead one line and trimming the apparent newline if the next line is "No newline at end of file".
This is motivated by upstream https://phab.mercurial-scm.org/D8762, but that test didn't exercise the bug for me, and the code change didn't work properly when reverting back to the no-newline case.
Reviewed By: quark-zju
Differential Revision: D33541600
fbshipit-source-id: 6e605fe2f6017baad0aa8232313a209f68fc871c
Summary: This adds two columns that shows the current download and upload speed for each process in either Kb/s or Mb/s
Reviewed By: quark-zju
Differential Revision: D33557150
fbshipit-source-id: f279904d78ac1e06a9bf1d3c286e3af7285b73a9
Summary:
This adds a column that shows whether a process is running. It shows `RUNNING` if the process is currently executing, or `TERMINATED (n)` if the process has finished. Here `n` is the exit code of the process (e.g. 0). Processes that terminate before debugtop starts running are not shown.
An option is added to the command for controlling the amount of time a terminated progress is shown after it finishes is also added.
Reviewed By: quark-zju
Differential Revision: D33522251
fbshipit-source-id: f8444298155aabecf4a33387b6ca56b67068367a
Summary:
This appears to have broken lfs fetching on a lot of laptops. https://fb.workplace.com/groups/mercurialusers/posts/4639345729448344/
Original commit changeset: 1447c880c767
Original Phabricator Diff: D33506809 (e791747460)
Reviewed By: quark-zju
Differential Revision: D33588115
fbshipit-source-id: d8aee673a582d22124f4354f58829fdd186ea33c
Summary: It will be used by pathhistory.
Reviewed By: DurhamG
Differential Revision: D33339951
fbshipit-source-id: dbb1bd509cce2fb54bc7f8d392ab8bcb11788e03
Summary: Hack things up so the sparse "ignore" matcher delegates to the gitignorematcher's "explain" method. This allows the "debugignore" command to give more useful information about why a particular file is ignored.
Reviewed By: quark-zju
Differential Revision: D33586208
fbshipit-source-id: 51bf69f39dbba2c724e9ec28211d3bc0b6c9b0fd
Summary:
This allows users to create a new snapshot reusing the latest snapshot's storage. This allows uploading very similar snapshots even faster (after the bugfix on D33096364).
There is already an optimisation to avoid re-uploading files. However, it still performs a get/put on the server side. This allows bypassing it altogether. The tradeoff is that the same bubble is used so we don't extend the lifetime.
In the future, we want to avoid the get/put on server side either way, but that needs a bunch more work.
Reviewed By: markbt
Differential Revision: D33098266
fbshipit-source-id: 94baad6d1db516a6300963d240c354c86a90fc05
Summary:
hg (and edenfs) spend a significant amount of time hashing lfs data when reading from the indexedlog store. Indexedlog already has checksums for each chunk of data, so, assuming the data was correct when inserted, we only need to verify the total content size (i.e. that we have all the chunks). In a previous commit I added content verification when inserting lfs data into indexedlog, and this commit introduces a config flag to remove the verification when reading. We still check the blob's total size which covers the case of missing chunks.
Note that there is risk if existing indexedlog lfs entries are invalid since this commit removes the validity check. I added a trace point I will use to verify that we don't currently get hash mismatches in practice.
Reviewed By: DurhamG
Differential Revision: D32444377
fbshipit-source-id: 8f2e857c0d88c00687500ad107b3a5ebc79956d6
Summary: We now have an explicit check verifying the blob's size when reading from the indexedlog. This is redundant with the current content hash verification, but I'm preparing to remove the content hashing on read.
Reviewed By: DurhamG
Differential Revision: D32444383
fbshipit-source-id: 79868175563621e234f2a7e8055afc83fad56f12
Summary:
Currently we verify the content hash every time we read lfs data from indexedlog, but this is slow. Instead, we can verify once at insertion time and skip the verification when reading. This commit adds the insertion time verification.
Note that this introduces an error if the hash mismatches where previously the data would be inserted and silently ignored when reading.
Reviewed By: DurhamG
Differential Revision: D32444380
fbshipit-source-id: c096263f7a13279e21e31216a3a7132c52e630f2
Summary: I accidentally triggered this code path in a test and was seeing an exception: "Expected type that converts to PyBytes but received str". Fix by calling "encode()" on the python strs.
Reviewed By: quark-zju
Differential Revision: D32444381
fbshipit-source-id: cac33c4b4b06ecf71329bbd8746fbfef7f5be1ad
Summary:
The sampling layer writes configured tracing events out to the hg sampling file. To make things play well with our existing TracingCollector, I added a tracing Filter for the sampling layer.
Note that the sampling Layer will not work properly if you set EDENSCM_LOG or LOG since that actives the EnvFilter which does not respect per-Layer filtering and will filter events before they make it to the sampling layer.
Reviewed By: quark-zju
Differential Revision: D32444378
fbshipit-source-id: 6eeb782b4a8c0aa6e9b19fc319ca7663d4cf45d8
Summary: "sampling" refers to the python hg feature where certain ui.log keys can be marked for export to a specified file. The scm telemetry wrapper shuffles the file contents off to scuba. In rust, we now have a tracing Layer that implements the event export format. It matches tracing events using the "target" metadata attribute since that can be checked statically before the event is instantiated. Note that I am not currently taking advantage of that, but will in a following commit.
Reviewed By: quark-zju
Differential Revision: D32444379
fbshipit-source-id: c5d9fd5e28271656082d82f6584925b304ab02eb
Summary: Use LevelFilter instead of implementing Layer::enabled. This way the filtering only applies to this Layer rather than all Layers. This is in preparation for adding another Layer.
Reviewed By: quark-zju
Differential Revision: D32444382
fbshipit-source-id: cd6a78d33d1de91ab41c92b7f76895b9d335a80f
Summary: The added tests allow testing most parts of debugtop without compiling the entirety of hg or running its integration test.
Differential Revision: D33485457
fbshipit-source-id: 8ec37322ec04b4a73d4b4e2c0f053d5206e224d1
Summary: This moves most of the contents of debugtop to another crate in order to improve modularity and compile times.
Differential Revision: D33484736
fbshipit-source-id: 15df453fc3b3e263878779998767d31aa885640a
Summary:
If there's a hard reboot, the backup file could be empty and this
version check could throw an index error. Let's handle that gracefully.
Differential Revision: D33553319
fbshipit-source-id: de2fec48766d9f7e75adaf3d1642b48a09d67cf3
Summary: This will stop us reading on-disk certs for lfs.
Reviewed By: farnz
Differential Revision: D33506809
fbshipit-source-id: 1447c880c767106e85994ff1c419e90d843d82eb
Summary:
With fastcopytrace, attempting to rebase a directory rename over a file rename
(or vice versa) is not successful, as the copy source that fastcopytrace comes
up with doesn't exist in the rebase source commit.
Currently this crashes with an obscure `[copy source filename] not found in manifest`
error. We can do better: if the copy source doesn't exist in the source
manifest at the point where we are merging, we can treat this as a conflict.
The user can either resolve this manually (by renaming the file to the new
destination), or they can try again with full copytrace, which should succeed.
While it's not strictly accurate, we treat this as a "change/delete" conflict,
as there is no "rename/delete" conflict type.
Reviewed By: DurhamG
Differential Revision: D33259386
fbshipit-source-id: 321f1942b0e31c3d97a4c4a32ee1eae6b6a740ce
Summary:
Add a test that demonstrates that fastcopytrace fails when a file that is
renamed later in the stack is renamed in the base commit, and then the rest of
the stack is restacked.
Reviewed By: quark-zju
Differential Revision: D18170733
fbshipit-source-id: 89c12abd8da598e07cf1b32ada11ac013a1945b0
Summary:
D33159847 (03a71ef9db) is unsound. The hg tree format is "file name + ... + hex hash", not
"hex hash" first. So file name containing spaces would cause the data to be
treated as git format incorrectly.
Fix it by passing the format from store explicitly to `Entry`.
Reviewed By: DurhamG
Differential Revision: D33534887
fbshipit-source-id: 31f12cc082f62b24794a46675efcdbf92c2551d5
Summary:
When creating a transaction we now automatically clean up an existing (abandoned) transaction if it is empty. This seems safe since recover() should be a no-op (other than cleaning up the tx files).
I've seen multiple cases of empty transaction files due to commands crashing/being killed in a transaction (but before anything has been written).
Reviewed By: quark-zju
Differential Revision: D33482320
fbshipit-source-id: a6ef74a30de96c600385a701ab2ab61bb149afb9
Summary:
This change fixes a bunch of typos that I stumbled upon reading through code and
documentation.
Reviewed By: quark-zju
Differential Revision: D33511166
fbshipit-source-id: 185ce3ac9dd2311d757fc2a3859b63c253f44dd2