Commit Graph

13758 Commits

Author SHA1 Message Date
Jun Wu
e6d1d7fa2b bookmarks: preserve order of selective pull bookmarks
Summary: Instead of returning a set, return a list with maintained order.

Reviewed By: danielocfb

Differential Revision: D33486861

fbshipit-source-id: 36e7703f4b9f13994bc75ca0f6f243e7d3abfeb6
2022-01-20 10:21:48 -08:00
Jun Wu
ef44f12024 git: avoid using FETCH_HEAD
Summary:
FETCH_HEAD is a bit noisy in the `pull` output, especially with upcoming
changes. It's currently only used to figure out `pull --update` or the initial
checkout commit during clone.  Let's do that without using FETCH_HEAD.

Reviewed By: DurhamG

Differential Revision: D33485187

fbshipit-source-id: dab92d3db677fd77a06022f50f499dd8c9ef000f
2022-01-20 10:21:48 -08:00
Jun Wu
6ba3152ea6 git: reuse pull function in clone
Summary:
Previously, clone runs git fetch on its own. Make it reuse the `pull` method so
upcoming changes are easier.

Reviewed By: danielocfb

Differential Revision: D33485188

fbshipit-source-id: 9f215fcb7a013cd947c7fce263999911446dd5c4
2022-01-20 10:21:47 -08:00
Jun Wu
9157671124 git: handle null id
Summary:
Reading a null OID causes an error in libgit2:

  odb: cannot read object: null OID cannot exist; class=Odb (9); code=NotFound (-3)

Since libgit2 special-cased the null id and hg wants to read null sometimes.
Let's just handle it in the gitstore.

Reviewed By: markbt

Differential Revision: D33485189

fbshipit-source-id: ed96921353ef592750fe361eee06ebcc0a25f07b
2022-01-20 10:21:47 -08:00
Jun Wu
d1be9f5546 merge: do not pass hex node to filecontext
Summary:
Pass binary nodes instead, since the `node` variable is expected to be binary.
This was caught by the `assert len(node) == 20` in gitfilelog.

With this diff, rebase with merging now seems to work for git.

Reviewed By: danielocfb

Differential Revision: D33372099

fbshipit-source-id: 58b36d13337b9557632e9146f7ff7f963531940c
2022-01-20 10:21:47 -08:00
Jun Wu
b9f29e42bb init: support --git
Summary: Add a way to init an empty repo backed by git without cloning.

Reviewed By: danielocfb

Differential Revision: D33372098

fbshipit-source-id: ef5758c8289f0cf8eea830e1c5c9ba3af30135b2
2022-01-20 10:21:47 -08:00
Jun Wu
55d98bd7b4 commitcloud: avoid crash if neither reponame nor path is set
Summary:
The code `os.path.basename(repo.ui.config("paths", "default")` will crash if paths.default is not set.

There is no need to read paths.default for reponame manually after D32570353 (30f98e1fad). So let's just remove it.

This makes `cloud sync` show the right error messages without crashing in a git
repo.

Reviewed By: markbt

Differential Revision: D33370562

fbshipit-source-id: 8e893a81bb012876ccc57cde44b2549f81af789c
2022-01-20 10:21:47 -08:00
Jun Wu
f3c6c85d4c git: do not fetch tags
Summary:
There isn't a good tag UX in hg. Too many tags can also slow down
graph syncing. Let's just skip the tags in clone and pull for now.

Maybe we can treat them like lazy remote names in the future.

Reviewed By: danielocfb

Differential Revision: D33370564

fbshipit-source-id: 311f4d6367cd952c7c7b0686fab3229e2de2c0b0
2022-01-20 10:21:47 -08:00
Jun Wu
23ff84fcdb git: do not crash on bookmarks --remote
Summary:
Previously, `bookmarks --remote` connects to the remote with listkeys
capability without checking if the remote is valid or not. That crashes in a
git repo where `[paths]` is empty. Fix it by falling back to local remotenames
listing.

Reviewed By: DurhamG

Differential Revision: D33370563

fbshipit-source-id: 2626a57dd39d35fd0205af79ffa02c5587490c69
2022-01-20 10:21:47 -08:00
Jun Wu
ff71611134 manifest-tree: do not crash seeing git submodules
Summary:
Support git submodule in the trees (serialzation + deserialization).
This enables reading git trees with submodules without errors, and
modifying the trees without losing the submodules entries.

The added "submodule" type forces downstream users to update.
Namely, `checkout` is updated to silently ignore them.

Reviewed By: DurhamG

Differential Revision: D33369607

fbshipit-source-id: c2bce1882df3958b272dd8a6dcc3e4052f2704f5
2022-01-20 10:21:46 -08:00
Jun Wu
f94dc779ab vfs: Option<UpdateFlag> -> UpdateFlag
Summary:
Use an enum variant for regular files so `Option` becomes unnecessary.
This makes the type simpler and makes upcoming changes cleaner.

Note: The type now matches `manifest::FileType` but `manifest::FileType`
will be changed in upcoming changes.

Reviewed By: danielocfb

Differential Revision: D33369606

fbshipit-source-id: 2b01d74875f97675d79082fe39bf4bdea26fdb07
2022-01-20 10:21:46 -08:00
Jun Wu
484e37311e git: support '{files}' template
Summary:
Change `ctx.files()` to run diff for git repos. This would make a few revsets
and templates working.

Reviewed By: danielocfb

Differential Revision: D33354722

fbshipit-source-id: c6449d7f405f2d2c568294eeebe47d371e46cb12
2022-01-20 10:21:46 -08:00
Jun Wu
471b5a49a4 git: support simple push
Summary:
Support pushing to a git repo using `git push` command.
This is implemented in remotenames' `push`, since that's
what used in production. The vanilla `push` is unchanged.

The `%include builtin:git.rc` in hgrc should have remotenames
enabled.

Reviewed By: DurhamG

Differential Revision: D33351380

fbshipit-source-id: f1e2dfd64168b83d1cf608c0490e56634688da15
2022-01-20 10:21:46 -08:00
Jun Wu
2209a1c032 treemanifest: do not prefetch trees if paths.default is not set
Summary: There is no need to prefetch trees if there is no remote repo set.

Reviewed By: DurhamG

Differential Revision: D33351378

fbshipit-source-id: 13a0dc56b712bc6be5373958b763febc1652e76b
2022-01-20 10:21:46 -08:00
Mark Juggurnauth-Thomas
0a771a579e mononoke_app: add new CLI library based on Clap 3
Summary:
Start the introduction of a new CLI library for Mononoke binaries
based on Clap 3, using `structopt`-style structures for specifying
arguments.

This adds an example binary which just prints out the repo identity,
and fills in the minimum amount of command line options required to
get Mononoke started.

There are many TODOs left in the code: these will be addressed by
subsequent diffs.

Reviewed By: mitrandir77

Differential Revision: D33622581

fbshipit-source-id: 553c09718146c5d4bcb3281476934c1f8082b94d
2022-01-20 09:34:23 -08:00
Harvey Hunt
34df634f61 mononoke: integration: Don't start mononoke server for hook_tailer test
Summary:
The hook_tailer just needs a repo to be blobimported, however it also
starts a Mononoke server.

Reviewed By: farnz

Differential Revision: D33658629

fbshipit-source-id: b79975bfc0700b19987f818ae404d82249b96d19
2022-01-20 04:07:49 -08:00
Egor Tkachenko
78c19c7a79 Add segmented changelog tailing to the repo_import tool
Summary: Let's populate segmented changelog with newly imported commits before merging them into the target repo. This way the repo will be ready to use right after merge.

Reviewed By: farnz

Differential Revision: D33654925

fbshipit-source-id: 1c00d280f429e6c3feb36f133cc71215311cf5de
2022-01-20 01:30:59 -08:00
Jun Wu
e4d5bac63f cpython-ext: fix a typo
Summary: Found it when I was reading the code.

Reviewed By: DurhamG

Differential Revision: D33458497

fbshipit-source-id: 06cc1e3d2afbcd345ba9b0471546aa5719106e77
2022-01-19 21:19:58 -08:00
Jun Wu
80845a4fb9 git: support simple clone
Summary: Support cloning a git repo using `git+` URLs.

Reviewed By: DurhamG

Differential Revision: D33351383

fbshipit-source-id: 684ac78111b201e44256b0dfebe2aa52d46f7693
2022-01-19 17:39:11 -08:00
Jun Wu
cf612fcd26 configparser: support "%include builtin:git.rc"
Summary:
The `%include builtin:git.rc` will be written to repo `hgrc` when creating
repos backed by git. The main benefit is that we can update the builtin
git.rc without migrating existing hgrc files in the future.

Reviewed By: DurhamG

Differential Revision: D33352777

fbshipit-source-id: 9b6904e4bc0eff53bf623ce9ae382d723da1213e
2022-01-19 17:39:10 -08:00
Jun Wu
fe6348cc0e configparser: add some tracing messages
Summary:
Sometimes it's unclear why a config is ineffective. This makes it
possible to figure out what's going on.

Reviewed By: DurhamG

Differential Revision: D33352778

fbshipit-source-id: bd7d591200e6a15a5e09f2057cc9303e5f1c344c
2022-01-19 17:39:10 -08:00
Jun Wu
ee77518bb0 git: support simple pull
Summary: Delegate the pull work to `git fetch` to complete its task.

Reviewed By: DurhamG

Differential Revision: D33351385

fbshipit-source-id: 16af5312d805d53f3ba1275baae3137357134ee7
2022-01-19 17:39:10 -08:00
Jun Wu
adac9e8d29 git: add a way to read git config
Summary: Use `git config -l` to read git config.

Reviewed By: DurhamG

Differential Revision: D33351381

fbshipit-source-id: 6205f2252035b54edfb8f6d50411fdd7b8fb2c57
2022-01-19 17:39:10 -08:00
Jun Wu
94eb4d41fa dev-loogger: make it easier to use
Summary:
dev-logger is used for tracing logs in tests. Make it easier to use
by reading the LOG env var, instead of RUST_LOG, and use a shorter
format without color for fast outputting, easier redirect and editing.

Reviewed By: DurhamG

Differential Revision: D33339411

fbshipit-source-id: a4a9e0336b17856c07076cf19f56bd99064d94e4
2022-01-19 17:39:10 -08:00
Jun Wu
e9ff2ed3b9 pydag: expose API to get segments
Summary:
It's not used for features. But having access to the segments in Python
makes it easier to prototype stuff. For example, this is the Python
prototype of "pathhistory" in `debugshell.py` with some parameters
to tweak:

  command("lg")
  def lg(ui, repo, *args, **opts):
      logfile(repo, *args, **opts)

  def logfile(repo, path="Documentation/logo.gif"):
      cache = {}

      def get(
          rev,
          rev2node=repo.changelog.idmap.id2node,
          clr=repo.changelog.changelogrevision,
          tm=bindings.manifest.treemanifest,
          ts=repo.manifestlog.datastore,
          path=path,
          cache=cache,
      ):
          if rev in cache:
              return cache[rev]
          node = rev2node(rev)
          mnode = clr(node).manifest
          fnode = tm(ts, mnode).get(path)
          cache[rev] = fnode
          return fnode

      cl = repo.changelog
      dag = repo.changelog.dag
      nset = repo.dageval(lambda: ancestors(lookup(".")))
      q = deque(dag.segments(nset))
      torevs = repo.changelog.torevs
      roots = repo.changelog.torevs(dag.roots(nset))

      spans = bindings.dag.spans
      spansetfromrange = bindings.dag.spans.unsaferange
      dagsegments = dag.segments
      tonodes = cl.tonodes

      skipbylevel = [0] * 5
      splitbylevel = [0] * 5
      skipbypdiff = [0] * 5
      visitbylevel = [0] * 5
      psameskip = 0
      psameblock = 0
      skipmulti = 0
      out = []
      skippedfnodes = set()
      skippedparents = spans([])

      opt_dedicated_l0_bisect = False
      opt_skip_none_ancestors = False
      opt_skip_multi = False
      opt_skip_any_psame = True

      def d(msg, write=repo.ui.write_err, verbose=repo.ui.verbose):
          if verbose:
              write("%s\n" % msg)

      d(f"QUEUE SIZE: {len(q)}")
      interesting = spans([782537])
      skipanc = spans([])
      while q:
          # try testing 8 ranges, as long as parents
          th = 4
          if opt_skip_multi and len(q) > th:
              multiparents = []
              j = 0
              for i in range(th):
                  if q[i]["has_root"]:
                      break
                  h = q[i]["high"]
                  if i > 0 and (h not in q[i - 1]["parents"]):
                      break
                  multiparents += [p for p in q[i - 1]["parents"] if p != h]
                  j = i + 1
              # Maybe track parent origins, so skip multi tests more efficiently?
              if j > 1:
                  multiparents += q[j - 1]["parents"]
                  high = q[0]["high"]
                  f = get(high)
                  mpdiff = [p for p in multiparents if get(p) != f]
                  if not mpdiff:
                      skipmulti += 1
                      d(f"SKIP MULTI {j}")
                      for i in range(j):
                          q.popleft()
                      continue

          s = q.popleft()
          indent = s.get("indent", 0)
          low = s["low"]
          high = s["high"]
          parents = s["parents"]
          ppdiff = s.get("ppdiff", [])
          pintersect = [p for p in parents if p in ppdiff]
          hasroot = s["has_root"]
          level = s["level"]
          f = get(high)

          plen = len(parents)
          pdiff = [p for p in parents if get(p) != f]
          psame = [p for p in parents if get(p) == f]
          rs = roots & spansetfromrange(low, high)
          rdiff = [p for p in rs if get(p) != f]
          rsame = not rdiff
          visitbylevel[level] += 1
          action = None
          if low == high:
              if (not psame) and (
                  plen > 0 or f is not None
              ):  # all parents are different, or no parents
                  action = "TAKE  "
                  out.append(low)
              else:
                  action = "SKIP1 "
          # if not action and high in skippedparents:
          if not action and len(spansetfromrange(low, high) & skippedparents) == high - low + 1:
              action = "SKIPP "
          if not action and high in skipanc:
              action = "SKIPA "
          # skip ancestors?
          # (can be wrong optimization?)
          if opt_skip_none_ancestors and not action and f is None:
              anc = dag.ancestors([cl.node(high)])
              ancids = torevs(anc)
              skipanc = skipanc + ancids
          # skip it
          if opt_skip_any_psame and psame:
              # mark pdiff as "skipped", so we don't output reverted commtis
              newskip = cl.torevs(cl.dag.only(cl.tonodes(pdiff), cl.tonodes(psame)))
              if newskip:
                  d(f"      {' ' * indent}NEW SKIP: {newskip}")
              if newskip & interesting:
                  d(f"      {' ' * indent}  SKIP & INTERSTING: {newskip & interesting}")
              skippedparents = skippedparents + newskip
              psameskip += 1
          if not action and (((opt_skip_any_psame) and psame or len(psame) == plen) and rsame):
              # skip it
              action = "SKIPS "
              skipbylevel[level] += high - low + 1
              # but need to check the roots
              if rs:
                  d(f"ROOTS {rs}")
                  for r in rs:
                      if get(r) is not None:
                          out.append(r)
          # output it
          if not action and (low == high):
              if psame == 0:
                  action = "TAKE0 "
                  out.append(low)
              else:
                  action = "SKIP0 "
          if not action:
              action = "SPLIT "
              splitbylevel[level] += 1
              if level > 0:
                  subsegs = dagsegments(
                      tonodes(spansetfromrange(low, high)), maxlevel=level - 1
                  )
                  # filter by pdiff
                  pdiffspans = spans(pdiff)
                  for subseg in reversed(subsegs):
                      if (
                          all(p not in pdiffspans for p in subseg["parents"])
                          and not subseg["has_root"]
                      ):
                          # skip it
                          skipbypdiff[subseg["level"]] += 1
                      else:
                          # keep it, put it in pdiffspans
                          pdiffspans = pdiffspans + spansetfromrange(
                              subseg["low"], subseg["high"]
                          )
                          subseg["indent"] = indent + 1
                          subseg["ppdiff"] = pdiff
                          q.appendleft(subseg)
              else:
                  # Dedicated L0 bisect
                  if opt_dedicated_l0_bisect:
                      # high --- mid --- low ---- end
                      end = low
                      while high > end:
                          # Check bisect results... tend to be the merge?
                          d(f"BISEC{' ' * indent} {high} -- {low}  {end}")
                          mid = (low + high) // 2
                          if get(mid) == get(high):
                              high = mid
                              low = (high + end) // 2
                          else:
                              if mid + 1 == high:
                                  # output
                                  d(f"TAKE  {' ' * indent} {high}")
                                  out.append(high)
                                  high = mid
                                  low = end
                              else:
                                  low = mid

                  else:
                      mid = (low + high) // 2
                      s1 = {
                          "low": mid + 1,
                          "high": high,
                          "parents": [mid],
                          "has_root": False,
                          "level": 0,
                          "indent": indent + 1,
                      }
                      s2 = {
                          "low": low,
                          "high": mid,
                          "parents": parents,
                          "has_root": not parents,
                          "level": 0,
                          "indent": indent + 1,
                      }
                      q.appendleft(s2)
                      q.appendleft(s1)
          d(
              f"{action}{' ' * indent}L{level} {high-low+1} {high} -> {low} (pdiff: {pdiff} / {plen}; rdiff: {rdiff} / {len(rs)})"
          )
          if pintersect:
              d(f"      {' ' * indent}PPDIFF: {pintersect}")
          iintersect = interesting & spansetfromrange(low, high)
          if iintersect:
              d(f"      {' ' * indent}INTERESTING: {iintersect}")

      repo.ui.write("skips:   %r\n" % (skipbylevel,))
      repo.ui.write("skippd:  %r\n" % (skipbypdiff,))
      repo.ui.write("splits:  %r\n" % (splitbylevel,))
      repo.ui.write("visits:  %r\n" % (visitbylevel,))
      repo.ui.write("out[:12] %r %d\n" % (out[:12], len(out)))
      repo.ui.write("cache:   %r\n" % (len(cache),))
      repo.ui.write("skipmult %d\n" % (skipmulti,))
      repo.ui.write("sameskip %d / %d\n" % (psameskip, psameblock))

      globals().update(locals())
      return

Reviewed By: DurhamG

Differential Revision: D33339407

fbshipit-source-id: 8f773ba465e36e896606548cdf71c38b1c31c147
2022-01-19 17:39:10 -08:00
Jun Wu
38bbd60d91 git: skip calculating pathcopies
Summary:
git backend does not have filelog, linkrev to calculate pathcopies using the
existing code path. So let's skip it for now.

Reviewed By: DurhamG

Differential Revision: D33282042

fbshipit-source-id: 65a427f87d7c08db91ba0ef3d73da51663f3095c
2022-01-19 17:39:10 -08:00
Jun Wu
b349cb7c79 git: support log --patch
Summary:
The old code paths might access filelog and linkrev to figure out what files to
show, which do not work in git.

Avoid them in the PathHistory log path. This simplifies stuff and make `log -p`
work for git.

Reviewed By: DurhamG

Differential Revision: D33282043

fbshipit-source-id: 4b0a48d4bafbc2cb54742620d7cd4f05d2a0f3ba
2022-01-19 17:39:09 -08:00
Jun Wu
29d72487b3 git: support basic log FILE
Summary:
Make `hg log PATH` work in a git repo. Under the hood it uses the PathHistory
abstraction which handles trees, multiple paths, removed paths that are
difficult to handle without PathHistory.

Reviewed By: DurhamG

Differential Revision: D33280173

fbshipit-source-id: 6126a0f7498fb39e3b93f6ac44b443a681e467d0
2022-01-19 17:39:09 -08:00
Durham Goode
e1307b12fc edenapi: add option to call files2 endpoint instead
Summary:
Now that we have a files2 endpoint that can return errors, let's add an
option to call it from the client.

Reviewed By: quark-zju

Differential Revision: D33283030

fbshipit-source-id: 3eda0fe870be1f4b74e0f0f60b11518c7bfe508f
2022-01-19 17:22:49 -08:00
Durham Goode
c43e88e2c8 mononoke: add files2 endpoint for returning file fetch errors
Summary:
The existing /files endpoint return a stream of file results, but any
errors are just silently dropped and the client just sees no result for that
key. As part of improving our network reliability, we should always be returning
some sort of result from the server for a given key fetch.

To start that process, let's introduce a new /files2 endpoint that returns
FileResponse instead of just FileEntry. This matches the existing commit api
endpoint. See D27549923 (82b689ad9d) for discussion on this pattern.

Reviewed By: quark-zju

Differential Revision: D33283029

fbshipit-source-id: f5590c3f0cefba72a8dd669b472834a42d08985e
2022-01-19 17:22:48 -08:00
Durham Goode
d589c916d5 edenapi: change client file fetching to work with FileResponse
Summary:
Currently fetching files from edenapi has no mechanism to report errors
to the client, so if the server can't find the key or hits an error, the client
just doesn't hear anything back.

As part of improving our network reliability and debugability, let's enable
passing errors back. The commit APIs use a pattern involving a response object
that includes the request input (in files case a Key) and a Result, so let's
follow that same pattern. This will require a new files2 endpoint, which is
introduced in a subsequent diff.

In this diff, we just introduce the FileResponse type and convert the current
FileEntry response from files v1 into the new type so we can make the endpoint
swappable later.

D27549923 (82b689ad9d) has discussion on why to choose this pattern.

Reviewed By: quark-zju

Differential Revision: D33283031

fbshipit-source-id: ec2e34760ee47ead95964e2d33e0be4173bb4e77
2022-01-19 17:22:48 -08:00
Jun Wu
8d5ae8b80f pypathhistory: expose feature as repo and revset API
Summary:
Made the PathHistory feature accessible via revset, or `repo.pathhistory`
if one do not want the rev number tech-debt.

It seems useable. For paths with long history, the bisect overhead becomes
significant and it can be much slower than simple traversal.

In linux.git (note `--time` excludes Python start up overhead):

  % lhg log -r '_pathhistory(::.,"README")' --time >/dev/null
  time: real 0.690 secs (user 0.550+0.000 sys 0.130+0.000)

  % lhg log -r '_pathhistory(::master,"mm/damon")' --time >/dev/null
  time: real 0.420 secs (user 0.320+0.000 sys 0.090+0.000)

  % lhg log -r '_pathhistory(::master,"mm")' --time >/dev/null
  time: real 9.760 secs (user 7.290+0.000 sys 1.840+0.000)
  (17k commits in output)

  % lhg log -r '_pathhistory(::master,"")' -T '{node}\n' >/dev/null
  time: real 178.720 secs (user 167.540+0.000 sys 11.250+0.000)
  (1060k commits in output)

Git:

  % time git log README >/dev/null
  0.46s user 0.12s system 99% cpu 0.576 total

  % time git log mm/damon >/dev/null
  0.63s user 0.13s system 99% cpu 0.770 total

  % time git log mm >/dev/null
  1.78s user 0.27s system 98% cpu 2.085 total

  % time git log --format=%H >/dev/null
  12.61s user 0.61s system 99% cpu 13.256 total

In fbsource. Files with shallow path and short history works okay:

The `tools/signedsource` has 8 changes. It takes about 1 second for a change
running from a devserver closer to the server.

Cold manifest cache + cold commit cache + semi-cold server cache:

  % hg clone --configfile /etc/mercurial/repo-specific/fbsource.rc -U  fb://fbsource /tmp/fbs1
  % rm -rf /var/cache/hgcache/fbsource/manifests
  % lhg --cwd /tmp/fbs1 log -r '_pathhistory(::master,"tools/signedsource")' --time >/dev/null
  time: real 10.780 secs (user 1.120+0.000 sys 0.490+0.000)

Cold manifest cache + cold commit cache + semi-warm server cache:

  % # same commands
  time: real 6.850 secs (user 1.480+0.000 sys 0.790+0.000)

Cold manifest cache + warm commit cache + semi-warm server cache:

  % rm -rf /var/cache/hgcache/fbsource/manifests
  % lhg --cwd /tmp/fbs1 log -r '_pathhistory(::master,"tools/signedsource")' --time >/dev/null
  time: real 1.330 secs (user 0.370+0.000 sys 0.160+0.000)

Warm local cache:

  % lhg --cwd /tmp/fbs1 log -r '_pathhistory(::master,"tools/signedsource")' --time >/dev/null
  time: real 0.260 secs (user 0.180+0.000 sys 0.090+0.000)

Files with long history or deeper paths can be slow. But the output is in a
streaming fashion so it is visible to see the progress:

  % lhg log -r '_pathhistory(::master,"fbcode/eden/scm/edenscm/mercurial/dispatch.py")' --time -T '{node|short} {desc|firstline}\n' > /tmp/log
  time: real 169.840 secs (user 7.960+0.000 sys 3.690+0.000)
  % wc -l /tmp/log
  79
  % lhg log -r '_pathhistory(::master,"fbcode/eden/scm/edenscm/mercurial/dispatch.py")' --time -T '{node|short} {desc|firstline}\n' > /tmp/dlog
  time: real 1.220 secs (user 0.820+0.000 sys 0.390+0.000)

Reviewed By: DurhamG

Differential Revision: D33280174

fbshipit-source-id: 97334b3b110f21722be7be3dce095468a887a0d6
2022-01-19 16:59:11 -08:00
Jun Wu
43f0f699b9 pypathhistory: bindings for PathHistory
Summary: Provide access to PathHistry features. The main API is `__next__`.

Reviewed By: DurhamG

Differential Revision: D33280042

fbshipit-source-id: e89b970991efce04937ce22d92ad4bbcd495b85f
2022-01-19 16:59:11 -08:00
Jun Wu
f0fb12a685 pathhistory: main implementation
Summary: Implement the main history logic by visiting, skipping, and splitting segments.

Reviewed By: DurhamG

Differential Revision: D33265878

fbshipit-source-id: f3165752cf9fc8cd0bd245be9427769804f9e556
2022-01-19 16:59:11 -08:00
Jun Wu
436a9a5e28 pathhistory: add some utility algorithms
Summary: Those are individual algorithms used by upcoming changes.

Reviewed By: DurhamG

Differential Revision: D33265884

fbshipit-source-id: b09813df4fb6477c4f0aa60a853b68af026b43c7
2022-01-19 16:59:11 -08:00
Jun Wu
173a75ed8d pathhistory: add tree traversal algorithm
Summary:
The problem is: given a list of paths, and a root tree, find
the content ids of the paths.

It can be a bit complex, if these are considered:
- avoid resolving common prefixes of paths multiple times
  (ex. if paths are "a/b/c" and "a/b/d", only visit "a" and "a/b" once)
- prefetch in batches per tree depth
  (need O(max tree depth) round-trips)

This module is to solve the problem. See the docstring for details.

Reviewed By: DurhamG

Differential Revision: D33339416

fbshipit-source-id: 3400e17799b42cf489576228bd486a671ddaaa5f
2022-01-19 16:59:10 -08:00
Jun Wu
ed91af1e91 pathhistory: path (file or dir) history based on segmented changelog
Summary:
The path history area is problematic in multiple ways:
- Visiting commits and checking their trees following commit graph can be too
  slow.
- The "fastlog" service in Mononoke can provide faster path history for a
  single file or a single directory. However, it requires complex infra to
  maintain the indexes and do not handle following multiple paths nicely.
- hg's linkrev is a tech-debt we'd like to remove but it's not easy to do so.
  Partially because a replacement will need new storage and protocol design,
  and might face questions like offline UX, etc.

This crate is an attempt to tackle problems above:
- Visiting segments and skip segments aggressively.
- If we use commit graph and regular tree reads, then there is no need for an
  external service, or hg's linkrev.

with one main downside caused by bisect:
- "change + revert" might cancel out. History can be incomplete.

But that downside seems acceptable considered the other wins.

This diff adds the crate with some high level commments.

Reviewed By: DurhamG

Differential Revision: D33265881

fbshipit-source-id: e29be3d8e9fa8cd9f011144a7104429edcb25ec4
2022-01-19 16:59:10 -08:00
Jun Wu
95b39a040c pyedenapi: add bindings for pull_lazy endpoint
Summary: Expose the pull_lazy endpoint client-side.

Reviewed By: DurhamG

Differential Revision: D33594531

fbshipit-source-id: 848b77557416aa2a110bd4d03938107e80e2a35d
2022-01-19 16:37:08 -08:00
Jun Wu
1da6f656d6 edenapi: add pull_lazy client-side implementation
Summary: This will replace pull_fast_forward_master.

Reviewed By: DurhamG

Differential Revision: D33594527

fbshipit-source-id: a8a52567ec2395978228f4c3bf5ff385ad0e6d46
2022-01-19 16:37:08 -08:00
Jun Wu
5600b0f82f edenapi_service: define pull_lazy endpoint
Summary: The `pull_lazy` endpoint is a more flexible version of `pull_fast_forward`.

Reviewed By: farnz

Differential Revision: D33594529

fbshipit-source-id: 50aa9053294761f45a319f195ca888de3d9575d0
2022-01-19 16:37:08 -08:00
Jun Wu
360dbdf66d edenapi/types: add types for the new fast pull API
Summary:
Instead of passing just the old and new master nodes. Pass a list to support
more complicated cases.

Reviewed By: DurhamG

Differential Revision: D33594530

fbshipit-source-id: 2087d4fce79eb5cff3c1d381cfc82f9bd6ad89c4
2022-01-19 16:37:08 -08:00
Durham Goode
bfc533ab20 debughiddencommit: hide commit even if backup fails
Summary:
debughiddencommit can produce visible ephemeral commits if the backup
fails (like if certs are invalid). Let's ensure we make them invisible even in
the case of a backup error.

https://fb.workplace.com/groups/asic.infra/posts/1314042865686399/

Reviewed By: mrkmndz

Differential Revision: D33668078

fbshipit-source-id: 6df48709ef183afa229f96fa7a526c479e8b4c0a
2022-01-19 15:00:54 -08:00
Chad Austin
b4f10a1727 update straggling license headers
Reviewed By: xavierd

Differential Revision: D33666997

fbshipit-source-id: 9a20b7f1dd68cc56055d0775993ba8dfc7347acc
2022-01-19 14:37:11 -08:00
Xavier Deguillard
cbf46db8a9 cli: improve error message when cloning over an existing clone
Summary:
When EdenFS is stopped, users have attempted to remove the root directory of
the mount point, in the hope that it deletes their repository. Cloning a new
repository at that location gives them an unhelpful error message claiming that
the directory isn't empty:

  error: [Errno 41] Directory not empty: 'C:\open\foobar'

Let's improve this error to help guide the user towards using `eden rm`.

Reviewed By: kmancini

Differential Revision: D33574906

fbshipit-source-id: 3f2ae567dd9b2af2493e6e52c52c85a8216c993b
2022-01-19 12:44:04 -08:00
Durham Goode
6784190eb0 merge: improve conflict hint
Summary:
The phrasing implied that "update --clean" would only discard the
conflicting files, but in reality it discards everything. Let's make the message
clearer.

Reviewed By: quark-zju

Differential Revision: D33662474

fbshipit-source-id: 60aeb7db72d45e894d959d9f83285f34132c603b
2022-01-19 12:17:24 -08:00
Jun Wu
39b019ade5 edenapi_service: add repo/health_check endpoint
Summary:
Currently `edenapi.url` and `remotefilelog.reponame` decide remote URLs.
But `paths.default` is the more "standard" way to specify a remote.

In the future we want to use a single config `paths.default` to decide URLs
instead. Prepare for that by duplicating the only non-repo endpoint so one
can specify `paths.default` like `edenapi://hostname/repo` and forget about
`edenapi.url` or `remotefilelog.reponame`.

Reviewed By: farnz

Differential Revision: D33594528

fbshipit-source-id: 1f0e134a5def2f54e22619751c1fb0a754b5dbb5
2022-01-19 11:11:23 -08:00
Jun Wu
0f262798d3 segmented_changelog: remove pull_fast_forward_master
Summary:
With the pull_data API, we can remove its special case pull_fast_forward_master.

Note the pull_fast_forward_master endpoint is kept for compatibility.

Differential Revision: D33204764

fbshipit-source-id: 90cd93a34c1733ed6ce2bb927ec98839f749dc39
2022-01-19 10:55:35 -08:00
Jun Wu
e704fd5605 segmented_changelog: implement pull_data method
Summary:
The pull_data method takes a list of old, new heads, instead of single point
old, new heads. It is more flexible and can be used to pull multiple branches,
or replace the clone endpoint.

The existing `pull_fast_forward_master` is changed to use `pull_data` instead.

Reviewed By: farnz

Differential Revision: D32800580

fbshipit-source-id: f971cdf65fce71cf0f759e32d50d5105f4840f38
2022-01-19 10:55:35 -08:00
Alex Hornby
9f21750df1 mononoke: allow more granular integration tests
Summary: Allow more granular integration tests, add few demonstration targets

Reviewed By: HarveyHunt

Differential Revision: D33622747

fbshipit-source-id: 1ac2f2280803e489886fc276620cd3a4e4ff570d
2022-01-19 10:48:31 -08:00
Alex Hornby
58b772e2e9 mononoke: add --discovered-test arg to integration test runner
Summary:
The next diff adds granular target tests, this diff adds a feature to the runner so that they can successfully run.

When `buck test` is used, `tpx` first calls the integration_runner with the `--dry-run` flag. This allows the integration_runner to return a list of tests that are available. `tpx` will then iterate through that list, calling the `integration_runner` for each test and passing `run-tests,<test name>` once. This format is called the simple test selector.

With the previous logic, the test runner would return all ~300 tests during discovery and then tpx would call each test. Update the runner to accept multiple `--discovered-test` args, which the runner will return when `tpx` does test discovery. As these discovered tests aren't used during test running, the simple test selector that is provided is used.

In a later diff we can remove the existing test discovery and just rely on the tests that buck tells us about.

Reviewed By: HarveyHunt

Differential Revision: D33630573

fbshipit-source-id: 20dcd0578ed6e193f8459bdc7acb8820cee7f3b0
2022-01-19 10:48:31 -08:00