sapling/hgext3rd/fbconduit.py
Zach Amsden f1ecf43a83 Parallel callout to scmquery through conduit
Summary:
Parallel callout to scmquery through conduit
This implements the special case of log on a single directory, preserving
follow behavior.  To do this, we backtrackfrom the current head to find
 all draft revisions along the path, then find the common public ancestor of
those.  Once we find that, we can begin paging in results from the scmquery
service.

Pretty much a working diff at this point.  Limits and boundary conditions
have not been fully tested.  Every once in a while I run into a bum query,
which I suspect to be either a bad proxy or a service router failure; still
debugging that.  Could also be an issue with conduit.  Other than that,
things seem to work.

Test Plan:
Testing log with fastest setting (no revsets), revsets (using -M), and with extension disabled (--sparse):

  [zamsden@devbig192.prn1 ~/local/fbcode] time hg log tao --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 100 -M > a

  real	0m1.895s
  user	0m0.003s
  sys	0m0.004s
  [zamsden@devbig192.prn1 ~/local/fbcode] time hg log tao --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 100 > b

  real	0m1.308s
  user	0m0.005s
  sys	0m0.001s
  [zamsden@devbig192.prn1 ~/local/fbcode] diff a b
  [zamsden@devbig192.prn1 ~/local/fbcode] time hg log tao --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 100 --sparse > c

  real	0m7.320s
  user	0m0.004s
  sys	0m0.003s
  [zamsden@devbig192.prn1 ~/local/fbcode] diff a c




Testing --user option:



  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/dragon --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 5 -u dmitri > a

  real	0m2.765s
  user	0m0.002s
  sys	0m0.004s
  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/dragon --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 5 -u dmitri --sparse > b

  real	0m23.247s
  user	0m0.005s
  sys	0m0.001s
  [zamsden@devbig192.prn1 ~/local/fbsource] diff a b
  [zamsden@devbig192.prn1 ~/local/fbsource]



Testing same output enabled / disabled for -X option:



  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/lithium/ --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 20 -X '**TARGETS' > a

  real	0m1.292s
  user	0m0.002s
  sys	0m0.003s
  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/lithium/ --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 20 -X '**TARGETS' --sparse > b

  real	0m2.697s
  user	0m0.002s
  sys	0m0.004s
  [zamsden@devbig192.prn1 ~/local/fbsource] diff a b
  [zamsden@devbig192.prn1 ~/local/fbsource]



Testing -k option:



  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/lithium/ --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -k 'e' -X '**TARGETS' -l 10 > a

  real	0m1.174s
  user	0m0.003s
  sys	0m0.003s
  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/lithium/ --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -k 'e' -X '**TARGETS' -l 10 --sparse > b

  real	0m1.259s
  user	0m0.004s
  sys	0m0.002s
  [zamsden@devbig192.prn1 ~/local/fbsource] diff a b
  [zamsden@devbig192.prn1 ~/local/fbsource]



Testing -I option:



  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/scm/ --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 20 -I '**.py' > a

  real	0m1.473s
  user	0m0.005s
  sys	0m0.003s
  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/scm/ --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 20 -I '**.py' --sparse > b

  real	0m2.911s
  user	0m0.003s
  sys	0m0.002s
  [zamsden@devbig192.prn1 ~/local/fbsource] diff a b
  [zamsden@devbig192.prn1 ~/local/fbsource]




Testing multiple directory output in all three modes - revset, fast filtered, and forcing original fallback with --sparse



  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/hphp/hhvm fbcode/hphp/runtime/ --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 100 -M > a

  real	0m2.892s
  user	0m0.003s
  sys	0m0.006s
  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/hphp/hhvm fbcode/hphp/runtime/ --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 100 > b

  real	0m2.575s
  user	0m0.697s
  sys	0m0.077s
  [zamsden@devbig192.prn1 ~/local/fbsource] time hg log fbcode/hphp/hhvm fbcode/hphp/runtime/ --config extensions.fbconduit=~/local/fb-hgext/hgext3rd/fbconduit.py --config fbconduit.host=our.zamsden.devbig192.prn1.facebook.com --config extensions.fastlog=~/local/fb-hgext/hgext3rd/fastlog.py --pager=off -l 100 --sparse > c

  real	0m7.339s
  user	0m0.003s
  sys	0m0.004s
  [zamsden@devbig192.prn1 ~/local/fbsource] diff a b
  [zamsden@devbig192.prn1 ~/local/fbsource] diff a c
  [zamsden@devbig192.prn1 ~/local/fbsource]

Reviewers: rmcelroy, #scmquery, #mercurial, ttung, lcharignon, durham, stash

Reviewed By: stash

Subscribers: cdykes, lcharignon, quark, stash, mjpieters, jeroenv

Differential Revision: https://phabricator.intern.facebook.com/D3634075

Tasks: 12341014

Signature: t1:3634075:1471039212:0989839636847a8e5da6a0ef9150035fcf5bb797
2016-08-12 15:52:50 -07:00

256 lines
7.7 KiB
Python

# fbconduit.py
#
# An extension to query remote servers for extra information via conduit RPC
#
# Copyright 2015 Facebook, Inc.
from mercurial import (
error,
extensions,
node,
revset,
templatekw,
templater,
)
from mercurial.i18n import _
import re
import json
from urllib import urlencode
from mercurial.util import httplib
conduit_host = None
conduit_path = None
connection = None
MAX_CONNECT_RETRIES = 3
class ConduitError(Exception):
pass
class HttpError(Exception):
pass
githashre = re.compile('g([0-9a-fA-F]{12,40})')
phabhashre = re.compile('^r([A-Z]+)([0-9a-f]{12,40})$')
fbsvnhash = re.compile('^r[A-Z]+(\d+)$')
def extsetup(ui):
if not conduit_config(ui):
ui.warn(_('No conduit host specified in config; disabling fbconduit\n'))
return
templater.funcs['mirrornode'] = mirrornode
templatekw.keywords['gitnode'] = showgitnode
revset.symbols['gitnode'] = gitnode
extensions.wrapfunction(revset, 'stringset', overridestringset)
revset.symbols['stringset'] = revset.stringset
revset.methods['string'] = revset.stringset
revset.methods['symbol'] = revset.stringset
def conduit_config(ui, host=None, path=None, protocol=None):
global conduit_host, conduit_path, conduit_protocol
conduit_host = host or ui.config('fbconduit', 'host')
conduit_path = path or ui.config('fbconduit', 'path')
conduit_protocol = protocol or ui.config('fbconduit', 'protocol')
if conduit_host is None:
return False
if conduit_protocol is None:
conduit_protocol = 'https'
return True
def call_conduit(method, **kwargs):
global connection, conduit_host, conduit_path, conduit_protocol
# start connection
if connection is None:
if conduit_protocol == 'https':
connection = httplib.HTTPSConnection(conduit_host)
elif conduit_protocol == 'http':
connection = httplib.HTTPConnection(conduit_host)
# send request
path = conduit_path + method
args = urlencode({'params': json.dumps(kwargs)})
for attempt in xrange(MAX_CONNECT_RETRIES):
try:
connection.request('POST', path, args, {'Connection': 'Keep-Alive'})
break
except httplib.HTTPException as e:
connection.connect()
else:
raise e
# read http response
response = connection.getresponse()
if response.status != 200:
raise HttpError(response.reason)
result = response.read()
# strip jsonp header and parse
assert result.startswith('for(;;);')
result = json.loads(result[8:])
# check for conduit errors
if result['error_code']:
raise ConduitError(result['error_info'])
# return RPC result
return result['result']
# don't close the connection b/c we want to avoid the connection overhead
def mirrornode(ctx, mapping, args):
'''template: find this commit in other repositories'''
reponame = mapping['repo'].ui.config('fbconduit', 'reponame')
if not reponame:
# We don't know who we are, so we can't ask for a translation
return ''
if mapping['ctx'].mutable():
# Local commits don't have translations
return ''
node = mapping['ctx'].hex()
args = [f(ctx, mapping, a) for f, a in args]
if len(args) == 1:
torepo, totype = reponame, args[0]
else:
torepo, totype = args
try:
result = call_conduit('scmquery.get.mirrored.revs',
from_repo=reponame,
from_scm='hg',
to_repo=torepo,
to_scm=totype,
revs=[node]
)
except ConduitError as e:
if 'unknown revision' not in str(e.args):
mapping['repo'].ui.warn((str(e.args) + '\n'))
return ''
return result.get(node, '')
def showgitnode(repo, ctx, templ, **args):
"""Return the git revision corresponding to a given hg rev"""
reponame = repo.ui.config('fbconduit', 'reponame')
if not reponame:
# We don't know who we are, so we can't ask for a translation
return ''
backingrepos = repo.ui.configlist('fbconduit', 'backingrepos',
default=[reponame])
if ctx.mutable():
# Local commits don't have translations
return ''
matches = []
for backingrepo in backingrepos:
try:
result = call_conduit('scmquery.get.mirrored.revs',
from_repo=reponame,
from_scm='hg',
to_repo=backingrepo,
to_scm='git',
revs=[ctx.hex()]
)
githash = result[ctx.hex()]
if githash != "":
matches.append((backingrepo, githash))
except ConduitError:
pass
if len(matches) == 0:
return ''
elif len(backingrepos) == 1:
return matches[0][1]
else:
# in case it's not clear, the sort() is to ensure the output is in a
# deterministic order.
matches.sort()
return "; ".join(["{0}: {1}".format(*match)
for match in matches])
def gitnode(repo, subset, x):
"""``gitnode(id)``
Return the hg revision corresponding to a given git rev."""
l = revset.getargs(x, 1, 1, _("id requires one argument"))
n = revset.getstring(l[0], _("id requires a string"))
reponame = repo.ui.config('fbconduit', 'reponame')
if not reponame:
# We don't know who we are, so we can't ask for a translation
return subset.filter(lambda r: false)
backingrepos = repo.ui.configlist('fbconduit', 'backingrepos',
default=[reponame])
peerpath = repo.ui.expandpath('default')
translationerror = False
for backingrepo in backingrepos:
try:
result = call_conduit('scmquery.get.mirrored.revs',
from_repo=backingrepo,
from_scm='git',
to_repo=reponame,
to_scm='hg',
revs=[n]
)
hghash = result[n]
if hghash != '':
break
except ConduitError as e:
pass
else:
translationerror = True
if translationerror or result[n] == "":
repo.ui.warn(("Could not translate revision {0}.\n".format(n)))
return subset.filter(lambda r: False)
rn = repo[node.bin(result[n])].rev()
return subset.filter(lambda r: r == rn)
def overridestringset(orig, repo, subset, x):
# Is the given revset a phabricator hg hash (ie: rHGEXTaaacb34aacb34aa)
phabmatch = phabhashre.match(x)
if phabmatch:
phabrepo = phabmatch.group(1)
phabhash = phabmatch.group(2)
# The hash may be a git hash
if phabrepo in repo.ui.configlist('fbconduit', 'gitcallsigns', []):
return overridestringset(orig, repo, subset, 'g%s' % phabhash)
if phabhash in repo:
return orig(repo, subset, phabhash)
# Is the given revset a phabricator svn revision (rO11223232323232)?
svnrev = fbsvnhash.match(x)
if svnrev and not x in repo:
try:
extensions.find('hgsubversion')
meta = repo.svnmeta()
desiredrevision = int(svnrev.group(1))
# For some odd reason, the key is a tuple instead of a revision num
# The second member always seems to be None
revmapkey = (desiredrevision, None)
hghash = meta.revmap.get(revmapkey)
if hghash:
return orig(repo, subset, hghash)
except KeyError:
pass
m = githashre.match(x)
if m is not None:
githash = m.group(1)
if len(githash) == 40:
return gitnode(repo, subset, ('string', githash))
else:
raise error.Abort('git hash must be 40 characters')
return orig(repo, subset, x)