mirror of
https://github.com/facebook/sapling.git
synced 2024-10-10 08:47:12 +03:00
git_handler: use convert_list to cache git objects
getnewgitcommits() does a weird traversal where a particular commit SHA is visited as many times as the number of parents it has, effectively doubling object reads in the standard case with one parent. This patch makes the convert_list a cache for objects, so that a particular Git object is read just once. On a mostly linear repository with over 50,000 commits, this brings a no-op hg pull down from 70 seconds to 38, which is close to half the time, as expected. Note that even a no-op hg pull currently does a full DAG traversal -- an upcoming patch will fix this.
This commit is contained in:
parent
36052aca77
commit
6f79df86d2
@ -620,7 +620,11 @@ class GitHandler(object):
|
||||
todo.pop()
|
||||
continue
|
||||
assert isinstance(sha, str)
|
||||
obj = self.git.get_object(sha)
|
||||
if sha in convert_list:
|
||||
obj = convert_list[sha]
|
||||
else:
|
||||
obj = self.git.get_object(sha)
|
||||
convert_list[sha] = obj
|
||||
assert isinstance(obj, Commit)
|
||||
for p in obj.parents:
|
||||
if p not in done:
|
||||
@ -630,7 +634,6 @@ class GitHandler(object):
|
||||
break
|
||||
else:
|
||||
commits.append(sha)
|
||||
convert_list[sha] = obj
|
||||
done.add(sha)
|
||||
todo.pop()
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user