Summary:
Today, people running codemods or search/replace on their repos often accidentally corrupt their repos, and everyone ends up sad.
It's better to make them read-only
Test Plan: python run-tests.py
Reviewers: rmcelroy, #sourcecontrol, durham, ttung
Reviewed By: durham
Subscribers: mitrandir, quark, durham
Differential Revision: https://phabricator.fb.com/D2807369
Tasks: 9431187
Signature: t1:2807369:1452192329:b5ed6606cb66b1c830fc3d3fb5a81e6120387b38
Summary:
Historicaly we would move the old backup data blob to <name>+<int> so we had a
record of all the old data blobs we could search though for good commit
histories.
Since we no longer require that the data blobs have perfect commit histories,
these extra blobs just take up space.
This changes makes us only store one old version (for debugging and recovery
purposes), which should save space on clients.
Also switched to atomic rename writes while we're at it.
Test Plan: Ran the tests
Reviewers: #sourcecontrol, ttung
Differential Revision: https://phabricator.fb.com/D2770675
Summary:
There was a race condition where there could be an exception when trying to
create directories that already exist.
Test Plan: Ran the tests
Reviewers: #sourcecontrol, ttung
Differential Revision: https://phabricator.fb.com/D2736268
Summary:
Attempting to maintain perfect history in the file blobs has become the most
complex, bug prone, and performance hurting aspect of remotefilelog. Let's just
drop this requirement and rely on upstream Mercurial's ability to fixup linkrevs
in the face of imperfect data.
The real solution for this class of problems is to make it so that the filelog
hashes are unique with respect to the commit that introduces them, but that's a
much harder problem.
Test Plan:
Ran the tests.
Made a commit with 1000 files changes. hg commit went from 15s to 7.5s. The difference will be even more dramatic for certain situations that have known to have caused problems in the past.
Reviewers: #sourcecontrol, pyd
Subscribers: rmcelroy, pyd
Differential Revision: https://phabricator.fb.com/D2686318
The rev graph building code was flawed because it didn't track second parents
correctly. This was caught when someone was developing an extension and
attempted to commit a merge commit in some way.
Summary:
Per @pyd's review of D1933267, we need to check for the linknode in cl.nodemap,
not in cl (whose __contains__ method only looks for revs and doesn't even check
for visibility... lolz).
Test Plan: ran tests
Reviewers: durham, sid0, pyd, ericsumner, lcharignon, davidsp, mitrandir
Reviewed By: mitrandir
Subscribers: akushner, daviser, pyd
Differential Revision: https://phabricator.fb.com/D1934941
Tasks: 6573011
Signature: t1:1934941:1427130649:b084635db9bfcd28c4d4a1bcf12a7500c06b323c
Summary:
The new version of adjust linknodes wasn't accounting for the fact that some
ancestries contained nodes that no longer exist. Check for that before looking
for common ancestors.
The old version of this code survived by luck. We were catching KeyErrors as one
base case, and it just happens that LookupError from the changelog is also a
KeyError, so it was getting caught and eaten.
Test Plan:
We should probably add a test, but I have to leave shortly and this is pretty
broken, so we'll have to take a rain check.
Reviewers: rmcelroy, pyd, sid0
Differential Revision: https://phabricator.fb.com/D1933267
Summary:
The new fixmappinglinknodes function was using recursion to traverse the file
history, but this would break for files with history that was extremely long
(stack overflow). Switch to using a manual stack approach.
Test Plan: Ran the tests (I'd added a test to cover this logic before).
Reviewers: sid0, davidsp, mitrandir, lcharignon, pyd, rmcelroy
Reviewed By: rmcelroy
Subscribers: michaelbarton
Differential Revision: https://phabricator.fb.com/D1931944
Signature: t1:1931944:1426884986:3a0ef144fb55b8c0533e5c5de90699a1823b891f
Summary:
Previously remotefilelog did not produce all the necessary local data blobs
when doing a peer push/pull if the incoming changegroup had two manifests
that referred to the same file revision. We would only create a file blob
containing the history for the first occurrence, then if the user tried to
access the file history for other occurrences they got an exception.
The fix is to add linkrev fixup logic, similar to the adjustlinkrev() method
from core Mercurial's filectx. Now, if no valid local file blob can be found, we
will compute a valid history by reading the changelog.
We might be able to write this data to disk in the future as well to prevent
having to repeatedly compute this.
Test Plan: Added a test
Reviewers: sid0, rmcelroy, pyd, mitrandir, lcharignon
Differential Revision: https://phabricator.fb.com/D1904453
Summary:
We've gotten reports of corrupt cache files, and the error message is pretty
obtuse (ValueError for converting a string to an int). This refactors the size
check into a function and provides a better error message.
Test Plan: Added a test
Reviewers: sid0, pyd, mitrandir, ericsumner, rmcelroy
Reviewed By: rmcelroy
Differential Revision: https://phabricator.fb.com/D1774721
Signature: t1:1774721:1420830671:afd54dde8fdc00e08ed1c6cb73bf9fdc7fac2327
It is part of the revlog API and some extension like tortoisehg rely on it. The
default implementation is the same as size so we can safely mimic this here.
A recent fix to make ancestor maps work with changeset evolution actually caused
a pretty serious regression. The ancestormap validation code was returning
ancestormaps with hidden ancestors if the first commit in the history was a
hidden node. This resulted in lots of invalid ancestories being returned.
Instead we only want to allow hidden ancestors in the map if the relativeto
commit has been explicitly set to a hidden node.
Summary:
When doing 'hg unshelve foo.txt' with Changeset Evolution enabled, uncommit will
first prune the commit, then try to read the filelog history to determine if any
renames need to be undone. Since the commit is now pruned, remotefilelog fails
to find any valid histories.
This fixes it two allow hidden histories if the filectx commit is hidden. It
also tweaks remotefilectx to produce commit-relative histories when possible,
which will result in more accurate histories.
Test Plan:
Ran hg uncommit in the evolve repo that had problems before. Verified
it now worked.
Reviewers: pyd, sid0
Differential Revision: https://phabricator.fb.com/D1587306
Summary: API change
Test Plan: @durham ran an amend.
Reviewers: durham
Reviewed By: durham
Subscribers: durham
Differential Revision: https://phabricator.fb.com/D1569510
Summary:
Upstream Mercurial changed the way merging works and added
revlog.commonancestorsheads. This changes remotefilelog to implement the same
API.
Previously we were able to use ancestors.genericancestors to do the graph
traversal. Upstream Mercurial has deleted that function though (since it is now
unused), so remotefilelog must now build a temporary rev graph in order to use
the ancestors.* apis.
Test Plan: Added a test. It failed without the fix, it passes with the fix.
Reviewers: sid0, davidsp, pyd
Differential Revision: https://phabricator.fb.com/D1566787
Summary: This was broken by recent changes.
Test Plan: Ran test suite.
Reviewers: durham
Reviewed By: durham
Differential Revision: https://phabricator.fb.com/D1558890
Tasks: 5170539
The current local cache is just files on disk, and this implementation detail
was spread across the extension. This change refactors it to hide the
implementation inside a class so that we can replace it with other
implementations (such as a sqlite local cache) later.
Previously the file service client was a global object that all repos could
share. This was a bit hacky and is no longer needed. Now the file service
client exists per repo instance.
This is part of a series of changes to abstract the local caching and remote
file service in such a way that we can plug and play implementations.
The alternate lookup code was mistakening looking for only the last digit
instead of looking at the entire prefix. This meant files with more than 10
alternates would start failing to find histories, which breaks rebase.
Enables specifying a name for a repo that is used in the cache key.
This allows multiple repos on a machine to share a cache without the
risk of keys overlapping.
The previous algorithm thought that if the system cache had the file rev, it was
guaranteed to be valid. This isn't true in the case of a machine in which
multiple people share the cache (one person may have pulled a rev but the other
hasn't).
The new algorithm is more explicit. It checks:
- system cache
- local cache
- local cache fallbacks
- remote cache
- master server
A rare bug can occur where the local file blob might not exist, but a valid old
version of that blob does exist. This refactor the linknode logic in ancestormap
to check the old versions if the server fetch fails to find the blob.
It still prints an ugly warning message from the server, but this whole issue is
quite rare anyway.
When the cache is stored on a filesystem, excessive stat calls can slow
mercurial updates down dramatically. This reduces it to a single open call for
the cache location and if that fails, a single open call for the local location.