Summary:
Expands environment variables in the cacheprocess and cachepath config options,
so users can specify something like remotefilelog.cachepath=$HOME/.hgcache
Test Plan:
Set my cachepath to $HOME/.hgcache on my laptop and manually
performed a shallow clone. Verified data was put in ~/.hgcache
Reviewers: sid0
Differential Revision: https://phabricator.fb.com/D1342174
The current local cache is just files on disk, and this implementation detail
was spread across the extension. This change refactors it to hide the
implementation inside a class so that we can replace it with other
implementations (such as a sqlite local cache) later.
Previously the file service client was a global object that all repos could
share. This was a bit hacky and is no longer needed. Now the file service
client exists per repo instance.
This is part of a series of changes to abstract the local caching and remote
file service in such a way that we can plug and play implementations.
If the memcache process exited early, remotefilelog was throwing an exception
instead of falling back to the server. This change makes it fall back to the
server, and also print a warning that the cache connection closed early.
When falling back to the master server for cache misses, we only kept two
requests in flight at any time. Over high latency connections (like across
oceans) this resulted in very slow downloads.
This change increases the request size to 10,000 keys at once. This will keep
the size of the request lower than the tcp buffer size, while allowing us to
maximize our throughput.
Previously we sent the entire list of files to the fallback repo in a single ssh
write/flush. If the size of this write exceeded the tcp buffer on the receiving
end, the call would hang until the buffer had room. The problem is that the
receiving end (the server) is hung trying to send data back to the
client. Therefore it deadlocked.
The fix is to send and receive requests one at a time. We always have the next
request in flight while receiving so we shouldn't be waiting on requests too
often.
Enables specifying a name for a repo that is used in the cache key.
This allows multiple repos on a machine to share a cache without the
risk of keys overlapping.
The previous algorithm thought that if the system cache had the file rev, it was
guaranteed to be valid. This isn't true in the case of a machine in which
multiple people share the cache (one person may have pulled a rev but the other
hasn't).
The new algorithm is more explicit. It checks:
- system cache
- local cache
- local cache fallbacks
- remote cache
- master server
The remotefilelog extension currently doesn't work with tags. Adding include and
exclude patterns allows users to specify which files they want to treat as
shallow and which the want to download the entire history for. By excluding
.hgtags from being shallow, this enables tags to work in a mostly shallow repo.
This also enables largefile like scenarios where most files are full and only a
few large ones are kept remote.
A rare bug can occur where the local file blob might not exist, but a valid old
version of that blob does exist. This refactor the linknode logic in ancestormap
to check the old versions if the server fetch fails to find the blob.
It still prints an ugly warning message from the server, but this whole issue is
quite rare anyway.
A workingctx produces manifest entries with nullid+'a' or nullid+'m'
for any added or modified files. The extension was trying to prefetch
these but they didn't exist and caused an error. Luckily they are length
42 so we can check for them and not prefetch them.