sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-12 01:39:21 +03:00

Author	SHA1	Message	Date
Durham Goode	c429030ca4	store: add class definitions and stub for repack Summary: This introduces the high level classes that will implement the generic repack logic. Test Plan: Ran the repack in conjunction with later commits that use these apis. Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir Reviewed By: mitrandir Differential Revision: https://phabricator.intern.facebook.com/D3249577 Signature: t1:3249577:1462225435:000f9cc29ae2a3d7fdbedf546c8936ef45d1e4cf	2016-05-03 12:32:35 -07:00
Durham Goode	b049a0910a	store: datapack fix perf issue Summary: Using range() allocates a full list, which is 2**16 entries in the fanout case. Let's use xrange instead. This is a notable performance win when checking many keys. Also removed an unused variable and use index instead of self._index since this is a hotpath. Test Plan: Ran hg repack Reviewers: rmcelroy, ttung, lcharignon, quark, mitrandir Reviewed By: mitrandir Differential Revision: https://phabricator.intern.facebook.com/D3249563 Signature: t1:3249563:1462240834:c19d6cbf0b6237f15ca8d81e8da856752df0ec59	2016-05-03 12:30:44 -07:00
Durham Goode	1704e5c8fb	store: add tests for historypack Summary: This adds a basic test suite for the historypack class, and fixes some issues it found. Test Plan: ./run-tests.py test-historypack.py Reviewers: mitrandir, rmcelroy, ttung, lcharignon Reviewed By: lcharignon Differential Revision: https://phabricator.intern.facebook.com/D3237858 Signature: t1:3237858:1461884966:c0ec90a2735255e5ef70eade09915066a7b71ee5	2016-04-28 17:37:03 -07:00
Durham Goode	8f4d83edeb	shallowbundle: fix broken fallback orig call This was caught by tests running in an unusual configuration	2016-04-28 17:34:08 -07:00
Durham Goode	22948ce7e1	checkcode: add check code test Summary: Adds the same check code test that upstream Mercurial uses. Test Plan: Ran it, and fixed all the failures. I won't land this commit until all the failure fixes are landed. Reviewers: #sourcecontrol, ttung, rmcelroy, wez Reviewed By: wez Subscribers: quark, rmcelroy, wez Differential Revision: https://phabricator.intern.facebook.com/D3221380 Signature: t1:3221380:1461802769:19f5bdc209c05edb442faa70ae572ce31e2fbc95	2016-04-28 10:18:47 -07:00
Durham Goode	29d3dda67e	checkcode: fix various store files Summary: Fix check code for various store related files Test Plan: Ran the tests Reviewers: #sourcecontrol, mitrandir, ttung Reviewed By: mitrandir Differential Revision: https://phabricator.intern.facebook.com/D3222465 Signature: t1:3222465:1461701300:34560288be4dc921f0252d4ad8fdc9c8d9357e23	2016-04-27 16:49:33 -07:00
Durham Goode	98fd33f8cb	store: add missing imports Summary: These were missing, and only needed in exception cases. Test Plan: nope Reviewers: #sourcecontrol, rmcelroy, ttung Reviewed By: rmcelroy Subscribers: rmcelroy Differential Revision: https://phabricator.intern.facebook.com/D3219749 Signature: t1:3219749:1461608742:91e3a721e78188c52431b6c5d1b3ad091e249c3a	2016-04-27 16:49:30 -07:00
Durham Goode	f92668636b	store: add historypack store that reads histpack files from .hg/store/packs Summary: Now that we can read and write histpack files, let's add a store implementation that can serve packed content. My next set of commits (which haven't been written yet) will: - add tests for all of this Test Plan: Ran the tests. Also repacked a repo, deleted the old cache files, ran hg log FILE, and verified it produced results without hitting the network. Reviewers: #sourcecontrol, ttung, mitrandir, rmcelroy Reviewed By: mitrandir, rmcelroy Subscribers: rmcelroy, mitrandir Differential Revision: https://phabricator.intern.facebook.com/D3219765 Signature: t1:3219765:1461717992:9b2e8646c0555472fa00ee7059c0f283fd4c2c65	2016-04-27 16:49:27 -07:00
Durham Goode	18cde8ba89	store: add a historypack class that can read histpacks Summary: The previous patch added logic to repack store history and write it to a histpack file. This patch adds a pack reader implementation that knows how to read histpacks. Test Plan: Ran the tests. Also tested this in conjunction with the next patch which actually reads from the data structure. Reviewers: #sourcecontrol, ttung, mitrandir, rmcelroy Reviewed By: mitrandir, rmcelroy Subscribers: rmcelroy, mitrandir Differential Revision: https://phabricator.intern.facebook.com/D3219764 Signature: t1:3219764:1461718081:9d812b6aea87fe9eb48fdac9dbef282e4775c3c9	2016-04-27 16:49:24 -07:00
Durham Goode	f22bae206b	store: add a historypack format and a repacker for it Summary: This is an initial implementation of a history pack file creator and a repacker class that can produce it. A history pack is a pack file that contains no file content, just history information (parents and linknodes). A histpack is two files: - a .histpack file consisting of a series of file sections, each of which contains a series of revision entries (node, p1, p2, linknode) - a .histidx file containing a filename based index to the various file sections in the histpack. See the code for documentation of the exact format. Test Plan: ran the tests. A future diff will add unit tests for all the new pack structures. Ran `hg repack` on a large repo. Verified pack files were produced in .hg/store/packs. In a future diff, I verified that the data could be read correctly. Reviewers: #sourcecontrol, mitrandir, ttung, rmcelroy Reviewed By: rmcelroy Subscribers: mitrandir, rmcelroy, mjpieters Differential Revision: https://phabricator.intern.facebook.com/D3219762 Signature: t1:3219762:1461751982:e7bbc65e8f01c812fc1eb566d2d48208b0913766	2016-04-27 16:49:21 -07:00
Durham Goode	f17f6cc093	store: add revisions to datapack in alphabetical order Summary: This forces the revisions in the datapack to be added in alphabetical order. This makes the algorithm more deterministic, but otherwise has little effect. Test Plan: Ran the tests, ran repack Reviewers: #sourcecontrol, rmcelroy, ttung Reviewed By: rmcelroy Subscribers: rmcelroy Differential Revision: https://phabricator.intern.facebook.com/D3219760 Signature: t1:3219760:1461687720:7be5fdc1419f8214c8c83074494b33214b3684ae	2016-04-27 16:49:18 -07:00
Durham Goode	43ed70b6f1	store: add datapack store that reads pack files from .hg/store/packs Summary: Now that we can read and write datapack files, let's add a store implementation that can serve packed content. With this patch, it's technically possible for someone to prefetch and repack large portions of history for long term storage with remotefilelog. My next set of commits (which haven't been written yet) will: - add tests for all of this - add an indexpack format for packing ancestor metadata (the datapack only packs revision content) Test Plan: Ran the tests. Also repacked a repo, deleted the old cache files, ran hg up null && hg up master, and verified it checked out master with the right files and without fetching blobs from the server. Reviewers: #sourcecontrol, ttung, rmcelroy Reviewed By: rmcelroy Subscribers: rmcelroy Differential Revision: https://phabricator.intern.facebook.com/D3205351 Signature: t1:3205351:1461751649:45a56b57d962a282aeef9478500a3b23495a0eb7	2016-04-27 16:49:15 -07:00
Durham Goode	56c83ea072	store: add a datapack class that can read datapacks Summary: The previous patch added logic to repack store contents and write it to a datapack file. This patch adds a new store implementation that knows how to read datapacks. It's just a simple implementation without any parallelism. So there's room for improvement. Test Plan: Ran the tests. Also tested this in conjunction with the next patch which actually reads from the data structure. Reviewers: #sourcecontrol, ttung, rmcelroy Reviewed By: rmcelroy Subscribers: rmcelroy Differential Revision: https://phabricator.intern.facebook.com/D3205342 Signature: t1:3205342:1461750967:84377517cb1f285d37694a3f503d60ae85bacb66	2016-04-27 16:49:12 -07:00
Durham Goode	510ac021f3	store: add a basic repack and datapack format Summary: This is an initial implementation of a repack algorithm that can read data from an arbitrary store (in this case the remotefilelog content store), and repack it into a datapack. A datapack is two files: - a .datapack file consisting of a series of deltas (a delta may be a full text if the delta base is the nullid) - a .dataidx file consisting of delta information and an index into the deltas See the code for documentation of the exact format. Test Plan: ran the tests Ran `hg repack` in a large repo. Verified that a datapack and a dataidx file were created in .hg/store/packs. The datapack used 148MB instead of the 439MB the old remotefilelog storage used. Reviewers: #sourcecontrol, ttung, rmcelroy Reviewed By: rmcelroy Subscribers: rmcelroy Differential Revision: https://phabricator.intern.facebook.com/D3205334 Signature: t1:3205334:1461751366:ee4bf6a580ffb667071a8046fda6f0858b7f25ae	2016-04-27 16:49:09 -07:00
Durham Goode	f362c9a3a8	store: add getfiles() api to store Summary: This adds a api to the store contract that allows the store to return a list of the name/node pairs that it contains. This will be used to allow a repack algorithm to list the contents of the store so it can repack it into another store. The old remotefilelog blob store used namehash+node keys, which is different from the new store API's name+node keys, so the getfiles() implementation here has to perform a reverse namehash->name lookup so it can satisfy the store API contract. In the remotefilelog basestore implementation, it reads the file names from the local data directory and the shared cache directory, and reverse resolves the file name hashes into filenames to produce the list. Test Plan: ran the tests Reviewers: #sourcecontrol, ttung, rmcelroy Reviewed By: rmcelroy Subscribers: rmcelroy Differential Revision: https://phabricator.intern.facebook.com/D3205321 Signature: t1:3205321:1461751437:a7c44c2bbe153122a3b85b8d82907a112cf77b1a	2016-04-27 16:49:06 -07:00
Durham Goode	438db1be81	store: allow union metadatastore to combine ancestors from many stores Summary: The old store api required that each store be able to return the complete ancestor history for a given name/node pair. This patch allows a store to return only the parts of history it knows about, and the union store will combine that history with the history from other stores to produce the full result. This is useful for stores like bundle files, where they contain only a partial history that needs to be annotated by the real store. Test Plan: ran the tests Reviewers: #sourcecontrol, ttung, rmcelroy Reviewed By: rmcelroy Subscribers: rmcelroy Differential Revision: https://phabricator.intern.facebook.com/D3205319 Signature: t1:3205319:1461751511:210740b82cc6767b2f0c393715ac93d8f1b96bc7	2016-04-27 16:49:04 -07:00
Durham Goode	cce75d4663	store: add concept of delta chain to content store Summary: The old store contracts required that every store be able to produce the full text for a revision. This patch modifies the contract so that a store (like a bundle file store) can serve a delta chain and the union store can combine delta chains from multiple stores together to create the final full text. Test Plan: ran the tests Reviewers: #sourcecontrol, rmcelroy Reviewed By: rmcelroy Subscribers: rmcelroy Differential Revision: https://phabricator.fb.com/D3205315 Signature: t1:3205315:1461669845:3eb8968566285f6221c7c44435b855cc65da33f4	2016-04-26 15:10:38 -07:00
Durham Goode	7e1047d11f	store: change union stores to accept a list of stores Summary: Instead of hard coding the list of stores in each union store, let's make it a list and just test each store in order. This will allow easily adding new stores and reordering the priority of the existing ones. Also fix the remote store's contains function. 'contains' is the old name, and it now needs to be getmissing in order to fit the store contract. Test Plan: ran the tests Reviewers: #sourcecontrol, ttung, rmcelroy Reviewed By: rmcelroy Differential Revision: https://phabricator.fb.com/D3205314 Signature: t1:3205314:1461606028:3a513ac82c5de668a7e40bbf7cc88d8754e2f0bb	2016-04-26 15:10:38 -07:00
Durham Goode	9cfbf5a59e	store: keep track of the writable store instead of hard coding it Summary: A future patch is going to change the union store to just contain an ordered list of stores. Therefore we need a special spot to record which store is the one that should receive writes. Test Plan: ran the tests Reviewers: #sourcecontrol Differential Revision: https://phabricator.fb.com/D3205307	2016-04-26 15:10:38 -07:00
Durham Goode	c3c047f0b7	Move sortnodes into shallowutil Summary: This is a generic topological sort and will be useful in the upcoming repacking code. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Reviewed By: ttung Differential Revision: https://phabricator.fb.com/D3204124 Signature: t1:3204124:1461260520:e1cb5c9d496f11e5f44e0cdbc5ba851b1573d2e1	2016-04-26 15:10:38 -07:00
Durham Goode	84bc49f25d	checkcode: fix shallowrepo, shallowutil, and setup.py Summary: Fix failures found by check-code. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Reviewed By: ttung Differential Revision: https://phabricator.fb.com/D3221375 Signature: t1:3221375:1461648312:7dbdd59e6370cb32b90d864a623d8066028741e7	2016-04-26 13:00:31 -07:00
Durham Goode	3817826242	checkcode: fix remotefilelogserver and shallowbundle Summary: Fix failures found by check-code. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Reviewed By: ttung Differential Revision: https://phabricator.fb.com/D3221373 Signature: t1:3221373:1461648284:23203c17f4a87e33ff4e9be17a8b99bddbcdff05	2016-04-26 13:00:31 -07:00
Durham Goode	39d350996f	checkcode: fix remotefilectx and remotefilelog Summary: Fix failures found by check-code. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Reviewed By: ttung Differential Revision: https://phabricator.fb.com/D3221371 Signature: t1:3221371:1461648217:e9702d761ab8fd6f85dee60a4c192cf25e784f11	2016-04-26 13:00:31 -07:00
Durham Goode	859510b65e	checkcode: fix fileserverclient.py Summary: Fix failures found by check-code. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Reviewed By: ttung Differential Revision: https://phabricator.fb.com/D3221369 Signature: t1:3221369:1461648197:185cbbba61a9d1a7a1beacd64153185d0d0826ed	2016-04-26 13:00:31 -07:00
Durham Goode	71bd8c2561	checkcode: fix errors in cacheclient and debugcommands Summary: Fix failures found by check-code. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Reviewed By: ttung Differential Revision: https://phabricator.fb.com/D3221366 Signature: t1:3221366:1461648117:088f3a5837393499e1a383af860bd1a935e0cba7	2016-04-26 13:00:31 -07:00
Durham Goode	495a853d78	checkcode: fix __init__.py Summary: Fix failures found by check-code. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Reviewed By: ttung Differential Revision: https://phabricator.fb.com/D3221365 Signature: t1:3221365:1461646159:efeb0478c66cbd49d4a0a6c02a79d530b42f8248	2016-04-26 13:00:31 -07:00
Jun Wu	ead8969797	Fix missing errno import Summary: Apparently we need to `import errno` in `shallowutil.py` Test Plan: Code Review Reviewers: #sourcecontrol, ttung, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D3195117 Signature: t1:3195117:1461031210:424912a96448a2a8cb37197f006cfa95d4ab1cb1	2016-04-18 19:04:58 -07:00
Durham Goode	2d1dcb4b97	Fix missing 'grp' import	2016-04-18 11:46:06 -07:00
Durham Goode	5b2914142a	Fix status returning invalid results The recent refactor caused remotefilelog.size() to include rename metadata in the size count, which meant the size didn't match what the rest of Mercurial expected. This caused clean files to show up as dirty in hg status if they had a 'lookup' dirstate state and were renames.	2016-04-10 09:46:24 -07:00
Durham Goode	2e93ca187a	Add byte count checking when receiving from the server Summary: We've received a few complaints that receivemissing is throwing corrupt data exceptions. My best guess is that we're not receiving all of the data for some reason. Let's add an assertion to ensure all the data is present, so we can narrow it down to a connection issue instead of actual corrupt data. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Differential Revision: https://phabricator.fb.com/D3136203	2016-04-05 09:50:12 -07:00
Durham Goode	24323a759c	store: address code review feedback This was meant to be part of the previous stack of commits, but I pushed the wrong stack. This patch addresses a number of code review feedback points, the most visible being to remain 'contains' to something else (in this case 'getmissing').	2016-04-04 16:48:55 -07:00
Durham Goode	8ca8f7f6ca	stores: remove fetch logic and replace with a remote store fallthrough The old way of fetching from the server required the base store api expose a way for outside callers to add fetch handlers to the store. This exposed some of the underlying details of how data is fetched in an unnecessary way and added an awkward subscription api. Let's just treat our remote caches as another store we can fetch from, and require that the over arching configure logic (in shallowrepo.py) can connect all our stores together in a union store.	2016-04-04 16:26:12 -07:00
Durham Goode	ece19111e0	ioutil: rename ioutil to shallowutil The old name was not very descriptive. There's already a shallowutil, so let's just use that.	2016-04-04 16:26:12 -07:00
Durham Goode	29ea8ada1e	store: delete the localcache class Now that all functionality has been moved to the new store, we no longer need the localcache class. So let's delete it.	2016-04-04 16:26:12 -07:00
Durham Goode	ecf4378d18	store: implement gc in the new store The last major piece of functionality that needs to be moved into the new store is the gc algorithm. This is just a copy paste of the one that exists in localcache.	2016-04-04 16:26:12 -07:00
Durham Goode	d70897e18c	store: implement markrepo on the new store Now that most of our storage has been moved behind the new store, let's also move the ability to mark the repo to behind that storage abstraction.	2016-04-04 16:26:12 -07:00
Durham Goode	0dd4247520	store: make remotefilelog.ancestormap use the new store Now that we have a metadatastore, let's use it to implement remotefilelog.ancestormap. This gets rid of a bunch of ugly code.	2016-04-04 16:26:12 -07:00
Durham Goode	ad473d5a6b	store: make remotefilelog.linknode us the new store Now that we have the new metadatastore, let's use it to fetch the linknode instead of parsing the data ourself.	2016-04-04 16:26:12 -07:00
Durham Goode	82bc4468ed	store: make remotefilelog.renamed use the store Now that we have a metadata store, let's switch remotefilelog.renamed to consult it, instead of parsing the data itself.	2016-04-04 16:26:12 -07:00
Durham Goode	aba161c424	store: implement metadatastore functions This implements the metadatastore APIs that were previously just stubs.	2016-04-04 16:26:12 -07:00
Durham Goode	8ad3ce6f41	store: change fileserviceclient to write via new store Now that we have the new store abstraction, and now that remotefilelog.py writes via it, let's also make fileserverclient write to the store via that API. This required some refactoring of how receive missing worked, so we could pass the filename down, as that is required for writing to the store.	2016-04-04 16:26:12 -07:00
Durham Goode	721f54d0df	store: move remotefilelog content writing to be done via basestore Now that we have the new store abstraction, let's route writes through it as well.	2016-04-04 16:26:12 -07:00
Durham Goode	ffb239bdcb	store: switch remotefilelog.size to use new store Now that we can read data via the new store, let's switch remotefilelog to use that instead of talking to the filesystem directly.	2016-04-04 16:26:12 -07:00
Durham Goode	69aff18063	store: switch remotefilelog.read to use self.revision Now that remotefilelog.revision is implemented using the new contentstore, let's switch remotefilelog.read to use that instead. This logic is almost identical to what's in filelog.read	2016-04-04 16:26:12 -07:00
Durham Goode	50df2e518f	store: switch remotefilelog.revision to use new store Now that the new contentstore has get(), let's switch remotefilelog.revision to use it instead.	2016-04-04 16:26:12 -07:00
Durham Goode	cb48cd034a	store: add store data validation The old store logic has validation for checking the data it's reading is corrupt. Let's copy and paste that over to the new store.	2016-04-04 16:26:12 -07:00
Durham Goode	dfb49ad597	store: implement contentstore.get This implements the basic function for fetching content data from the remotefilelog store.	2016-04-04 16:26:12 -07:00
Durham Goode	647684cca8	store: implement basestore.contains This implements the basic contains function that checks if the given (filename, node) pairs are in the store.	2016-04-04 16:26:12 -07:00
Durham Goode	1d97924c54	store: construct store during repo creation We are refactoring the storage to be behind more abstract APIs. This patch creates the new store objects on the repo and passes them to the fileserverclient so it can add itself as a file provider, in the case of misses.	2016-04-04 16:26:12 -07:00
Durham Goode	9c88142860	store: add union stores Future patches will refactor the storage logic into a more abstract API. This patch adds a union store, which will allow us to check both local client storage and shared cache storage, without exposing the difference at higher levels.	2016-04-04 16:26:12 -07:00
Durham Goode	b62ef50278	store: add stubs for storage classes Future patches will refactor the storage into a more abstract API. This is the initial stubs for that API.	2016-04-04 16:26:12 -07:00
Durham Goode	492b9af06e	ioutil: move helper functions to ioutil Future patches will refactor the storage into more abstract APIs. Let's move these utility functions out to be on their own.	2016-04-04 16:26:12 -07:00
Jun Wu	b7e6384e9c	Allow repo = None in runcommand Summary: When running inside chg, `reposetup` will be called once since `serve` is not a `norepo` command. Then if the user runs a `norepo` command like `help`, `runcommand` will receive `repo = None` and error out. Fix it by checking `repo` explicitly. Test Plan: Run `chg help` and no exception is thrown. Reviewers: #sourcecontrol, ttung, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D3136328 Signature: t1:3136328:1459811387:3b86df9765aa5e20677031d6e9fc4bc3d524efa6	2016-04-04 16:22:16 -07:00
Durham Goode	f774b1b204	adjustlinknode: remove unnecessary ancestor walk Summary: Since we added the C code ancestor walk to this function, this python ancestor walk is completely unnecessary, and can cause significant slow downs if none of the ancestors are known linknodes (it walks the entire history). Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Differential Revision: https://phabricator.fb.com/D3136150	2016-04-04 15:30:47 -07:00
Jun Wu	2ec49732fd	Add shallowrepo check in wrapped log function Summary: Discovered by `hg log filename` in the hg-committed repo. It seems we missed a check here. Test Plan: Run `hg log filename` in a non-remotefilelog repo with remotefilelog enabled and make sure "warning: file log can be slow on large repos" is not printed. Reviewers: #sourcecontrol, ttung, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D3132523 Signature: t1:3132523:1459801676:bcba3bbcaf1c358ad11e8ad25c0a1d3cc2637a76	2016-04-04 13:33:28 -07:00
Kostia Balytskyi	4e61e19a3d	remotefilelog: do ui.log of remotecache hit rate Summary: We would like to utilize Martijn's logtoprocess extension to log cache hit rate. Test Plan: None so far, will update the diff later. Reviewers: #sourcecontrol, ttung Differential Revision: https://phabricator.fb.com/D3094765	2016-04-01 03:16:00 -07:00
Mateusz Kwapich	cc54a98956	addchangegroup: adjust for new upstream API Summary: addchangegropfiles doesn't take the pr function as a parameter anymore. The upstream change https://selenic.com/hg/rev/982e3ef7f5bf Test Plan: tests are passing now on the release branch Reviewers: #sourcecontrol, ttung, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D3107217 Signature: t1:3107217:1459211189:4ece7531aff6043fc3acbfe43e2f471781c25c9d	2016-03-30 14:17:49 -07:00
Augie Fackler	9eb0009839	fileserverclient: use new iterbatch() method This allows the client to send a single batch request for all file contents and then handle the responses as they stream back to the client, which should improve both running time and the user experience as far as it goes with progress.	2016-03-22 10:06:24 -07:00
Augie Fackler	86ea8ed060	commands: norepo was removed in e1563031f528 Use the decorator form instead, introduced in hg 3.1.	2016-03-03 13:40:31 -05:00
Wez Furlong	2ec314e26a	remotefilelog: add separate option to validate localcache files Summary: We've recently had to dig into two different issues that resulted in broken files landing in the localcache; one was due to a problem with the data source for our cacheprocess becoming corrupt and the other was due to a failed write (ENOSPC) causing a truncated file to be left in the local cache. It is desirable to perform some lightweight consistency checks before we return data up to the caller of localcache, but prior to this diff the validation functionality was coupled to configuring a log file. Due to the shared nature of the localcache it's not always clear cut where we want to log localcache consistency issues, so it feels more flexible to decouple logging from enabling checks. This diff introduces `remotefilelog.validatecache` as a separate option that can have three values: * `off` - no checks are performed * `on` - checks are performed during read and write * `strict` - checks are performed during __contains__, read and write The default is now `on`. Test Plan: `./run-tests.py --with-hg=../../hg-crew/hg` Reviewers: #sourcecontrol, ttung Differential Revision: https://phabricator.fb.com/D2941067 Tasks: 10044183, 9987694	2016-02-18 08:34:33 -08:00
Durham Goode	a7a78cda1e	More robust adjustlinknode code for None srcrev's Summary: The srcrev passed to adjustlinknode can sometimes be None, which causes an exception. The code that throws the exception was introduced recently as part of taking advantage of a C fast path. The fix is to move the srcrev check to be after the None handling. Test Plan: I'm not sure how to repro this naturally actually. I tried writing tests that did rebases of renames, but it didn't trigger. I manually verified it by using the debugger to insert a None for the srcrev at the beginning of adjustlinknode Reviewers: lcharignon, #sourcecontrol, ttung, mitrandir Reviewed By: mitrandir Differential Revision: https://phabricator.fb.com/D2944899 Tasks: 10066192 Signature: t1:2944899:1455735567:c8eea240885847061239bf3df0ea59dbbd0e4858	2016-02-17 11:01:45 -08:00
Wez Furlong	fd584f7e56	remotefilelog: more graceful handling of write errors for localcache Summary: I debugged an issue this past week where a set of machines had exhausted the disk space available on the partition where the local cache was situated. This particular tier didn't use cacheprocess, only the local cache. There were some number of truncated files in the local cache. Inspecting the code here, it looks like we're using atomictempfile incorrectly. atomictempfile.close() will unconditionally rename the temp file into place, and we were calling this from a finally handler. It seems safest to remove the try/finally from around this section of code and just let the destructor trigger to clean up the temporary file in the error path, and if we make it through writing the data, then call close and have it move the file in to place. Test Plan: ran the tests. They don't cover this case, but at least I didn't obviously break anything: ``` $ ./run-tests.py --with-hg=../../hg-crew/hg ................... # Ran 19 tests, 0 skipped, 0 warned, 0 failed. ``` Reviewers: #sourcecontrol, ttung, mitrandir Reviewed By: mitrandir Subscribers: scyost Differential Revision: https://phabricator.fb.com/D2940861 Tasks: 10044183 Signature: t1:2940861:1455673078:a7593d70c32151e13c8ccc31f92387e9c8cb23a0	2016-02-17 08:03:38 -08:00
Durham Goode	2cce4008b6	adjustlinknode: user C fastpath Summary: The adjustlinknode logic was pretty slow, since it did all the ancestry traversal in python. This patch makes it first use the C fastpath to check if the provide linknode is correct (which it usually is), before proceeding to the slow path. The fastpath can process about 300,000 commits per second, versus the 9,000 commits per second by the slow path. This cuts 'hg log <file>' down from 5s to 2.5s in situations where the log spans several hundred thousand commits. Test Plan: Ran the tests, and ran hg log <file> on a file with a lot of history and verified the time gain. Reviewers: pyd, #sourcecontrol, ttung, quark Reviewed By: quark Subscribers: quark Differential Revision: https://phabricator.fb.com/D2908532 Signature: t1:2908532:1454718666:c4e63d73057572f035082943ef2e6fe0a49238c1	2016-02-08 14:40:07 -08:00
Simon Farnsworth	6cdf20e7ad	remotefilelog: Make TortoiseHG work with remotefilelog	2016-02-05 14:53:45 +00:00
Durham Goode	16d12ec27c	Remove limit on adjust linknode lookup Previously we limited the changelog scan for old commits to the most recent 100,000, under the assumption that most changes would be within that time frame. This turned out to not be a good assumption, so let's remove the limitation.	2016-01-27 15:56:36 -08:00
Augie Fackler	afca077cf9	fileserverclient: add option to provide file path to cacheprocess For our uses of remotefilelog, life is significantly easier if we also have the file path rather than just a hash of the file path. Hide this behind a config knob so users can enable it or not as makes sense.	2016-01-27 13:22:22 -08:00
Durham Goode	4ee8e7278d	changegroup: support new _packermap name Upstream changed changegroup.packermap to be changegroup._packermap. So we need to update accordingly.	2016-01-19 16:34:53 -08:00
Durham Goode	13c2a7823f	Add alternative linkrev lookup logic Summary: The old linkrev lookup logic depended on the repo containing the latest commit to have contained that particular version of the file. If the latest version had been stripped however (like what happens in rebase --abort currently), the linkrev function would attempt to scan history from the current rev, trying to find the linkrev node. If the filectx was not provided with a 'current node', the linkrev function would return None. This caused certain places to break, like the Mercurial merge conflict resolution logic (which constructs a filectx using only a fileid, and no changeid, for the merge ancestor). The fix is to allow scanning all the latest commits in the repo, looking for the appropriate linkrev. This is pretty slow (1 second for every 14,000 commits inspected), but is better than just returning None and crashing. Test Plan: Manually repro'd the issue by making a commit, amending it, stripping the amended version and going back to the original, making two sibling commits on top of the original, then rebasing sibling 1 onto sibling 2 (so that the original commit that had the bad linknode data was the ancestor during the merge). Previously this failed, now it passes. I'd write a test, but it's 11pm and I'm tired and I need this in by early tomorrow morning to make the cut. Reviewers: #sourcecontrol, ttung, rmcelroy Reviewed By: rmcelroy Subscribers: trunkagent, rmcelroy Differential Revision: https://phabricator.fb.com/D2826850 Signature: t1:2826850:1452680293:cb8c1f8c20ce13ad632925137dbdce6e994ab360	2016-01-13 11:25:26 -08:00
Laurent Charignon	707f243248	remotefilelog: make the wrapping of dispatch.run safer Summary: I somehow got a stacktrace with IPython on a non-remotefilelog repo that ran this code and complained that fileservice didn't exit. I am not sure how it happened but let's make the call safer to match the pattern used elsewhere in the file. Test Plan: No stacktrace seen after that, one line change Reviewers: durham Differential Revision: https://phabricator.fb.com/D2819402	2016-01-11 10:48:51 -08:00
Kostia Balytskyi	9500813607	remotefilelog: removing filelog check from verification process Differential Revision: https://phabricator.fb.com/D2812664	2016-01-07 16:57:39 -08:00
Stanislau Hlebik	33b7e1013a	remotefilelog: make .hg/store/data blobs read only Summary: Today, people running codemods or search/replace on their repos often accidentally corrupt their repos, and everyone ends up sad. It's better to make them read-only Test Plan: python run-tests.py Reviewers: rmcelroy, #sourcecontrol, durham, ttung Reviewed By: durham Subscribers: mitrandir, quark, durham Differential Revision: https://phabricator.fb.com/D2807369 Tasks: 9431187 Signature: t1:2807369:1452192329:b5ed6606cb66b1c830fc3d3fb5a81e6120387b38	2016-01-07 13:37:36 -08:00
Laurent Charignon	af9917b578	remotefilelog: fix compat with core on builddeltaheader	2015-12-30 13:33:47 -08:00
Laurent Charignon	963dc28d83	compat: fix _verify wrapper Summary: In 4fb35d8c2105 in core @durham removed _verify and replaced it with verify, this patch makes remotefilelog compatible with those changes. Test Plan: The tests are failing after but don't fail on this anyore Reviewers: ericsumner Subscribers: durham Differential Revision: https://phabricator.fb.com/D2791847	2015-12-28 14:58:21 -08:00
Durham Goode	cb448f683b	Stop writing backup local data blobs Summary: Historicaly we would move the old backup data blob to <name>+<int> so we had a record of all the old data blobs we could search though for good commit histories. Since we no longer require that the data blobs have perfect commit histories, these extra blobs just take up space. This changes makes us only store one old version (for debugging and recovery purposes), which should save space on clients. Also switched to atomic rename writes while we're at it. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Differential Revision: https://phabricator.fb.com/D2770675	2015-12-17 13:02:29 -08:00
Durham Goode	c59623483f	Limit checkunknown fetching to just what's in the sparse checkout The newly added checkunknown prefetching apparently gets handed the full list of files that are not present on disk right now, which includes all the files outside of the sparse checkout. So we need to filter those out here.	2015-12-16 12:59:44 -08:00
Durham Goode	b3b4ddc20b	Prefetch before addremove check Summary: When running addremove, it needs to see the contents of the removed files so it can determine if they are a remain. So we need to add bulk prefetching in this situation. Test Plan: Added a test Reviewers: #sourcecontrol, ttung, rmcelroy Reviewed By: rmcelroy Subscribers: dcapra Differential Revision: https://phabricator.fb.com/D2756979 Signature: t1:2756979:1450132279:668b8b160d792cad1ac37e2069716e20ea304f57	2015-12-14 14:44:11 -08:00
Durham Goode	faccfe65d4	Add prefetching to checklookup Summary: During hg status Mercurial sometimes needs to look at the size of contents of the file and compare it to what's in history, which requires the file blob. This patch causes those files to be batch downloaded before they are compared. There was a previous attempt at this (see the deleted code), but it only wrapped the dirstate once at the beginning, so it was lost if the dirstate object was replaced at any point. Test Plan: Added a test to verify unknown files require only one fetch. Reviewers: #sourcecontrol, ttung Reviewed By: ttung Subscribers: dcapra Differential Revision: https://phabricator.fb.com/D2756768 Signature: t1:2756768:1450130997:7c7101efe66c998e3182dfbd848aa6b1a57d509f	2015-12-14 14:44:08 -08:00
Durham Goode	4a5ae177bb	Add prefetching for checkunknownfiles Summary: When doing an update, Mercurial checks if unknown files on disk match what's in memory, otherwise it stops the checkout so it doesn't cause data loss. We need to batch fetch the necessary files from the remotefilelog server for this operation. Test Plan: Added a test Reviewers: #sourcecontrol, ttung, rmcelroy Reviewed By: rmcelroy Subscribers: dcapra Differential Revision: https://phabricator.fb.com/D2756837 Signature: t1:2756837:1450132288:bc0530a07ea40aaeb2af1a93e4da82778cc11369	2015-12-14 14:49:34 -08:00
Durham Goode	b1c0840594	Remove unnecessary fallbackpath arg from getfiles This wasn't used so we can clean it up.	2015-12-11 11:20:24 -08:00
Durham Goode	20102e4f2b	Reuse ssh connection across miss fetches Summary: Previously we recreated the ssh connection for each prefetch. In the case where we were fetching files one by one (like when we forgot to batch request files), it results in a 1+ second overhead for each fetch. This changes makes us hold onto the ssh connection and simply issue new requests along the same connection. Test Plan: Some of the tests execute this code path (I know because I saw them fail when I had bugs) Reviewers: #sourcecontrol, ttung Differential Revision: https://phabricator.fb.com/D2744688	2015-12-11 11:18:51 -08:00
Martin von Zweigbergk	1c64f784ed	make changegroup.addchangegroupfiles() overriding more flexible The method gained a parameter in hg revision 43d86cd9dae2 (changegroup: note during bundle apply if the repo was empty, 2015-12-02).	2015-12-10 17:25:14 -08:00
Martin von Zweigbergk	7251d9b51b	repo: replace repo.parents() by repo[None].parents() repo.parents() was removed in hg revision d5d613de0f44 (commands: inline definition of localrepo.parents() and drop the method (API), 2015-11-11).	2015-12-10 17:25:14 -08:00
Martin von Zweigbergk	8f7ee3c1b1	replace localrepo.clone() by exchange.pull() localrepo.clone() was removed in hg revision 9996a5eb7344 (localrepo: remove clone method by hoisting into hg.py, 2015-11-11). Instead of localrepo.clone(), we now use exchange.pull(). However, that method was already overridden in onetimeclientsetup(), which is called from our new overriding of exchange.pull(). Since it should be done first, we move that overriding from onetimeclientsetup() to uisetup().	2015-12-10 17:25:14 -08:00
Durham Goode	2b30eeb96b	Fix exception when making a directory that already exists Summary: There was a race condition where there could be an exception when trying to create directories that already exist. Test Plan: Ran the tests Reviewers: #sourcecontrol, ttung Differential Revision: https://phabricator.fb.com/D2736268	2015-12-10 10:11:27 -08:00
Durham Goode	f75037000f	Make gc only inspect the last week of changes Summary: Previously hg gc would try to keep all files relevant to all heads in the repo. If the repo has a lot of heads, reading the manifest for all of them and building a massive set of all the files can be extremely slow. Let's just keep files related to the most recent public heads. Test Plan: Ran the tests. This improves 'hg gc' time on some repos from 2 hours to 10 minutes. Reviewers: #sourcecontrol, ttung Reviewed By: ttung Differential Revision: https://phabricator.fb.com/D2733157 Signature: t1:2733157:1449558332:14bbea343600959155f5927913552304ab8f94a7	2015-12-08 09:53:33 -08:00
Laurent Charignon	c89f602b7d	gcclient: guard against malformed repo paths Before this patch, gc would stop on malformed repo path. When this happens we want to know what happened and get useful debugging information.	2015-12-02 10:40:49 -08:00
Laurent Charignon	34e5ad607d	gcclient: guard against corrupted or empty repofile Before this patch, if the repofile was empty or containing bad entries we were just crashing. This patch prevents the crash by catching the error and displays some interesting information to debug issues.	2015-12-02 10:40:49 -08:00
Laurent Charignon	e388dd5709	localcache: don't fail on file removal if the file is not there If another process deletes files managed by localcache, then, the gc step would fail. This patch prevents the failure and add interesting information to debug the problem.	2015-12-02 10:40:49 -08:00
Durham Goode	9947ff9cc6	Allow file blobs to have imperfect history Summary: Attempting to maintain perfect history in the file blobs has become the most complex, bug prone, and performance hurting aspect of remotefilelog. Let's just drop this requirement and rely on upstream Mercurial's ability to fixup linkrevs in the face of imperfect data. The real solution for this class of problems is to make it so that the filelog hashes are unique with respect to the commit that introduces them, but that's a much harder problem. Test Plan: Ran the tests. Made a commit with 1000 files changes. hg commit went from 15s to 7.5s. The difference will be even more dramatic for certain situations that have known to have caused problems in the past. Reviewers: #sourcecontrol, pyd Subscribers: rmcelroy, pyd Differential Revision: https://phabricator.fb.com/D2686318	2015-12-01 23:49:48 -08:00
Durham Goode	5c49e2b7e4	Change server cache collection strategy Summary: Previously we would keep all server cache files for any head in the repo, even if that head was really old. This resulted in unnecessarily large serve caches. The new strategy is to keep the files necessary for any commit within the past 25,000 revs or so. Even on repo's with large commit rates this equates to multiple weeks of time. Test Plan: Ran the tests Reviewers: #sourcecontrol Differential Revision: https://phabricator.fb.com/D2652542	2015-11-13 09:56:52 -08:00
Durham Goode	eb4f7f166c	Speed up log -fr master file/ Summary: Previously, hg log -fr master file/ was very slow with remotefilelog because Mercurial decides whether to take the slowpath (i.e. walk the changelog) or the filelog path based on if the filelog exists in the repo. remotefilelog has no way to know if the filelog exists (since there's not a full list of filelogs), so it fakes it by returning 'True' any time mercurial asks, then when the filelog is needed, remotefilelog walks the entire changelog to build a fake looking filelog. Therefore mercurial attempted to take the filelog path, and remotefilelog did a very slow walk. The fix is to force mercurial to take the slowpath when it sees 'hg log -fr revset file'. Technically we could take the fast path by inspecting all the results of the revset and seeing if the file/pattern exists as a file in any of those. But that could be expensive and complicated, so this naive fix will suffice for now. Test Plan: Added a test. Previously it resulted in no output Reviewers: cdelahousse, rmcelroy, #sourcecontrol Differential Revision: https://phabricator.fb.com/D2634918	2015-11-09 16:54:14 -08:00
Aaron Kushner	fe561e382a	Don't stack trace when getting children from thg and hg serve Summary: thg and 'hg serve' stack trace when trying to view a file. The correct fix is to walk back the changelog and look to see which was the first one to touch the specific file. In the meantime, this makes the graphic UIs usable. Test Plan: ran tests Reviewers: durham, rmcelroy Reviewed by: rmcelroy	2015-10-25 14:05:14 +00:00
Aaron Kushner	11c9fd8e04	Remove what looks to be dead code Summary: changectx set, but doesn't seem to be used. Test Plan: ran tests Reviewers: rmcelroy, durham	2015-10-25 15:32:58 +00:00
Augie Fackler	0b81082c8a	remotefilelog: cope with rename of addchangegroupfiles to _addchangegroupfiles This prevents remotefilelog from breaking with Mercurial 3.6.	2015-10-15 10:12:54 -04:00
Ryan McElroy	9943d04f51	make fileserverclient.close fully robust Test Plan: ran tests Reviewers: durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D2544243 Signature: t1:2544243:1444883084:a7b9cc9167a7671e34813826ba9fcd289919afd1	2015-10-14 18:49:55 -07:00
Ryan McElroy	f8360b4766	remotecache: unconditionally close process and pipes Summary: It is possible to mark the cache connection as closed but never close the pipes, which leads to an error the next time the connection is opened for use. Make sure we actually close and terminate everything when close is called. Test Plan: ran the tests Reviewers: #sourcecontrol, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D2540680 Tasks: 8712950 Signature: t1:2540680:1444841805:e9fd8f21ab370a599138bd8b0c3241543418521a	2015-10-14 08:12:48 -07:00
Durham Goode	bb8c595d67	Update to work with latest Mercurial Upstream Mercurial has made a lot of changes around streaming clones, so we need to update remotefilelog to handle these new changes.	2015-10-13 14:17:02 -07:00
Mathias De Maré	2ddceef9c7	cacheclient: don't forget to specify the port of the memcached server	2015-09-29 07:48:58 +02:00
Durham Goode	ca8028eb16	Add kwargs to repo.sparsematch	2015-10-06 10:07:01 -07:00
Durham Goode	4eec2c3535	Add excessive fetch logging Summary: We've received reports of non-batch fetches that do a ton of invididual file downloads. This patch adds logging to the blackbox for that. Test Plan: manually changed the code to trigger the logging and verified it came out in the blackbox and had a warning message. Reviewers: #sourcecontrol Differential Revision: https://phabricator.fb.com/D2488803	2015-09-28 22:16:12 -07:00
Durham Goode	e9a9bad998	Use atomic file writes for server side cache We've gotten reports of users receiving corrupt file blobs directly from the server. The corruption doesn't enter the cache pool, and we don't get any further reports of it, so I think it's a transient issue caused certain readers reading the file before the writer has finished writing it. Let's use atomic rename files to make this not happen.	2015-09-28 10:31:38 -07:00
Wez Furlong	6e7195b8ef	Be more careful during close I saw some crazy looking stack traces like this while testing an improved implementation of our internal cacheprocess binary: ``` fileservice.prefetch([(self.filename, id)]) File "/usr/lib/python2.6/site-packages/remotefilelog/remotefilelog.py", line 78, in read File "/usr/lib/python2.6/site-packages/remotefilelog/fileserverclient.py", line 357, in prefetch raw = self._read(hex(node)) File "/usr/lib/python2.6/site-packages/remotefilelog/remotefilelog.py", line 283, in _read missingids = self.request(missingids) File "/usr/lib/python2.6/site-packages/remotefilelog/fileserverclient.py", line 196, in request fileservice.prefetch([(self.filename, id)]) File "/usr/lib/python2.6/site-packages/remotefilelog/fileserverclient.py", line 357, in prefetch missingid = cache.receiveline() File "/usr/lib/python2.6/site-packages/remotefilelog/fileserverclient.py", line 105, in receiveline self.close() missingids = self.request(missingids) File "/usr/lib/python2.6/site-packages/remotefilelog/fileserverclient.py", line 76, in close File "/usr/lib/python2.6/site-packages/remotefilelog/fileserverclient.py", line 196, in request self.pipei.write("exit\n") missingid = cache.receiveline() File "/usr/lib/python2.6/site-packages/remotefilelog/fileserverclient.py", line 105, in receiveline ValueError: I/O operation on closed file self.close() File "/usr/lib/python2.6/site-packages/remotefilelog/fileserverclient.py", line 76, in close self.pipei.write("exit\n") ValueError: I/O operation on closed file ``` it looks like we are somehow re-entrant (maybe referenced from multiple generators?) and get tripped up if we're not careful about checking for or catching issues during the close() method call. So let's be a little more careful :-)	2015-09-15 07:48:14 -07:00
Adam Simpkins	a93ebb8b1e	remotefilelogserver: fix missing import Summary: _walkstreamfiles() uses mercurial.store.decodedir(), so mercurial.store needs to be imported. Test Plan: Confirmed that _walkstreamfiles() no longer throws an exception when cloning a remote shallow repository. Reviewers: durham, pyd, rmcelroy Reviewed By: rmcelroy Subscribers: net-systems-diffs@, exa, yogeshwer Differential Revision: https://phabricator.fb.com/D2409648 Signature: t1:2409648:1441245825:00a758f6f0884b77572078589f18592ca6cb6fa4	2015-09-02 19:04:33 -07:00
Durham Goode	fb7827372b	Don't check datafiles if the matcher says everything is remote Streaming clones were taking a while because apparently self.datafiles() actually stats each .i file instead of just returning the list straight from fncache. To fix this, let's not call datafiles() when we know the matcher is going to reject everything anyways. This significantly speeds up streaming clones.	2015-09-05 12:24:04 -07:00
Mathias De Maré	8ab8d2601b	fileserverclient: clear error message if cachepath is not configured	2015-08-29 08:20:54 +02:00
Augie Fackler	226a6f1027	fileserverclient: add config knob to control batch size Previously we'd just send one enormous batch for everything to the server. This led to prolonged periods of no progress output for the user. Now we send batches in smaller chunks (default is 100) which gives the user some idea that things are working. Includes a trivial test, which doesn't really verify that the batching logic is used as described, but at least prevents the boneheaded error I had in an earlier (unmailed) version of this patch which forgot to use configint() when loading the config setting.	2015-08-18 15:14:01 -04:00
Augie Fackler	06c09f03ab	fileserverclient: correctly use exception constructor We were passing one argument instead of 3.	2015-08-18 15:35:21 -04:00
Augie Fackler	51f7cac5a7	getfile: add error reporting to getfile method Without this, the only way to report a failure of a file load in a batched set of getfile requests is to fail the entire batch, which is potentially painful. Instead, add our own error reporting in-band which the client can then detect and raise. I'm not completely happy with the somewhat adhoc error reporting here, but we expect our server to have at least one additional error ("not allowed to see file contents") which will require some special handling on our end, so we need some level of flexibility in the error reporting protocol so we can extend it later. Sigh. Open question: should we reserve some range of error codes so that it's easy for strange custom servers to have related monkeypatches to client code for custom handling of unforseen-by-remotefilelog conditions? I couldn't figure out how to actually get the client to try loading file contents over http in the test, but the get-with-headers test at least proves that the server responses look the way I expect.	2015-08-04 14:59:53 -04:00
Durham Goode	5bb4351364	prefetch: add prefetching to bundle receiving We were not prefetching the potential dependent files for the filelog revisions we received over the wire. This resulted in a lot of non-batched downloads, which was super slow. This fixes it by batch downloading the parents and delta parents of the incoming filelog revisions and adds a test.	2015-07-21 18:32:33 -07:00
Durham Goode	9152c8be08	fileserverclient: fix progress bar A previous commit changed count to be a list, but missed the use of it when being passed to progress. This fixes it.	2015-07-21 18:31:01 -07:00
Augie Fackler	26ab790f75	fileserverclient: mark getfile as batchable This lets clients send many getfile requests in a single transaction. Note that this requires 76fcf62accb0 be applied to your Mercurial, or you'll be bitten by a bug[0] in Mercurial's wireproto batching. As a result of this change, remotefilelog now effectively requires the upcoming Mercurial 3.5 if you want to use a specific release. 0: http://bz.selenic.com/show_bug.cgi?id=4739	2015-06-30 17:34:01 -04:00
Augie Fackler	16310f95f3	remotefilelog: introduce new getfile method Right now, this is a naive fetch-one-file method. The next change will mark the method as batchable and use a batch in the client so that many files can be requested in a single RPC.	2015-06-30 17:32:31 -04:00
Augie Fackler	adef2bd2d0	remotefilelogserver: move umask twiddling for cache into _loadfileblob This narrows the interval during which we've modified umask, which seems nice. Done as a separate change for clarity.	2015-06-30 16:58:15 -04:00
Augie Fackler	d2f7930f70	fileserverclient: tease out a _getfiles method This will make it easier to detect servers that support _getfiles2 and prefer that method when available.	2015-06-30 16:43:18 -04:00
Augie Fackler	5966446c14	remotefilelogserver: tease out a _loadfileblob method for future use We're about to introduce a new getfiles method, so let's take this opportunity to split out the file loading code so it'll be used in only one place.	2015-06-30 15:02:07 -04:00
Augie Fackler	882ca8e705	remotefilelogserver: prevent getfiles from being called over http at all This means that even old clients that fail to sniff for capabilities before trying getfiles will get a sensible error message back from the server.	2015-06-30 11:04:47 -04:00
Augie Fackler	4e4a3a3a7b	remotefilelogserver: disable remotefilelog serving over non-ssh protocols	2015-06-29 16:34:31 -04:00
Augie Fackler	e2d021637c	fileserverclient: refuse to operate on a non-sshpeer The way the protocol is defined for getfiles interleaves reading filenames and sending file contents, which works fine over ssh but is incompatible with http. This change is probably not neccessary now that remotefilelog correctly checks for its own capability first, but it helped me debug so I left it in for completeness.	2015-06-29 16:25:44 -04:00
Augie Fackler	dd2e200ad1	fileserverclient: sniff for remotefilelog capability before using it This prevents clients from causing a server problem on an http server.	2015-06-29 17:33:56 -04:00
Augie Fackler	32cb84c8b7	remotefilelogserver: restrict remotefilelog capability to ssh This only works over ssh, so let's not pretend otherwise. A future change will ensure the capability is still advertised via ssh.	2015-06-29 17:36:25 -04:00
Augie Fackler	5a72282b12	remotefilelogserver: wrap wireproto._capabilities If we instead wrap wireproto.capabilities, then our capabilities don't get transmitted via the hello command, so not all clients will notice the new capability unless we do the wrapping here. Test output is in the test that previously demonstrated the defect. Note that there's still a defect: we're advertising the capability over http even though we have no hope of the getfiles method working over http.	2015-06-29 17:35:32 -04:00
Augie Fackler	2c11d5bbf8	remotefilelog: stop declaring remotefilelog to be an hg-internal extension The magic string 'internal' causes Mercurial to never blame remotefilelog for being broken. I had suspected that remotefilelog might work with 3.4, but the tests fail against 3.4.1, so I'm just making testedwith empty.	2015-07-01 15:58:44 -04:00
Durham Goode	87ac4a0c9e	Fix building revgraph across merge commits The rev graph building code was flawed because it didn't track second parents correctly. This was caught when someone was developing an extension and attempted to commit a merge commit in some way.	2015-06-30 16:43:01 -07:00
Augie Fackler	5eecca9702	remotefilelog: handle the death of repo.sopener (hg change 0bbe3294361a) repo.sopener has been deprecated since hg 2.3, and repo.svfs replaces it. Since it's been dead for so long, let's just use svfs and call it good enough.	2015-06-30 10:12:38 -04:00
Durham Goode	047afeff5f	hooks: remove incominghook Summary: The incominghook was meant to pregenerate any remotefilelog blobs that were likely to be needed shortly. Unfortunately it actually just slows down pushes, since in large repos the hook takes longer than the push does sometimes. So let's just remove it. Test Plan: Apparently there were no tests for this :p Reviewers: sid0, lcharignon, mitrandir, ericsumner, rmcelroy Reviewed By: rmcelroy Differential Revision: https://phabricator.fb.com/D2185894 Signature: t1:2185894:1435126819:e1e1125520411356eccff4baee31ab2938ebc0fe	2015-06-23 20:03:57 -07:00
Siddharth Agarwal	c45c59236b	remove prefetch from the short help list Summary: I really don't think it should be in this list. Test Plan: `hg` Reviewers: durham, #sourcecontrol, rmcelroy Reviewed By: durham, #sourcecontrol, rmcelroy Subscribers: rmcelroy Differential Revision: https://phabricator.fb.com/D1997655 Signature: t1:1997655:1429189594:aa8f355a6fc61e300f824be6b2fbd64a42dde2b5	2015-04-16 00:38:43 -07:00
Durham Goode	93e4a455ff	clone: fix streaming clones Upstream refactored the streaming clone api, so we need to adjust accordingly.	2015-05-27 17:29:34 -07:00
Durham Goode	acea316460	Fix blob generation with adjustlinkrevs Summary: When adjustlinkrevs got moved to the filectx upstream, we incorrectly moved it to the remotefilectx inside remotefilelog. We don't actually use remotefilectx on the server, so wrapping it did nothing. The fix is to move the wrapping to be in remotefilelogserver.py so it is executed on the server side. Test Plan: Did a checkout with my shallow client pointed at a full repo with no blob cache. Verified it went quickly (minutes, instead of hours). Reviewers: pyd Differential Revision: https://phabricator.fb.com/D2097851	2015-05-22 21:32:12 -07:00
Durham Goode	95e9918016	sparse: remove sparse-filtered results from copy tracing Summary: Since we only prefetch things that are in the sparse checkout, copy tracing (which touches everything in the manifest diff) would do individual file downloads for every file. Let's just remove those files from the copy tracing check entirely since the user probably doesn't care if they're outside the sparse checkout. Test Plan: Added a test Reviewers: sid0, rmcelroy, lcharignon, pyd Differential Revision: https://phabricator.fb.com/D2083768	2015-05-18 16:08:49 -07:00
Laurent Charignon	5652bd276a	Match with with latest version of core to pass the test Summary: Match with with latest version of core to pass the test. There were a couple of changes in core that broke the extension, I matched those changes to make the test pass. Test Plan: The tests are all passing Reviewers: durham Differential Revision: https://phabricator.fb.com/D2053958	2015-05-07 12:50:51 -07:00
Durham Goode	b29a6b04dd	Add match arg to computeforwardmissing wrapper Upstream now has a matcher on _computeforwardmissing which will allow us to only prefetch the necessary parts of a sparse checkout. Since we're now being returned an iterator, we need to convert it to a list since we iterate over it and return it.	2015-04-22 16:39:16 -07:00
Durham Goode	8bf6e4f004	sparse: make remotefilelog aware of sparse checkouts Summary: Previously remotefilelog would prefetch every file in a commit. With the sparse checkout extension we want to only prefetch things in the sparse checkout. This commit makes remotefilelog aware of the possible existence of a sparse matcher. Test Plan: Added tests Reviewers: sid0, rmcelroy, pyd, lcharignon Subscribers: kang Differential Revision: https://phabricator.fb.com/D1967207	2015-04-02 09:58:46 -07:00
Ryan McElroy	5a769d1fd6	ajustlinknodes: check for node in nodemap Summary: Per @pyd's review of D1933267, we need to check for the linknode in cl.nodemap, not in cl (whose __contains__ method only looks for revs and doesn't even check for visibility... lolz). Test Plan: ran tests Reviewers: durham, sid0, pyd, ericsumner, lcharignon, davidsp, mitrandir Reviewed By: mitrandir Subscribers: akushner, daviser, pyd Differential Revision: https://phabricator.fb.com/D1934941 Tasks: 6573011 Signature: t1:1934941:1427130649:b084635db9bfcd28c4d4a1bcf12a7500c06b323c	2015-03-23 09:55:23 -07:00
Durham Goode	25efa4b886	Fix adjust linknodes for ancestries with old nodes Summary: The new version of adjust linknodes wasn't accounting for the fact that some ancestries contained nodes that no longer exist. Check for that before looking for common ancestors. The old version of this code survived by luck. We were catching KeyErrors as one base case, and it just happens that LookupError from the changelog is also a KeyError, so it was getting caught and eaten. Test Plan: We should probably add a test, but I have to leave shortly and this is pretty broken, so we'll have to take a rain check. Reviewers: rmcelroy, pyd, sid0 Differential Revision: https://phabricator.fb.com/D1933267	2015-03-20 18:39:38 -07:00
Ryan McElroy	1184ee707c	Fix stack overflow when dealing with long file histories Summary: The new fixmappinglinknodes function was using recursion to traverse the file history, but this would break for files with history that was extremely long (stack overflow). Switch to using a manual stack approach. Test Plan: Ran the tests (I'd added a test to cover this logic before). Reviewers: sid0, davidsp, mitrandir, lcharignon, pyd, rmcelroy Reviewed By: rmcelroy Subscribers: michaelbarton Differential Revision: https://phabricator.fb.com/D1931944 Signature: t1:1931944:1426884986:3a0ef144fb55b8c0533e5c5de90699a1823b891f	2015-03-20 14:04:40 -07:00
Siddharth Agarwal	604cebd541	make patch.trydiff wrapper more generic Summary: I'm going to add a new parameter upstream. Make this more generic so that we don't have to try and support both the old and the new versions. Test Plan: Ran tests with both old and new hg. Reviewers: davidsp, rmcelroy, akushner, pyd, daviser, mitrandir, ericsumner, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D1920172 Signature: t1:1920172:1426615175:d90bda3b3cc30f6e5f3149af82ae9e43dee39455	2015-03-17 10:56:59 -07:00
Durham Goode	c599c6ae79	Extra changes related to the previous commit	2015-03-12 15:58:46 -07:00
Durham Goode	1d96446f97	push: fix pushing multiple manifests with the same file node Summary: Previously remotefilelog did not produce all the necessary local data blobs when doing a peer push/pull if the incoming changegroup had two manifests that referred to the same file revision. We would only create a file blob containing the history for the first occurrence, then if the user tried to access the file history for other occurrences they got an exception. The fix is to add linkrev fixup logic, similar to the adjustlinkrev() method from core Mercurial's filectx. Now, if no valid local file blob can be found, we will compute a valid history by reading the changelog. We might be able to write this data to disk in the future as well to prevent having to repeatedly compute this. Test Plan: Added a test Reviewers: sid0, rmcelroy, pyd, mitrandir, lcharignon Differential Revision: https://phabricator.fb.com/D1904453	2015-03-10 20:02:14 -07:00
Durham Goode	8bc01a01bc	prefetching: fix computenonoverlap wrapper The computenonoverlap function has changed upstream. Update ourselves to match it.	2015-03-10 19:59:43 -07:00
Siddharth Agarwal	43e26aff3b	shallowrepo: prefetch files before a commitctx Summary: For hg-git conversions we're going to cause commits without actually updating to the base. Currently, this will cause lots of individual fetches. The test demonstrates the issue -- wihtout this patch it'll fetch the 2 files over 2 fetches, but with it it'll fetch the files over 1 fetch. Test Plan: Ran the tests. Reviewers: davidsp, rmcelroy, akushner, pyd, daviser, mitrandir, ericsumner, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D1893721 Tasks: `6390769` Signature: t1:1893721:1425624679:5651f71d5023919e9321646275b681b573847c44	2015-03-05 16:06:12 -08:00
Durham Goode	8203e771e3	Fix store/data permissions to have g+w Previously we only set the umask for shared caches. Let's set it for .hg/store/data as well so shallow repos can be used for shared repositories.	2015-02-25 17:13:49 -08:00
Durham Goode	74c8469821	Update copy wrapping to use new upstream functions Upstream has refactored the copy logic to compute the file lists in separate functions, so we no longer need to compute the file lists ourselves. Update the README's Mercurial min-version since this change depends on new APIs inside Mercurial.	2015-01-27 19:20:47 -08:00
Durham Goode	f84dcdee5d	Move _adjustlinkrev onto remotefilectx Summary: Upstream has moved _adjustlinkrev from being a global function to one on the filectx. Let's do the same. Test Plan: Ran the tests Reviewers: mitrandir Differential Revision: https://phabricator.fb.com/D1825043	2015-02-03 18:59:00 -08:00
Durham Goode	07359d1038	Change server blob creation to not use adjustlinkrev Summary: adjustlinkrev makes ancestor reading orders of magnitude slower, so we need to avoid using it. Since adjustlinkrev already returns the linkrev in certain cases, let's just force it to always return that during file blob creation. Test Plan: Generated a few thousand blobs for www and fbcode using the old and new methods and verified that they were byte-for-byte identical. Reviewers: sid0, pyd, mpm, rmcelroy Differential Revision: https://phabricator.fb.com/D1782400	2015-01-14 13:14:35 -08:00
Durham Goode	d56fa342f0	Improve error message when fallback server isn't configured Summary: If the remotefilelog server was not specified in the hgrc, or if the project hgrc wasn't trusted, it would throw an obtuse error about a NoneType string. This fixes it to give a more informative error explaining the problem. Test Plan: Added a test Reviewers: sid0, pyd, mitrandir, ericsumner, rmcelroy Reviewed By: rmcelroy Differential Revision: https://phabricator.fb.com/D1774743 Signature: t1:1774743:1420830544:5122a8e11f668ee8c35996e0f4395883a31ce8b0	2015-01-09 09:43:14 -08:00
Durham Goode	4d92ad3ed7	Add optional cache validation Summary: There are reports of the local cache becoming invalid when stored on disk. This adds an option that will do some basic validation and remediation for those entries, and log some data to disk. This is optional, since it incurs some performance overhead. We just want to use it long enough to track down the issue. Test Plan: Added a test Reviewers: sid0, pyd, ericsumner, rmcelroy, mitrandir Reviewed By: mitrandir Differential Revision: https://phabricator.fb.com/D1774724 Signature: t1:1774724:1420827432:06ace9d1dc078f469e0f61ebd7f604fc3b606f6d	2015-01-08 18:59:04 -08:00
Durham Goode	5f69d8dd0b	Improve error message for corrupt cache files Summary: We've gotten reports of corrupt cache files, and the error message is pretty obtuse (ValueError for converting a string to an int). This refactors the size check into a function and provides a better error message. Test Plan: Added a test Reviewers: sid0, pyd, mitrandir, ericsumner, rmcelroy Reviewed By: rmcelroy Differential Revision: https://phabricator.fb.com/D1774721 Signature: t1:1774721:1420830671:afd54dde8fdc00e08ed1c6cb73bf9fdc7fac2327	2015-01-09 09:11:06 -08:00
Durham Goode	f0548ee974	Update remotefilectx.filectx to match upstream Upstream has changed the filectx function slightly, so we need to match it.	2015-01-09 11:56:42 -08:00
Siddharth Agarwal	8b622893dc	[shallowbundle] don't drop units and reorder on the floor Summary: We were forgetting to pass these arguments on to the child function. Test Plan: Visual inspection. Reviewers: durham, davidsp, rmcelroy, akushner, pyd, daviser, mitrandir, ericsumner Reviewed By: ericsumner Differential Revision: https://phabricator.fb.com/D1773782 Signature: t1:1773782:1420765574:d73be08ab25265e4769d8bf70671f2ea1c13f8dd	2015-01-08 17:02:37 -08:00
Durham Goode	6687d78fc7	Add introrev to remotefilectx Mercurial upstream does some fancy stuff inside introrev now to provide the correct introrev. It relies on having the filelog though, so we need to avoid it. Remotefilelog has perfect history knowledge, so we can just return the correct linkrev.	2015-01-06 09:28:16 -08:00
Durham Goode	d985df868c	Atomically write local cache files Summary: We're seeing some weird cache corruption errors when writing the cache to disk. My best bet is there's multiple writes colliding and causing bad data, so let's do atomic renames. Test Plan: Ran the test suite Reviewers: sid0, pyd, davidsp, rmcelroy Reviewed By: rmcelroy Subscribers: ericsumner, mitrandir Differential Revision: https://phabricator.fb.com/D1747190 Signature: t1:1747190:1418865586:0a07e5243dfe9c1d5ea24f81874910d1080f24e2	2014-12-17 16:36:40 -08:00
Pierre-Yves David	ee7bdd47d8	remotefilelog: "implement" rawsize too It is part of the revlog API and some extension like tortoisehg rely on it. The default implementation is the same as size so we can safely mimic this here.	2014-11-29 05:20:28 -08:00
Durham Goode	97d36d285b	Fix rebase with changeset evolution A recent fix to make ancestor maps work with changeset evolution actually caused a pretty serious regression. The ancestormap validation code was returning ancestormaps with hidden ancestors if the first commit in the history was a hidden node. This resulted in lots of invalid ancestories being returned. Instead we only want to allow hidden ancestors in the map if the relativeto commit has been explicitly set to a hidden node.	2014-11-24 22:42:34 -08:00
Siddharth Agarwal	d731468f70	[bundle2] insert ourselves into the cg1packer class hierarchy and fix up the packermap Summary: Last bits needed to get remotefilelog over bundle2 working. Includes tests. Test Plan: Ran tests, including with `--extra-config-opt experimental.bundle2-exp=True` Reviewers: davidsp, akushner, pyd, rmcelroy, daviser, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D1671738 Tasks: 5568731 Signature: t1:1671738:1415676482:b9e7a1f308919526b0c41fee54d89da876518ec7	2014-11-07 18:35:52 -08:00
Siddharth Agarwal	ca3f7a704e	[bundle2] rename shallowbundle to shallowcg1packer Summary: Preparation for bundle2 support Test Plan: Ran tests Reviewers: pyd, akushner, davidsp, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D1668145 Tasks: 5568731 Signature: t1:1668145:1415643197:05ea239c2eb713f82bed6ad67bcd02fad7073a1f	2014-11-07 15:39:20 -08:00
Siddharth Agarwal	74584bb934	[bundle2] support arbitrary kwargs in getlocalbundle Summary: bundle2 adds arbitrary kwargs like `listkeys`. Test Plan: Got further in a remotefilelog pull with bundle2. Reviewers: pyd, davidsp, akushner, durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D1668121 Tasks: 5568731 Signature: t1:1668121:1415643137:8f85d1c32ffc00f3c7d8bf3c3179626268814a17	2014-11-07 18:31:48 -08:00
Durham Goode	3889ee7b5d	Fix relative ancestor traversals for hg blame Certain filectx constructions used the rev number of the self._changeid. We need to convert that to a node before using it. This was breaking blame. I've now added a blame test too.	2014-10-23 17:16:07 -07:00
Durham Goode	dc5a3bf415	Allow pulling from shallow bundlerepos Bundlerepos work by providing a fake revlog layer above an existing revlog. Since remotefilelog doesn't use revlogs for filelogs, bundlerepo's did not work. This commit fixes it such that you can now hg pull from a bundle, as long as that bundle is shallow (i.e. contains no file contents). This will work for the common use case of trying to recover data from .hg/strip-backups. For reference, shallow bundles don't contain any file data because we never delete any file data from .hg/store/data when using remotefilelog. Even after the commits have been stripped.	2014-10-23 00:01:21 -07:00
Durham Goode	f9730cd521	Fix dirstate wrapping to match upstream Upstream Mercurial commit f447144c8ada changed the dirstate.status output. This updates remotefilelog to match that new output.	2014-10-22 12:36:53 -07:00
Durham Goode	37798a0827	Fix pull wrapping to match upstream Upstream Mercurial has moved localrepo.pull into exchange.pull. This moves our wrapping of that command out of shallowrepo and into __init__. Exchange is becoming an increasingly important class, so we may want to think about moving all exchange wrapper logic out to a separate module in remotefilelog.	2014-10-14 15:50:04 -07:00
Durham Goode	65503211ed	Fix revset indexing bug and update test output repo.revs() no longer returns an object that can be indexed, so we can't use [] on it anymore. So let's use list() on it first. The bookmark output from upstream Mercurial has also changed, so we need to update the tests.	2014-10-14 15:30:38 -07:00
Durham Goode	3ecee80a81	Allow ancestormap to contain hidden commits (sometimes) Summary: When doing 'hg unshelve foo.txt' with Changeset Evolution enabled, uncommit will first prune the commit, then try to read the filelog history to determine if any renames need to be undone. Since the commit is now pruned, remotefilelog fails to find any valid histories. This fixes it two allow hidden histories if the filectx commit is hidden. It also tweaks remotefilectx to produce commit-relative histories when possible, which will result in more accurate histories. Test Plan: Ran hg uncommit in the evolve repo that had problems before. Verified it now worked. Reviewers: pyd, sid0 Differential Revision: https://phabricator.fb.com/D1587306	2014-09-30 14:40:09 -07:00
Durham Goode	8a5a5330c1	Fix pullprefetch for recently landed commits Summary: Pull-prefetch would not download file versions from the server if the file version already existed in the local cache or the local store data. Unfortunately, if someone landed their commit, then later stripped their local version, the local store data file version might become invalid and no local cache version would exist. Meaning things like 'commit' might fail when offline. This changes prefetch to always fetch from the server when dealing with files it knows are from revs on the server. Test Plan: Added a test that makes local commits that already exist on the server, and verifies that a pull-prefetch fetches the server file version, despite that same version existing locally. Reviewers: sid0, pyd, davidsp Subscribers: orip Differential Revision: https://phabricator.fb.com/D1607260	2014-10-09 15:20:54 -07:00
Pierre-Yves David	548b8af8b5	client: add a second argument to ResponseError Summary: The ResponseError exception expect a second argument. Otherwise the code handling it crashes. Test Plan: The handling of the response error stop crashing. Reviewers: durham Differential Revision: https://phabricator.fb.com/D1581574	2014-09-11 20:30:16 +02:00
Pierre-Yves David	c72eed0894	clone: have a more robust finally clause Summary: If the orig function crash before the fileservice is installed, the finally clause explode, shadowing the original error. We fixes thats. Test Plan: crash stopped being shadowed but crash in the finally clause. Reviewers: durham Differential Revision: https://phabricator.fb.com/D1581562	2014-09-11 20:08:42 +02:00
Siddharth Agarwal	5faaeedd84	[remotefilelog] fix packmeta call Summary: API change Test Plan: @durham ran an amend. Reviewers: durham Reviewed By: durham Subscribers: durham Differential Revision: https://phabricator.fb.com/D1569510	2014-09-22 11:38:04 -07:00
Durham Goode	c7f1c0b383	Fix committing merges Summary: Upstream Mercurial changed the way merging works and added revlog.commonancestorsheads. This changes remotefilelog to implement the same API. Previously we were able to use ancestors.genericancestors to do the graph traversal. Upstream Mercurial has deleted that function though (since it is now unused), so remotefilelog must now build a temporary rev graph in order to use the ancestors.* apis. Test Plan: Added a test. It failed without the fix, it passes with the fix. Reviewers: sid0, davidsp, pyd Differential Revision: https://phabricator.fb.com/D1566787	2014-09-19 12:21:30 -07:00
Siddharth Agarwal	8d48e1e5ee	fix for parsemeta API change Summary: This was broken by recent changes. Test Plan: Ran test suite. Reviewers: durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D1558890 Tasks: 5170539	2014-09-16 13:28:03 -07:00
Pierre-Yves David	2c956d95e2	revert: only pre-fetch files that needs to be touched Summary: With recent version of mercurial (>= 3.2, 4dfcf21a6aa7), revert uses status information to determine the files that needs to be touched. It then offer a simple handle for extensions that needs prefetch. Test Plan: Ran the tests. Certain tests depended on the old revert behavior (of prefetching everything), so they required slight changes. Reviewers: pyd, sid0, davidsp Differential Revision: https://phabricator.fb.com/D1551059	2014-09-08 15:20:59 +02:00
Durham Goode	580f3eaeb3	Update to match Mercurial version b8c8cacd4482 Summary: Changegroups have been refactored upstream and we need to update our remotefilelog monkey patching accordingly. Also fix an issue with the tests where 'function foo()' was not considered valid on certain systems. Test Plan: Ran the tests Reviewers: pyd, sid0, davidsp Differential Revision: https://phabricator.fb.com/D1551019	2014-09-11 14:39:14 -07:00
Durham Goode	17c16cf610	Optimize pullprefetch to limit number of stats Summary: Previously, if pullprefetch was set, we'd perform a prefetch of the entire manifest of the specified revs (usually the public bookmarks). This involved stat-ing all the relevant files in the cache to see if they already existed, which added an extra 6 seconds or so to every pull. Now we only prefetch the files that are different from our working copy. We assume we already have all the files that are in our working copy. This reduces the pullprefetch overhead significantly. Test Plan: Did a pull on my laptop. Verified it didn't hang for 6 seconds at the prefetch stage. Also updated a test Reviewers: davidsp, pyd, sid0 Reviewed By: sid0 Differential Revision: https://phabricator.fb.com/D1505841 Tasks: 4608894	2014-08-19 09:33:31 -07:00
Durham Goode	e46cd0e8e0	Merge heads	2014-08-07 10:23:18 -07:00
Durham Goode	e5228d9989	Fix pullprefetch that uses bookmarks Summary: Previously, pullprefetch was executed during the repo.pull stage. This happens before the bookmarks have been moved, so revsets like 'bookmark()' would prefetch the wrong commits. This change moves the pullprefetch logic to after the pull command is completely finished. Updated a test to make sure this is caught. Also fixes a bug where we were using linkrevs to read a manifest rev entry. We should be using the manifest rev instead. Test Plan: Added a test. Ran it. Reviewers: sid0, pyd, davidsp Differential Revision: https://phabricator.fb.com/D1483345	2014-08-06 18:50:57 -07:00
Siddharth Agarwal	07a515c430	don't show remotefilelog commands in the shortlist Summary: These commands (well, not the debug one) were visible in the shortlist that showed up when you type `hg`. They're not basic commands. Test Plan: Ran `hg` with the extension enabled, didn't see those commands. Reviewers: durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D1454931	2014-07-23 20:37:48 -07:00
Durham Goode	c44433c62c	Fix hg log on patterns Summary: Due to a change in upstream mercurial, hg log with patterns was no longer working. This fixes it by forcing hg log to take the slow path when using patterns. It also updates the warning messages to work when running hg log <file> from within a subdirectory. Test Plan: Ran the new tests Reviewers: sid0 Differential Revision: https://phabricator.fb.com/D1450193	2014-07-22 12:55:29 -07:00
Durham Goode	13058fb30c	Allow auto-prefetching during pulls Summary: Adds a remotefilelog.pullprefetch config options that accepts a revset. Whenever a pull is run, the revs matched by that revset will be prefetched. The most common value for this will be '(bookmark() + heads(all())) & public()', since it will download almost everything necessary to work offline. Test Plan: Added a test. Ran it. Reviewers: davidsp, pyd, sid0 Reviewed By: sid0 Differential Revision: https://phabricator.fb.com/D1419420	2014-07-03 13:05:11 -07:00
Siddharth Agarwal	f662120645	merge	2014-06-21 16:06:06 -07:00
Siddharth Agarwal	0d248aa73f	applyupdates: update for Mercurial changes Summary: Update for Mercurial commits 1b6040917a6c anmd 9b42f49d06aa. Test Plan: Ran the tests Reviewers: durham, dschleimer, pyd, akushner, davidsp Reviewed By: davidsp Differential Revision: https://phabricator.fb.com/D1388563 Tasks: 4533623	2014-06-17 15:47:12 -07:00
Durham Goode	e6bee07496	Expand environment variables in cacheprocess and cachepath Summary: Expands environment variables in the cacheprocess and cachepath config options, so users can specify something like remotefilelog.cachepath=$HOME/.hgcache Test Plan: Set my cachepath to $HOME/.hgcache on my laptop and manually performed a shallow clone. Verified data was put in ~/.hgcache Reviewers: sid0 Differential Revision: https://phabricator.fb.com/D1342174	2014-05-21 12:28:03 -07:00
Siddharth Agarwal	ef8674624a	Fix shallowbundle.getbundle for local non-remotefilelog repositories Summary: Pulling from a local non-remotefilelog repo to a remotefilelog repo was broken. This fixes it. Test Plan: `hg pull` from a local non-remotefilelog repo to a remotefilelog repo. Reviewers: durham Reviewed By: durham Differential Revision: https://phabricator.fb.com/D1341059	2014-05-20 20:56:19 -07:00
Durham Goode	c5b2f574a0	Fix changegroup wrapping with new upstream Mercurial Summary: Recent changes to upstream Mercurial have moved localrepo.getbundle and localrepo.addchangegroupfiles to changegroup.py. remotefilelog wraps these functions, and thus needs to be updated. Applyupdate also had a function signature change, which is fixed here. Minor fix to a test as well, which had a hard coded time instead of a glob. Test Plan: ./run-tests.py --with-hg=/data/users/durham/hg/hg Reviewers: sid0, davidsp, pyd, dschleimer Differential Revision: https://phabricator.fb.com/D1260737	2014-04-04 15:55:06 -07:00
Durham Goode	0237412d94	Fix shallow clones using getbundle protocol Preivously shallow clones only work using the streaming clone protocol. With this change they work for the standard getbundle protocol as well. This is what the majority of Mercurial users use, so we need to support that.	2014-02-24 22:19:15 -08:00
Durham Goode	0301f9f129	Move local cache logic into it's own class The current local cache is just files on disk, and this implementation detail was spread across the extension. This change refactors it to hide the implementation inside a class so that we can replace it with other implementations (such as a sqlite local cache) later.	2014-02-11 16:25:55 -08:00
Durham Goode	bdea38dd56	Move fileservice to be per repo instead of global Previously the file service client was a global object that all repos could share. This was a bit hacky and is no longer needed. Now the file service client exists per repo instance. This is part of a series of changes to abstract the local caching and remote file service in such a way that we can plug and play implementations.	2014-02-11 14:41:56 -08:00
Durham Goode	9eda4f7a0f	Fix fallback when memcache process exits unexpectedly If the memcache process exited early, remotefilelog was throwing an exception instead of falling back to the server. This change makes it fall back to the server, and also print a warning that the cache connection closed early.	2014-01-09 11:41:12 -08:00
Durham Goode	fc3a887712	hg bundle produces full sized bundles Summary: hg bundle was producing shallow bundles. This change makes it produce full sized bundles so they can be used in other repos. Test Plan: Added a test Reviewers: sid0 Reviewed By: sid0 CC: keegancsmith Differential Revision: https://phabricator.fb.com/D1167462	2014-02-10 16:13:41 -08:00
Durham Goode	92d01b616c	Allow readonly access to remotefilelog cache Summary: Previously requesting remotefilelog file blobs from the server required write access in order to write the blob to the cache. This changes it to not abort entirely if the user doesn't have write access. Test Plan: cd tests ./run-tests.py --with-hg=/data/users/durham/hg/hg test-permissions.t Also ran the test without the fix and verified it fails. Reviewers: sid0, davidsp, pyd, dschleimer Reviewed By: dschleimer Differential Revision: https://phabricator.fb.com/D1145976 Task ID: 3601184	2014-01-27 17:09:48 -08:00
Durham Goode	106035959b	Add prefetch command to remotefilelog Summary: Adds a 'hg prefetch' command to remotefilelog for prepopulating the local cache. Supports specifying revsets and file patterns to limit what is downloaded. Test Plan: ./run-tests.py test-prefetch.t --with-hg=/data/users/durham/hg/hg Reviewers: dschleimer, sid0, davidsp, pyd, mpm CC: kunalb, minyoung Differential Revision: https://phabricator.fb.com/D1129942	2014-01-15 13:41:29 -08:00
Durham Goode	f76b0f894c	Fix looking up double digit alternates The alternate lookup code was mistakening looking for only the last digit instead of looking at the entire prefix. This meant files with more than 10 alternates would start failing to find histories, which breaks rebase.	2014-01-09 11:40:39 -08:00
Durham Goode	16a7f940d5	Increase batch request size When falling back to the master server for cache misses, we only kept two requests in flight at any time. Over high latency connections (like across oceans) this resulted in very slow downloads. This change increases the request size to 10,000 keys at once. This will keep the size of the request lower than the tcp buffer size, while allowing us to maximize our throughput.	2013-12-17 14:31:21 -08:00
Durham Goode	285ad01336	Handle the case where the alternates directory doesn't exist yet	2013-12-13 17:14:55 -08:00
Durham Goode	688d0f9594	Fix debugremotefilelog command	2013-12-13 11:42:50 -08:00
Durham Goode	4e8c3b941d	Fix broken alternates lookup	2013-12-13 11:21:51 -08:00
Durham Goode	17f5a0d712	Fix issues with hg pulling from svn	2013-12-12 12:34:39 -08:00
Durham Goode	4d6f31837e	Fix hang when manifest size is greater than tcp buffer Previously we sent the entire list of files to the fallback repo in a single ssh write/flush. If the size of this write exceeded the tcp buffer on the receiving end, the call would hang until the buffer had room. The problem is that the receiving end (the server) is hung trying to send data back to the client. Therefore it deadlocked. The fix is to send and receive requests one at a time. We always have the next request in flight while receiving so we shouldn't be waiting on requests too often.	2013-12-11 13:39:53 -08:00
Durham Goode	393958c76b	Allow naming repos Enables specifying a name for a repo that is used in the cache key. This allows multiple repos on a machine to share a cache without the risk of keys overlapping.	2013-08-15 11:00:51 -07:00
Durham Goode	85e48b58fd	Move server and debug logic into their own files __init__.py was getting quite large. This change moves the server and debug logic into their own files. Client-side logic remains in __init__.py	2013-11-25 16:36:44 -08:00
Durham Goode	d9d4477013	Remove global variable for tracking shallow remotes Previously we used a global variable to track if the incoming connection was from a shallow remote (based on if the network command was a *_shallow command). This is hacky and overall a bad idea. The new implementation stores the shallow flag as a bundlecapability passed to the getbundle command. A side effect of this is remotefilelog won't work with versions of mercurial that don't use the getbundle command.	2013-11-25 14:22:56 -08:00
Durham Goode	b88d1b44d4	Replace linknode fallback algorithm The previous algorithm thought that if the system cache had the file rev, it was guaranteed to be valid. This isn't true in the case of a machine in which multiple people share the cache (one person may have pulled a rev but the other hasn't). The new algorithm is more explicit. It checks: - system cache - local cache - local cache fallbacks - remote cache - master server	2013-11-22 13:41:54 -08:00
Durham Goode	e5f5e3244b	Add more comments explaining various complexities	2013-11-05 17:19:59 -08:00
Durham Goode	24ce0242d7	Add example cache client implementation Adds a cache client implementation using the opensource python-memcached library. It's more of an educational example than a production ready one since it doesn't perform the requests asynchronously. It does however split up large files into smaller chunks for you.	2013-10-17 14:18:23 -07:00
Durham Goode	18baf608df	Remove unused time and traceback imports	2013-10-16 13:40:25 -07:00
Durham Goode	d122f76e5b	Add readme and GPL info	2013-10-15 17:20:12 -07:00
Durham Goode	1275d15990	Add include and exclude configuration settings The remotefilelog extension currently doesn't work with tags. Adding include and exclude patterns allows users to specify which files they want to treat as shallow and which the want to download the entire history for. By excluding .hgtags from being shallow, this enables tags to work in a mostly shallow repo. This also enables largefile like scenarios where most files are full and only a few large ones are kept remote.	2013-09-26 10:46:06 -07:00
Durham Goode	5a628dc440	Fix linknode test failure	2013-10-09 10:20:47 -07:00
Durham Goode	3c6137f555	Fix revert prefetch causing excess output	2013-10-07 17:13:00 -07:00
Durham Goode	b47e016320	Replace linknode recovery tests with a real world test	2013-10-04 14:40:47 -07:00
Durham Goode	7268e5b709	Refactor ancestormap linknode logic to handle a bug A rare bug can occur where the local file blob might not exist, but a valid old version of that blob does exist. This refactor the linknode logic in ancestormap to check the old versions if the server fetch fails to find the blob. It still prints an ugly warning message from the server, but this whole issue is quite rare anyway.	2013-10-03 15:15:15 -07:00
Durham Goode	ab72a92e85	GC server cache and add GC tests	2013-10-02 16:21:48 -07:00
Durham Goode	be29ee042a	Fix reverting from non-root directories	2013-10-02 09:45:52 -07:00
Durham Goode	335e1d1bfc	Prefetch before revert	2013-09-17 11:24:31 -07:00
Durham Goode	2ccd88dfcd	Support new mercurial _basesupported	2013-10-01 15:11:57 -07:00
Durham Goode	efdfcc1502	Send all available data during a pull	2013-09-19 16:22:14 -07:00
Durham Goode	3667c253fd	Refresh changelog during getfiles loop	2013-09-19 15:56:26 -07:00
Durham Goode	cf9d751d8a	Add remotefilelog debug commands	2013-09-17 20:15:08 -07:00
Durham Goode	6a8a2f0e58	Fix rare issue with broken linknodes in the ancestormap	2013-09-16 18:46:24 -07:00
Durham Goode	f480c7deef	Remove remotefilectx.__str__ Recent changes to Mercurial mean this is implemented by a base class.	2013-09-11 12:29:01 -07:00
Durham Goode	4ce55b8a0f	Add log file warning	2013-09-11 10:27:56 -07:00
Durham Goode	6781d80d25	Fix local pulls to send file data	2013-09-09 11:44:08 -07:00
Durham Goode	6acb5968a1	Clean up empty cache files if we encounter them	2013-09-09 11:23:03 -07:00
Durham Goode	3619a1911d	Cut down number of sys calls during filelog reads When the cache is stored on a filesystem, excessive stat calls can slow mercurial updates down dramatically. This reduces it to a single open call for the cache location and if that fails, a single open call for the local location.	2013-09-09 10:23:29 -07:00
Durham Goode	c17ec690c9	Change cache key to use a two character prefix for directories. Some file systems can't handle having a ton of files/directories inside a directory, so this splits up all our files amongst directories.	2013-09-06 13:28:15 -07:00
Durham Goode	4d70ed4fce	Fix a bug with status prefetching in merge scenarios	2013-09-04 19:07:01 -07:00
Durham Goode	29ba0e9bc1	If cacheprocess is not set, always use the fallback This allows tests to run without a memcache process.	2013-09-03 20:03:24 -07:00
Durham Goode	4a5c8d437d	Fix hg diff when fnode is None	2013-09-03 11:39:16 -07:00
Durham Goode	5ec22c7093	Prevent verify from checking filelogs	2013-08-30 15:43:22 -07:00
Durham Goode	b685d98f57	Prefetch revisions before a diff	2013-08-30 11:27:09 -07:00
Durham Goode	4edeed8417	Prefetch lookup set during hg status	2013-08-30 11:09:19 -07:00
Durham Goode	3c879ed1a8	Enable efficient pulling between shallow repos	2013-08-28 18:51:01 -07:00
Durham Goode	d0738cc010	Make cache files owned by uid/svnuser	2013-08-20 12:59:33 -07:00
Durham Goode	96bbab8f7a	Fix shared cache permissions to be g+w	2013-08-15 10:59:11 -07:00
Durham Goode	f68d704603	Enable hg gc from outside a repo	2013-08-15 10:56:25 -07:00
Durham Goode	bf7491936d	Fix hg diff with added or moved files. A workingctx produces manifest entries with nullid+'a' or nullid+'m' for any added or modified files. The extension was trying to prefetch these but they didn't exist and caused an error. Luckily they are length 42 so we can check for them and not prefetch them.	2013-07-24 22:16:50 -07:00
Durham Goode	a5828ce7a3	Add newline to end of debug output	2013-07-24 18:49:14 -07:00
Durham Goode	9df6e83354	Prevent 'running ssh...' in stdout when run with -v	2013-07-24 13:20:13 -07:00
Durham Goode	3cbc732b42	Fix fallbackrepo not being present during the clone after update. Make debug message get sent to stderr instead of stdout.	2013-07-23 19:06:40 -07:00
Durham Goode	77d31b12e4	Add hit/miss ratio to debug output	2013-07-01 17:37:55 -07:00
Durham Goode	9642a8a2d6	Add remotefilelog.fallbackrepo config	2013-07-01 16:28:34 -07:00
Durham Goode	58ff8f91f6	Prefetch before copy tracing	2013-07-01 15:35:08 -07:00
Durham Goode	027a1d4ab8	Set umask before writing files to shared cache	2013-06-28 17:12:20 -07:00
Durham Goode	8e037436cb	Add gc command for cleaning up the cache	2013-06-28 15:57:15 -07:00
Durham Goode	6e3494bf98	Add incoming hook for producing file blobs	2013-06-27 15:14:22 -07:00
Durham Goode	1ac9b8cbc1	Move requirement string to a variable	2013-06-26 14:37:59 -07:00
Durham Goode	3e6b7810df	Override bundle10.generatefiles instead of prune	2013-06-25 13:26:24 -07:00
Durham Goode	bb32b111bf	Change contract between extension and memcache process to allow arbitrary key lengths and customizable cache paths	2013-06-25 11:38:48 -07:00
Durham Goode	6536d87bc0	Prevent pull from sending files to shallow clones	2013-06-23 13:50:22 -07:00
Durham Goode	84b481de56	Add option for server cache location. Change _callstream wrapper to only run on client.	2013-06-21 13:22:18 -07:00
Durham Goode	f16a3a4134	Rename to remotefilelog since shallowrepo is already taken	2013-06-21 10:14:29 -07:00

... 6 7 8 9 10 ...

598 Commits