sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 08:47:12 +03:00

Author	SHA1	Message	Date
Siddharth Agarwal	89c409af60	verify: add new command to verify the contents of a Mercurial rev Since the Git to Mercurial conversion process is incremental, it's at risk of missing files, or recording files the wrong way, or recording the wrong commit metadata. Add a command called 'gverify' that can verify the contents of a particular Mercurial rev against the corresponding Git commit. Currently, this is limited to checking file names, flags and contents, but this can be made as robust as desired. Further additions will probably require refactoring git_handler.py a bit though. This function is pretty fast: on a Linux machine with a warm cache, verifying a repository with around 50,000 files takes just 20 seconds. There is scope for further improvement through parallelization, but conducting tree walks in parallel is non-trivial with the current worker infrastructure in Mercurial.	2014-02-26 14:19:24 -08:00
Siddharth Agarwal	56cbe49bb0	git_handler: remove init_if_missing This function is a no-op and can be removed.	2014-02-25 20:01:42 -08:00
Siddharth Agarwal	6d1bd2e02e	git_handler: make self.git a lazily evaluated property This allows other functions to be able to use the `git` property without needing to care about initializing it. An upcoming patch will remove the `init_if_missing` function.	2014-02-25 19:51:02 -08:00
Siddharth Agarwal	bbfc3bf8b0	overlayrevlog: handle root commits correctly Previously, we'd try to access commit.parents[0] and fail. Now, check for commit.parents being empty and return what Mercurial thinks is a repository root in that case.	2014-02-25 00:23:12 -08:00
Siddharth Agarwal	a5e956514d	overlayrevlog: handle rev = 0 correctly Previously we'd just test if gitrev was falsy, which it is if the rev returned is 0, even though it shouldn't be. With this patch, test against None explicitly. This unmasks another bug: see next patch for a fix and a test.	2014-02-25 00:20:22 -08:00
Siddharth Agarwal	fea85d57c6	git_handler: fix call to self.ui.progress in flush Since we now directly use progress on self.ui, we shouldn't pass in self.ui as the first argument. Oops.	2014-02-24 15:29:31 -08:00
Siddharth Agarwal	2cc81f4c1f	git_handler: don't compute tags for each tag imported Previously we'd recompute the repo tags each time we'd consider importing a Git tag. This is O(n^2) in the number of tags and produced noticeable slowdowns in repos with large numbers of tags. To fix this, compute the tags just once. This is correct because the only case where we'd have issues is if multiple new Git tags with the same name were introduced, which can't happen because Git tags cannot share names. For a repository with over 200 tags, this causes a no-op hg pull to be sped up by around 0.5 seconds.	2014-02-24 11:38:00 -08:00
Siddharth Agarwal	27784bf3bc	util: drop support for Mercurial < 1.4	2014-02-19 18:49:42 -08:00
Siddharth Agarwal	01fb9068a0	git_handler: replace util.progress with ui.progress util.progress was a shim for Mercurial < 1.4.	2014-02-19 18:49:28 -08:00
Siddharth Agarwal	01896282f6	overlay: drop support for Mercurial < 1.9	2014-02-19 18:46:56 -08:00
Siddharth Agarwal	a49ac12684	git_handler: remove old and bogus code for deleting entries from tags cache This code never worked for Mercurial >= 2.0, since it neither had repo._tags nor repo.tagscache.	2014-02-19 18:45:36 -08:00
Siddharth Agarwal	d7bad71d02	git_handler.save_tags: drop support for Mercurial < 1.9	2014-02-19 16:12:27 -08:00
Siddharth Agarwal	5ef555a629	git_handler.save_map: drop support for Mercurial < 1.9	2014-02-19 16:10:35 -08:00
Siddharth Agarwal	8f2d697a54	hgrepo.tags: drop support for Mercurial < 2.0 A new property called _tagscache was introduced in Mercurial 2.0, so the cache wasn't actually working. The contract for tags() also changed at some point -- it stopped returning nodes that weren't in the repo. This will need to be accounted for if we start using the tags cache again. However, it isn't very clear whether the Mercurial tags cache is actually worth doing, since we already have a separate in-memory cache for Git tags in the handler.	2014-02-19 16:09:23 -08:00
Siddharth Agarwal	41357ce554	hgrepo.push: drop support for Mercurial < 1.6	2014-02-19 15:55:45 -08:00
Siddharth Agarwal	732da34592	gitrepo: drop support for Mercurial < 1.7	2014-02-19 15:54:37 -08:00
Siddharth Agarwal	7e329463b5	getremotechanges: drop support for Mercurial < 1.7	2014-02-19 15:54:04 -08:00
Siddharth Agarwal	a1a2eb9b35	nodetags: drop support for Mercurial < 1.6	2014-02-19 15:53:14 -08:00
Siddharth Agarwal	c11b48a4e7	extsetup: drop support for Mercurial < 1.7	2014-02-19 15:52:14 -08:00
Siddharth Agarwal	6a0d42bac0	version: drop support for Mercurial 1.9.3 Upcoming patches will clean up some code that makes hg-git work with Mercurial versions < 2.0.	2014-02-19 15:48:27 -08:00
Siddharth Agarwal	6b4e5f67db	hg2git: fix subrepo handling to be deterministic Previously, the correctness of _handle_subrepos was based on the order the files were processed in. For example, consider the case where a subrepo at location 'loc' is replaced with a file at 'loc', while another subrepo exists. This would cause .hgsubstate and .hgsub to be modified and the file added. If .hgsubstate was seen _before_ 'loc' in the modified/added loop, then _handle_subrepos would run and remove 'loc' correctly, before 'loc' was added back later. If, however, .hgsubstate was seen _after_ 'loc', then _handle_subrepos would run after 'loc' was added and would remove 'loc'. With this patch, _handle_subrepos merely computes the changes that need to be applied. The changes are then applied, making sure removed files and subrepos are processed before added ones. This was detected by setting a random PYTHONHASHSEED (in this case, 3910358828) and running the test suite against it. An upcoming patch will randomize the PYTHONHASHSEED in run-tests.py, just like is done in Mercurial.	2014-02-19 20:52:59 -08:00
Siddharth Agarwal	689b38dc44	hg2git: move parse_subrepos to top level durin42 expressed a desire for this function to be at the top level.	2014-02-19 20:18:43 -08:00
Siddharth Agarwal	08f028a3c9	gitnodekw: use githandler from repo Since a fresh GitHandler is no longer created for every commit, this speeds up the {gitnode} template massively. For a repo with over 50,000 commits, the command hg log -l 10 --template '{gitnode}\n' speeds up from 2.4 seconds to 0.3.	2014-02-19 15:23:36 -08:00
Siddharth Agarwal	e7c06facc2	revset_gitnode: use githandler from repo	2014-02-19 15:22:54 -08:00
Siddharth Agarwal	1d58a0a197	revset_fromgit: use githandler from repo	2014-02-19 15:22:36 -08:00
Siddharth Agarwal	b1bbd30c48	getremotechanges: use githandler from repo	2014-02-19 15:15:01 -08:00
Siddharth Agarwal	886532ea23	findcommonoutgoing: use githandler from repo	2014-02-19 15:13:43 -08:00
Siddharth Agarwal	068acd034c	gclear: use githandler from repo	2014-02-19 15:12:59 -08:00
Siddharth Agarwal	fcd7e472fc	gexport: use githandler from repo	2014-02-19 15:12:42 -08:00
Siddharth Agarwal	298b98a518	gimport: use githandler from repo	2014-02-19 15:12:20 -08:00
Siddharth Agarwal	dd8bbcebed	gitrepo: drop unused _initializehandler function and handler property Also drop the GitHandler import. All this now lives on hgrepo.	2014-02-19 15:11:14 -08:00
Siddharth Agarwal	d239f557d1	gitrepo.listkeys: use githandler from localrepo Previously we'd load the git and hg maps twice on separate git handler objects. This avoids that. For a repo with over 50,000 commits, this brings a no-op hg pull down from 2.45 seconds to 2.37.	2014-02-19 15:07:19 -08:00
Siddharth Agarwal	772133c48a	hgrepo.tags: use githandler property Currently we call hgrepo.tags() separately for each tag. (This should be fixed at some point.) This avoids initializing a separate git handler for each tag. For a repository with over 150 tags, this brings down a no-op hg pull by 0.05 seconds.	2014-02-19 14:16:40 -08:00
Siddharth Agarwal	232c6612ae	hgrepo._findtags: use githandler property	2014-02-19 14:15:33 -08:00
Siddharth Agarwal	1c6dc044d5	hgrepo.findoutgoing: use githandler property	2014-02-19 14:14:54 -08:00
Siddharth Agarwal	f87f28dc3a	hgrepo.push: use githandler property	2014-02-19 14:14:01 -08:00
Siddharth Agarwal	63f40c5059	hgrepo.pull: use githandler property	2014-02-19 14:12:38 -08:00
Siddharth Agarwal	728f8df8de	hgrepo: expose git handler as a property This and upcoming patches have the goal of initializing a GitHandler just once for a Mercurial repo.	2014-02-19 14:12:03 -08:00
Siddharth Agarwal	7d37b2a516	git_handler: terminate new commit DAG traversal at known commits Any commit in _map_git is already known, so there's no point walking further down the DAG. For a repo with over 50,000 commits, this brings down a no-op hg pull from 38 seconds to 2.5.	2014-02-18 20:30:27 -08:00
Siddharth Agarwal	6f79df86d2	git_handler: use convert_list to cache git objects getnewgitcommits() does a weird traversal where a particular commit SHA is visited as many times as the number of parents it has, effectively doubling object reads in the standard case with one parent. This patch makes the convert_list a cache for objects, so that a particular Git object is read just once. On a mostly linear repository with over 50,000 commits, this brings a no-op hg pull down from 70 seconds to 38, which is close to half the time, as expected. Note that even a no-op hg pull currently does a full DAG traversal -- an upcoming patch will fix this.	2014-02-18 20:22:13 -08:00
Siddharth Agarwal	36052aca77	git_handler: note that new commits are returned in topo order This wasn't obvious to me at first.	2014-02-18 20:13:15 -08:00
Siddharth Agarwal	5e72b26e7b	git_handler: fix progress reset call	2014-02-16 01:13:10 -08:00
Siddharth Agarwal	298fec2a4b	git_handler: use repo.changelog.node instead of repo.lookup For a repo with over 50,000 commits, this brings down the computation of 'export' from 1.25 seconds to 0.25 seconds. To scale this to hundreds of thousands of commits, one solution might be to maintain the mapping in a DAG data structure mirroring the changelog, over which findcommonmissing can be used.	2014-02-16 01:11:47 -08:00
Siddharth Agarwal	d7dbce79bd	hg2git: call _handle_subrepos when .hgsubstate is removed Now that _handle_subrepos can handle .hgsubstate being removed, we should use it for that. The test changes make sure that the SHAs roundtrip.	2014-02-12 22:55:16 -08:00
Siddharth Agarwal	39d1c15298	hg2git: make _handle_subrepos worked in the removed case A test for this will be included in an upcoming patch.	2014-02-12 21:19:04 -08:00
Siddharth Agarwal	ca74d6d967	hg2git: add 'new' prefix to _handle_subrepos variables An upcoming patch will introduce similar variables for self._ctx. This helps disambiguate.	2014-02-12 20:34:09 -08:00
Siddharth Agarwal	3cadf19b94	hg2git: factor out subrepo parsing into a separate function This code will be used in multiple contexts in an upcoming patch.	2014-02-12 20:28:28 -08:00
Siddharth Agarwal	44c13be822	hg2git: factor out remove path logic into a separate function This will be used by _handle_subrepos in an upcoming patch.	2014-02-12 19:50:56 -08:00
Siddharth Agarwal	94957f9a66	git_handler: remove collect_gitlinks now that it is unused	2014-02-15 16:21:49 -08:00
Siddharth Agarwal	8d0c4fe9f2	git_handler: fix hgsubstate generation Before this patch, in the git to hg conversion, .hgsubstate once created is never deleted, even if no submodules are any longer present. This is broken state, as shown by the test for which the SHA changes. Fix that by looking at the diff instead of just what submodules are present. Since 'gitlinks' now contains changed gitlinks, not all gitlinks, it no longer makes sense to gate gitmodules checks on that. This patch simply demonstrates that the test was broken; an upcoming patch will introduce more tests. Bonus: this also makes the import process faster because we no longer need to walk the entire tree to collect gitlinks. This will cause the SHAs of repos that have submodules added and then removed to change.	2014-02-14 15:44:50 -08:00

1 2 3 4 5

244 Commits