sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 00:45:18 +03:00

Author	SHA1	Message	Date
Pierre-Yves David	64e5cd2f7e	upgrade: extract code in its own module Given about 2/3 or 'mercurial.repair' is now about repository upgrade, I think it is fair to move it into its own module. An expected benefit is the ability to drop the 'upgrade' prefix of many functions. This will be done in coming changesets.	2017-04-07 18:53:17 +02:00
Jun Wu	809dfffea4	repair: use ProgrammingError	2017-03-26 16:53:28 -07:00
Matt Harbison	721da7fc07	repair: use context manager for lock management If repo.lock() raised inside of the try block, 'tr' would have been None in the finally block where it tries to release(). Modernize the syntax instead of just winching the lock out of the try block. I found several other instances of acquiring the lock inside of the 'try', but those finally blocks handle None references. I also started switching some trivial try/finally blocks to context managers, but didn't get them all because indenting over 3x for lock, wlock and transaction would have spilled over 80 characters. That got me wondering if there should be a repo.rwlock(), to handle locking and unlocking in the proper order. It also looks like py27 supports supports multiple context managers for a single 'with' statement. Should I hold off on the rest until py26 is dropped?	2017-03-23 23:47:23 -04:00
Pierre-Yves David	197ab7aeb0	repair: directly use repo.vfs.join The 'repo.join' method is about to be deprecated.	2017-03-08 16:53:39 -08:00
Pierre-Yves David	37925b72f7	vfs: use 'vfs' module directly in 'mercurial.repair' Now that the 'vfs' classes moved in their own module, lets use the new module directly. We update code iteratively to help with possible bisect needs in the future.	2017-03-02 13:29:43 +01:00
Simon Farnsworth	e0b70e4f7f	mercurial: switch to util.timer for all interval timings util.timer is now the best available interval timer, at the expense of not having a known epoch. Let's use it whenever the epoch is irrelevant.	2017-02-15 13:17:39 -08:00
Gregory Szorc	abe1c0e17e	repair: clean up stale lock file from store backup Since we did a directory rename on the stores, the source repository's lock path now references the dest repository's lock path and the dest repository's lock path now references a non-existent filename. So releasing the lock on the source will unlock the dest and releasing the lock on the dest will no-op because it fails due to file not found. So we clean up the dest's lock manually.	2016-11-24 18:45:29 -08:00
Gregory Szorc	a400e3d753	repair: copy non-revlog store files during upgrade The store contains more than just revlogs. This patch teaches the upgrade code to copy regular files as well. As the test changes demonstrate, the phaseroots file is now copied.	2016-11-24 18:34:50 -08:00
Gregory Szorc	93504084a0	repair: migrate revlogs during upgrade Our next step for in-place upgrade is to migrate store data. Revlogs are the biggest source of data within the store and a store is useless without them, so we implement their migration first. Our strategy for migrating revlogs is to walk the store and call `revlog.clone()` on each revlog. There are some minor complications. Because revlogs have different storage options (e.g. changelog has generaldelta and delta chains disabled), we need to obtain the correct class of revlog so inserted data is encoded properly for its type. Various attempts at implementing progress indicators that didn't lead to frustration from false "it's almost done" indicators were made. I initially used a single progress bar based on number of revlogs. However, this quickly churned through all filelogs, got to 99% then effectively froze at 99.99% when it got to the manifest. So I converted the progress bar to total revision count. This was a little bit better. But the manifest was still significantly slower than filelogs and it took forever to process the last few percent. I then tried both revision/chunk bytes and raw bytes as the denominator. This had the opposite effect: because so much data is in manifests, it would churn through filelogs without showing much progress. When it got to manifests, it would fill in 90+% of the progress bar. I finally gave up having a unified progress bar and instead implemented 3 progress bars: 1 for filelog revisions, 1 for manifest revisions, and 1 for changelog revisions. I added extra messages indicating the total number of revisions of each so users know there are more progress bars coming. I also added extra messages before and after each stage to give extra details about what is happening. Strictly speaking, this isn't necessary. But the numbers are impressive. For example, when converting a non-generaldelta mozilla-central repository, the messages you see are: migrating 2475593 total revisions (1833043 in filelogs, 321156 in manifests, 321394 in changelog) migrating 1.67 GB in store; 2508 GB tracked data migrating 267868 filelogs containing 1833043 revisions (1.09 GB in store; 57.3 GB tracked data) finished migrating 1833043 filelog revisions across 267868 filelogs; change in size: -415776 bytes migrating 1 manifests containing 321156 revisions (518 MB in store; 2451 GB tracked data) That "2508 GB" figure really blew me away. I had no clue that the raw tracked data in mozilla-central was that large. Granted, 2451 GB is in the manifest and "only" 57.3 GB is in filelogs. But still. It's worth noting that gratuitous loading of source revlogs in order to display numbers and progress bars does serve a purpose: it ensures we can open all source revlogs. We don't want to spend several minutes copying revlogs only to encounter a permissions error or similar later. As part of this commit, we also add swapping of the store directory to the upgrade function. After revlogs are converted, we move the old store into the backup directory then move the temporary repo's store into the old store's location. On well-behaved systems, this should be 2 atomic operations and the window of inconsistency show be very narrow. There are still a few improvements to be made to store copying and upgrading. But this commit gets the bulk of the work out of the way.	2016-12-18 17:00:15 -08:00
Gregory Szorc	b9b6954ea9	repair: begin implementation of in-place upgrading Now that all the upgrade planning work is in place, we can start doing the real work: actually upgrading a repository. The main goal of this commit is to get the "framework" for running in-place upgrade actions in place. Rather than get too clever and low-level with regards to in-place upgrades, our strategy is to create a new, temporary repository, copy data to it, then replace the old data with the new. This allows us to reuse a lot of code in localrepo.py around store interaction, which will eventually consume the bulk of the upgrade code. But we have to start small. This patch implements adding new repository requirements. But it still sets up a temporary repository and locks it and the source repo before performing the requirements file swap. This means all the plumbing is in place to implement store copying in subsequent commits.	2016-12-18 16:59:04 -08:00
Gregory Szorc	a3569d4b71	repair: determine what upgrade will do This commit introduces code for determining what actions/improvements an upgrade should perform. The "upgradefindimprovements" function introduces a mechanism to return a list of improvements that can be made to a repository. Each improvement is effectively an action that an upgrade will perform. Associated with each of these improvements is metadata that will be used to inform users what's wrong and what an upgrade will do. Each "improvement" is categorized as a "deficiency" or an "optimization." TBH, I'm not thrilled about the terminology and am receptive to constructive bikeshedding. The main difference between a "deficiency" and an "optimization" is a deficiency is always corrected (if it deviates from the current config) and an "optimization" is an optional action that goes above and beyond to improve the state of the repository (usually by requiring more CPU during upgrade). Our initial set of improvements identifies missing repository requirements, a single, easily correctable problem with changelog storage, and a set of "optimizations" related to delta recalculation. The main "upgraderepo" function has been expanded to handle improvements. It queries for the list of improvements and determines which of them will run based on the current repository state and user I went through numerous iterations of the output format before settling on a ReST-inspired definition list format. (I used bulleted lists in the first submission of this commit and could not get it to format just right.) Even with the various iterations, I'm still not super thrilled with the format. But, this is a debug* command, so that should mean we can refine the output without BC concerns.	2016-12-18 16:51:09 -08:00
Gregory Szorc	f42e2dcaac	repair: implement requirements checking for upgrades This commit introduces functionality for upgrading a repository in place. The first part that's implemented is testing for upgrade "compatibility." This is done by examining repository requirements. There are 5 functions returning sets of requirements that control upgrading. Why so many functions? Mainly to support extensions. Functions are easier to monkeypatch than module variables. Astute readers will see that we don't support "manifestv2" and "treemanifest" requirements in the upgrade mechanism. I don't have a great answer for why other than this is a complex set of patches and I don't want to deal with the complexity of these experimental features just yet. We can teach the upgrade mechanism about them later, once the basic upgrade mechanism is in place. This commit also introduces the "upgraderepo" function. This will be our main routine for performing an in-place upgrade. Currently, it just implements requirements checking. The structure of some code in this function may look a bit weird (e.g. the inline function that is only called once). But this will make sense after future commits.	2016-12-18 16:16:54 -08:00
Martin von Zweigbergk	e1f0ba8ef9	repair: combine two loops over changelog revisions This just saves a few lines.	2017-01-04 10:35:04 -08:00
Martin von Zweigbergk	92d0334538	repair: speed up stripping of many roots repair.strip() expects a set of root revisions to strip. It then builds the full set of descedants by walking the descandants of each. It is rare that more than a few roots get passed in, but if that happens, it will wastefully walk the changelog for each root. So let's just walk it once. I noticed this because the narrowhg extension was passing not only roots, but all the commits to strip. When there were tens of thousands of commits to strip, this resulted in quadratic behavior with that extension.	2017-01-04 10:07:12 -08:00
Durham Goode	52b8095f37	manifest: remove last uses of repo.manifest Now that all the functionality has been moved to manifestlog/manifestrevlog/etc, we can finally change all the uses of repo.manifest to use the new versions. A future diff will then delete repo.manifest. One additional change in this commit is to change repo.manifestlog to be a @storecache property instead of @property. This is required by some uses of repo.manifest require that it be settable (contrib/perf.py and the static http server). We can't do this in a prior change because we can't use @storecache on this until repo.manifest is no longer used anywhere.	2016-11-10 02:13:19 -08:00
Durham Goode	f980c11277	manifest: delete unused dirlog and _newmanifest functions As part of migrating all manifest functionality out of manifest.manifest, let's migrate a couple spots off of manifest.dirlog() to use the revlog specific accessor. Then we can delete manifest.dirlog() and other unused functions.	2016-11-10 02:13:19 -08:00
Martin von Zweigbergk	422165fd86	repair: make strip() return backup file path narrowhg wants to strip some commits and then re-apply them after applying another bundle. Having repair.strip() return the bundle path will be helpful for it.	2016-10-31 15:40:30 -07:00
FUJIWARA Katsunori	49079a5fce	repair: open a file with checkambig=True to avoid file stat ambiguity Before this patch, if steps below occurs at "the same time in sec", all of mtime, ctime and size are same between (1) and (3). 1. append data to revlog-style file (and close transaction) 2. discard appended data by truncation of strip 3. append same size but different data to revlog-style file again Therefore, cache validation doesn't work after (3) as expected. To avoid such file stat ambiguity around truncation, this patch opens a file with checkambig=True. This patch also introduces "with" statement style, to ensure immediate invocation of close() after truncation, because closing file is the only trigger to check (and get rid of) file stat ambiguity. This is a part of ExactCacheValidationPlan. https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan	2016-09-22 21:52:00 +09:00
Martin von Zweigbergk	e3fc1041dc	strip: don't use "full" and "partial" to describe bundles The partial bundle is not a subset of the full bundle, and the full bundle is not full in any way that i see. The most obvious interpretation of "full" I can think of is that it has all commits back to the null revision, but that is not what the "full" bundle is. The "full" bundle is simply a backup of what the user asked us to strip (unless --no-backup). The "partial" bundle contains the revisions we temporarily stripped because they had higher revision numbers that some commit that the user asked us to strip. The "full" bundle is already called "backup" in the code, so let's use that in user-facing messages too. Let's call the "partial" bundle "temporary" in the code.	2016-09-19 09:14:35 -07:00
Martin von Zweigbergk	fe544b2a62	strip: clarify that user action is required to recover temp bundle If strip fails when applying the temporary bundle, the commits in the temporary bundle have not yet been applied, so the user will almost definitely want to apply the bundle. We should be more clear to the user about that than our current "partial bundle stored in...". Note that we will probably not be able to recover it automatically, since whatever made it fail (e.g. a hook) will most likely make it fail again. We need to give control back to the user to fix the problem before trying again.	2016-09-19 09:14:32 -07:00
Martin von Zweigbergk	c435ce4f96	strip: report both bundle files in case of exception (issue5368) If strip fails while recovering the temporary bundle (e.g. because a hook fails), we tell the user only about the backup bundle, not about the temporary bundle. Since the user did not ask to strip the commits in the temporary bundle, that's the more important bundle to mention, so let's do that (and also mention the backup bundle as usual).	2016-09-15 09:45:29 -07:00
Martin von Zweigbergk	19513cd917	strip: simplify some repeated conditions We check "if saveheads or savebases" in several places to see if we should or have created a bundle of the changesets to apply after truncating the revlogs. One of the conditions is actually just "if saveheads", but since there can't be savebases without saveheads, that is effectively the same condition. It seems simpler to check only once and from then on see if we created the file.	2016-09-15 10:18:56 -07:00
Augie Fackler	cc39e93327	repair: build dirlogs using manifest, rather than repo shortcut method As before, this was rarely used, so let's get rid of the convenience method.	2016-08-05 13:01:01 -04:00
Martin von Zweigbergk	82a5e7d944	treemanifests: actually strip directory manifests Stripping has only partly worked since f41815302d49 (repair: use cg3 for treemanifests, 2016-01-19): the bundle seems to have been created correctly, but revlog entries in subdirectory revlogs were not stripped. This meant that e.g. "hg verify" would fail after stripping in a tree manifest repo. To find the revisions to strip, we simply iterate over all directories in the repo (included in store.datafiles()). This is inefficient for stripping few commits, but efficient for stripping many commits. To optimize for stripping few commits, we could instead walk the tree from the root and find modified subdirectories, just like we do in the changegroup code. I'm leaving that for another day.	2016-06-30 13:06:19 -07:00
Augie Fackler	ad67b99d20	cleanup: replace uses of util.(md5\|sha1\|sha256\|sha512) with hashlib.\1 All versions of Python we support or hope to support make the hash functions available in the same way under the same name, so we may as well drop the util forwards.	2016-06-10 00:12:33 -04:00
Laurent Charignon	94c489ecf6	strip: invalidate phase cache after stripping changeset (issue5235) When we remove a changeset from the changelog, the phase cache must be invalidated, otherwise it could refer to changesets that are no longer in the repo. To reproduce the failure, I created an extension querying the phase cache after the strip transaction is over. To do that, I stripped two commits with a bookmark on one of them to force another transaction (we open a transaction for moving bookmarks) after the strip transaction. Without the fix in this patch, the test leads to a stacktrace showing the issue: repair.strip(ui, repo, revs, backup) File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/repair.py", line 205, in strip tr.close() File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/transaction.py", line 44, in _active return func(self, args, *kwds) File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/transaction.py", line 490, in close self._postclosecallback[cat](self) File "$TESTTMP/crashstrip2.py", line 4, in test [repo.changelog.node(r) for r in repo.revs("not public()")] File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/changelog.py", line 337, in node return super(changelog, self).node(rev) File "/Users/lcharignon/facebook-hg-rpms/hg-crew/mercurial/revlog.py", line 377, in node return self.index[rev][7] IndexError: revlog index out of range The situation was encountered in inhibit (evolve's repo) where we would crash following the volatile set invalidation submitted by Augie in cbc52a99d057d11790cf5011e877c6f698bf57bf. Before his patch the issue was masked as we were not accessing the phasecache after stripping a revision. This bug uncovered another but in histedit (see explanation in issue5235). I changed the histedit test accordingly to avoid fixing two things at once.	2016-05-12 06:13:59 -07:00
Kostia Balytskyi	47727221bc	obsstore: move delete function from obsstore class to repair module Since one of the original patches was accepted already and people on the mailing list still have suggestions as to how this should be improved, I'm implementing those suggestions in the following patches (this and the ones that might follow).	2016-04-12 04:06:50 -07:00
Martin von Zweigbergk	4cc86f7b27	bundle: move writebundle() from changegroup.py to bundle2.py (API) writebundle() writes a bundle2 bundle or a plain changegroup1. Imagine away the "2" in "bundle2.py" for a moment and this change should makes sense. The bundle wraps the changegroup, so it makes sense that it knows about it. Another sign that this is correct is that the delayed import of bundle2 in changegroup goes away. I'll leave it for another time to remove the "2" in "bundle2.py" (alternatively, extract a new bundle.py from it).	2016-03-28 14:41:29 -07:00
Anton Shestakov	e4ad4cc290	repair: specify unit for ui.progress in rebuildfncache()	2016-03-11 20:44:40 +08:00
Anton Shestakov	9c0d085179	repair: use 'rebuilding' progress topic in rebuildfncache()	2016-03-11 20:39:29 +08:00
Martin von Zweigbergk	d7ef7dd40a	treemanifest: fix debugrebuildfncache When I taught debugrebuildfncache about dirlogs in ebe9dacc63ba (treemanifests: fix streaming clone, 2016-02-04), I added a last-minute "if 'treemanifest' in repo" guard. That should have been checking for "... in repo.requirements". Fix that and add tests for it.	2016-02-07 21:44:38 -08:00
Martin von Zweigbergk	e50c296659	treemanifests: fix streaming clone Similar to the previous patch, the .hg/store/meta/ directory does not get copied when when using "hg clone --uncompressed". Fix by including "meta/" in store.datafiles(). This seems safe to do, as there are only a few users of this method. "hg manifest" already filters the paths by "data/" prefix. The calls from largefiles also seem safe. The use in verify needs updating to prevent it from mistaking dirlogs for orphaned filelogs. That change is included in this patch. Since the dirlogs will now be in the fncache when using fncachestore, let's also update debugrebuildfncache(). That will also allow any existing treemanifest repos to get their dirlogs into the fncache. Also update test-treemanifest.t to use an a directory name that requires dot-encoding and uppercase-encoding so we test that the path encoding works.	2016-02-04 08:34:07 -08:00
Martin von Zweigbergk	857d2206c3	repair: use cg3 for treemanifests The newly created helper changegroup.safeversion() knows to pick version 03 if the repo uses treemanifests, so just using that means we pick the right changegroup version.	2016-01-19 15:38:24 -08:00
Bryan O'Sullivan	541db2c882	with: use context manager for transaction in strip	2016-01-15 13:14:49 -08:00
Bryan O'Sullivan	8bfeb98530	with: use context manager for transaction in strip	2016-01-15 13:14:49 -08:00
Bryan O'Sullivan	da377129a6	with: use context manager in rebuildfncache	2016-01-15 13:14:49 -08:00
Bryan O'Sullivan	155e1de602	with: use context manager in rebuildfncache again	2016-01-15 13:14:49 -08:00
Laurent Charignon	0a74ada9c5	repair: improves documentation of strip regarding locks This patch adds a comment making it clear that we should hold a lock before calling repair.strip. The wording is the same than what we have for obsolete.createmarkers	2015-12-29 10:21:39 -08:00
Laurent Charignon	56e13fa832	repair: use bookmarks.recordchange instead of bookmarks.write Before this patch we were using the deprecated bookmarks.write api. This patch replaces the call to bookmarks.write by a call to bookmarks.recordchange. We move the bookmark code above the code removing the undo file because with bookmarks.recordchange we have to create a transaction that would create an undo file.	2015-11-30 16:38:29 -08:00
Pierre-Yves David	0089eccb8e	strip: pass source and url to bundle2 processing Restoring from a 'bundle2' was missing this data.	2015-10-20 16:01:33 +02:00
Augie Fackler	5888f3afd8	repair: use cg?unpacker.apply() instead of changegroup.addchangegroup()	2015-10-13 17:12:46 -04:00
Ryan McElroy	61f504197f	strip: factor out revset calculation for strip -B This will allow reusing it in evolve and overriding it in other extensions.	2015-10-09 14:48:59 -07:00
Pierre-Yves David	30913031d4	error: get Abort from 'error' instead of 'util' The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be confused about that and gives all the credit to 'util' instead of the hardworking 'error'. In a spirit of equity, we break the cycle of injustice and give back to 'error' the respect it deserves. And screw that 'util' poser. For great justice.	2015-10-08 12:55:45 -07:00
Pierre-Yves David	316994e165	strip: compress bundle2 backup using BZ Storing uncompressed bundle on disk would be a regression. Strip backup using bundle2 are now compressed when requested.	2015-09-29 14:42:03 -07:00
Pierre-Yves David	124ee320a0	strip: use bundle2 + cg2 by default when repository use general delta The bundle10 format (plain changegroup-01) does not support general delta and result into expensive delta re-computation when stripping. If the repository is general delta, we store backups as bundle20 containing a changegroup-02 payload. We remove the experimental feature related to strip backup bundle format because this achieve the same goal in a leaner way. Removing the experimental option is fine, that is why it experimental in the first place. Compression of these bundles are coming in later changesets.	2015-09-29 13:16:51 -07:00
Matt Mackall	dc3c45835d	merge with stable	2015-08-12 17:01:50 -05:00
Pierre-Yves David	618daeb9ac	strip: use the 'finally: tr.release' pattern during stripping The previous code, was calling 'abort' in all exception cases. This was wrong when an exception was raised by post-close callback on the transaction. Calling 'abort' on an already closed transaction resulted in a error, shadowing the original error. We now use the same pattern as everywhere else. 'tr.release()' will abort the transaction if we escape the scope without closing it. We add a test to make sure we do not regress.	2015-08-08 14:50:03 -07:00
Wagner Bruna	f6ce8ccc47	repair: fix typo in warning message	2015-07-26 09:28:52 -03:00
Matt Mackall	c8d98cbf34	bundle2: fix type of experimental option	2015-06-25 17:56:06 -05:00
Gregory Szorc	a34c01ff2d	repair: use absolute_import	2015-08-08 19:50:48 -07:00

1 2 3

144 Commits