sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-09 16:31:02 +03:00

Author	SHA1	Message	Date
Mads Kiilerich	09567db49a	spelling: trivial spell checking	2015-10-17 00:58:46 +02:00
Matt Mackall	3b6391ff9a	copies: group bothnew with other sets	2015-08-19 15:40:13 -05:00
Matt Mackall	a4a77851e3	copies: rename renamedelete to renamedeleteset for clarity	2015-08-19 15:32:27 -05:00
Matt Mackall	505912bcc2	copies: move _makegetfctx calls into checkcopies	2015-08-19 15:26:08 -05:00
Matt Mackall	babc6236a3	copies: factor out setupctx into _makegetfctx This reduces the scope of mergecopies a bit	2015-08-19 15:17:33 -05:00
Matt Mackall	6be7bd49e6	copies: avoid reference to c1/c2 in makectx	2015-08-21 15:12:58 -05:00
Matt Mackall	846cb052bb	copies: move debug statement to appropriate place	2015-08-19 15:11:17 -05:00
Matt Mackall	06dcd73aae	copies: rename diverge2 to divergeset for clarity	2015-08-19 14:04:54 -05:00
Matt Mackall	cf71f5908f	copies: begin separating mergecopies sides	2015-08-19 13:40:18 -05:00
Matt Mackall	a37c0c906b	copies: rename ctx() to getfctx() for clarity	2015-08-19 13:09:54 -05:00
Durham Goode	1bef6fef82	copy: add flag for disabling copy tracing Copy tracing can be up to 80% of rebase time when rebasing stacks of commits in large repos (hundreds of thousands of files). This provides the option of turning off the majority of copy tracing. It does not turn off _forwardcopies() since that is used to carry copy information inside a commit across a rebase. This will affect the situation where a user edits a file, then rebases on top of commits that have moved that file. The move will not be detected and the user will have to manually resolve the issue (possibly by redoing the rebase with this flag off). The reason to have a flag instead of trying to fix the actual copy tracing performance is that copy tracing is fundamentally an O(number of files in the repo) operation. In order to know if file X in the rebase source was copied anywhere, we have to walk the filelog for every new file that exists in the rebase destination (i.e. a file in the destination that is not in the common ancestor). Without an index that lets us trace forward (i.e. from file Y in the common ancestor forward to the rebase destination), it will never be an O(number of changes in my branch) operation. In mozilla-central, rebasing a 3 commit stack across 20,000 revs goes from 39s to 11s.	2015-01-27 11:26:27 -08:00
Gregory Szorc	9440bd18c5	copies: use absolute_import	2015-08-08 00:41:13 -07:00
Matt Mackall	fc81d2a796	merge with stable	2015-05-26 14:52:47 -05:00
Matt Mackall	bd4df663a9	mergecopies: avoid slowdown from linkrev adjustment (issue4680) checkcopies was using fctx.rev() which it was expecting would be equivalent to linkrev() but was triggering the new _adjustlinkrev path. This was making grafts and merges with large sets of potential copies very expensive.	2015-05-26 06:45:18 -05:00
Martin von Zweigbergk	68e09d510f	copies: document hack for adding '' to set of dirs The root directory is not normally added to 'dirs' instances (although I think it should be). In copies.mergecopies, we call dirname() to get the directory of a path and then check for containment in the 'dirs' instances ('d1' and 'd2'). In order to easily handle files in the root directory, '/' is added to d1/d2. This results in the empty string being added to the sets, since what comes before the slash in '/' is an empty string. This seems less than obvious, so let's document it.	2015-05-22 14:02:04 -07:00
Durham Goode	586027b77d	copies: switch to using pathutil.dirname copies had it's own dirname implementation. Now that pathutils has a common one, let's use that instead.	2015-05-22 12:58:27 -07:00
Durham Goode	610230ad03	copies: add matcher parameter to copy logic This allows passing a matcher down the pathcopies() stack to _forwardcopies(). This will let us add logic in a later patch to avoid tracing copies when not necessary (like when doing hg diff -r 1 -r 2 foo.txt).	2015-04-16 11:29:30 -07:00
Durham Goode	ce308b547b	copies: pass changectx instead of manifest to _computenonoverlap The _computenonoverlap function takes two manifests to allow extensions to hook in and read the manifest nodes produced by the function. The remotefilelog extension actually needs the entire changectx instead (which includes the manifest) so it can prefetch the subset of files necessary for a sparse checkout (and the sparse checkout depends on which commit is being accessed, hence the need for the changectx). I have tests in the remotefilelog extension that cover this.	2015-04-03 15:18:34 -07:00
Matt Mackall	db55434dfb	merge with stable	2015-03-20 17:30:38 -05:00
Pierre-Yves David	41927328f0	mergecopies: reuse ancestry context when traversing file history (issue4537) Merge copies is traversing file history in search for copies and renames. Since 3.3 we are doing "linkrev adjustment" to ensure duplicated filelog entry does not confuse the traversal. This "linkrev adjustment" involved ancestry testing and walking in the changeset graph. If we do such walk in the changesets graph for each file, we end up with a 'O(<changesets>x<files>)' complexity that create massive issue. For examples, grafting a changeset in Mozilla's repo moved from 6 seconds to more than 3 minutes. There is a mechanism to reuse such ancestors computation between all files. But it has to be manually set up in situation were it make sense to take such shortcut. This changesets set this mechanism up and bring back the graph time from 3 minutes to 8 seconds. To do so, we need a bigger control on the way 'filectx' are instantiated during each 'checkcopies' calls that 'mergecopies' is doing. We add a new 'setupctx' that configure and return a 'filectx' factory. The function make sure the ancestry context is properly created and the factory make sure it is properly installed on returned 'filectx'.	2015-03-20 00:30:35 -07:00
Matt Mackall	ab01fb226c	copies: use linkrev for file tracing limit This lets us lazily evaluate _adjustlinkrev.	2015-02-01 16:25:12 -06:00
Pierre-Yves David	e94f338ab6	_adjustlinkrev: reuse ancestors set during rename detection (issue4514) The new linkrev adjustement mechanism makes rename detection very slow, because each file rewalks the ancestor dag. To mitigate the issue in Mercurial 3.3, we introduce a simplistic way to share the ancestors computation for the linkrev validation phase. We can reuse the ancestors in that case because we do not care about sub-branching in the ancestors graph. The cached set will be use to check if the linkrev is valid in the search context. This is the vast majority of the ancestors usage during copies search since the uncached one will only be used when linkrev is invalid, which is hopefully rare.	2015-01-30 16:02:28 +00:00
Durham Goode	f07fec4ff0	copies: added manifests to computenonoverlap Commit d1f83f500b47 changed the computenonoverlap api's to not require the manifests. We actually need the manifests in the remotefilelog extension so we can find the file nodes for the various files that change. Let's add it back to the function signature with a note explaining why. This doesn't affect any behavior.	2015-03-10 13:56:05 -07:00
Martin von Zweigbergk	2de043381d	copies: only calculate 'addedinm[12]' sets once Pass the addedinm1 and addedinm2 instead of m1, m2, ma into _computenonoverlap() instead of calculating the sets twice.	2015-02-27 14:26:22 -08:00
Martin von Zweigbergk	d4eabc6ccd	copies: calculate 'bothnew' from manifestdict.filesnotin() In the same spirit as the previous change, let's now calculate the 'bothnew' variable using manifestdict.filesnotin().5D	2015-02-27 14:03:01 -08:00
Martin von Zweigbergk	93f839cfd2	copies: replace _nonoverlap() by calls to manifestdict.filesnotin() Now that we have manifestdict.filesnotin(), we can write _nonoverlap() in terms of that method instead, enabling future speedups when filesnotin() gets optimized, and perhaps making the code a little clearer at the same time.	2015-02-27 14:02:30 -08:00
Martin von Zweigbergk	199e845f93	copies: move code into new manifestdict.filesnotin() method copies._computeforwardmissing() finds files in one context that is not in the other. Let's move this code into a new method on manifestdict, so m1.filesnotin(m2) can be optimized for various types of manifests (we expect more types of manifests soon).	2015-02-27 13:57:37 -08:00
Durham Goode	7b9ea7ac55	copy: move _forwardcopies file logic to a function Moves the _forwardcopies missingfiles logic to a separate function so that other extensions which need to prefetch information about the files being processed have a hook point. This saves extensions from having to recompute this information themselves, and thus saves several seconds off of various commands (like rebase).	2015-01-27 17:24:12 -08:00
Durham Goode	5a2bff7069	copy: move mergecopies file logic to a function Moves the mergecopies nonoverlap logic to a separate function so that other extensions which may need to prefetch information about the files being processed have a hook point. This saves extensions from having to recompute this information themselves, and thus saves several seconds off of various commands (like rebase).	2015-01-27 17:23:18 -08:00
Mads Kiilerich	523c87c1fe	spelling: fixes from proofreading of spell checker issues	2014-04-17 22:47:38 +02:00
Ryan McElroy	365c7718eb	amend: fix amending rename commit with diverged topologies (issue4405) This addresses the bug described in issue4405: when obsolescence markers are enabled, amending a commit with a file move can lead to the copy information being lost. However, the bug is more general and can be reproduced without obsmarkers as well, as demonstracted by Pierre-Yves and put into the updated test. Specifically, graph topology divergences between the filelogs and the changelog can cause copy information to be lost during amends.	2014-10-16 06:35:06 -07:00
Matt Mackall	f663e5fc01	duplicatecopies: move from cmdutil to copies This is in preparation for moving its primary caller into merge.py, which would be a layering violation in the current location.	2014-10-13 14:33:13 -05:00
Mads Kiilerich	c55887b864	copies: guard debug section with ui.debugflag	2014-02-25 20:31:53 +01:00
Mads Kiilerich	210347bb4f	copies: remove _checkcopies wrapper - it does no good mergecopies might be doomed but it is not dead yet ...	2014-02-25 20:31:51 +01:00
Mads Kiilerich	43ddf0086b	copies: when both sides made the same copy, report it as a copy Not used yet ... but shows up in debug output.	2014-02-25 20:29:14 +01:00
Mads Kiilerich	3ee1a27c56	diff: search beyond ancestor when detecting renames This removes an optimization that was introduced in 5a644704d5eb but was too aggressive - as indicated by how it changed test-mq-merge.t . We are walking filelogs to find copy sources and we can thus not be sure to hit the base revision and find the renamed file there - it could also be in the first ancestor of the base ... in the filelog. We are walking the filelog and can thus not easily know when we hit the first ancestor of the base revision and which filename to look for there. Instead, we use _findlimit like mergecopies do: The lower bound for how far we have to go is found from the lowest changelog revision that is an ancestor of only one of the compared revisions. Any filelog ancestor with a revision number lower than that revision will be the ancestor of both compared revisions, and there is thus no reason to go further back than that.	2013-11-16 15:46:29 -05:00
Durham Goode	8921ef9d64	copies: refactor checkcopies() into a top level method This moves checkcopies() out of mergecopies() and makes it a top level function in the copies module. This allows extensions to override it. For example, I'm developing a filelog replacement that doesn't have rev numbers so all the rev number dependent implementation here needs to be replaced by the extension. No logic is changed in this commit.	2013-05-01 10:44:21 -07:00
Bryan O'Sullivan	4a4a5dde94	scmutil: use new dirs class in dirstate and context The multiset-of-directories code was open coded in each of these modules; this change gets rid of the duplication.	2013-04-10 15:08:26 -07:00
Siddharth Agarwal	563db40b8c	copies._forwardcopies: use set operations to find missing files This is a performance win for a number of reasons: - We don't iterate over contexts, which avoids a completely unnecessary sorted call + the O(number of files) abstraction cost of doing that. - We don't check membership in a context, which avoids another O(number of files) abstraction cost. - We iterate over the manifests in C instead of Python. For a large repo with 170,000 files, this improves perfpathcopies from 0.34 seconds to 0.07. Anything that uses pathcopies, such as rebase or diff --git between two revisions, benefits.	2013-04-04 20:22:29 -07:00
Mads Kiilerich	2e43383c70	copies: report found copies sorted	2012-12-12 02:38:14 +01:00
Mads Kiilerich	ff0cc1a7f9	copies: make the loss in _backwardcopies more stable A couple of tests shows slightly more correct output. That is pure coincidence.	2013-01-15 02:59:12 +01:00
Siddharth Agarwal	28f04a41d2	copies: do not track backward copies, only renames (issue3739) The inverse of a rename is a rename, but the inverse of a copy is not a copy. Presenting it as such -- in particular, stuffing it into the same dict as real copies -- causes bugs because other code starts believing the inverse copies are real. The only test whose output changes is test-mv-cp-st-diff.t. When a backwards status -C command is run where a copy is involved, the inverse copy (which was hitherto presented as a real copy) is no longer displayed. Keeping track of inverse copies is useful in some situations -- composability of diffs, for example, since adding "a" followed by an inverse copy "b" to "a" is equivalent to a rename "b" to "a". However, representing them would require a more complex data structure than the same dict in which real copies are also stored.	2012-12-26 15:04:07 -08:00
Siddharth Agarwal	76d4a91eed	copies: make debug messages more sensible The -> in debug messages is currently overloaded to mean both source to dest and dest to source. To fix this, we add explicit labels and make the arrow direction consistent.	2012-12-26 15:03:58 -08:00
Siddharth Agarwal	8e266eb31f	copies: separate moves via directory renames from explicit copies Currently the "copy" dict contains both explicit copies/moves made by a context and pending moves that need to happen because the other context moved the directory the file was in. For explicit copies, the dict stores a destination to source map, while for pending moves via directory renames, it stores a source to destination map. The merge code uses this fact in a non- obvious way to differentiate between these two cases. We make this explicit by storing these pending moves in a separate dict. The dict still has a source to destination map, but that is called out in the docstring.	2012-12-26 14:50:17 -08:00
Matt Mackall	cbbdbdd866	copies: re-include root directory in directory rename detection (issue3511)	2012-06-27 13:41:04 -05:00
Thomas Arendsen Hein	6c60809dcc	merge: show renamed on one and deleted on the other side in debug output	2012-05-23 21:34:29 +02:00
Thomas Arendsen Hein	91a8201c52	merge: warn about file deleted in one branch and renamed in other (issue3074) For divergent renames the following message is printed during merge: note: possible conflict - file was renamed multiple times to: newfile file2 When a file is renamed in one branch and deleted in the other, the file still exists after a merge. With this change a similar message is printed for mv+rm: note: possible conflict - file was deleted and renamed to: newfile	2012-05-23 20:50:16 +02:00
Thomas Arendsen Hein	4735ff24c9	merge: do not warn about copy and rename in the same transaction (issue2113)	2012-05-23 17:25:48 +02:00
Matt Mackall	598d070281	copies: use ctx.dirs() for directory rename detection	2012-02-26 16:45:59 -06:00
Matt Mackall	a5ae14e360	copies: fix mergecopies doc mapping direction	2012-02-26 15:51:56 -06:00

1 2

97 Commits