sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-11 09:17:30 +03:00

Author	SHA1	Message	Date
Pulkit Goyal	b73ca160f7	py3: use dict.update() instead of constructing lists and adding them dict.items() returned a list on Python 2 and whereas on Python 3 it returns a view object. So we required a work around. Using dict.update() is better then constructing lists as it should save us on gc churns.	2017-06-01 01:14:02 +05:30
Stanislau Hlebik	8513dd5f4b	copies: introduce getdstfctx Previously `c2` may had an incorrect linkrev because getsrcfctx set wrong _descendantrev. getsrcfctx() sets descendant rev equals to srcctx.rev() (see _makegetfctx()), but for `c2` descendant rev should be dstctx. While we were lucky it didn't broke copytracing it made it significantly slower in some cases. Besides it broke some external extensions, for example remotefilelog.	2017-05-29 06:06:13 -07:00
Stanislau Hlebik	52f5b22254	copies: rename getfctx to getsrcfctx In the next patch we'll use getdstfctx. Let's rename getfctx to getsrcfctx in this patch.	2017-05-29 05:58:08 -07:00
Stanislau Hlebik	789df611a1	copies: remove msrc and mdst parameters This function already has lots of parameters. And we can get manifests from contexts. So let's get msrc and mdst parameters from srcctx and dstctx.	2017-05-29 05:57:25 -07:00
Stanislau Hlebik	d7b30bb1f8	copies: add dstctx parameter Add parameter with destination context	2017-05-29 05:57:03 -07:00
Stanislau Hlebik	92e5d5f67c	copies: rename ctx to srcctx In the next diff we'll pass new dstctx parameter. Let's rename ctx to srcctx in this patch.	2017-05-29 05:56:17 -07:00
Stanislau Hlebik	aa25365b25	copies: rename m2 to mdst Small refactoring to rename m2 to more clearer mdst.	2017-05-29 05:52:15 -07:00
Stanislau Hlebik	219c6a051d	copies: rename m1 to msrc Small refactoring that renames `m1` parameter name to a more clearer name `msrc`.	2017-05-29 05:52:15 -07:00
Martin von Zweigbergk	c3406ac3db	cleanup: use set literals We no longer support Python 2.6, so we can now use set literals.	2017-02-10 16:56:29 -08:00
Durham Goode	3058681421	copies: remove use of manifest.matches Convert the existing use of manifest.matches to use the new api. This is part of getting rid of manifest.matches, since it is O(manifest).	2017-03-07 09:56:11 -08:00
Gábor Stefanik	b631d16eab	graft: support grafting changes to new file in renamed directory (issue5436)	2016-12-05 17:40:01 +01:00
Durham Goode	fb55c2fbf3	dirstate: change added/modified placeholder hash length to 20 bytes Previously the added/modified placeholder hash for manifests generated from the dirstate was a 21byte long string consisting of the p1 file hash plus a single character to indicate an add or a modify. Normal hashes are only 20 bytes long. This makes it complicated to implement more efficient manifest implementations which rely on the hashes being fixed length. Let's change this hash to just be 20 bytes long, and rely on the astronomical improbability of an actual hash being these 20 bytes (just like we rely on no hash every being the nullid). This changes the possible behavior slightly in that the hash for all added/modified entries in the dirstate manifest will now be the same (so simple node comparisons would say they are equal), but we should never be doing simple node comparisons on these nodes even with the old hashes, because they did not accurately represent the content (i.e. two files based off the same p1 file node, with different working copy contents would have the same hash (even with the appended character) in the old scheme too, so we couldn't depend on the hashes period).	2016-11-10 02:19:16 -08:00
Durham Goode	03d313b5fd	dirstate: change placeholder hash length to 20 bytes Previously the new-node placeholder hash for manifests generated from the dirstate was a 21byte long string of "!" characters. Normal hashes are only 20 bytes long. This makes it complicated to implement more efficient manifest implementations which rely on the hashes being fixed length. Let's change this hash to just be 20 bytes long, and rely on the astronomical improbability of an actual hash being 20 "!" bytes in a row (just like we rely on no hash ever being the nullid). A future diff will do this for added and modified dirstate markers as well, so we're putting the new newnodeid in node.py so there's a common place for these placeholders.	2016-11-10 02:17:22 -08:00
Gábor Stefanik	5533b05a12	merge: avoid superfluous filemerges when grafting through renames (issue5407) This is a fix for a regression introduced by the patches for issue4028. The test changes are due to us doing fewer _checkcopies searches now, which makes some test outputs revert to the pre-issue4028 behavior. That issue itself remains fixed, we only skip copy tracing for files where it isn't relevant. As a nice side effect, this makes copy detection much faster when tracing backwards through lots of renames.	2016-10-25 21:01:53 +02:00
Gábor Stefanik	14dc42e666	copies: improve assertions during copy recombination - Make sure there is nothing to recombine in non-graftlike scenarios - More pythonic assert syntax	2016-10-18 02:09:08 +02:00
Gábor Stefanik	2f48be6841	copies: make _checkcopies handle copy sequences spanning the TCA (issue4028) When working in a rotated DAG (for a graftlike merge), there can be files that are renamed both between the base and the topological CA, and between the TCA and the endpoint farther from the base. Such renames span the TCA (and thus need both passes of _checkcopies to be fully detected), but may not necessarily be divergent. Make _checkcopies return "incomplete copies" and "incomplete divergences" in this case, and let mergecopies recombine them once data from both passes of _checkcopies is available. With this patch, all known cases involving renames and grafts pass. (Developed together with Pierre-Yves David)	2016-10-11 04:39:47 +02:00
Gábor Stefanik	c03c8792d5	checkcopies: add logic to handle remotebase As the two _checkcopies passes' ranges are separated by tca, not base, only one of the two passes will actually encounter the base. Pass "remotebase" to the other pass to let it know not to expect passing over the base. This is required for handling a few unusual rename cases.	2016-10-11 04:25:59 +02:00
Gábor Stefanik	912f58ada1	mergecopies: add logic to process incomplete data We first combine incomplete copies on the two sides of the topological CA into complete copies. Any leftover incomplete copies are then combined with the incomplete divergences to reconstruct divergences spanning over the topological CA. Finally we promote any divergences falsely flagged as incomplete to full divergences. Right now, there is nothing generating incomplete copy/divergence data, so this code does nothing. Changes to _checkcopies to populate these dicts are coming later in this series.	2016-10-04 12:51:54 +02:00
Gábor Stefanik	242c4897e8	checkcopies: handle divergences contained entirely in tca::ctx During a graftlike merge, _checkcopies runs from ctx to tca, possibly passing over the merge base. If there is a rename both before and after the base, then we're actually dealing with divergent renames. If there is no rename on the other side of tca, then the divergence is contained entirely in the range of one _checkcopies invocation, and should be detected "in the loop" without having to rely on the other _checkcopies pass.	2016-10-12 11:54:03 +02:00
Gábor Stefanik	d967d939d6	mergecopies: invoke _computenonoverlap for both base and tca during merges The algorithm of _checkcopies can only walk backwards in the DAG, never forward. Because of this, the two _checkcopies patches need to run from their respective endpoints to the TCA to cover the entire subgraph where the merge is being performed. However, detection of files new in both endpoints, as well as directory rename detection, need to run with respect to the merge base, so we need lists of new files both from the TCA's and the merge base's viewpoint to correctly detect renames in a graft-like merge scenario. (Series reworked by Pierre-Yves David)	2016-10-13 02:19:43 +02:00
Pierre-Yves David	cce3e9c3ad	copies: make it possible to distinguish betwen _computenonoverlap invocations _computenonoverlap needs to be invoked twice during a graft, and debugging messages should be distinguishable between the two invocations	2016-10-18 00:00:43 +02:00
Gábor Stefanik	7730b47e09	copies: make _checkcopies handle simple renames in a rotated DAG This introduces a distinction between "merge base" and "topological common ancestor". During a regular merge, these two are identical. Graft, however, performs a merge in a rotated DAG, where the merge base will not be a common ancestor at all in the original DAG. To correctly find copies in case of a graft, we need to take both the merge base and the topological CA into account, and track any renames between them in reverse. Fortunately we can detect this in advance, see comment in the code about "backwards". This patch only supports finding non-divergent renames contained entirely between the merge base and the topological CA. Further patches are coming to support more complex cases. (Pierre-Yves David was involved in the cleanup of this patch.)	2016-10-13 02:03:54 +02:00
Gábor Stefanik	60bab1ec6c	copies: compute a suitable TCA if base turns out to be unsuitable This will be used later in an update to _checkcopies. (Pierre-Yves David was involved in the cleanup of this patch.)	2016-10-13 02:03:49 +02:00
Gábor Stefanik	6250f7ff54	copies: detect graft-like merges Right now, nothing changes as a result of this, but we want to handle grafts differently from ordinary merges later. (Series developed together with Pierre-Yves David)	2016-10-13 01:47:33 +02:00
Gábor Stefanik	4adc2f1a6a	checkcopies: add a sanity check against false-positive copies When grafting a copy backwards through a rename, a copy is wrongly detected, which causes the graft to be applied inappropriately, in a destructive way. Make sure that the old file name really exists in the common ancestor, and bail out if it doesn't. This fixes the aggravated case of bug 5343, although the basic issue (failure to duplicate the copy information) still occurs.	2016-10-12 21:33:45 +02:00
Pierre-Yves David	604c8243a9	mergecopies: rename 'ca' to 'base' This variable was named after the common ancestor. It is actually the merge base that might differ from the common ancestor in the graft case. We rename the variable before a larger refactoring to clarify the situation. Similar rename was also applied to 'checkcopies' in a prior changeset.	2016-10-13 01:30:14 +02:00
Pierre-Yves David	4eabc75da9	copies: move variable document from checkcopies to mergecopies It appears that 'mergecopies' is the function consuming these data so we move the documentation there.	2016-10-13 01:26:33 +02:00
Pierre-Yves David	9df147eb63	checkcopies: pass data as a dictionary of dictionaries more are coming	2016-10-11 02:21:42 +02:00
Pierre-Yves David	80ed73689f	checkcopies: move 'movewithdir' initialisation right before its usage The 'movewithdir' had a lot of related logic all around the 'mergecopies'. However it is actually never containing anything until the very last loop in that function. We move the (simplified) variable definition there for clarity	2016-10-11 02:15:23 +02:00
Pierre-Yves David	163070ae3d	checkcopies: extract the '_related' closure There is not need for it to be a closure.	2016-10-11 01:29:08 +02:00
Pierre-Yves David	2d670597fe	checkcopies: add an inline comment about the '_related' call This helps understanding the flow of the function.	2016-10-08 23:00:55 +02:00
Pierre-Yves David	71b0c4ef9c	checkcopies: minor change to comment This helped me understand the refactoring so this must be helpful.	2016-10-08 19:03:16 +02:00
Pierre-Yves David	1f966aa892	checkcopies: rename 'ca' to 'base' This variable was named after the common ancestor. It is actually the merge base that might differ from the common ancestor in the graft case. We rename the variable before a larger refactoring to clarify the situation.	2016-10-08 18:38:42 +02:00
Gábor Stefanik	8d5c0019c5	copies: don't record divergence for files needing no merge This is left over from when _checkcopies was factored out from mergecopies. The 2nd break has "of = None" before it, so it's a functionally equivalent change. The 1st one, however, causes a divergence to be recorded when a file has been renamed, but there is nothing to be merged to it. This is currently harmless, since the extra divergence is simply ignored later. However, the new _checkcopies introduced in the rest of this series does more than just record a divergence after completing the main loop, and it's important that the "post-processing" stage is really skipped for no-merge-needed renames.	2016-10-03 13:29:59 +02:00
Gábor Stefanik	5a7c7889a5	copies: mark checkcopies as internal with the _ prefix	2016-10-03 13:24:56 +02:00
Gábor Stefanik	6f3ec7c4c3	copies: split u1/u2 to u1u/u2u and u1r/u2r These will be made different in case of grafts by another patch in this series.	2016-10-03 13:23:19 +02:00
Gábor Stefanik	88c9737beb	copies: style fixes and add comment	2016-10-03 13:18:31 +02:00
Gábor Stefanik	846bad50ac	copies: limit is an optimization, and doesn't provide guarantees	2016-10-03 16:19:55 +02:00
timeless	a1cb3173a2	py3: convert to next() function next(..) was introduced in py2.6 and .next() is not available in py3 https://docs.python.org/2/library/functions.html#next	2016-05-16 21:30:53 +00:00
Durham Goode	6f7f581f5f	copies: optimize forward copy detection logic for rebases Forward copy detection (i.e. detecting what files have been moved/copied in commit X since ancestor Y) previously required diff'ing the manifests of both X and Y. This was expensive since it required reading both entire manifests and doing a set difference (they weren't already in a set because of the lazymanifest work). This cost almost 1 second on very large repositories, and happens N times for a rebase of N commits. This patch optimizes it for the case of rebase. In a rebase, we are comparing a commit against it's immediate parent, and therefore we can know what files changed by looking at ctx.files(). This lets us drastically decrease the size of the set comparison, and makes it O(# of changes) instead of O(size of manifest). This makes it take 1ms instead of 1000ms.	2016-02-05 13:23:24 -08:00
Matt Mackall	a8fcfbf03d	copies: fix detection of divergent directory renames If we move all the files out of one directory, but into two different directories, we should not consider it a directory rename. The detection of this case was broken.	2016-01-13 10:10:05 -06:00
Mads Kiilerich	09567db49a	spelling: trivial spell checking	2015-10-17 00:58:46 +02:00
Matt Mackall	3b6391ff9a	copies: group bothnew with other sets	2015-08-19 15:40:13 -05:00
Matt Mackall	a4a77851e3	copies: rename renamedelete to renamedeleteset for clarity	2015-08-19 15:32:27 -05:00
Matt Mackall	505912bcc2	copies: move _makegetfctx calls into checkcopies	2015-08-19 15:26:08 -05:00
Matt Mackall	babc6236a3	copies: factor out setupctx into _makegetfctx This reduces the scope of mergecopies a bit	2015-08-19 15:17:33 -05:00
Matt Mackall	6be7bd49e6	copies: avoid reference to c1/c2 in makectx	2015-08-21 15:12:58 -05:00
Matt Mackall	846cb052bb	copies: move debug statement to appropriate place	2015-08-19 15:11:17 -05:00
Matt Mackall	06dcd73aae	copies: rename diverge2 to divergeset for clarity	2015-08-19 14:04:54 -05:00
Matt Mackall	cf71f5908f	copies: begin separating mergecopies sides	2015-08-19 13:40:18 -05:00

1 2 3

138 Commits