sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 08:47:12 +03:00

Author	SHA1	Message	Date
Martin von Zweigbergk	ed4a2d83be	changegroup: inline 'publishing' variable in apply()	2017-06-19 00:06:23 -07:00
Martin von Zweigbergk	02b210363d	changegroup: rename "dh" to the clearer "deltaheads" We have a lot of frequently used abbreviations, but this is not one of them.	2017-06-15 13:47:54 -07:00
Martin von Zweigbergk	f2408d8030	changegroup: rename "srccontent" to "cgnodes" It's the list of nodes in the incoming changegroup, so "cgnodes" made more sense to me.	2017-06-15 13:42:41 -07:00
Durham Goode	a73dbb6c8d	changegroup: add bundlecaps back Commit 9233182ea547d0aa removed the unused bundlecaps argument from the changegroup code. While it is unused in core Mercurial, it was an important feature for the remotefilelog extension because it allowed the exchange layer to communicate to the changegroup packer that this was a shallow repo and that filelogs should not be included. Without bundlecaps, there is currently no other way to pass that information along without a more extensive refactor of exchange, bundle, and changegroup code. This patch backs out the original removal, and merges it with some recent changes to changegroup apis.	2017-05-15 09:35:27 -07:00
Pierre-Yves David	dc1e05ed68	caches: stop warming the cache after changegroup application Now that we garantee that branchmap cache is updated at the end of the transaction we can drop this update. This removes a problematic case with nested transaction where the new cache could be written on disk before the transaction is finished (and even roll-backed) Such premature cache write was visible in the following test: * tests/test-acl.t * tests/test-rebase-conflicts.t In addition, running the cache update later means having more date about the state of the repository (in particular: phases). So we can generate caches with more information. This creates harmless changes to the following tests: * tests/test-hardlinks-whitelisted.t * tests/test-hardlinks.t * tests/test-phases.t * tests/test-tags.t * tests/test-inherit-mode.t	2017-05-02 18:57:52 +02:00
Pierre-Yves David	9c635f53f5	caches: move the 'updating the branch cache' message in 'updatecaches' We are about to remove the branchmap cache update in changegroup application. There is a debug message alongside this update that we do not want to loose. We move the message beforehand to simplify the test update in the next changeset. The message move is quite noisy and isolating that noise is useful. Most tests update are just line reordering since the message is issued at a later point during the transaction. After this changes, the message is displayed in more case since local commit creation also issue it.	2017-05-02 22:27:44 +02:00
Pierre-Yves David	5149c68aae	changegroup: deprecate 'getlocalchangroup' (API) We have 'getchangegroup' with a shorter name for the exactly same feature. Now that all users are gone we can formally deprecate it.	2017-05-04 12:43:41 +02:00
Pierre-Yves David	cc5d603ef2	changegroup: deduplicate 'getlocalchangegroup' The two functions 'getlocalchangegroup' and 'getchangegroup' have been strictly identical for multiple years ('getchangegroup' had a deprecated docstring) We'll drop one of them (getlocalchangegroup, since it has the longest name). However, we needs to migrate all users of the dropped one to the new one before we can deprecate it. In the mean time we drop one of the duplicated definition and the outdated docstring.	2017-05-04 12:36:45 +02:00
Martin von Zweigbergk	feebdd58e5	changegroup: delete unused 'bundlecaps' argument (API)	2017-05-02 23:47:10 -07:00
Martin von Zweigbergk	2e8637ffda	merge with stable	2017-03-24 08:37:26 -07:00
Gregory Szorc	6fb64a9cca	changegroup: store old heads as a set Previously, the "oldheads" variable was a list. On a repository at Mozilla with 46,492 heads, profiling revealed that list membership testing was dominating execution time of applying small changegroups. This patch converts the list of old heads to a set. This makes membership testing significantly faster. On the aforementioned repository with 46,492 heads: $ hg unbundle <file with 1 changeset> before: 18.535s wall after: 1.303s Consumers of this variable only check for truthiness (`if oldheads`), length (`len(oldheads)`), and (most importantly) item membership (`h not in oldheads` - which occurs twice). So, the change to a set should be safe and suitable for stable. The practical effect of this change is that changegroup application and related operations (like `hg push`) no longer exhibit an O(n^2) CPU explosion as the number of heads grows.	2017-03-23 19:54:59 -07:00
Remi Chaintron	6d11b9177b	revlog: add 'raw' argument to revision and _addrevision This patch introduces a new 'raw' argument (defaults to False) to revlog's revision() and _addrevision() methods. When the 'raw' argument is set to True, it indicates the revision data should be handled as raw data by the flagprocessor. Note: Given revlog.addgroup() calls are restricted to changegroup generation, we can always set raw to True when calling revlog._addrevision() from revlog.addgroup().	2017-01-05 17:16:07 +00:00
Pierre-Yves David	b77eb86598	changegroup: simplify logic around enabling changegroup 03 There was multiple spot that took care of adding '03' as supported changegroup version for different condition. We gather them all in one location for simplicity. The 'supportedincomingversions' function is now doing nothing, but I kept it around because it looks like a great hooking point for extension. (Note that we should probably just get changegroup3 out of experimental now, But that would be a patch with a much wider scope).	2016-12-19 04:25:18 +01:00
Pierre-Yves David	281f6d3210	changegroup: pass 'repo' to allsupportedversions In the next changesets, we will introduce more logic directly related to the repository to decide what version have to be supported. So we now directly pass the repo object instead of just ui.	2016-12-19 04:29:33 +01:00
Pierre-Yves David	c066ecb20b	changegroup: simplify 'allsupportedversions' logic Discarding '03' to add it back is a bit strange. Instead we only discard it when needed.	2016-12-19 04:31:13 +01:00
Stanislau Hlebik	da605718f2	cg1packer: fix `compressed` method `cg1packer.compressed()` returns True even if `self._type` is 'UN'. This patch fixes it.	2016-12-14 09:53:56 -08:00
Gregory Szorc	ce177e2b4d	changegroup: use compression engines API The new API doesn't have the equivalence for None and 'UN' so we introduce code to use 'UN' explicitly.	2016-11-07 18:38:13 -08:00
Durham Goode	1a774a75e3	changegroup: remove remaining uses of repo.manifest The remaining uses of repo.manifest in the changegroup module are treating the manifest exclusively as a revlog, so let's replace them with instances of the revlog directly. This is part of dropping all dependencies on repo.manifest in favor of repo.manifestlog.	2016-11-08 08:03:43 -08:00
Durham Goode	f952eca1af	manifest: get rid of manifest.readshallowfast This removes manifest.readshallowfast and converts it's one user to use manifestlog instead.	2016-11-02 17:10:47 -07:00
Gregory Szorc	f6f301e95d	changegroup: use changelogrevision() Using offsets for accessing changelog entries isn't very readable. As a bonus, changelog.changelogrevision() also accepts a revision, so we don't need to perform the inline node resolution either.	2016-11-01 18:29:09 -07:00
Gregory Szorc	881f361857	changegroup: cache changelog and manifestlog outside of loop History has taught us that repo.changelog can add significant overhead to loops. So cache the changelog instance outside of the loop to avoid the lookup. While we're here, do the same for manifestlog, since each loop would otherwise initialize a new manifestlog instance.	2016-11-01 18:28:03 -07:00
Pulkit Goyal	56031921a5	py3: convert the mode argument of os.fdopen to unicodes (2 of 2)	2017-02-13 22:15:28 +05:30
Gregory Szorc	722900ff91	changegroup: increase write buffer size to 128k By default, Python defers to the operating system for choosing the default buffer size on opened files. On my Linux machine, the default is 4k, which is really small for 2016. This patch bumps the write buffer size when writing changegroups/bundles to 128k. This matches the 128k read buffer we already use on revlogs. It's worth noting that this only impacts when writing to an explicit file (such as during `hg bundle`). Buffers when writing to bundle files via the repo vfs or to a temporary file are not impacted. When producing a none-v2 bundle file of the mozilla-unified repository, this change caused the number of write() system calls to drop from 952,449 to 29,788. After this change, the most frequent system calls are fstat(), read(), lseek(), and open(). There were 2,523,672 system calls after this patch (so a net decrease of ~950k is statistically significant). This change shows no performance change on my system. But I have a high-end system with a fast SSD. It is quite possible this change will have a significant impact on network file systems, where extra network round trips due to excessive I/O system calls could introduce significant latency.	2016-10-16 13:35:23 -07:00
Pierre-Yves David	667d10975b	changegroup: skip delta when the underlying revlog do not use them Revlog can now be configured to store full snapshot only. This is used on the changelog. However, the changegroup packing was still recomputing deltas to be sent over the wire. We now just reuse the full snapshot directly in this case, skipping delta computation. This provides use with a large speed up(-30%): # perfchangegroupchangelog on mercurial ! wall 2.010326 comb 2.020000 user 2.000000 sys 0.020000 (best of 5) ! wall 1.382039 comb 1.380000 user 1.370000 sys 0.010000 (best of 8) # perfchangegroupchangelog on pypy ! wall 5.792589 comb 5.780000 user 5.780000 sys 0.000000 (best of 3) ! wall 3.911158 comb 3.920000 user 3.900000 sys 0.020000 (best of 3) # perfchangegroupchangelog on mozilla central ! wall 20.683727 comb 20.680000 user 20.630000 sys 0.050000 (best of 3) ! wall 14.190204 comb 14.190000 user 14.150000 sys 0.040000 (best of 3) Many tests have to be updated because of the change in bundle content. All theses update have been verified. Because diffing changelog was not very valuable, the resulting bundle have similar size (often a bit smaller): # full bundle of mozilla central with delta: 1142740533B without delta: 1142173300B So this is a win all over the board.	2016-10-14 01:31:11 +02:00
Gregory Szorc	eb9f859c39	changegroup: document deltaparent's choice of previous revision As part of debugging low-level changegroup generation, I came across what I initially thought was a weird behavior: changegroup v2 is choosing the previous revision in the changegroup as a delta base instead of p1. I was tempted to rewrite this to use p1, as p1 will delta better than prev in the common case. However, I realized that taking p1 as the base would potentially require resolving a revision fulltext and thus require more CPU for e.g. server-side processing of getbundle requests. This patch tweaks the code comment to note the choice of behavior. It also notes there is room for a flag or config option to tweak this behavior later: using p1 as the delta base would likely make changegroups smaller at the expense of more CPU, which could be beneficial for things like clone bundles.	2016-10-13 12:49:47 +02:00
Durham Goode	23d229132c	manifest: add manifestctx.readdelta() This adds an implementation of readdelta to the new manifestctx class and adds a couple consumers of it. This currently appears to have some duplicate code, but future patches cause this function to diverge when things like "shallow" are introduced.	2016-09-13 16:25:21 -07:00
Pierre-Yves David	7775b9bfe7	computeoutgoing: move the function from 'changegroup' to 'exchange' Now that all users are in exchange, we can safely move the code in the 'exchange' module. This function is really about processing the argument of a 'getbundle' call, so it even makes senses to do so.	2016-08-09 17:06:35 +02:00
Pierre-Yves David	1b40b7e1c5	getchangegroup: take an 'outgoing' object as argument (API) There is various version of this function that differ mostly by the way they define the bundled set. The flexibility is now available in the outgoing object itself so we move the complexity into the caller themself. This will allow use to remove a good share of the similar function to obtains a changegroup in the 'changegroup.py' module. An important side effect is that we stop calling 'computeoutgoing' in 'getchangegroup'. This is fine as code that needs such argument processing is actually going through the 'exchange' module which already all this function itself.	2016-08-09 17:00:38 +02:00
Pierre-Yves David	cca37f1814	outgoing: add a 'missingroots' argument This argument can be used instead of 'commonheads' to determine the 'outgoing' set. We remove the outgoingbetween function as its role can now be handled by 'outgoing' itself. I've thought of using an external function instead of making the constructor more complicated. However, there is low hanging fruit to improve the current code flow by storing some side products of the processing of 'missingroots'. So in my opinion it make senses to add all this to the class.	2016-08-09 22:31:38 +02:00
Pierre-Yves David	f460bb8823	outgoing: pass a repo object to the constructor We are to introduce more code constructing such object in the code base. It will be more convenient to pass a repository object, all current users already operate at the repository level anyway. More changes to the contructor argument are coming in later changeset.	2016-08-09 15:26:53 +02:00
Gregory Szorc	aa618850dc	changegroup: move branch cache debug message to proper location Before, we logged about performing a branch cache update when we weren't actually doing it. Fix that.	2016-08-08 22:06:07 -07:00
Augie Fackler	715ea0d1cc	changegroup: use `iter(callable, sentinel)` instead of while True This is functionally equivalent, but is a little more concise.	2016-08-05 13:59:58 -04:00
Gregory Szorc	9f5f743c8a	discovery: move code to create outgoing from roots and heads changegroup.changegroupsubset() contained somewhat low-level code for constructing an "outgoing" instance from a list of roots and heads nodes. It feels like discovery.py is a more appropriate location for this code. This code can definitely be optimized, as outgoing.missing will recompute the set of changesets we've already discovered from cl.between(). But code shouldn't be refactored during a move, so I've simply inserted a TODO calling attention to that.	2016-08-03 22:07:52 -07:00
Gregory Szorc	4ad5f2e492	bundle2: store changeset count when creating file bundles The bundle2 changegroup part has an advisory param saying how many changesets are in the part. Before this patch, we were setting this part when generating bundle2 parts via the wire protocol but not when generating local bundle2 files. A side effect of not setting the changeset count part is that progress bars don't work when applying changesets. As the tests show, this impacted clone bundles, shelve, backup bundles, `hg unbundle`, and anything touching bundle2 files. This patch adds a backdoor to allow us to pass state from changegroup generation into the unbundler. We store the number of changesets in the changegroup in this state and use it to populate the aforementioned advisory part parameter when generating the bundle2 bundle. I concede that I'm not thrilled by how state is being passed in changegroup.py (it feels a bit hacky). I would love to overhaul the rather confusing set of functions in changegroup.py with something that passes rich objects around instead of e.g. low-level generators. However, given the code freeze for 3.9 is imminent, I'd rather not undertake this endeavor right now. This feels like the easiest way to get the parameter added to the changegroup part.	2016-07-17 15:13:51 -07:00
Martin von Zweigbergk	6612ed3d4a	changegroup: don't send empty subdirectory manifest groups When grafting/rebasing, it is common for multiple changesets to make the same change to a subdirectory. When writing the revlog for the directory, the revlog code already takes care of not writing the entry again. In 3eb9fa4180d3 (changegroup: prune subdirectory dirlogs too, 2016-02-12), I added the corresponding code in changegroup (not sending entries the client already has), but I forgot to avoid sending the entire changegroup if no nodes remained in the pruned set. Although that's harmless besides the wasted network traffic, the receiving side was checking for it (copied from the changegroup code for handling files). This resulted in the client crashing with: abort: received dir revlog group is empty Fix by simply not emitting a changegroup for the directory if there were no changes is it. This matches how files are handled.	2016-06-16 15:15:33 -07:00
Augie Fackler	8945f7f25d	changegroup: extract method that sorts nodes to send The current implementation of narrowhg needs to influence the order in which nodes are sent to the client. adgar@ and I think this is fixable, but it's going to require pretty substantial time investment, so in the interim we'd like to extract this method. I think it makes the group() code a little more obvious, as it took us a couple of tries to isolate the exact behavior we were observing.	2016-05-12 22:29:05 -04:00
Martin von Zweigbergk	4cc86f7b27	bundle: move writebundle() from changegroup.py to bundle2.py (API) writebundle() writes a bundle2 bundle or a plain changegroup1. Imagine away the "2" in "bundle2.py" for a moment and this change should makes sense. The bundle wraps the changegroup, so it makes sense that it knows about it. Another sign that this is correct is that the delayed import of bundle2 in changegroup goes away. I'll leave it for another time to remove the "2" in "bundle2.py" (alternatively, extract a new bundle.py from it).	2016-03-28 14:41:29 -07:00
Martin von Zweigbergk	daa3928fb2	changegroup: clear progress callback after changelog processing The progress callback is replaced by one for manifests after changelog processing is done, but let's not depend on manifests replacing the value and instead explicitly clear it.	2016-02-29 09:26:43 -08:00
Martin von Zweigbergk	e8ad80f690	changegroup: progress for added files is not measured in "chunks" The "prog" class cg1unpacker.apply() has the unit set to "chunks". This is not correct for files, where the file itself is the unit. The unit is not usually printed, which is probably why this has not been fixed yet. It can be show with e.g. "--config progress.format='topic number unit'".	2016-02-28 22:51:07 -08:00
Martin von Zweigbergk	8d4ca9dc03	changegroup: exclude submanifests from manifest progress The progress callback for manifests is cleared outside of _unpackmanifests(), which means it will remain in effect while pulling subdirectory manifests when using treemanifests. Since the total number of revisions used for the progress is the number of changesets, the total number of treemanifest revisions is usually larger than that. One effect of this is that the ETA is negative. It's hard to estimate the number of subdirectory revisions, so let's just exclude them from progress for now.	2016-02-28 21:15:06 -08:00
Gregory Szorc	3774b02abb	changegroup: use changelog.readfiles We have a dedicated function to get just the list of files in a changelog entry. Use it. This will presumably speed up changegroup application since we're no longer decoding the entire changelog entry. But I didn't measure the impact.	2016-02-27 23:06:05 -08:00
Martin von Zweigbergk	225ad0fad5	changegroup: drop special-casing of flat manifests Since 37e42a0009a4 (changegroup: avoid iterating the whole manifest, 2015-12-04), the manifest linkrev callback iterates over only the files that were touched according the the changeset. Before that change, we iterated over all files returned in manifest.readfast(). That method returns the files in the delta, if the delta parent is a parent, otherwise it returns the full manifest. Most manifest revisions end up using one of the parents as its delta parent, so most of the time, the method returns a short manifest. It seems that that happens often enough that it doesn't really matter; I could not reproduce the timings reported in that change. Since the treemanifest code now works quite differently, and since that code also works correctly for flat manifests, let's drop the special-casing of flat manifests.	2016-02-22 14:43:14 -08:00
Martin von Zweigbergk	58c3ff9aaf	changegroup: fix treemanifests on merges The current code for generating treemanifest revisions takes the list of files in the changeset and finds the directories from them. This does not work for merges, since a merge may pick file A from one side and file B from another and neither of them would appear in the changeset's "files" list, but the manifest would still change. Fix this by instead walking the root manifest log for all needed revisions, storing all needed file and subdirectory revisions, then recursively visiting the subdirectories. This also turns out to be faster: cloning a version of hg core converted to treemanifests went from ~28s to ~19s (timing somewhat unfair: before this patch, timed until crash; after this patch, timed until manifests complete). The new algorithm is used only on treemanifest repos. Although it works equally well on flat manifests, we leave the iteration over files in the changeset for flat manifests for now.	2016-02-12 23:09:09 -08:00
Martin von Zweigbergk	c1d77f8a77	changegroup: write root manifests and subdir manifests in a single loop This is another step towards making the manifest generation recurse along the directory trees. The loop over 'tmfnodes' now takes the form of a queue. At this point, we only add to the queue twice: we add the root manifests, and, while visiting the root manifest revisions, we add all subdirectory revisions (for treemanifest repos). Thus, any iterations over 'tmfnodes' after the first will not add any items and the "queue" will just keep shrinking.	2016-02-12 23:30:18 -08:00
Martin von Zweigbergk	58d674fbd1	changegroup: introduce makelookupmflinknode(dir) This is another step towards making the manifest generation recurse along the directory trees. It makes the two calls to _packmanifests() more similar.	2016-02-12 23:26:15 -08:00
Martin von Zweigbergk	9381017496	changegroup: prune subdirectory dirlogs too We already prune changesets, root manifests and files whose linkrev is in the set of common revisions. We should do the same for dirlogs.	2016-02-12 21:21:28 -08:00
Martin von Zweigbergk	3761b3f9e7	changegroup: include subdirectory manifests in verbose size When verbose logging is one, we report the size in bytes of the manifest data in the changegroup. For files, we report the size per file, but I'm not sure we need that level of detail (i.e. size per directory manifest). Instead, report a single figure for the size of root manifest plus submanifests.	2016-02-12 15:42:16 -08:00
Martin von Zweigbergk	cd0a0297ee	changegroup: make _packmanifests() dumber The next few patches will rewrite the manifest generation code to work with merges. We will then walk dirlogs recursively. This prepares for that by moving much of the treemanifest code out of _packmanifests() and into generatemanifests(). For this to work, it also adds _manifestsdone() method that returns the "end of manifests" close chunk for cg3 and an empty string for cg1 and cg2.	2016-02-12 15:18:56 -08:00
Martin von Zweigbergk	fb3a96fcf4	changegroup: extract generatemanifests() The changegroup.generate() function is pretty long, so let's extract the manifest generation part of it.	2016-02-11 20:19:48 -08:00
Martin von Zweigbergk	86ca76bafe	changegroup: fix pulling to treemanifest repo from flat repo (issue5066) In b89de5ee5b31 (changegroup: don't support versions 01 and 02 with treemanifests, 2016-01-19), I stopped supporting use of cg1 and cg2 with treemanifest repos. What I had not considered was that it's perfectly safe to pull to a treemanifest repo using any changegroup version. As reported in issue5066, I therefore broke pull from old repos into a treemanifest repo. It was not covered by the test case, because that pulled from a local repo while enabling treemanifests, which enabled treemanifests on the source repo as well. After switching to pulling via HTTP, it breaks. Fix by splitting up changegroup.supportedversions() into supportedincomingversions() and supportedoutgoingversions().	2016-01-27 09:07:28 -08:00

1 2 3 4 5

245 Commits