sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-09 00:14:35 +03:00

Author	SHA1	Message	Date
Yuya Nishihara	027352b7a8	streamclone: comment why path auditing is disabled in generatev1() Copied from 8809f5acb29a. I wasn't sure whether it's for optimization or suppressing unwanted error.	2017-07-07 23:19:31 +09:00
Yuya Nishihara	954fcd3b6c	streamclone: close large revlog files explicitly in generatev1()	2017-07-07 23:25:16 +09:00
Pierre-Yves David	a9e2461dd2	streamclone: stop using 'vfs.mustaudit = False' Now that each call disable the auditing on its own, we can safely drop this the mustaudit usage. No other code is modified.	2017-07-02 04:26:34 +02:00
Pierre-Yves David	6da87d469e	vfs: simplify path audit disabling in stream clone The whole 'mustaudit' API is quite complex compared to its actual usage by its unique user in stream clone. Instead we add a "auditpath" parameter to 'vfs.__call_'. The stream clone code then explicitly open files with path auditing disabled. The 'mustaudit' API will be cleaned up in the next changeset.	2017-07-02 02:28:04 +02:00
Pierre-Yves David	54d53f6ed6	configitems: register the 'server.uncompressedallowsecret' config	2017-06-30 03:44:14 +02:00
Gregory Szorc	bc8582fc01	streamclone: consider secret changesets (BC) (issue5589) Previously, a repo containing secret changesets would be served via stream clone, transferring those secret changesets. While secret changesets aren't meant to imply strong security (if you really want to keep them secret, others shouldn't have read access to the repo), we should at least make an effort to protect secret changesets when possible. After this commit, we no longer serve stream clones for repos containing secret changesets by default. This is backwards incompatible behavior. In case anyone is relying on the behavior, we provide a config option to opt into the old behavior. Note that this defense is only beneficial for remote repos accessed via the wire protocol: if a client has access to the files backing a repo, they can get to the raw data and see secret revisions.	2017-06-09 10:41:13 -07:00
Siddharth Agarwal	3bf516869d	clone: warn when streaming was requested but couldn't be performed This helps both users and the people who support them figure out why a stream clone couldn't be performed. In an upcoming patch we're going to add a way for servers to hard abort on a full getbundle. In those cases servers might expect clients to perform a stream clone, so it's important to communicate why one couldn't be done.	2017-05-08 20:01:06 -07:00
Simon Farnsworth	e0b70e4f7f	mercurial: switch to util.timer for all interval timings util.timer is now the best available interval timer, at the expense of not having a known epoch. Let's use it whenever the epoch is irrelevant.	2017-02-15 13:17:39 -08:00
Mads Kiilerich	38cb771268	spelling: fixes of non-dictionary words	2016-10-17 23:16:55 +02:00
FUJIWARA Katsunori	ff0a456116	streamclone: clear caches after writing changes into files for visibility Before this patch, streamclone-ed changes are invisible via @filecache properties to in-process procedures before closing transaction (e.g. pretxnclose python hook), if corresponded property is cached before consumev1(). Strictly speaking, caching should occur inside (store) lock for transaction. repo.invalidate() after closing transaction is too late to force @filecache properties to be reloaded from changed files at next access. For visibility of streamclone-ed changes to in-process procedures before closing transaction, this patch clears caches just after writing changes into files. BTW, regardless of changing in this patch, clearing cached properties in consumev1() causes inconsistency, if (1) transaction is started and (2) any @filecache property is changed before consumev1(). This patch also adds the comment to fix this (potential) inconsistency in the future.	2016-09-12 03:06:29 +09:00
FUJIWARA Katsunori	e23dee13b3	streamclone: force @filecache properties to be reloaded from file Before this patch, consumev1() invokes repo.invalidate() after closing transaction, to force @filecache properties to be reloaded from files at next access, because streamclone writes data into files directly. But this doesn't work as expected in the case below: 1. at closing transaction, repo._refreshfilecachestats() refreshes file stat of each @filecache properties with streamclone-ed files This means that in-memory properties are treated as valid. 2. but streamclone doesn't changes in-memory properties This means that in-memory properties are actually invalid. 3. repo.invalidate() just forces to examine file stat of @filecache properties at the first access after it Such examination should concludes that reloading from file isn't needed, because file stat was already refreshed at (1). Therefore, invalid in-memory cached properties (2) are unintentionally treated as valid (1). This patch invokes repo.invalidate() with clearfilecache=True, to force @filecache properties to be reloaded from file at next access. BTW, it is accidental that repo.invalidate() without clearfilecache=True in streamclone case seems to work as expected before this patch. If transaction is started via "filtered repo" object, repo._refreshfilecachestats() tries to refresh file stat of each @filecache properties on "filtered repo" object, even though all of them are stored into "unfiltered repo" object. In this case, repo._refreshfilecachestats() does nothing unintentionally, but this unexpected behavior causes reloading @filecache properties after repo.invalidate(). This is reason why this patch should be applied before making _refreshfilecachestats() correctly refresh file stat of @filecache properties.	2016-09-12 03:06:28 +09:00
Matt Mackall	97d8dbf685	merge with stable	2016-03-15 14:10:46 -07:00
Mads Kiilerich	39acd325e0	streamclone: fix error when store files grow while stream cloning Effectively a backout of d573a437d564, but updated to using 'with'.	2016-03-13 02:29:11 +01:00
Anton Shestakov	245ded8e7d	streamclone: specify unit for ui.progress when handling data	2016-03-11 22:28:27 +08:00
Gregory Szorc	a05892eae0	streamclone: use backgroundfilecloser (issue4889) Closing files that have been appended to is slow on Windows/NTFS. CloseHandle() calls on this platform often take 1-10ms - and that's on my i7-6700K Skylake processor with a modern and fast SSD. Contrast with other I/O operations, such as writing data, which take <100us. This means that creating/appending thousands of files can add significant overhead. For example, cloning mozilla-central creates ~232,000 revlog files. Assuming 1ms per CloseHandle(), that yields 232s (3:52) of wall time waiting for file closes! The impact of this overhead can be measured most directly when applying stream clone bundles. Applying these files is effectively uncompressing a tar archive (read: it's very fast). Using a RAM disk (read: no I/O wait), the difference in wall time for a `hg debugapplystreamclonebundle` for a ~1731 MB mozilla-central bundle between Windows and Linux from the same machine is drastic: Linux: ~12.8s (128MB/s) Windows: ~352.0s (4.7MB/s) Windows is ~27.5x slower. Yikes! After this patch: Linux: ~12.8s (128MB/s) Windows: ~102.1s (16.1MB/s) Windows is now ~3.4x faster. Unfortunately, it is still ~8x slower than Linux. Profiling reveals a few hot code paths that could likely be improved. But those are for other patches. This patch introduces test-clone-uncompressed.t because existing tests of `clone --uncompressed` are scattered about and adding a variation for background thread closing to e.g. test-http.t doesn't feel correct.	2016-01-14 13:44:01 -08:00
Gregory Szorc	ba2d05e908	streamclone: indent code This will make the subsequent patch easier to read.	2016-01-02 16:11:36 -08:00
Gregory Szorc	9128d3d945	streamclone: extract code for reading header fields So it can be called from another consumer in a future patch.	2016-01-14 22:48:54 -08:00
Bryan O'Sullivan	13360de2f3	with: use context manager for transaction in consumev1	2016-01-15 13:14:49 -08:00
Bryan O'Sullivan	cde011507a	with: use context manager in streamclone consumev1	2016-01-15 13:14:49 -08:00
Bryan O'Sullivan	9b486e52cc	with: use context manager in maybeperformlegacystreamclone	2016-01-15 13:14:49 -08:00
Bryan O'Sullivan	721f51151e	with: use context manager in streamclone generatev1	2016-01-15 13:14:50 -08:00
Bryan O'Sullivan	47caeb4184	i18n: don't translate a transaction name	2016-01-15 13:14:49 -08:00
Gregory Szorc	054fcca201	streamclone: use context manager for writing files These are the file writes that have the most to gain from background I/O. Plug in a context manager so I can design the background I/O mechanism with context managers in mind.	2016-01-02 15:09:58 -08:00
Gregory Szorc	34d6b4f44c	streamclone: use read() We have a convenience API for reading the full contents of a file. Use it.	2016-01-02 15:14:55 -08:00
Gregory Szorc	9f922bdde8	streamclone: support for producing and consuming stream clone bundles Up to this point, stream clones only existed as a dynamically generated data format produced and consumed during streaming clones. In order to support this efficient cloning format with the clone bundles feature, we need a more formal, on disk representation of the streaming clone data. This patch introduces a new "bundle" type for streaming clones. Unlike existing bundles, it does not contain changegroup data. It does, however, share the same concepts like the 4 byte header which identifies the type of data that follows and the 2 byte abbreviation for compression types (of which only "UN" is currently supported). The new bundle format is essentially the existing stream clone version 1 data format with some headers at the beginning. Content negotiation at stream clone request time checked for repository format/requirements compatibility before initiating a stream clone. We can't do active content negotiation when using clone bundles. So, we put this set of requirements inside the payload so consumers have a built-in mechanism for checking compatibility before reading and applying lots of data. Of course, we will also advertise this requirements set in clone bundles. But that's for another patch. We currently don't have a mechanism to produce and consume this new bundle format. This will be implemented in upcoming patches. It's worth noting that if a legacy client attempts to `hg unbundle` a stream clone bundle (with the "HGS1" header), it will abort with: "unknown bundle version S1," which seems appropriate.	2015-10-17 11:14:52 -07:00
Pierre-Yves David	30913031d4	error: get Abort from 'error' instead of 'util' The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be confused about that and gives all the credit to 'util' instead of the hardworking 'error'. In a spirit of equity, we break the cycle of injustice and give back to 'error' the respect it deserves. And screw that 'util' poser. For great justice.	2015-10-08 12:55:45 -07:00
Gregory Szorc	c70ae254f7	streamclone: move "streaming all changes" message location Previously, the message was printed after we requested and started processing the remote stream. This seems like something that we should do before calling out to the remote. Moving it also makes it easier to deal with the bundle2 implementation.	2015-10-04 12:07:01 -07:00
Gregory Szorc	9250e99393	streamclone: move payload header generation into own function The stream clone data over the wire protocol contains a header line indicating total file count and data size. In bundle2, this metadata can be captured by a part parameter and doesn't need to be in the body. In preparation for bundle2, have generatev1() return the raw metadata and move the header generation to its own function.	2015-10-04 19:06:06 -07:00
Gregory Szorc	f101d4721b	streamclone: move payload header line consumption bundle2 parts have parameters. These are a logical place for "header" data such as the file count and payload size of stream clone data. In preparation for supporting stream clones with bundle2, move the consumption of the header line from the payload into maybeperformlegacystreamclone(). Note: the header line is still being emitted by generatev1(). This will be addressed in a subsequent patch.	2015-10-04 18:44:46 -07:00
Gregory Szorc	6868c29dd6	streamclone: teach canperformstreamclone to be bundle2 aware We add an argument to canperformstreamclone() to return False if a bundle2 stream clone is available. This will enable the legacy stream clone step to no-op when a bundle2 stream clone is supported. The commented code will be made active when bundle2 supports streaming clone. This patch does foreshadow the introduction of the "stream" bundle2 capability and its "v1" sub-capability. The bundle2 capability mirrors the existing "stream" capability and is needed so clients know whether a server explicitly supports streaming clones over bundle2 (servers up to this point support bundle2 without streaming clone support). The sub-capability will denote which data formats and variations are supported. Currently, the value "v1" denotes the existing streaming clone data format, which I intend to reuse inside a bundle2 part. My intent is to eventually introduce alternate data formats that can be produced and consumed more efficiently. Having a sub-capability means we don't need to introduce a new top-level bundle2 capability when new formats are introduced. This doesn't really have any implications beyond making the capabilities namespace more organized.	2015-10-04 18:35:19 -07:00
Gregory Szorc	77cf036fc0	streamclone: refactor canperformstreamclone to accept a pullop This isn't strictly necessary. But a lot of pull functionality accepts a pulloperation so extra state can be added easily. It also enables extensions to perform more powerful things.	2015-10-04 11:50:42 -07:00
Gregory Szorc	f283944ccb	streamclone: rename and document maybeperformstreamclone() Upcoming patches will introduce bundle2 based streaming clones. Add "legacy" to the function name and add a docstring clarifying the intent of the function.	2015-10-04 11:34:28 -07:00
Gregory Szorc	fc2b0ba2f0	streamclone: move applyremotedata() into maybeperformstreamclone() Future work around stream cloning will be implemented in a bundle2 world. This code will only be used in the legacy code path and doesn't need to be abstracted or extensible.	2015-10-04 11:27:10 -07:00
Gregory Szorc	eeba469be5	branchmap: move branch cache code out of streamclone.py This is low-level branch map and cache manipulation code. It deserves to live next to similar code in branchmap.py. Moving it also paves the road for multiple consumers, such as a bundle2 part handler that receives branch mappings from a remote. This is largely a mechanical move, with only variable names and indentation being changed.	2015-10-03 09:53:56 -07:00
Gregory Szorc	b191f901f5	streamclone: move streamin() into maybeperformstreamclone() streamin() only had a single consumer. And it always only ever will because it is strongly coupled with the current, soon-to-be-superseded-by-bundle2 functionality. The return value has been dropped because nobody was using it.	2015-10-02 23:08:15 -07:00
Gregory Szorc	8ac7d32ad1	streamclone: refactor maybeperformstreamclone to take a pullop Just like all the other pull steps. Consistency is good. This seems a little excessive right now since maybeperformstreamclone is such a short function. This will be addressed in a subsequent patch.	2015-10-04 11:20:52 -07:00
Gregory Szorc	37cd17cd86	streamclone: add explicit check for empty local repo Stream clone doesn't work with non-empty local repositories. In upcoming patches, we'll move stream cloning to the regular pull code path. Add an explicit check on the repository being empty to prevent streaming clones to non-empty repos.	2015-10-02 21:53:25 -07:00
Gregory Szorc	21ac8b474d	streamclone: refactor code for deciding to stream clone Having this in a standalone function will eventually enable bundle2 to share code with the bundle1 code path. While I was here, I also added some comments to add clarity.	2015-10-02 22:22:11 -07:00
Gregory Szorc	db39558e0c	streamclone: move streaming clone logic from localrepo This is the last remnants of streaming clone code in localrepo.py. This is a mostly mechanical transplant of code to a new file. Only a rewrite of "self" to "repo" was performed. The code will be significantly refactored in upcoming patches. So don't scrutinize it too closely.	2015-10-02 21:39:04 -07:00
Gregory Szorc	40483f6f59	streamclone: move _allowstream() from wireproto While we're moving things into streamclone.py...	2015-10-02 16:24:56 -07:00
Gregory Szorc	d8e74180f0	streamclone: move code out of exchange.py We bulk move functions from exchange.py related to streaming clones. Function names were renamed slightly to drop a component redundant with the module name. Docstrings and comments referencing old names and locations were updated accordingly.	2015-10-02 16:05:52 -07:00
Gregory Szorc	e5b1fcee2d	streamclone: move stream_in() from localrepo Another basic content move. The underscore from the function name was removed to comply with naming standards.	2015-10-02 15:58:24 -07:00
Gregory Szorc	afd8c0b560	streamclone: move applystreamclone() from localrepo.py Upcoming patches will modernize the streaming clone code. Streaming clone data and code kind of lives in its own world. exchange.py is arguably the most appropriate existing location for it. However, over a dozen patches from now it became apparent that there was a lot of code related to streaming clones and that having it contained within its own module would make it easier to comprehend. So, we establish streamclone.py. It's worth noting that streamclone.py existed a long time ago, last seen in the 1.6 release. It was removed in 2cd3dd86758c. The function was renamed as part of the move because its old name was redundant with the new module name. The only other content change was "self" was renamed to "repo" and minor grammar in the docstring was updated.	2015-10-02 15:51:32 -07:00
Dirkjan Ochtman	2c0f8ea6a7	protocol: move the streamclone implementation into wireproto	2010-07-20 20:52:23 +02:00
Dirkjan Ochtman	ce4ed80c7a	protocol: convert StreamException to generated error code This makes it much easier to handle these errors at the transport level.	2010-07-16 22:20:19 +02:00
Nicolas Dumazet	7f1a963829	pylint, pyflakes: remove unused or duplicate imports	2010-04-14 17:58:10 +09:00
Matt Mackall	b5b825953f	streaming: actually change default	2010-02-09 14:12:34 -06:00
Matt Mackall	b7afbe529a	streamclone: allow uncompressed clones by default	2010-02-07 15:31:53 +01:00
Matt Mackall	595d66f424	Update license to GPLv2+	2010-01-19 22:20:08 -06:00
Matt Mackall	3e6199cea0	Merge with -stable	2009-09-30 21:42:51 -05:00

1 2

82 Commits