sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-11 09:17:30 +03:00

Author	SHA1	Message	Date
Gregory Szorc	8509056f34	sparse: add a requirement when a repository uses sparse (BC) The presence of a sparse checkout can confuse legacy clients or clients without sparse enabled for reasons that should be obvious. This commit introduces a new repository requirement that tracks whether sparse is enabled. The requirement is added when a sparse config is activated and removed when the sparse config is reset. The localrepository constructor has been taught to not open repos with this requirement unless the sparse feature is enabled. It yields a more actionable error message than what you would get if the lockout were handled strictly at the requirements verification phase. Old clients that aren't sparse aware will see the generic "repository requires features unknown to this Mercurial" error, however. The new requirement has "exp" in its name to reflect the experimental nature of sparse. There's a chance that the eventual non-experimental feature won't change significantly and we could have squatted on the "sparse" requirement without ill effect. If that happens, we can teach new clients to still recognize the old name. But I suspect we'll sneak in some BC and we'll want a new requirement to convey new meaning. Differential Revision: https://phab.mercurial-scm.org/D110	2017-07-17 11:45:38 -07:00
Gregory Szorc	efbb740737	revlog: skeleton support for version 2 revlogs There are a number of improvements we want to make to revlogs that will require a new version - version 2. It is unclear what the full set of improvements will be or when we'll be done with them. What I do know is that the process will likely take longer than a single release, will require input from various stakeholders to evaluate changes, and will have many contentious debates and bikeshedding. It is unrealistic to develop revlog version 2 up front: there are just too many uncertainties that we won't know until things are implemented and experiments are run. Some changes will also be invasive and prone to bit rot, so sitting on dozens of patches is not practical. This commit introduces skeleton support for version 2 revlogs in a way that is flexible and not bound by backwards compatibility concerns. An experimental repo requirement for denoting revlog v2 has been added. The requirement string has a sub-version component to it. This will allow us to declare multiple requirements in the course of developing revlog v2. Whenever we change the in-development revlog v2 format, we can tweak the string, creating a new requirement and locking out old clients. This will allow us to make as many backwards incompatible changes and experiments to revlog v2 as we want. In other words, we can land code and make meaningful progress towards revlog v2 while still maintaining extreme format flexibility up until the point we freeze the format and remove the experimental labels. To enable the new repo requirement, you must supply an experimental and undocumented config option. But not just any boolean flag will do: you need to explicitly use a value that no sane person should ever type. This is an additional guard against enabling revlog v2 on an installation it shouldn't be enabled on. The specific scenario I'm trying to prevent is say a user with a 4.4 client with a frozen format enabling the option but then downgrading to 4.3 and accidentally creating repos with an outdated and unsupported repo format. Requiring a "challenge" string should prevent this. Because the format is not yet finalized and I don't want to take any chances, revlog v2's version is currently 0xDEAD. I figure squatting on a value we're likely never to use as an actual revlog version to mean "internal testing only" is acceptable. And "dead" is easily recognized as something meaningful. There is a bunch of cleanup that is needed before work on revlog v2 begins in earnest. I plan on doing that work once this patch is accepted and we're comfortable with the idea of starting down this path.	2017-05-19 20:29:11 -07:00
Gregory Szorc	7d51a8278d	revlog: remove some revlogNG terminology RevlogNG is not such a good name when it is no longer the newest revlog version. Since we'll soon have revlog version 2, let's remove some references to it.	2017-05-19 20:14:31 -07:00
Augie Fackler	ec67fe37fb	merge with stable	2017-05-04 00:26:55 -04:00
Matt Harbison	44c7a911a4	help: spelling fixes	2017-05-03 22:07:47 -04:00
Siddharth Agarwal	d28bc68b23	internals: document that "branches" is a legacy wire command Modern clients use a different discovery mechanism.	2017-05-03 14:07:14 -07:00
Yuya Nishihara	cb8fe16f12	help: fix layout of pre-formatted text	2017-03-09 12:55:48 +09:00
Augie Fackler	b3e554d062	internals: add some brief documentation about censor	2017-01-23 20:17:24 -05:00
Kim Alvefur	977cd43980	help: align description of 'base rev' with reality [issue5488] The text about revlogs seems to be wrong about -1 being used to indicate the start of a delta chain. Attempt to correct this.	2017-02-28 15:19:08 +01:00
Kyle Lippincott	2dc3df2798	help: fix internals.changegroups Add information about tree manifests, copy edit the text and fix up a few ambiguities. The document also contains a few additional fixes from Siddharth Agarwal <sid0@fb.com>, who used it to build a parser for changegroups in Rust.	2017-03-01 18:37:34 -08:00
Dan Villiom Podlaski Christiansen	66c3976319	share: add --relative flag to store a relative path to the source Storing a relative path the source repository is useful when exporting repositories over the network or when they're located on external drives where the mountpoint isn't always fixed. Currently, Mercurial interprets paths in `.hg/shared` relative to $PWD. I suspect this is very much unintentional, and you have to manually edit `.hg/shared` in order to trigger this behaviour. However, on the off chance that someone might rely on it, I added a new capability called 'relshared'. In addition, this makes earlier versions of Mercurial fail with a graceful error. I should note that I haven't tested this patch on Windows.	2017-02-13 14:05:24 +01:00
Martin von Zweigbergk	ad5f4ef8a6	revlog: give EXTSTORED flag value to narrowhg Narrowhg has been using "1 << 14" as its revlog flag value for a long time. We (Google) have many repos with that value in production already. When the same value was reserved for EXTSTORED, it made those repos invalid. Upgrading them will be a little painful. We should clearly have reserved the value for narrowhg a long time ago. Since the EXTSTORED flag is not yet in any release and Facebook also says they have not started using it in production, so it should be okay to change it. This patch gives the current value (1 << 14) back to narrowhg and gives a new value (1 << 13) to EXTSTORED.	2017-01-17 11:25:02 -08:00
Martin von Zweigbergk	a445384510	help: don't let tools reflow revlog flags list Before this change, the text about revlog flags was reflowed into a single paragraph, which made it a bit hard to read. I don't even know the rules around this, but adding a blank line before each flag seems to prevent the reflowing.	2017-01-17 11:45:10 -08:00
Martin von Zweigbergk	0ecfe18db3	help: format revlog.txt more closely to result The rendered text has spaces before each item in the list	2017-01-17 11:29:06 -08:00
Gregory Szorc	73aa240fe1	util: declare wire protocol support of compression engines This patch implements a new compression engine API allowing compression engines to declare support for the wire protocol. Support is declared by returning a compression format string identifier that will be added to payloads to signal the compression type of data that follows and default integer priorities of the engine. Accessor methods have been added to the compression engine manager class to facilitate use. Note that the "none" and "bz2" engines declare wire protocol support but aren't enabled by default due to their priorities being 0. It is essentially free from a coding perspective to support these compression formats, so we do it in case anyone may derive use from it.	2016-12-24 13:51:12 -07:00
Gregory Szorc	0d089ecdf9	internals: document compression negotiation As part of adding zstd support to all of the things, we'll need to teach the wire protocol to support non-zlib compression formats. This commit documents how we'll implement that. To understand how we arrived at this proposal, let's look at how things are done today. The wire protocol today doesn't have a unified format. Instead, there is a limited facility for differentiating replies as successful or not. And, each command essentially defines its own response format. A significant deficiency in the current protocol is the lack of payload framing over the SSH transport. In the HTTP transport, chunked transfer is used and the end of an HTTP response body (and the end of a Mercurial command response) can be identified by a 0 length chunk. This is how HTTP chunked transfer works. But in the SSH transport, there is no such framing, at least for certain responses (notably the response to "getbundle" requests). Clients can't simply read until end of stream because the socket is persistent and reused for multiple requests. Clients need to know when they've encountered the end of a request but there is nothing simple for them to key off of to detect this. So what happens is the client must decode the payload (as opposed to being dumb and forwarding frames/packets). This means the payload itself needs to support identifying end of stream. In some cases (bundle2), it also means the payload can encode "error" or "interrupt" events telling the client to e.g. abort processing. The lack of framing on the SSH transport and the transfer of its responsibilities to e.g. bundle2 is a massive layering violation and a wart on the protocol architecture. It needs to be fixed someday by inventing a proper framing protocol. So about compression. The client transport abstractions have a "_callcompressable()" API. This API is called to invoke a remote command that will send a compressible response. The response is essentially a "streaming" response (no framing data at the Mercurial layer) that is fed into a decompressor. On the HTTP transport, the decompressor is zlib and only zlib. There is currently no mechanism for the client to specify an alternate compression format. And, clients don't advertise what compression formats they support or ask the server to send a specific compression format. Instead, it is assumed that non-error responses to "compressible" commands are zlib compressed. On the SSH transport, there is no compression at the Mercurial protocol layer. Instead, compression must be handled by SSH itself (e.g. `ssh -C`) or within the payload data (e.g. bundle compression). For the HTTP transport, adding new compression formats is pretty straightforward. Once you know what decompressor to use, you can stream data into the decompressor until you reach a 0 size HTTP chunk, at which point you are at end of stream. So our wire protocol changes for the HTTP transport are pretty straightforward: the client and server advertise what compression formats they support and an appropriate compression format is chosen. We introduce a new HTTP media type to hold compressed payloads. The header of the payload defines the compression format being used. Whoever is on the receiving end can sniff the first few bytes route to an appropriate decompressor. Support for multiple compression formats is advertised on both server and client. The server advertises a "compression" capability saying which compression formats it supports and in what order they are preferred. Clients advertise their support for multiple compression formats and media types via the introduced "X-HgProto" request header. Strictly speaking, servers don't need to advertise which compression formats they support. But doing so allows clients to fail fast if they don't support any of the formats the server does. This is useful in situations like sending bundles, where the client may have to perform expensive computation before sending data to the server. Rather than simply advertise a list of supported compression formats, we introduce an additional "httpmediatype" server capability advertising which media types the server supports. This means servers are explicit about what formats they exchange. IMO, this is superior to inferring support from other capabilities (like "compression"). By advertising compression support on each request in the "X-HgProto" header and media type and direction at the server level, we are able to gradually transition existing commands/responses to the new media type and possibly compression. Contrast with the old world, where we only supported a single media type and the use of compression was built-in to the semantics of the command on both client and server. In the new world, if "application/mercurial-0.2" is supported, compression is supported. It's that simple. It's worth noting that we explicitly don't use "Accept," "Accept-Encoding," "Content-Encoding," or "Transfer-Encoding" for content negotiation and compression. People knowledgeable of the HTTP specifications will say that we should use these because that's what they are designed to be used for. They have a point and I sympathize with the argument. Earlier versions of this commit even defined supported media types in the "Accept" header. However, my years of experience rolling out services leveraging HTTP has taught me to not trust the HTTP layer, especially if you are going outside the normal spec (such as using a custom "Content-Encoding" value to represent zstd streams). I've seen load balancers, proxies, and other network devices do very bad and unexpected things to HTTP messages (like insisting zlib compressed content is decoded and then re-encoded at a different compression level or even stripping compression completely). I've found that the best way to avoid surprises when writing protocols on top of HTTP is to use HTTP as a dumb transport as much as possible to minimize the chances that an "intelligent" agent between endpoints will muck with your data. While the widespread use of TLS is mitigating many intermediate network agents interfering with HTTP, there are still problems at the edges, with e.g. the origin HTTP server needing to convert HTTP to and from WSGI and buggy or feature-lacking HTTP client implementations. I've found the best way to avoid these problems is to avoid using headers like "Content-Encoding" and to bake as much logic as possible into media types and HTTP message bodies. The protocol changes in this commit do rely on a custom HTTP request header and the "Content-Type" headers. But we used them before, so we shouldn't be increasing our exposure to "bad" HTTP agents. For the SSH transport, we can't easily implement content negotiation to determine compression formats because the SSH transport has no content negotiation capabilities today. And without a framing protocol, we don't know how much data to feed into a decompressor. So in order to implement compression support on the SSH transport, we'd need to invent a mechanism to represent content types and an outer framing protocol to stream data robustly. While I'm fully capable of doing that, it is a lot of work and not something that should be undertaken lightly. My opinion is that if we're going to change the SSH transport protocol, we should take a long hard look at implementing a grand unified protocol that attempts to address all the deficiencies with the existing protocol. While I want this to happen, that would be massive scope bloat standing in the way of zstd support. So, I've decided to take the easy solution: the SSH transport will not gain support for multiple compression formats. Keep in mind it doesn't support any compression today. So essentially nothing is changing on the SSH front.	2016-12-24 13:56:36 -07:00
Remi Chaintron	66071d6de5	revlog: REVIDX_EXTSTORED flag This flag will be used by the lfs extension to mark the revision data as stored externally.	2017-01-05 17:16:51 +00:00
Remi Chaintron	0c2f48e6cc	documentation: better censor flag documentation	2016-12-22 19:08:38 -05:00
Remi Chaintron	25e30cf1ed	censor: flag internal documentation	2016-11-23 17:36:35 +00:00
Gregory Szorc	247c663431	help: clarify contents of revlog index The previous wording indicated that field at index 3 was the size of the decompressed chunk, not the size of the full revision text.	2016-11-22 18:13:02 -08:00
Gregory Szorc	57934ad4c1	help: document wire protocol commands	2016-08-22 19:50:21 -07:00
Gregory Szorc	a167c55bc7	help: document wire protocol "handshake" protocol There isn't a formal handshake protocol in the wire protocol. But clients almost certainly need to perform particular actions before they can communicate with a server optimally. So document what that is so people understand what's going on at connection establishment time.	2016-08-22 19:49:59 -07:00
Gregory Szorc	378142967e	help: document wire protocol capabilities All capabilities from the history of the project are now documented.	2016-08-22 19:48:31 -07:00
Gregory Szorc	b21abff69d	help: document wire protocol transport protocols The HTTP and SSH transport protocols are documented. This includes how commands and arguments are serialized as well as response types.	2016-08-22 19:47:34 -07:00
Gregory Szorc	5c136c2209	help: internals topic for wire protocol The Mercurial wire protocol is under-documented. This includes a lack of source docstrings and comments as well as pages on the official wiki. This patch adds the beginnings of "internals" documentation on the wire protocol. The documentation should have nearly complete coverage on the lower-level parts of the protocol, such as the different transport mechanims, how commands and arguments are sent, capabilities, and, of course, the commands themselves. As part of writing this documentation, I discovered a number of deficiencies in the protocol and bugs in the implementation. I've started sending patches for some of the issues. I hope to send a lot more. This patch starts with the scaffolding for a new internals page.	2016-08-22 19:46:39 -07:00
Gregory Szorc	efb3c4ff63	help: don't try to render a section on sub-topics This patch subtly changes the behavior of the parsing of "X.Y" values to not set the "section" variable when rendering a known sub-topic. Previously, "section" would be the same as the sub-topic name. This required the sub-topic RST to have a section named the same as the sub-topic name. When I made this change, the descriptions from help.internalstable started being rendered in command line output. This didn't look correct to me, as it didn't match the formatting of main help pages. I corrected this by moving the top section to help.internalstable and changing the section levels of all the "internals" topics. The end result is that "internals" topics now match the rendering of main topics on both the CLI and HTML. And, "internals" topics no longer require a main section matching the name of the topic.	2016-08-06 17:04:22 -07:00
Matt Harbison	8d72832828	help: fix the display for `hg help internals.revlogs` (issue5227) It previously aborted saying the help section wasn't found. Credit to Yuya for figuring out the fix.	2016-05-08 22:28:09 -04:00
Gregory Szorc	db9b5063c1	help: document sharing of revlog header with revision 0 The previous docs were incorrect about there being a discrete header on revlogs.	2016-03-19 15:17:33 -07:00
Gregory Szorc	891be7d008	help: document requirements We didn't have unified documentation of the various repository requirements. This patch changes that.	2016-03-12 18:51:07 -08:00
Gregory Szorc	8e5f315fff	internals: document revlog format It seems like a good idea to document the revlog format. There is a lot more that could be added to this documentation. But you have to start somewhere.	2015-12-30 16:21:57 -07:00
Augie Fackler	e4e988fdf5	changegroups: add documentation for cg3	2015-12-18 09:57:35 -05:00
Gregory Szorc	0507fa0c45	help: add documentation for bundle types Bundle types and the high-level data format of each bundle isn't documented anywhere. Let's document this as well. Obviously there are many more details about bundles that could be written about. But you have to start somewhere.	2015-12-13 11:27:52 -08:00
Gregory Szorc	15c267d26e	help: add documentation for changegroup formats There is no formal location for spec-like technical/internal docs. The repository makes sense as such a location because spec-like documentation should be reviewed (ruling out a wiki). mpm has also stated that he would like this documentation to be part of the built-in help system. So, we establish an "internals" sub-directory to hold this class of documentation. The format of changegroups does not appear to be documented anywhere, even in source code. It therefore seemed like an appropriate first thing to document. This patch adds low-level documentation of versions 1 and 2 of the changegroup foromat. It currently only describes the raw data format. There is probably room to write higher-level documentation on strategies for producing and consuming the data. We'll leave that for another day. The added file is not yet accessible via `hg help` nor via hgweb. Support for this will follow in subsequent patches.	2015-10-25 00:19:45 +01:00

33 Commits