sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-11 09:17:30 +03:00

Author	SHA1	Message	Date
Pierre-Yves David	d03144db9c	hidden: extract the code generating "filtered rev" error for wrapping The goal is to help experimentation in extensions (ie: evolve) around more advance messages.	2017-04-15 18:13:10 +02:00
Matt Harbison	6d898e296f	serve: add support for Mercurial subrepositories I've been using `hg serve --web-conf ...` with a simple '/=projects/**' [paths] configuration for awhile without issue. Let's ditch the need for the manual configuration in this case, and limit the repos served to the actual subrepos. This doesn't attempt to handle the case where a new subrepo appears while the server is running. That could probably be handled with a hook if somebody wants it. But it's such a rare case, it probably doesn't matter for the temporary serves. The main repo is served at '/', just like a repository without subrepos. I'm not sure why the duplicate 'adding ...' lines appear on Linux. They don't appear on Windows (see 3f4ff1bdf101), so they are optional. Subrepositories that are configured with '../path' or absolute paths are not cloneable from the server. (They aren't cloneable locally either, unless they also exist at their configured source, perhaps via the share extension.) They are still served, so that they can be browsed, or cloned individually. If we care about that cloning someday, we can probably just add the extra entries to the webconf dictionary. Even if the entries use '../' to escape the root, only the related subrepositories would end up in the dictionary.	2017-04-15 18:05:40 -04:00
Matt Harbison	0181beb642	hgwebdir: allow a repository to be hosted at "/" This can be useful in general, but will also be useful for hosting subrepos, with the main repo at /.	2017-03-31 23:00:41 -04:00
Gregory Szorc	2d0781917d	httppeer: eliminate decompressresponse() proxy Now that the response instance itself is wrapped with error handling, we no longer need this code. This code became dead with the previous patch because the added code catches HTTPException and re-raises as something else.	2017-04-14 00:03:30 -07:00
Gregory Szorc	4958a4d6ca	httppeer: wrap HTTPResponse.read() globally There were a handful of places in the code where HTTPResponse.read() was called with no explicit error handling or with inconsistent error handling. In order to eliminate this class of bug, we globally swap out HTTPResponse.read() with a unified error handler. I initially attempted to fix all call sites. However, after going down that rabbit hole, I figured it was best to just change read() to do what we want. This appears to be a worthwhile change, as the tests demonstrate many of our uncaught exceptions go away. To better represent this class of failure, we introduce a new error type. The main benefit over IOError is it can hold a hint. I'm receptive to tweaking its name or inheritance.	2017-04-14 00:33:56 -07:00
Gregory Szorc	8637678d4e	phases: emit phases to pushkey protocol in deterministic order An upcoming test will report exact bytes sent over the wire protocol. Without this change, the ordering of phases listkey data is non-deterministic.	2017-04-13 22:12:04 -07:00
Gregory Szorc	2ccb65a5bc	keepalive: send HTTP request headers in a deterministic order An upcoming patch will add low-level testing of the bytes being sent over the wire. As part of developing that test, I discovered that the order of headers in HTTP requests wasn't deterministic. This patch makes the order deterministic to make things easier to test.	2017-04-13 18:04:38 -07:00
Denis Laxalde	9e99218a46	revset: properly parse "descend" argument of followlines() We parse "descend" symbol as a Boolean using getboolean (prior extraction by getargsdict already checked that it is a symbol). In tests, check for error cases and vary Boolean values here and there.	2017-04-15 11:29:42 +02:00
Denis Laxalde	f3c282d63c	revsetlang: add a getboolean helper function This will be used to parse followlines's "descend" argument.	2017-04-15 11:26:09 +02:00
Pierre-Yves David	53505593ab	track-tags: write all tag changes to a file The tag changes information we compute is now written to disk. This gives hooks full access to that data. The format picked for that file uses a 2 characters prefix for the action: -R: tag removed +A: tag added -M: tag moved (old value) +M: tag moved (new value) This format allows hooks to easily select the line that matters to them without having to post process the file too much. Here is a couple of examples: * to select all newly tagged changeset, match "^+", * to detect tag move, match "^.M", * to detect tag deletion, match "-R". Once again we rely on the fact the tag tests run through all possible situations to test this change.	2017-03-28 10:15:02 +02:00
Pierre-Yves David	cd08df0c89	track-tags: compute the actual differences between tags pre/post transaction We now compute the proper actuall differences between tags before and after the transaction. This catch a couple of false positives in the tests. The compute the full difference since we are about to make this data available to hooks in the next changeset.	2017-03-28 10:14:55 +02:00
Pierre-Yves David	ac782d2423	track-tags: introduce first bits of tags tracking during transaction This changeset introduces detection of tags changes during transaction. When this happens a 'tag_moved=1' argument is set for hooks, similar to what we do for bookmarks and phases. This code is disabled by default as there are still various performance concerns. Some require a smarter use of our existing tag caches and some other require rework around the transaction logic to skip execution when unneeded. These performance improvements have been delayed, I would like to be able to experiment and stabilize the feature behavior first. Later changesets will push the concept further and provide a way for hooks to know what are the actual changes introduced by the transaction. Similar work is needed for the other families of changes (bookmark, phase, obsolescence, etc). Upgrade of the transaction logic will likely be performed at the same time. The current code can report some false positive when .hgtags file changes but resulting tags are unchanged. This will be fixed in the next changeset. For testing, we simply globally enable a hook in the tag test as all the possible tag update cases should exist there. A couple of them show the false positive mentioned above. See in code documentation for more details.	2017-03-28 06:38:09 +02:00
Pierre-Yves David	0736144919	tags: introduce a function to return a valid fnodes list from revs This will get used to compare tags between two set of revisions during a transaction (pre and post heads). The end goal is to be able to track tags movement in transaction hooks.	2017-03-28 05:06:56 +02:00
Denis Laxalde	631e6988ca	context: possibly yield initial fctx in blockdescendants() If initial 'fctx' has changes in line range with respect to its parents, we yield it first. This makes 'followlines(..., descend=True)' consistent with 'descendants()' revset which yields the starting revision. We reuse one iteration of blockancestors() which does exactly what we want. In test-annotate.t, adjust 'startrev' in one case to cover the situation where the starting revision does not touch specified line range.	2017-04-14 14:25:06 +02:00
Denis Laxalde	559326afdb	context: add an assertion checking linerange consistency in blockdescendants() If this assertion fails, this indicates a flaw in the algorithm. So fail fast instead of possibly producing wrong results. Also extend the target line range in test to catch a merge changeset with all its parents.	2017-04-14 14:09:26 +02:00
Kostia Balytskyi	64a48b9fb1	windows: add win32com.shell to demandimport ignore list Module 'appdirs' tries to import win32com.shell (and catch ImportError as an indication of failure) to check whether some further functionality should be implemented one or another way [1]. Of course, demandimport lets it down, so if we want appdirs to work we have to add it to demandimport's ignore list. The reason we want appdirs to work is becuase it is used by setuptools [2] to determine egg cache location. Only fairly recent versions of setuptools depend on this so people don't see this often. [1] https://github.com/ActiveState/appdirs/blob/master/appdirs.py#L560 [2] `aae0a92811/pkg_resources/__init__.py (L1369)`	2017-04-14 12:34:26 -07:00
Bryan O'Sullivan	60b68c00eb	stdio: raise StdioError if something goes wrong in ui.flush The prior code used to ignore all errors, which was intended to deal with a decade-old problem with writing to broken pipes on Windows. However, that code inadvertantly went a lot further, making it impossible to detect all I/O errors on stdio ... but only sometimes. What actually happened was that if Mercurial wrote less than a stdio buffer's worth of output (the overwhelmingly common case for most commands), any error that occurred would get swallowed here. But if the buffering strategy changed, an unhandled IOError could be raised from any number of other locations. Because we now have a top-level StdioError handler, and ui._write and ui._write_err (and now flush!) will raise that exception, we have one rational place to detect and handle these errors.	2017-04-11 14:54:12 -07:00
Bryan O'Sullivan	dd48bd8237	stdio: raise StdioError if something goes wrong in ui._write_err The prior code used to ignore certain classes of error, which was not the right thing to do.	2017-04-11 14:54:12 -07:00
Bryan O'Sullivan	7b0fee3bf9	stdio: raise StdioError if something goes wrong in ui._write	2017-04-11 14:54:12 -07:00
Bryan O'Sullivan	ebffdb4558	stdio: catch StdioError in dispatch.run and clean up appropriately We attempt to report what went wrong, and more importantly exit the program with an error code. (The exception we catch is not yet raised anywhere in the code.)	2017-04-11 14:54:12 -07:00
Bryan O'Sullivan	84ac0ade7c	stdio: add machinery to identify failed stdout/stderr writes Mercurial currently fails to notice failures to write to stdout or stderr. A correctly functioning command line tool should detect this and exit with an error code. To achieve this, we need a little extra plumbing, which we start adding here.	2017-04-11 14:54:12 -07:00
Bryan O'Sullivan	287bd28acf	atexit: switch to home-grown implementation	2017-04-11 14:54:12 -07:00
Bryan O'Sullivan	0c663fe04d	ui: add special-purpose atexit functionality In spite of its longstanding use, Python's built-in atexit code is not suitable for Mercurial's purposes, for several reasons: * Handlers run after application code has finished. * Because of this, the code that runs handlers swallows exceptions (since there's no possible stacktrace to associate errors with). If we're lucky, we'll get something spat out to stderr (if stderr still works), which of course isn't any use in a big deployment where it's important that exceptions get logged and aggregated. * Mercurial's current atexit handlers make unfortunate assumptions about process state (specifically stdio) that, coupled with the above problems, make it impossible to deal with certain categories of error (try "hg status > /dev/full" on a Linux box). * In Python 3, the atexit implementation is completely hidden, so we can't hijack the platform's atexit code to run handlers at a time of our choosing. As a result, here's a perfectly cromulent atexit-like implementation over which we have control. This lets us decide exactly when the handlers run (after each request has completed), and control what the process state is when that occurs (and afterwards).	2017-04-11 14:54:12 -07:00
Denis Laxalde	761577866a	context: follow all branches in blockdescendants() In the initial implementation of blockdescendants (and thus followlines(..., descend=True) revset), only the first branch encountered in descending direction was followed. Update the algorithm so that all children of a revision ('x' in code) are considered. Accordingly, we need to prevent a child revision to be yielded multiple times when it gets visited through different path, so we skip 'i' when this occurs. Finally, since we now consider all parents of a possible child touching a given line range, we take care of yielding the child if it has a diff in specified line range with at least one of its parent (same logic as blockancestors()).	2017-04-14 08:55:18 +02:00
Jun Wu	dcf42da6e9	pager: set some environment variables if they're not set Git did this already [1] [2]. We want this behavior too [3]. This provides a better default user experience (like, supporting colors) if users have things like "PAGER=less" set, which is not uncommon. The environment variables are provided by a method so extensions can override them on demand. [1]: `6a5ff7acb5/pager.c (L87)` [2]: `6a5ff7acb5/Makefile (L1545)` [3]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-March/094780.html	2017-04-13 08:27:19 -07:00
Augie Fackler	fe10e9b912	sshpeer: fix docstring typo	2017-04-13 14:48:18 -04:00
Augie Fackler	6278186d0b	util: pass sysstrs to warnings.filterwarnings Un-breaks the Python 3 build.	2017-04-13 13:12:49 -04:00
Pierre-Yves David	dc3a34b74e	vfs: deprecate all old classes in scmutil Now that all vfs class moved to the vfs module, we can deprecate the old one.	2017-04-03 14:21:38 +02:00
Pierre-Yves David	a185960897	util: add a way to issue deprecation warning without a UI object Our current deprecation warning mechanism relies on ui object. They are case where we cannot have access to the UI object. On a general basis we avoid using the python mechanism for deprecation warning because up to Python 2.6 it is exposing warning to unsuspecting user who cannot do anything to deal with them. So we build a "safe" strategy to hide this warnings behind a flag in an environment variable. The test runner set this flag so that tests show these warning. This will help us marker API as deprecated for extensions to update their code.	2017-04-04 11:03:29 +02:00
Denis Laxalde	160d0b298e	gitweb: plug followlines UI in filerevision view Mostly copy CSS rules from style-paper.css into style-gitweb.css. The only modification is addition of !important on "background-color" rule for "pre.sourcelines > span.followlines-selected" selector as the background color is otherwise overriden by "pre.sourcelines.stripes > :nth-child(4n+4)" rule.	2017-04-13 09:49:48 +02:00
Denis Laxalde	14cc343c76	gitweb: handle "patch" query parameter in filelog view As for paper style, in d9b8811bed4a, we display "diff" data as an additional row in the table of revision entries for the gitweb template. Also, as these additional diff rows have a white background, they may be confused with log entry rows ("age", "author", "description", "links") of even parity (parity0 also have a white background). So we disable parity colors for log entry rows when diff is displayed and fix the color to the "dark" parity (i.e. parity1 #f6f6f0) so that it's always distinguishable from	2017-04-13 10:04:09 +02:00
Denis Laxalde	8806e20e50	gitweb: add information about "linerange" filtering in filelog view As for paper style, in a58e79a03a6e, we display a "(following lines <fromline>:<toline> <a href='...'>back to filelog</a>)" message alongside the file name when "linerange" query parameter is present.	2017-04-13 09:59:58 +02:00
Gábor Stefanik	387861cc38	util: fix human-readable printing of negative byte counts Apply the same human-readable printing rules to negative byte counts as to positive ones. Fixes output of debugupgraderepo.	2017-04-10 18:16:30 +02:00
Gregory Szorc	6c7c4762ec	show: implement underway view This is the beginning of a wip/smartlog view. It is basically a manually constructed (read: fast) revset function to collect "relevant" changesets combined with a custom template and a graph displayer. It obviously needs a lot of work. I'd like to get something usable in 4.2 so `hg show` has some value to end-users. Let the bikeshedding begin.	2017-04-12 20:31:15 -07:00
Gregory Szorc	e9dd2f7a3f	pycompat: import correct cookie module on Python 3 http.cookielib doesn't exist. http.cookiejar does and it contains the symbols we need. This fixes test failures on Python 3.	2017-04-12 18:42:20 -07:00
Denis Laxalde	5544045959	hgweb: add a link to followlines in descending direction We change the content of the followlines popup to display two links inviting to follow the history of selected lines in ascending (as before) and descending directions. The popup now renders as: follow history of lines <fromline>:<toline>: <a href=...>ascending</a> / <a href=...>descending</a>	2017-04-10 17:36:40 +02:00
Denis Laxalde	bd52f5d831	hgweb: handle a "descend" query parameter in filelog command When this "descend" query parameter is present along with "linerange" parameter, we get revisions following line range in descending order. The parameter has no effect without "linerange".	2017-04-10 16:23:41 +02:00
Yuya Nishihara	ee998576d8	worker: flush messages written by child processes before exit I found some child outputs were lost while testing the previous patch. Since os._exit() does nothing special, we need to do that explicitly.	2017-02-25 12:48:50 +09:00
Rishabh Madan	d0ac5a9dcb	ui: replace obsolete default-push with default:pushurl (issue5485) Default-push has been deprecated in favour of default:pushurl. But "hg clone" still inserts this in every hgrc file it creates. This patch updates the message by replacing default-push with default:pushurl and also makes the necessary changes to test files.	2017-02-25 16:57:21 +05:30
FUJIWARA Katsunori	47ba9fae77	worker: ignore meaningless exit status indication returned by os.waitpid() Before this patch, worker implementation assumes that os.waitpid() with os.WNOHANG returns '(0, 0)' for still running child process. This is explicitly specified as below in Python API document. os.WNOHANG The option for waitpid() to return immediately if no child process status is available immediately. The function returns (0, 0) in this case. On the other hand, POSIX specification doesn't define the "stat_loc" value returned by waitpid() with WNOHANG for such child process. http://pubs.opengroup.org/onlinepubs/9699919799/functions/waitpid.html CPython implementation for os.waitpid() on POSIX doesn't take any care of this gap, and this may cause unexpected "exit status indication" even on POSIX conformance platform. For example, os.waitpid() with os.WNOHANG returns non-zero "exit status indication" on FreeBSD. This implies os.kill() with own pid or sys.exit() with non-zero exit code, even if no child process fails. To ignore meaningless exit status indication returned by os.waitpid(), this patch skips subsequent steps forcibly, if os.waitpid() returns 0 as pid. This patch also arranges examination of 'p' value for readability. FYI, there are some issues below about this behavior reported for CPython. https://bugs.python.org/issue21791 https://bugs.python.org/issue27808	2017-02-25 01:07:52 +09:00
Siddharth Agarwal	7d1a6f9777	bundle2: fix assertion that 'compression' hasn't been set `n.lower()` will return `compression`, not `Compression`.	2017-02-13 11:43:12 -08:00
Pierre-Yves David	43b1ef004c	wireproto: properly report server Abort during 'getbundle' Previously Abort raised during 'getbundle' call poorly reported (HTTP-500 for http, some scary messages for ssh). Abort error have been properly reported for "push" for a long time, there is not reason to be different for 'getbundle'. We properly catch such error and report them back the best way available. For bundle, we issue a valid bundle2 reply (as expected by the client) with an 'error:abort' part. With bundle1 we do as best as we can depending of http or ssh.	2017-02-10 18:20:58 +01:00
Pierre-Yves David	695fa85daa	getbundle: cleanly handle remote abort during getbundle bundle2 allow the server to report error explicitly. This was initially implemented for push but there is not reason to not use it for pull too. This changeset add logic similar to the one in 'unbundle' to the client side of 'getbundle'. That logic make sure the error is properly reported as "remote". This will allow the server side of getbundle to send clean "Abort" message in the next changeset.	2017-02-10 18:17:20 +01:00
Pierre-Yves David	d00dbd00d9	bundle1: fix bundle1-denied reporting for pull over ssh Changeset a0966f529e1b introduced a config option to have the server deny pull using bundle1. The original protocol has not really been design to allow that kind of error reporting so some hack was used. It turned the hack only works on HTTP and that ssh server hangs forever when this is used. After further digging, there is no way to report the error in a unified way. Using `ooberror` freeze ssh and raising 'Abort' makes HTTP return a HTTP-500 without further details. So with sadness we implement a version that dispatch according to the protocol used. Now the error is properly reported, but we still have ungraceful abort after that. The protocol do not allow anything better to happen using bundle1.	2017-02-10 18:06:08 +01:00
Pierre-Yves David	5b07cfa3b3	bundle1: display server abort hint during unbundle The code was printing the abort message but not the hint. This is now fixed.	2017-02-10 17:56:52 +01:00
Pierre-Yves David	64f57e513b	bundle1: fix bundle1-denied reporting for push over ssh Changeset a0966f529e1b introduced a config option to have the server deny push using bundle1. The original protocol has not really be design to allow such kind of error reporting so some hack was used. It turned the hack only works on HTTP and that ssh wire peer hangs forever when the same hack is used. After further digging, there is no way to report the error in a unified way. Using 'ooberror' freeze ssh and raising 'Abort' makes HTTP return a HTTP500 without further details. So with sadness we implement a version that dispatch according to the protocol used. We also add a test for pushing over ssh to make sure we won't regress in the future. That test show that the hint is missing, this is another bug fixed in the next changeset.	2017-02-10 17:56:59 +01:00
Pierre-Yves David	e8a7ecc281	bundle2: keep hint close to the primary message when remote abort The remote hint message was ignored when reporting the remote error and passed to the local generic abort error. I think I might initially have tried to avoid reimplementing logic controlling the hint display depending of the verbosity level. However, first, there does not seems to have such verbosity related logic and second the resulting was wrong as the primary error and the hint were split apart. We now properly print the hint as remote output.	2017-02-10 17:56:47 +01:00
FUJIWARA Katsunori	2afd920706	misc: update year in copyright lines This patch also makes some expected output lines in tests glob-ed for persistence of them. BTW, files below aren't yet changed in 2017, but this patch also updates copyright of them, because: - mercurial/help/hg.1.txt almost all of "man hg" output comes from online help of hg command, and is already changed in 2017 - mercurial/help/hgignore.5.txt - mercurial/help/hgrc.5 "copyright 2005-201X Matt Mackall" in them mentions about copyright of Mercurial itself	2017-02-12 02:23:33 +09:00
Mads Kiilerich	6945cf0f5b	merge: more safe detection of criss cross merge conflict between dm and r 0b5f1f2efc77 introduced handling of a crash in this case. A review comment suggested that it was not entirely obvious that a 'dm' always would have a 'r' for the source file. To mitigate that risk, make the code more conservative and make less assumptions.	2017-02-01 02:10:30 +01:00
Mads Kiilerich	120b66d101	merge: fix crash on criss cross merge with dir move and delete (issue5020) Work around that 'dm' in the data model only can have one operation for the target file, but still can have multiple and conflicting operations on the source file where the other operation is a 'rm'. The move would thus fail with 'abort: No such file or directory'. In this case it is "obvious" that the file should be removed, either before or after moving it. We thus keep the 'rm' of the source file but drop the 'dm'. This is not a pretty fix but quite "obviously" safe (famous last words...) as it only touches a rare code path that used to crash. It is possible that it would be better to swap the files for 'dm' as suggested on https://bz.mercurial-scm.org/show_bug.cgi?id=5020#c13 but it is not entirely obvious that it not just would create conflicts on the other file. That can be revisited later.	2017-01-31 03:25:59 +01:00
Martin von Zweigbergk	06f115a93e	util: make sortdict.keys() return a copy dict.keys() is documented to return a copy, so it's surprising that sortdict.keys() did not. I noticed this because we have an extension that calls readlocaltags(). That method tries to remove any tags that point to non-existent revisions (most likely stripped). However, since it's unintentionally working on the instance it's modifying, it sometimes fails to remove tags when there are multiple bad tags in a row. This was not caught because localrepo.tags() does an additional layer of filtering. sortdict is also used in other places, but I have not checked whether its keys() and/or __delitem__() methods are used there.	2017-01-30 22:58:56 -08:00
Yuya Nishihara	74023f2b13	revset: prevent using outgoing() and remote() in hgweb session (BC) outgoing() and remote() may stall for long due to network I/O, which seems unsafe per definition, "whether a predicate is safe for DoS attack." But I'm not 100% sure about this. If our concern isn't elapsed time but CPU resource, these predicates are considered safe. Perhaps that would be up to the web/application server configuration? Anyway, outgoing() and remote() wouldn't be useful in hgweb, so I think it's okay to ban them.	2017-01-20 21:33:18 +09:00
Sean Farley	e145fc2df7	ui: rename tmpdir parameter to more specific repopath This was requested by Augie and I agree that repopath is more descriptive.	2017-01-18 18:25:51 -08:00
Gregory Szorc	9c03a7696d	statprof: require input file statprof has a __main__ handler that allows viewing of previously written data files. As Yuya pointed out during review, 82ee01726a77 broke this. This patch fixes that.	2017-01-18 22:45:07 -08:00
Sean Farley	a405503f7a	cmdutil: add tmpdir parament to ui.edit calls	2017-01-16 21:15:21 -08:00
Sean Farley	9280f19af2	ui: add a parameter to set the temporary directory for edit Until callsites are updated, this will have no effect. Once callsites are updated, specifying experimental.editortmpinhg will create editor temporary files in a subdirectory of .hg, which will make it easier for tool integrations to determine what repository is in play when they're asked to edit an hg-related file.	2017-01-16 21:05:22 -08:00
Pulkit Goyal	f38d10e539	help: update help for `hg update` which was misleading (issue5427)	2017-01-18 03:44:19 +05:30
Matt Harbison	511b164fad	templater: add '{envvars}' to access environment variables Since the option for ui.exportableenviron is experimental, so is this template until the underlying API is sorted out.	2017-01-17 23:12:54 -05:00
Matt Harbison	5a63dbb230	ui: introduce an experimental dict of exportable environment variables Care needs to be taken to prevent leaking potentially sensitive environment variables through hgweb, if template support for environment variables is to be introduced. There are a few ideas about the API for preventing accidental leaking [1]. Option 3 seems best from the POV of not needing to configure anything in the normal case. I couldn't figure out how to do that, so guard it with an experimental option for now. [1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-January/092383.html	2017-01-17 23:05:12 -05:00
Martin von Zweigbergk	ad5f4ef8a6	revlog: give EXTSTORED flag value to narrowhg Narrowhg has been using "1 << 14" as its revlog flag value for a long time. We (Google) have many repos with that value in production already. When the same value was reserved for EXTSTORED, it made those repos invalid. Upgrading them will be a little painful. We should clearly have reserved the value for narrowhg a long time ago. Since the EXTSTORED flag is not yet in any release and Facebook also says they have not started using it in production, so it should be okay to change it. This patch gives the current value (1 << 14) back to narrowhg and gives a new value (1 << 13) to EXTSTORED.	2017-01-17 11:25:02 -08:00
Martin von Zweigbergk	a445384510	help: don't let tools reflow revlog flags list Before this change, the text about revlog flags was reflowed into a single paragraph, which made it a bit hard to read. I don't even know the rules around this, but adding a blank line before each flag seems to prevent the reflowing.	2017-01-17 11:45:10 -08:00
Martin von Zweigbergk	0ecfe18db3	help: format revlog.txt more closely to result The rendered text has spaces before each item in the list	2017-01-17 11:29:06 -08:00
Denis Laxalde	86ca3ec602	hgweb: simplify calculation of first revision in filelog command	2017-01-17 09:19:24 +01:00
Denis Laxalde	8eecb0ced7	hgweb: restore ascending iteration on revs in filelog web command Follow-up on e082a1597833. Adjust back the "parity" generator's offset to keep rendering the same.	2017-01-17 09:17:29 +01:00
Denis Laxalde	779e08447b	revset: add a 'descend' argument to followlines to return descendants This is useful to follow changes in a block of lines forward in the history (for instance, when one wants to find out how a function evolved from a point in history). We added a 'descend' parameter to followlines(), which defaults to False. If True, followlines() returns descendants of startrev. Because context.blockdescendants() does not follow renames, these are not followed by the revset either, so history will end when a rename occurs (as can be seen in tests).	2017-01-16 09:24:47 +01:00
Denis Laxalde	d7409a0458	context: add a blockdescendants function This is symmetrical with blockancestors() and yields descendants of a filectx with changes in the given line range. The noticeable difference is that the algorithm does not follow renames (probably because filelog.descendants() does not), so we are missing branches with renames.	2017-04-10 15:11:36 +02:00
Gregory Szorc	ef4d6a1617	url: support auth.cookiesfile for adding cookies to HTTP requests Mercurial can't currently send cookies as part of HTTP requests. Some authentication systems use cookies. So, it seems like adding support for sending cookies seems like a useful feature. This patch implements support for reading cookies from a file and automatically sending them as part of the request. We rely on the "cookiejar" Python module to do the heavy lifting of parsing cookies files. We currently only support the Mozilla (really Netscape-era) cookie format. There is another format supported by cookielib and we may want to consider using that, especially since the Netscape cookie parser can't parse ports. It wasn't immediately obvious to me what the format of the other parser is, so I didn't know how to test it. I /think/ it might be literal "Cookie" header values, but I'm not sure. If it is more robust than the Netscape format, we may want to just support it.	2017-03-09 22:40:52 -08:00
Gregory Szorc	bd7f2afe30	httpconnection: allow a global auth.cookiefile config entry This foreshadows support for defining a cookies file.	2017-03-09 22:35:10 -08:00
Gregory Szorc	3c5a0a039c	util: make cookielib module available In preparation for supporting sending cookies on HTTP requests.	2017-03-09 21:35:21 -08:00
Pierre-Yves David	010d017cdd	crecord: avoid setting non-existing SIGTSTP signal on windows (issue5512) Windows do not have a SIGTSTP so we avoid setting the handler if the signal is unknown.	2017-04-06 11:28:25 +02:00
Pierre-Yves David	5a12bd8592	crecord: ensure we reinstall the SIGTSTP handler Previous, exceptions would prevent the reinstallation of the signal.	2017-04-06 11:25:13 +02:00
Pierre-Yves David	6ab2d25fb5	crecord: avoid setting non-existing signal SIGWINCH on windows Windows do not have a SIGWINCH so we avoid setting the handler if the signal is unknown.	2017-04-06 11:25:33 +02:00
Pierre-Yves David	75f4f604c1	crecord: ensure we reinstall the SIGWINCH handler Previous, exception in _main(...) would prevent the reinstallation of the signal.	2017-03-26 15:06:09 +02:00
Pierre-Yves David	83f005a5e4	crecord: extract most of 'main' into a sub function There are some setup and cleanup necessary around the main code, that setup/cleanup code needs multiple adjustments so we extract the core code into its own function first for clarity.	2017-03-26 15:05:12 +02:00
Yuya Nishihara	35d42be491	templater: add shorthand for building a dict like {"key": key} Like field init shorthand of Rust. This is convenient for building a JSON object from selected keywords. This means dict() won't support Python-like dict(iterable) syntax because it's ambiguous. Perhaps it could be implemented as 'mapdict(xs % (k, v))'.	2017-04-03 23:13:49 +09:00
Yuya Nishihara	d86057a7bc	templater: find keyword name more thoroughly on filtering error Before, it could spill an internal representation of compiled template such as [(<function runsymbol at 0x....>, 'extras'), ...]. Show less cryptic message if no symbol found. New findsymbolicname() function will be also used by dict() constructor.	2017-04-08 23:33:32 +09:00
Yuya Nishihara	ada544b9a5	templater: add dict() constructor It's troublesome to build JSON by template, so let's add programmatic way.	2017-04-03 22:54:06 +09:00
Yuya Nishihara	2274942817	templatekw: add public function to wrap a dict by _hybrid object	2017-04-05 22:28:09 +09:00
Yuya Nishihara	f8dcd91891	templatekw: add public function to wrap a list by _hybrid object	2017-04-05 22:25:36 +09:00
Yuya Nishihara	17d1580914	templatekw: add default implementation of _hybrid.gen This is convenient for new template keyword, which doesn't need to support the legacy list hack (provided by _showlist()), but still wants to have a string representation.	2017-04-12 21:10:47 +09:00
Yuya Nishihara	e70ac1c73a	parser: preserve order of keyword arguments This helps building dict(key1=value1, ...) in deterministic way.	2017-04-09 11:58:27 +09:00
Yuya Nishihara	33d96b70bc	parser: extend buildargsdict() to support arbitrary number of **kwargs Prepares for adding dict(key1=value1, ...) template function. More tests will be added later.	2017-04-03 22:07:09 +09:00
Yuya Nishihara	2b723f40bc	parser: verify excessive number of args excluding kwargs in buildargsdict() This makes the next patch slightly simpler. We don't need to check the excessive number of keyword arguments since unknown and duplicated kwargs are rejected.	2017-04-08 20:07:37 +09:00
Pierre-Yves David	63f0ebdb7f	upgrade: simplify the "origin" dispatch in dry run We could compute the final set we need directly.	2017-04-11 00:03:11 +02:00
Pierre-Yves David	0befb32302	upgrade: use 'improvement' object for action too This simplify multiple pieces of code. For now we restrict this upgrade to the top level function to keep this patch simple.	2017-04-10 23:11:45 +02:00
Pierre-Yves David	8343e068f0	upgrade: implement equality for 'improvement' object Through the code, we use a mix of 'improvement' object and string. Having a single type would be simpler. For this we need the object to be comparable.	2017-04-10 23:10:03 +02:00
Pierre-Yves David	28e1ded0a7	upgrade: simplify some of the initial dispatch for dry run Since we already have the list of deficiencies, we can use it directly.	2017-04-10 22:15:17 +02:00
Pierre-Yves David	917b0eb147	upgrade: simplify 'determineactions' Since we only takes 'deficiencies', we can simplify the function and clarify its arguments.	2017-04-07 18:39:27 +02:00
Pierre-Yves David	dbe4fb45ab	upgrade: filter optimizations outside of 'determineactions' This sounds like higher level logic to process arguments. Moving it out of 'determineactions' will allow passing only deficiencies to the function. Then, in a future changeset, we will remove dispatch on "improvement type" within the function. See next changeset for details.	2017-04-11 23:46:16 +02:00
Pierre-Yves David	74208899a2	upgrade: directly iterate over optimisations Since we already have the list of optimisations independent from the deficiencies, we can use it directly. (we make a dual assignement in this changeset to simplify the next one)	2017-04-07 18:46:27 +02:00
Pierre-Yves David	a5369d6f5d	upgrade: simplify optimisations validation Since we fetch optimizations distinctly from the deficiencies, we can simplify some code.	2017-04-10 21:01:06 +02:00
Pierre-Yves David	6e51b0fbf0	upgrade: split finding deficiencies from finding optimisations Our ultimate goal is to make it easier to get a diagnostic of the repository format. A first important and step for that is to separate part related to repository format from the optimisation. We start by having two different functions returning the two categories of possible "improvement".	2017-04-10 21:00:52 +02:00
Pierre-Yves David	a73976f4f4	upgrade: update the copyright statement	2017-04-11 22:07:40 +02:00
Pierre-Yves David	ef922b1da6	upgrade: update the header comment	2017-04-11 22:07:15 +02:00
Pierre-Yves David	f958f09136	upgrade: import 'localrepo' globally The in-function imports mention a cycle that seems to no longer be relevant. As a result, we just import it globally.	2017-04-11 22:01:13 +02:00
Matt Harbison	38d197a30d	windows: add context manager support to mixedfilemodewrapper I stumbled into this in the next patch. The difference between getting a context manager capable object or not from vfs classes was as subtle as adding a '+' to the file mode.	2017-04-11 21:38:11 -04:00
Pierre-Yves David	a8ff8b5088	bundle2: move 'seek' and 'tell' methods off the unpackermixin class These methods are unrelated to unpacking. They are used internally by the 'unbundlepart' class only. So me move them there as private methods. In the same go, we clarify their internal role in the their docstring.	2017-04-09 19:09:07 +02:00
Yuya Nishihara	073239ae67	templater: port pad() to take keyword arguments This is another example where keyword arguments can be actually useful.	2017-04-03 22:23:52 +09:00
Yuya Nishihara	85fe439717	templater: add support for keyword arguments Unlike revset, function arguments are pre-processed in templater. That's why we need to define argspec per function. An argspec field looks somewhat redundant in @templatefunc definition as a name field contains human-readable list of arguments. I'll make function doc be built from argspec later. Ported separate() function as an example.	2017-04-03 21:22:39 +09:00
Yuya Nishihara	1d5bb45321	templater: add parsing rule for key-value pair Based on the revset implementation, ef14ee493cf7. This patch also adjusts the test as '=' is now a valid token.	2017-04-03 20:55:55 +09:00
Yuya Nishihara	fe158d1bad	templater: adjust binding strengths to make room for key-value operator Changed as follows: - template ops (%, \|): +10 - arithmetic ops: +1 (but "negate" should be greater than "%")	2017-04-03 20:44:05 +09:00
Yuya Nishihara	ef29c2e54c	templater: sort token table by binding strength Just for readability.	2017-04-03 20:37:25 +09:00
Yuya Nishihara	0aa51ecaec	templater: make _hybrid provide more list/dict-like methods So the JSON filter works.	2017-04-04 22:31:59 +09:00
Yuya Nishihara	e6e5ca157b	templater: hide private variable of _hybrid	2017-04-04 22:20:06 +09:00
Yuya Nishihara	e6ea93a8d4	templater: remove __iter__() from _hybrid, resolve it explicitly The goal is to fix "{hybrid_obj\|json}" output. A _hybrid object must act as a list or a dict as well as a generator of legacy template strings. Before, _hybrid.__iter__() was assigned for legacy template, which conflicted with list.__iter__() API. This patch drops _hybrid.__iter__() and makes stringify/flatten functions unwrap a generator instead.	2017-04-04 22:19:02 +09:00
Denis Laxalde	098c0d5368	context: extract _changesinrange() out of blockancestors() We'll need it to write a blockdescendants function in next changeset.	2017-01-16 09:22:32 +01:00
Pulkit Goyal	5a0e39fb56	util: add length argument to util.buffer() util.buffer() either returns inbuilt buffer function or defines a new one which slices. The inbuilt buffer() also has a length argument which is missing from the ones we defined. This patch adds that length argument.	2017-01-14 20:05:15 +05:30
Pulkit Goyal	3c7388da12	py3: replace pycompat.getenv with encoding.environ.get pycompat.getenv returns os.getenvb on py3 which is not available on Windows. This patch replaces them with encoding.environ.get and checks to ensure no new instances of os.getenv or os.setenv are introduced.	2017-01-15 13:17:05 +05:30
Yuya Nishihara	f3733be9e2	patch: check length of git index header only if integer is specified Otherwise TypeError would be raised. Follows up 062245c938a0.	2017-01-15 16:33:15 +09:00
Gregory Szorc	765aada92f	localrepo: experimental support for non-zlib revlog compression The final part of integrating the compression manager APIs into revlog storage is the plumbing for repositories to advertise they are using non-zlib storage and for revlogs to instantiate a non-zlib compression engine. The main intent of the compression manager work was to zstd all of the things. Adding zstd to revlogs has proved to be more involved than other places because revlogs are... special. Very small inputs and the use of delta chains (which are themselves a form of compression) are a completely different use case from streaming compression, which bundles and the wire protocol employ. I've conducted numerous experiments with zstd in revlogs and have yet to formalize compression settings and a storage architecture that I'm confident I won't regret later. In other words, I'm not yet ready to commit to a new mechanism for using zstd - or any other compression format - in revlogs. That being said, having some support for zstd (and other compression formats) in revlogs in core is beneficial. It can allow others to conduct experiments. This patch introduces highly experimental support for non-zlib compression formats in revlogs. Introduced is a config option to control which compression engine to use. Also introduced is a namespace of "exp-compression-" requirements to denote support for non-zlib compression in revlogs. I've prefixed the namespace with "exp-" (short for "experimental") because I'm not confident of the requirements "schema" and in no way want to give the illusion of supporting these requirements in the future. I fully intend to drop support for these requirements once we figure out what we're doing with zstd in revlogs. A good portion of the patch is teaching the requirements system about registered compression engines and passing the requested compression engine as an opener option so revlogs can instantiate the proper compression engine for new operations. That's a verbose way of saying "we can now use zstd in revlogs!" On an `hg pull` conversion of the mozilla-unified repo with no extra redelta settings (like aggressivemergedeltas), we can see the impact of zstd vs zlib in revlogs: $ hg perfrevlogchunks -c ! chunk ! wall 2.032052 comb 2.040000 user 1.990000 sys 0.050000 (best of 5) ! wall 1.866360 comb 1.860000 user 1.820000 sys 0.040000 (best of 6) ! chunk batch ! wall 1.877261 comb 1.870000 user 1.860000 sys 0.010000 (best of 6) ! wall 1.705410 comb 1.710000 user 1.690000 sys 0.020000 (best of 6) $ hg perfrevlogchunks -m ! chunk ! wall 2.721427 comb 2.720000 user 2.640000 sys 0.080000 (best of 4) ! wall 2.035076 comb 2.030000 user 1.950000 sys 0.080000 (best of 5) ! chunk batch ! wall 2.614561 comb 2.620000 user 2.580000 sys 0.040000 (best of 4) ! wall 1.910252 comb 1.910000 user 1.880000 sys 0.030000 (best of 6) $ hg perfrevlog -c -d 1 ! wall 4.812885 comb 4.820000 user 4.800000 sys 0.020000 (best of 3) ! wall 4.699621 comb 4.710000 user 4.700000 sys 0.010000 (best of 3) $ hg perfrevlog -m -d 1000 ! wall 34.252800 comb 34.250000 user 33.730000 sys 0.520000 (best of 3) ! wall 24.094999 comb 24.090000 user 23.320000 sys 0.770000 (best of 3) Only modest wins for the changelog. But manifest reading is significantly faster. What's going on? One reason might be data volume. zstd decompresses faster. So given more bytes, it will put more distance between it and zlib. Another reason is size. In the current design, zstd revlogs are larger*: debugcreatestreamclonebundle (size in bytes) zlib: 1,638,852,492 zstd: 1,680,601,332 I haven't investigated this fully, but I reckon a significant cause of larger revlogs is that the zstd frame/header has more bytes than zlib's. For very small inputs or data that doesn't compress well, we'll tend to store more uncompressed chunks than with zlib (because the compressed size isn't smaller than original). This will make revlog reading faster because it is doing less decompression. Moving on to bundle performance: $ hg bundle -a -t none-v2 (total CPU time) zlib: 102.79s zstd: 97.75s So, marginal CPU decrease for reading all chunks in all revlogs (this is somewhat disappointing). $ hg bundle -a -t <engine>-v2 (total CPU time) zlib: 191.59s zstd: 115.36s This last test effectively measures the difference between zlib->zlib and zstd->zstd for revlogs to bundle. This is a rough approximation of what a server does during `hg clone`. There are some promising results for zstd. But not enough for me to feel comfortable advertising it to users. We'll get there...	2017-01-13 20:16:56 -08:00
Gregory Szorc	94d36bba2d	revlog: use compression engine APIs for decompression Now that compression engines declare their header in revlog chunks and can decompress revlog chunks, we refactor revlog.decompress() to use them. Making full use of the property that revlog compressor objects are reusable, revlog instances now maintain a dict mapping an engine's revlog header to a compressor object. This is not only a performance optimization for engines where compressor object reuse can result in better performance, but it also serves as a cache of header values so we don't need to perform redundant lookups against the compression engine manager. (Yes, I measured and the overhead of a function call versus a dict lookup was observed.) Replacing the previous inline lookup table with a dict lookup was measured to make chunk reading ~2.5% slower on changelogs and ~4.5% slower on manifests. So, the inline lookup table has been mostly preserved so we don't lose performance. This is unfortunate. But many decompression operations complete in microseconds, so Python attribute lookup, dict lookup, and function calls do matter. The impact of this change on mozilla-unified is as follows: $ hg perfrevlogchunks -c ! chunk ! wall 1.953663 comb 1.950000 user 1.920000 sys 0.030000 (best of 6) ! wall 1.946000 comb 1.940000 user 1.910000 sys 0.030000 (best of 6) ! chunk batch ! wall 1.791075 comb 1.800000 user 1.760000 sys 0.040000 (best of 6) ! wall 1.785690 comb 1.770000 user 1.750000 sys 0.020000 (best of 6) $ hg perfrevlogchunks -m ! chunk ! wall 2.587262 comb 2.580000 user 2.550000 sys 0.030000 (best of 4) ! wall 2.616330 comb 2.610000 user 2.560000 sys 0.050000 (best of 4) ! chunk batch ! wall 2.427092 comb 2.420000 user 2.400000 sys 0.020000 (best of 5) ! wall 2.462061 comb 2.460000 user 2.400000 sys 0.060000 (best of 4) Changelog chunk reading is slightly faster but manifest reading is slower. What gives? On this repo, 99.85% of changelog entries are zlib compressed (the 'x' header). On the manifest, 67.5% are zlib and 32.4% are '\0'. This patch swapped the test order of 'x' and '\0' so now 'x' is tested first. This makes changelogs faster since they almost always hit the first branch. This makes a significant percentage of manifest '\0' chunks slower because that code path now performs an extra test. Yes, I too can't believe we're able to measure the impact of an if..elif with simple string compares. I reckon this code would benefit from being written in C...	2017-01-13 19:58:00 -08:00
Denis Laxalde	e0d6f05072	hgweb: build the "entries" list directly in filelog command There's no apparent reason to have this "entries" generator function that builds a list and then yields its elements in reverse order and which is only called to build the "entries" list. So just build the list directly, in reverse order. Adjust "parity" generator's offset to keep rendering the same.	2017-01-13 10:22:25 +01:00
Yuya Nishihara	5d86e43147	ui: check EOF of getpass() response read from command-server channel readline() returns '' only when EOF is encountered, in which case, Python's getpass() raises EOFError. We should do the same to abort the session as "response expected." This bug was reported to https://bitbucket.org/tortoisehg/thg/issues/4659/	2017-01-14 20:31:35 +09:00
Gregory Szorc	550169e48e	help: make "mergetool" an alias for "merge-tools" I've probably typed `hg help mergetool` dozens of times. I'm tired of it not working.	2017-01-13 21:21:02 -08:00
Matthieu Laneuville	1146ca6217	templatekw: force noprefix=False to insure diffstat consistency (issue4755) The result of diffstatdata should not depend on having noprefix set or not, as was reported in issue 4755. Forcing noprefix to false on call makes sure the parser receives the diff in the correct format and returns the proper result. Another way to fix this would have been to change the regular expressions in path.diffstatdata(), but that would have introduced many unecessary special cases.	2017-01-12 21:06:55 +09:00
Pierre-Yves David	b3ce804dcd	similar: remove caching from the module level To prevent Bad Things™ from happening, let's rework the logic to not use util.cachefunc.	2017-01-13 11:42:36 -08:00
Sean Farley	7335c165eb	patch: add label for coloring the similarity extended header Just like the summary says, this will colorize the: similarity index 88% line in the diff output.	2017-01-09 11:01:45 -08:00
Sean Farley	311a50fdae	patch: use opt.showsimilarity to calculate and show the similarity Tests have been added.	2017-01-09 11:24:18 -08:00
Sean Farley	bf5e8cb800	patch: add similarity config knob in experimental section This config knob will control whether or not to show the similarity calculation in the diff output: diff --git a/README.md b/foo.md similarity index 88% rename from README.md rename to foo.md --- a/README.md +++ b/foo.md	2017-01-09 10:51:44 -08:00
Sean Farley	8fc2b48eb5	similar: move score function to module level Future patches will use this to report the similarity of a rename / copy in the patch output.	2017-01-07 20:47:57 -08:00
Yuya Nishihara	5ade140d5c	revset: abuse x:y syntax to specify line range of followlines() This slightly complicates the parsing (see the previous patch), but the overall result seems not bad. I keep x:, :y and : for future extension.	2017-01-09 17:58:19 +09:00
Yuya Nishihara	615f3c1669	revset: do not transform range* operators in parsed tree This allows us to handle x:y range as a general range object. A primary user of it is followlines().	2017-01-09 16:55:56 +09:00
Yuya Nishihara	0f4a24bbbf	revset: add default value to getinteger() helper This seems handy.	2017-01-09 17:45:11 +09:00
Yuya Nishihara	49d42c696d	revset: factor out getinteger() helper We have 4 revset functions that take integer arguments, and they handle their arguments in slightly different ways. This patch unifies them: - getstring() in place of getsymbol(), which is more consistent with the handling of integer revisions (both 1 and '1' are valid) - say "expects" instead of "requires" for type errors We don't need to catch TypeError since getstring() must return a string.	2017-01-09 17:39:44 +09:00
Yuya Nishihara	a73b0aaf6b	revset: rename rev argument of followlines() to startrev The rev argument has the same meaning as startrev of follow(), and I think startrev is more informative. followlines() is new function, we can make BC now.	2017-01-09 16:16:26 +09:00
Yuya Nishihara	a0c3bc199a	help: use :hg: role and canonical name to point to revset string patterns Follows up ae418afed3f6. Now revisions.txt and revsets.txt has been merged, so use revisions.* as a pointer.	2017-01-13 23:48:21 +09:00
Gregory Szorc	4a3b8df214	util: compression APIs to support revlog decompression Previously, compression engines had APIs for performing revlog compression but no mechanism to perform revlog decompression. This patch changes that. Revlog decompression is slightly more complicated than compression because in the compression case there is (currently) only a single engine that can be used at a time. However for decompression, a revlog could contain chunks from multiple compression engines. This means decompression needs to map to multiple engines and decompressors. This functionality is outside the scope of this patch. But it drives the decision for engines to declare a byte header sequence that identifies revlog data as belonging to an engine and an API for obtaining an engine from a revlog header.	2017-01-02 13:27:20 -08:00
Anton Shestakov	9427025e13	crecord: add an experimental option for space key to move cursor down I really want to have an option of toggling a selection on a line and also moving cursor down as a single keystroke. It also kinda makes sense for space key to do this, because some other curses UIs in the wild do this (e.g. various file managers, htop). So I got an idea to make a config option that defaults to False for compatibility, but allows making crecord UI a lot more useful for people with big hunks. We add this an experimental option to experiment with this behavior.	2017-01-08 10:08:29 +08:00
Gregory Szorc	24c1205d69	revlog: use compression engine API for compression This commit swaps in the just-added revlog compressor API into the revlog class. Instead of implementing zlib compression inline in compress(), we now store a cached-on-first-use revlog compressor on each revlog instance and invoke its "compress()" method. As part of this, revlog.compress() has been refactored a bit to use a cleaner code flow and modern formatting (e.g. avoiding parenthesis around returned tuples). On a mozilla-unified repo, here are the "compress" times for a few commands: $ hg perfrevlogchunks -c ! wall 5.772450 comb 5.780000 user 5.780000 sys 0.000000 (best of 3) ! wall 5.795158 comb 5.790000 user 5.790000 sys 0.000000 (best of 3) $ hg perfrevlogchunks -m ! wall 9.975789 comb 9.970000 user 9.970000 sys 0.000000 (best of 3) ! wall 10.019505 comb 10.010000 user 10.010000 sys 0.000000 (best of 3) Compression times did seem to slow down just a little. There are 360,210 changelog revisions and 359,342 manifest revisions. For the changelog, mean time to compress a revision increased from ~16.025us to ~16.088us. That's basically a function call or an attribute lookup. I suppose this is the price you pay for abstraction. It's so low that I'm not concerned.	2017-01-02 11:22:52 -08:00
Gregory Szorc	29c30e4b7e	util: compression APIs to support revlog compression As part of "zstd all of the things," we need to teach revlogs to use non-zlib compression formats. Because we're routing all compression via the "compression manager" and "compression engine" APIs, we need to introduction functionality there for performing revlog operations. Ideally, revlog compression and decompression operations would be implemented in terms of simple "compress" and "decompress" primitives. However, there are a few considerations that make us want to have a specialized primitive for handling revlogs: 1) Performance. Revlogs tend to do compression and especially decompression operations in batches. Any overhead for e.g. instantiating a "context" for performing an operation can be noticed. For this reason, our "revlog compressor" primitive is reusable. For zstd, we reuse the same compression "context" for multiple operations. I've measured this to have a performance impact versus constructing new contexts for each operation. 2) Specialization. By having a primitive dedicated to revlog use, we can make revlog-specific choices and leave the door open for more functionality in the future. For example, the zstd revlog compressor may one day make use of dictionary compression. A future patch will introduce a decompress() on the compressor object. The code for the zlib compressor is basically copied from revlog.compress(). Although it doesn't handle the empty input case, the null first byte case, and the 'u' prefix case. These cases will continue to be handled in revlog.py once that code is ported to use this API.	2017-01-02 12:39:03 -08:00
Gregory Szorc	1a6670d670	revlog: move decompress() from module to revlog class (API) Upcoming patches will convert revlogs to use the compression engine APIs to perform all things compression. The yet-to-be-introduced APIs support a persistent "compressor" object so the same object can be reused for multiple compression operations, leading to better performance. In addition, compression engines like zstd may wish to tweak compression engine state based on the revlog (e.g. per-revlog compression dictionaries). A global and shared decompress() function will shortly no longer make much sense. So, we move decompress() to be a method of the revlog class. It joins compress() there. On the mozilla-unified repo, we can measure the impact of this change on reading performance: $ hg perfrevlogchunks -c ! chunk ! wall 1.932573 comb 1.930000 user 1.900000 sys 0.030000 (best of 6) ! wall 1.955183 comb 1.960000 user 1.930000 sys 0.030000 (best of 6) ! chunk batch ! wall 1.787879 comb 1.780000 user 1.770000 sys 0.010000 (best of 6 ! wall 1.774444 comb 1.770000 user 1.750000 sys 0.020000 (best of 6) "chunk" appeared to become slower but "chunk batch" got faster. Upon further examination by running both sets multiple times, the numbers appear to converge across all runs. This tells me that there is no perceived performance impact to this refactor.	2017-01-02 13:00:16 -08:00
Gregory Szorc	df8167ed29	revlog: make compressed size comparisons consistent revlog.compress() compares the compressed size to the input size and throws away the compressed data if it is larger than the input. This is the correct thing to do, as storing compressed data that is larger than the input takes up more storage space and makes reading slower. However, the comparison was implemented inconsistently. For the streaming compression mode, we threw away the result if it was greater than or equal to the input size. But for the one-shot compression, we threw away the compression only if it was greater than the input size! This patch changes the comparison for the simple case so it is consistent with the streaming case. As a few tests demonstrate, this adds 1 byte to some revlog entries. This is because of an added 'u' header on the chunk. It seems somewhat wrong to increase the revlog size here. However, IMO the cost of 1 byte in storage is insignificant compared to the performance gains of avoiding decompression. This patch should invite questions around the heuristic for throwing away compressed data. For example, I'd argue we should be more liberal about rejecting compressed data, additionally doing so where the number of bytes saved fails to reach a threshold. But we can have this discussion another time.	2017-01-02 11:50:17 -08:00
Sean Farley	3c1cbd7c9b	similar: rename local variable to not collide with previous Future patches will move the score function to the module level, so let's not shadow that.	2017-01-07 20:43:49 -08:00
Sean Farley	25acd53e01	patch: add label for coloring the index extended header Just like the summary says, this will colorize the: index 3d3ba4b65e11..57274a0f46b2 100644 line in the diff output.	2017-01-09 10:59:45 -08:00
Sean Farley	14adabd19a	patch: add index line for diff output This helps highlighting in third-party diff coloring (which assumes git output) and maintains pedantic correctness with diff --git. Tests will be added at the end of the series.	2016-12-31 15:41:57 -06:00
Sean Farley	8cd1b5827c	patch: add config knob for displaying the index header This config knob can take an integer between 0 and 40 or a keyword ('none', 'short', 'full') to control the length of hash to output. It will display diffs with the git index header as such, diff --git a/mercurial/mdiff.py b/mercurial/mdiff.py index 112edf7..d6b52c5 100644 We'll put this in the experimental section for now.	2017-01-09 11:13:47 -08:00
Martin von Zweigbergk	9e63f2d21c	bisect: refer directly to bisect() revset predicate in help We have specific syntax for displaying the help text for a particular revset predicate, so let's refer directly to the bisect() revset in the verbose bisect help. It seems likely that the user doesn't care about other revsets at that point, so they will probably not miss the text about the other revset predicates.	2017-01-12 12:05:23 -08:00
Martin von Zweigbergk	029203f29d	help: remove now-redundant pointer to revsets help "hg help revisions" and "hg help revsets" now point to the same text, so drop the revsets reference.	2017-01-12 11:52:05 -08:00
Matt Harbison	d3bfb5a06a	help: eliminate duplicate text for revset string patterns There's no reason to duplicate this so many times, and it's likely an instance will be missed if support for a new pattern is added and documented. The stringmatcher is mostly used by revsets, though it is also used for the 'tag' related templates, and namespace filtering in the journal extension. So maybe there's a better place to document it. `hg help patterns` seems inappropriate, because that is all file pattern matching. While here, indicate how to perform case insensitive regex searches.	2017-01-07 23:35:35 -05:00
Matt Harbison	e0b76f5323	revset: add regular expression support to 'desc' This is a case insensitive predicate like 'author', so it conforms to the existing behavior of performing a case insensitive regex.	2017-01-07 21:26:32 -05:00
Matt Harbison	840ab22fff	revset: stop lowercasing the regex pattern for 'author' It was probably unintentional for regex, as the meaning of some sequences like \S and \s is actually inverted by changing the case. For backward compatibility however, the matching is forced to case insensitive.	2017-01-11 22:42:10 -05:00
Gregory Szorc	abe1c0e17e	repair: clean up stale lock file from store backup Since we did a directory rename on the stores, the source repository's lock path now references the dest repository's lock path and the dest repository's lock path now references a non-existent filename. So releasing the lock on the source will unlock the dest and releasing the lock on the dest will no-op because it fails due to file not found. So we clean up the dest's lock manually.	2016-11-24 18:45:29 -08:00
Gregory Szorc	a400e3d753	repair: copy non-revlog store files during upgrade The store contains more than just revlogs. This patch teaches the upgrade code to copy regular files as well. As the test changes demonstrate, the phaseroots file is now copied.	2016-11-24 18:34:50 -08:00
Gregory Szorc	93504084a0	repair: migrate revlogs during upgrade Our next step for in-place upgrade is to migrate store data. Revlogs are the biggest source of data within the store and a store is useless without them, so we implement their migration first. Our strategy for migrating revlogs is to walk the store and call `revlog.clone()` on each revlog. There are some minor complications. Because revlogs have different storage options (e.g. changelog has generaldelta and delta chains disabled), we need to obtain the correct class of revlog so inserted data is encoded properly for its type. Various attempts at implementing progress indicators that didn't lead to frustration from false "it's almost done" indicators were made. I initially used a single progress bar based on number of revlogs. However, this quickly churned through all filelogs, got to 99% then effectively froze at 99.99% when it got to the manifest. So I converted the progress bar to total revision count. This was a little bit better. But the manifest was still significantly slower than filelogs and it took forever to process the last few percent. I then tried both revision/chunk bytes and raw bytes as the denominator. This had the opposite effect: because so much data is in manifests, it would churn through filelogs without showing much progress. When it got to manifests, it would fill in 90+% of the progress bar. I finally gave up having a unified progress bar and instead implemented 3 progress bars: 1 for filelog revisions, 1 for manifest revisions, and 1 for changelog revisions. I added extra messages indicating the total number of revisions of each so users know there are more progress bars coming. I also added extra messages before and after each stage to give extra details about what is happening. Strictly speaking, this isn't necessary. But the numbers are impressive. For example, when converting a non-generaldelta mozilla-central repository, the messages you see are: migrating 2475593 total revisions (1833043 in filelogs, 321156 in manifests, 321394 in changelog) migrating 1.67 GB in store; 2508 GB tracked data migrating 267868 filelogs containing 1833043 revisions (1.09 GB in store; 57.3 GB tracked data) finished migrating 1833043 filelog revisions across 267868 filelogs; change in size: -415776 bytes migrating 1 manifests containing 321156 revisions (518 MB in store; 2451 GB tracked data) That "2508 GB" figure really blew me away. I had no clue that the raw tracked data in mozilla-central was that large. Granted, 2451 GB is in the manifest and "only" 57.3 GB is in filelogs. But still. It's worth noting that gratuitous loading of source revlogs in order to display numbers and progress bars does serve a purpose: it ensures we can open all source revlogs. We don't want to spend several minutes copying revlogs only to encounter a permissions error or similar later. As part of this commit, we also add swapping of the store directory to the upgrade function. After revlogs are converted, we move the old store into the backup directory then move the temporary repo's store into the old store's location. On well-behaved systems, this should be 2 atomic operations and the window of inconsistency show be very narrow. There are still a few improvements to be made to store copying and upgrading. But this commit gets the bulk of the work out of the way.	2016-12-18 17:00:15 -08:00
Gregory Szorc	4dbc7459c8	revlog: add clone method Upcoming patches will introduce functionality for in-place repository/store "upgrades." Copying the contents of a revlog feels sufficiently low-level to warrant being in the revlog class. So this commit implements that functionality. Because full delta recomputation can be very expensive (we're talking several hours on the Firefox repository), we support multiple modes of execution with regards to delta (re)use. This will allow repository upgrades to choose the "level" of processing/optimization they wish to perform when converting revlogs. It's not obvious from this commit, but "addrevisioncb" will be used for progress reporting.	2016-12-18 17:02:57 -08:00
Gregory Szorc	b9b6954ea9	repair: begin implementation of in-place upgrading Now that all the upgrade planning work is in place, we can start doing the real work: actually upgrading a repository. The main goal of this commit is to get the "framework" for running in-place upgrade actions in place. Rather than get too clever and low-level with regards to in-place upgrades, our strategy is to create a new, temporary repository, copy data to it, then replace the old data with the new. This allows us to reuse a lot of code in localrepo.py around store interaction, which will eventually consume the bulk of the upgrade code. But we have to start small. This patch implements adding new repository requirements. But it still sets up a temporary repository and locks it and the source repo before performing the requirements file swap. This means all the plumbing is in place to implement store copying in subsequent commits.	2016-12-18 16:59:04 -08:00
Gregory Szorc	a3569d4b71	repair: determine what upgrade will do This commit introduces code for determining what actions/improvements an upgrade should perform. The "upgradefindimprovements" function introduces a mechanism to return a list of improvements that can be made to a repository. Each improvement is effectively an action that an upgrade will perform. Associated with each of these improvements is metadata that will be used to inform users what's wrong and what an upgrade will do. Each "improvement" is categorized as a "deficiency" or an "optimization." TBH, I'm not thrilled about the terminology and am receptive to constructive bikeshedding. The main difference between a "deficiency" and an "optimization" is a deficiency is always corrected (if it deviates from the current config) and an "optimization" is an optional action that goes above and beyond to improve the state of the repository (usually by requiring more CPU during upgrade). Our initial set of improvements identifies missing repository requirements, a single, easily correctable problem with changelog storage, and a set of "optimizations" related to delta recalculation. The main "upgraderepo" function has been expanded to handle improvements. It queries for the list of improvements and determines which of them will run based on the current repository state and user I went through numerous iterations of the output format before settling on a ReST-inspired definition list format. (I used bulleted lists in the first submission of this commit and could not get it to format just right.) Even with the various iterations, I'm still not super thrilled with the format. But, this is a debug* command, so that should mean we can refine the output without BC concerns.	2016-12-18 16:51:09 -08:00
Gregory Szorc	f42e2dcaac	repair: implement requirements checking for upgrades This commit introduces functionality for upgrading a repository in place. The first part that's implemented is testing for upgrade "compatibility." This is done by examining repository requirements. There are 5 functions returning sets of requirements that control upgrading. Why so many functions? Mainly to support extensions. Functions are easier to monkeypatch than module variables. Astute readers will see that we don't support "manifestv2" and "treemanifest" requirements in the upgrade mechanism. I don't have a great answer for why other than this is a complex set of patches and I don't want to deal with the complexity of these experimental features just yet. We can teach the upgrade mechanism about them later, once the basic upgrade mechanism is in place. This commit also introduces the "upgraderepo" function. This will be our main routine for performing an in-place upgrade. Currently, it just implements requirements checking. The structure of some code in this function may look a bit weird (e.g. the inline function that is only called once). But this will make sense after future commits.	2016-12-18 16:16:54 -08:00
Gregory Szorc	16568ee7f0	debugcommands: stub for debugupgraderepo command Currently, if Mercurial introduces a new repository/store feature or changes behavior of an existing feature, users must perform an `hg clone` to create a new repository with hopefully the correct/optimal settings. Unfortunately, even `hg clone` may not give the correct results. For example, if you do a local `hg clone`, you may get hardlinks to revlog files that inherit the old state. If you `hg clone` from a remote or `hg clone --pull`, changegroup application may bypass some optimization, such as converting to generaldelta. Optimizing a repository is harder than it seems and requires more than a simple `hg` command invocation. This commit starts the process of changing that. We introduce `hg debugupgraderepo`, a command that performs an in-place upgrade of a repository to use new, optimal features. The command is just a stub right now. Features will be added in subsequent commits. This commit does foreshadow some of the behavior of the new command, notably that it doesn't do anything by default and that it takes arguments that influence what actions it performs. These will be explained more in subsequent commits.	2016-11-24 16:24:09 -08:00
Matt Harbison	86e0681833	util: teach stringmatcher to handle forced case insensitive matches The 'author' and 'desc' revsets are documented to be case insensitive. Unfortunately, this was implemented in 'author' by forcing the input to lowercase, including for regex like '\B'. (This actually inverts the meaning of the sequence.) For backward compatibility, we will keep that a case insensitive regex, but by using matcher options instead of brute force. This doesn't preclude future hypothetical 'icase-literal:' style prefixes that can be provided by the user. Such user specified cases can probably be handled up front by stripping 'icase-', setting the variable, and letting it drop through the existing code.	2017-01-11 21:47:19 -05:00

1 2 3 4 5 ...

18378 Commits