sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 16:57:49 +03:00

Author	SHA1	Message	Date
Siddharth Agarwal	d24e970042	revlog: add a fast method for getting a list of chunks This moves _chunkraw into the loop. Doing that improves revlog decompression -- in particular, manifest decompression -- significantly. For a 20 MB manifest which is the result of a > 40k delta chain, hg perfmanifest improves from 0.55 seconds to 0.49 seconds.	2013-09-06 16:31:35 -07:00
Wojciech Lopata	3a79365ccc	revlog: pass node as an argument of addrevision This change will allow revlog subclasses that override 'checkhash' method to use custom strategy of computing nodeids without overriding 'addrevision' method. In particular this change is necessary to implement manifest compression.	2013-08-19 11:25:23 -07:00
Wojciech Lopata	299c718f66	revlog: extract 'checkhash' method Extract method that decides whether nodeid is correct for paricular revision text and parent nodes. Having this method extracted will allow revlog subclasses to implement custom way of computing nodes. In particular this change is necessary to implement manifest compression.	2013-08-19 11:06:38 -07:00
Matt Mackall	06155d5c8a	revlog: handle hidden revs in _partialmatch (issue3979) Looking up hidden prefixes could cause a no node exception Looking up unique non-hidden prefixes could be ambiguous	2013-07-23 17:28:12 -05:00
Durham Goode	e750755eba	revlog: add exception when linkrev == nullrev When we deployed the latest crew mercurial to our users, a few of them had issues where a filelog would have an entry with a -1 linkrev. This caused operations like rebase and amend to create a bundle containing the entire repository, which took a long time. I don't know what the issue is, but adding this check should prevent repos from getting in this state, and should help us pinpoint the issue next time it happens.	2013-06-17 19:44:00 -07:00
Sune Foldager	6bd4fdfe9d	bundle-ng: move group into the bundler No additional semantic changes made.	2013-05-10 21:03:01 +02:00
Alexander Plavin	48936f264c	revlog: fix a regression with null revision Introduced in the patch which fixes issue3497 Part of that patch was erroneously submitted and it shouldn't be in the code	2013-04-18 16:46:09 +04:00
Alexander Plavin	829cf92d16	log: fix behavior with empty repositories (issue3497) Make output in this special case consistent with general case one.	2013-04-17 00:29:54 +04:00
Bryan O'Sullivan	512383d40e	revlog: don't cross-check ancestor result against Python version	2013-04-16 10:08:20 -07:00
Bryan O'Sullivan	c6b9f1099d	parsers: a C implementation of the new ancestors algorithm The performance of both the old and new Python ancestor algorithms depends on the number of revs they need to traverse. Although the new algorithm performs far better than the old when revs are numerically and topologically close, both algorithms become slow under other circumstances, taking up to 1.8 seconds to give answers in a Linux kernel repo. This C implementation of the new algorithm is a fairly straightforward transliteration. The only corner case of interest is that it raises an OverflowError if the number of GCA candidates found during the first pass is greater than 24, to avoid the dual perils of fixnum overflow and trying to allocate too much memory. (If this exception is raised, the Python implementation is used instead.) Performance numbers are good: in a Linux kernel repo, time for "hg debugancestors" on two distant revs (24bf01de7537 and c2a8808f5943) is as follows: Old Python: 0.36 sec New Python: 0.42 sec New C: 0.02 sec For a case where the new algorithm should perform well: Old Python: 1.84 sec New Python: 0.07 sec New C: measures as zero when using --time (This commit includes a paranoid cross-check to ensure that the Python and C implementations give identical answers. The above performance numbers were measured with that check disabled.)	2013-04-16 10:08:20 -07:00
Bryan O'Sullivan	59b785a485	revlog: choose a consistent ancestor when there's a tie Previously, we chose a rev based on numeric ordering, which could cause "the same merge" in topologically identical but numerically different repos to choose different merge bases. We now choose the lexically least node; this is stable across different revlog orderings.	2013-04-16 10:08:19 -07:00
Bryan O'Sullivan	4a3a46aff6	ancestor: a new algorithm that is faster for nodes near tip Instead of walking all the way to the root of the DAG, we generate a set of candidate GCA revs, then figure out which ones will win the race to the root (usually without needing to traverse all the way to the root). In the common case of nodes that are close to each other in both revision number and topology, this is usually a big win: it makes "hg --time debugancestors" up to 9 times faster than the more general ancestor function when measured on heads of the linux-2.6 hg repo. Victory is not assured, however. The older function can still win by a large margin if one node is much closer to the root than the other, or by a much smaller amount if one is an ancestor of the other. For now, we've also got a small paranoid harness function that calls both ancestor functions on every input and ensures that they give equivalent answers. Even without the checker function, the old ancestor function needs to stay alive for the time being, as its generality is used by context.filectx.merge.	2013-04-16 10:08:18 -07:00
Benoit Boissinot	41300c28e0	revlog: document v0 format	2013-02-09 12:08:02 +01:00
Siddharth Agarwal	4d560304bb	revlog: move ancestor generation out to a new class This refactoring is to prepare for implementing lazy membership.	2012-12-18 10:14:01 -08:00
Siddharth Agarwal	cca6ff3076	revlog: remove incancestors since it is no longer used	2012-12-17 15:08:37 -08:00
Siddharth Agarwal	6d5198a5a3	revlog.ancestors: add support for including revs This is in preparation for an upcoming refactoring. This also fixes a bug in incancestors, where if an element of revs was an ancestor of another it would be generated twice.	2012-12-17 15:13:51 -08:00
Pierre-Yves David	9ac120f569	revlog: allow reverse iteration with revlog.revs We often need to perform rev iteration in reverse order. This changeset makes it possible to do so, in order to avoid costly reverse or reversed() calls later.	2012-11-21 00:42:05 +01:00
Siddharth Agarwal	b05b94a300	revlog: add rev-specific variant of findmissing This will be used by rebase in an upcoming commit.	2012-11-26 10:48:24 -08:00
Siddharth Agarwal	76a23a18f8	revlog: switch findmissing to use ancestor.missingancestors This also speeds up other commands that use findmissing, like incoming and merge --preview. With a large linear repository (>400000 commits) and with one incoming changeset, incoming is sped up from around 4-4.5 seconds to under 3.	2012-11-26 11:02:48 -08:00
Durham Goode	cce0517fb6	commit: increase perf by avoiding unnecessary filteredrevs check When commiting to a repo with lots of history (>400000 changesets) the filteredrevs check (added with 373606589de5) in changelog.py takes a bit of time even if the filteredrevs set is empty. Skipping the check in that case shaves 0.36 seconds off a 2.14 second commit. A 17% gain.	2012-11-16 15:39:12 -08:00
Pierre-Yves David	6d6a3d27a5	clfilter: split `revlog.headrevs` C call from python code Make the pure python implementation of headrevs available to derived classes. It is important because filtering logic applied by `revlog` derived class won't have effect on `index`. We want to be able to bypass this C call to implement our own.	2012-09-03 14:19:45 +02:00
Pierre-Yves David	23fb63d637	clfilter: handle non contiguous iteration in `revlov.headrevs` This prepares changelog level filtering. We can't assume that any revision can be heads because filtered revisions need to be excluded. New algorithm: - All revisions now start as "non heads", - every revision we iterate over is made candidate head, - parents of iterated revisions are definitely not head. Filtered revisions are never iterated over and never considered as candidate head.	2012-09-03 14:12:45 +02:00
Pierre-Yves David	6981326b92	clfilter: make the revlog class responsible of all its iteration This prepares changelog level filtering. We need the algorithms used in revlog to work on a subset of revisions. To achieve this, the use of explicit range of revision is banned. `range` and `xrange` calls are replaced by a `revlog.irevs` method. Filtered super class can then overwrite the `irevs` method to filter out revision.	2012-09-20 19:00:59 +02:00
Mads Kiilerich	2f4504e446	fix trivial spelling errors	2012-08-15 22:38:42 +02:00
Matt Mackall	5b06da939f	backout 94ae81a4e338 This may have allowed unbounded I/O sizes with the current chunk retrieval code.	2012-07-12 14:20:34 -05:00
Martin Geisler	c52341ae3f	merge with main	2012-07-12 10:03:50 +02:00
Friedrich Kastner-Masilko	a6245a11d3	revlog: fix for generaldelta distance calculation The decision whether or not to store a full snapshot instead of a delta is done based on the distance value calculated in _addrevision.builddelta(rev). This calculation traditionally used the fact of deltas only using the previous revision as base. Generaldelta mechanism is changing this, yet the calculation still assumes that current-offset minus chainbase-offset equals chain-length. This appears to be wrong. This patch corrects the calculation by means of using the chainlength function if Generaldelta is used.	2012-07-11 12:38:42 +02:00
Bryan O'Sullivan	26f2c363fd	revlog: make compress a method This allows an extension to optionally use a new compression type based on the options applied by the repo to the revlog's opener. (decompress doesn't need the same treatment, as it can be replaced using extensions.wrapfunction, and can figure out which compression algorithm is in use based on the first byte of the compressed payload.)	2012-06-25 13:56:13 -07:00
Joshua Redstone	09130c5cf2	revlog: remove reachable and switch call sites to ancestors This change does a trivial conversion of callsites to ancestors. Followon diffs will switch the callsites over to revs.	2012-06-08 08:39:44 -07:00
Joshua Redstone	70aeee0070	revlog: add incancestors, a version of ancestors that includes revs listed ancestors() returns the ancestors of revs provided. This func is like that except it also includes the revs themselves in the total set of revs generated.	2012-06-08 07:59:37 -07:00
Thomas Arendsen Hein	01adf7776d	merge heads	2012-06-07 15:55:12 +02:00
Brad Hall	f20c06750f	revlog: zlib.error sent to the user (issue3424) Give the user the zlib error message instead of a backtrace when decompression fails.	2012-06-04 14:46:42 -07:00
Joshua Redstone	e38b770424	revlog: add optional stoprev arg to revlog.ancestors() This will be used as a step in removing reachable() in a future diff. Doing it now because bryano is in the process of rewriting ancestors in C. This depends on bryano's patch to replace *revs with revs in the declaration of revlog.ancestors.	2012-06-01 15:44:13 -07:00
Bryan O'Sullivan	141bd09daa	revlog: descendants(*revs) becomes descendants(revs) (API) Once again making the API more rational, as with ancestors.	2012-06-01 12:45:16 -07:00
Bryan O'Sullivan	6ba97b40c1	revlog: ancestors(revs) becomes ancestors(revs) (API) Accepting a variable number of arguments as the old API did is deeply ugly, particularly as it means the API can't be extended with new arguments. Partly as a result, we have at least three different implementations of the same ancestors algorithm (!?). Most callers were forced to call ancestors(somelist), adding to both inefficiency and ugliness.	2012-06-01 12:37:18 -07:00
Bryan O'Sullivan	abdf4a8227	util: subclass deque for Python 2.4 backwards compatibility It turns out that Python 2.4's deque type is lacking a remove method. We can't implement remove in terms of find, because it doesn't have find either.	2012-06-01 17:05:31 -07:00
Bryan O'Sullivan	bef5b61512	cleanup: use the deque type where appropriate There have been quite a few places where we pop elements off the front of a list. This can turn O(n) algorithms into something more like O(n**2). Python has provided a deque type that can do this efficiently since at least 2.4. As an example of the difference a deque can make, it improves perfancestors performance on a Linux repo from 0.50 seconds to 0.36.	2012-05-15 10:46:23 -07:00
Bryan O'Sullivan	a49ea963d7	revlog: switch to a C version of headrevs The C implementation is more than 100 times faster than the Python version (which is still available as a fallback). In a repo with 330,000 revs and a stale .hg/cache/tags file, this patch improves the performance of "hg tip" from 2.2 to 1.6 seconds.	2012-05-19 19:44:58 -07:00
Matt Mackall	42c30757a2	revlog: don't handle long for revision matching The underlying C code doesn't support indexing by longs, there are no legitimate reasons to use a long, and longs should generally be converted to ints at a higher level by context's constructor.	2012-05-21 16:36:09 -05:00
Brodie Rao	a7ef0a0cc5	cleanup: "not x in y" -> "x not in y"	2012-05-12 16:00:57 +02:00
Bryan O'Sullivan	058dfb801d	revlog: speed up prefix matching against nodes The radix tree already contains all the information we need to determine whether a short string is an unambiguous node identifier. We now make use of this information. In a kernel tree, this improves the performance of "hg log -q -r24bf01de75" from 0.27 seconds to 0.06.	2012-05-12 10:55:08 +02:00
Matt Mackall	a97dbbe308	revlog: backout df8c4d732869 This regresses performance of 'hg branches', presumably because it's visiting the revlog in the wrong order. This suggests we either need to fix the branch code or add some read-behind to mitigate the effect.	2012-04-27 13:07:29 -05:00
Patrick Mezard	e9454c243f	revlog: fix partial revision() docstring (from f4a6c9197dbd)	2012-04-13 10:14:59 +02:00
Matt Mackall	a6546db90e	revlog: drop some unneeded rev.node calls in revdiff	2012-04-13 22:55:46 -05:00
Bryan O'Sullivan	62554752c6	revlog: avoid an expensive string copy This showed up in a statprof profile of "hg svn rebuildmeta", which is read-intensive on the changelog. This two-line patch improved the performance of that command by 10%.	2012-04-12 20:26:33 -07:00
Matt Mackall	4e0b41f193	revlog: increase readahead size	2012-04-13 21:35:48 -05:00
Bryan O'Sullivan	dc46676e81	parsers: use base-16 trie for faster node->rev mapping This greatly speeds up node->rev lookups, with results that are often user-perceptible: for instance, "hg --time log" of the node associated with rev 1000 on a linux-2.6 repo improves from 0.3 seconds to 0.03. I have not found any instances of slowdowns. The new perfnodelookup command in contrib/perf.py demonstrates the speedup more dramatically, since it performs no I/O. For a single lookup, the new code is about 40x faster. These changes also prepare the ground for the possibility of further improving the performance of prefix-based node lookups.	2012-04-12 14:05:59 -07:00
Matt Mackall	055cba03a8	revlog: allow retrieving contents by revision number	2012-04-08 12:38:02 -05:00
Matt Mackall	30645d82e7	revlog: add hasnode helper method	2012-04-07 15:43:18 -05:00
Pierre-Yves David	15ab7ccd15	revlog: make addgroup returns a list of node contained in the added source This list will contains any node see in the source, not only the added one. This is intended to allow phase to be move according what was pushed by client not only what was added.	2012-01-13 01:29:03 +01:00
Pierre-Yves David	a51dc67424	revlog: improve docstring for findcommonmissing	2012-01-09 04:15:31 +01:00
Steven Brown	3ebdb5ed19	revlog: clarify strip docstring "readd" -> "re-add" I misread it as "read".	2012-01-10 22:35:25 +08:00
Matt Mackall	864ce9da04	misc: adding missing file close() calls Spotted by Victor Stinner <victor.stinner@haypocalc.com>	2011-11-03 11:24:55 -05:00
Greg Ward	bc1dfb1ac9	atomictempfile: make close() consistent with other file-like objects. The usual contract is that close() makes your writes permanent, so atomictempfile's use of close() to discard writes (and rename() to keep them) is rather unexpected. Thus, change it so close() makes things permanent and add a new discard() method to throw them away. discard() is only used internally, in __del__(), to ensure that writes are discarded when an atomictempfile object goes out of scope. I audited mercurial., hgext., and ~80 third-party extensions, and found no one using the existing semantics of close() to discard writes, so this should be safe.	2011-08-25 20:21:04 -04:00
Augie Fackler	ea2e868e0f	revlog: use getattr instead of hasattr	2011-07-25 15:43:55 -05:00
Matt Mackall	1b52b02896	check-code: catch misspellings of descendant This word is fairly common in Mercurial, and easy to misspell.	2011-06-07 17:02:54 -05:00
Sune Foldager	7db447dd4c	revlog: bail out earlier in group when we have no chunks	2011-06-03 20:32:54 +02:00
Martin Geisler	af8a35e078	check-code: flag 0/1 used as constant Boolean expression	2011-06-01 12:38:46 +02:00
Matt Mackall	66805ccfed	revlog: stop exporting node.short	2011-05-21 15:01:28 -05:00
Matt Mackall	a6f2ad6f1e	revlog: drop base() again deltaparent does what's needed, and more "portably".	2011-05-18 17:05:30 -05:00
Sune Foldager	9a73f9bed3	revlog: linearize created changegroups in generaldelta revlogs This greatly improves the speed of the bundling process, and often reduces the bundle size considerably. (Although if the repository is already ordered, this has little effect on both time and bundle size.) For non-generaldelta clients, the reduced bundle size translates to a reduced repository size, similar to shrinking the revlogs (which uses the exact same algorithm). For generaldelta clients the difference is minor. When the new bundle format comes, reordering will not be necessary since we can then store the deltaparent relationsships directly. The eventual default behavior for clients and servers is presented in the table below, where "new" implies support for GD as well as the new bundle format: old client new client old server old bundle, no reorder old bundle, no reorder new server, non-GD old bundle, no reorder[1] old bundle, no reorder[2] new server, GD old bundle, reorder[3] new bundle, no reorder[4] [1] reordering is expensive on the server in this case, skip it [2] client can choose to do its own redelta here [3] reordering is needed because otherwise the pull does a lot of extra work on the server [4] reordering isn't needed because client can get deltabase in bundle format Currently, the default is to reorder on GD-servers, and not otherwise. A new setting, bundle.reorder, has been added to override the default reordering behavior. It can be set to either 'auto' (the default), or any true or false value as a standard boolean setting, to either force the reordering on or off regardless of generaldelta. Some timing data from a relatively branch test repository follows. All bundling is done with --all --type none options. Non-generaldelta, non-shrunk repo: ----------------------------------- Size: 276M Without reorder (default): Bundle time: 14.4 seconds Bundle size: 939M With reorder: Bundle time: 1 minute, 29.3 seconds Bundle size: 381M Generaldelta, non-shrunk repo: ----------------------------------- Size: 87M Without reorder: Bundle time: 2 minutes, 1.4 seconds Bundle size: 939M With reorder (default): Bundle time: 25.5 seconds Bundle size: 381M	2011-05-18 23:26:26 +02:00
Sune Foldager	c222fc4662	changelog: don't use generaldelta	2011-05-16 13:06:48 +02:00
Sune Foldager	d7f01e602b	revlog: get rid of defversion defversion was a property (later option) on the store opener, used to propagate the changelog revlog format to the other revlogs, so they would be created with the same format. This required that the changelog instance was created before any other revlog; an invariant that wasn't directly enforced (or documented) anywhere. We now use the revlogv1 requirement instead, which is transfered to the store opener options. If this option is missing, v0 revlogs are created.	2011-05-16 12:44:34 +02:00
Matt Mackall	608041d55e	revlog: restore the base method	2011-05-15 11:50:15 -05:00
Sune Foldager	2ce60e2564	revlog: improve delta generation heuristics for generaldelta Without this change, pulls (and clones) into a generaldelta repository could generate very inefficient revlogs, the size of which could be at least twice the original size. This was caused by the generated delta chains covering too large distances, causing new chains to be built far too often. This change addresses the problem by forcing a delta against second parent or against the previous revision, when the first parent delta is in danger of creating a long chain.	2011-05-12 15:24:33 +02:00
Sune Foldager	7b30600f6b	revlog: fix bug in chainbase cache The bug didn't cause corruption, and thus wasn't caught in hg verify or in tests. It could lead to delta chains longer than normally allowed, by affecting the code that decides when to add a full revision. This could, in turn, lead to performance regression.	2011-05-12 13:47:17 +02:00
Sune Foldager	762090a2c7	revlog: add docstring to _addrevision	2011-05-11 11:04:44 +02:00
Sune Foldager	1c7dece034	revlog: support writing generaldelta revlogs With generaldelta switched on, deltas are always computed against the first parent when adding revisions. This is done regardless of what revision the incoming bundle, if any, is deltaed against. The exact delta building strategy is subject to change, but this will not affect compatibility. Generaldelta is switched off by default.	2011-05-08 21:32:33 +02:00
Sune Foldager	8bdf02181a	revlog: support reading generaldelta revlogs Generaldelta is a new revlog global flag. When it's turned on, the base field of each revision entry holds the deltaparent instead of the base revision of the current delta chain. This allows for great potential flexibility when generating deltas, as any revision can serve as deltaparent. Previously, the deltaparent for revision r was hardcoded to be r - 1. The base revision of the delta chain can still be accessed as before, since it is now computed in an iterative fashion, following the deltaparents backwards.	2011-05-07 22:40:17 +02:00
Sune Foldager	88485e9322	revlog: calculate base revisions iteratively This is in preparation for generaldelta, where the revlog entry base field is reinterpreted as the deltaparent. For that reason we also rename the base function to chainbase. Without generaldelta, performance is unaffected, but generaldelta will suffer from this in _addrevision, since delta chains will be walked repeatedly. A cache has been added to eliminate this problem completely.	2011-05-07 22:40:14 +02:00
Sune Foldager	be6386433b	revlog: remove the last bits of punched/shallow Most of it was removed in fa05c723ac8c, but a few pieces were accidentally left behind.	2011-05-07 22:37:40 +02:00
Martin Geisler	d04646b8d9	revlog: use real Booleans instead of 0/1 in nodesbetween	2011-05-06 12:09:20 +02:00
Sune Foldager	750dcd7b48	revlog: compute correct deltaparent in the deltaparent function It now returns nullrev for chain base revisions, since they are conceptually deltas against nullrev. The revdiff function was updated accordingly.	2011-05-05 18:05:24 +02:00
Sune Foldager	bb96ed66fc	revlog: remove support for punched/shallow The feature was never finished, and there has been restructuring going on since it was added.	2011-05-05 12:46:02 +02:00
Sune Foldager	d959ff1e97	revlog: remove support for parentdelta We will introduce a more powerful and general delta concept instead, called generaldelta.	2011-05-05 12:55:12 +02:00
Peter Arrenbrecht	75fa0e5ea9	discovery: add new set-based discovery Adds a new discovery method based on repeatedly sampling the still undecided subset of the local node graph to determine the set of nodes common to both the client and the server. For small differences between client and server, it uses about the same or slightly fewer roundtrips than the old tree-based discovery. For larger differences, it typically reduces the number of roundtrips drastically (from 150 to 4, for instance). The old discovery code now lives in treediscovery.py, the new code is in setdiscovery.py. Still missing is a hook for extensions to contribute nodes to the initial sample. For instance, Augie's remotebranches could contribute the last known state of the server's heads. Credits for the actual sampler and computing common heads instead of bases go to Benoit Boissinot.	2011-05-02 19:21:30 +02:00
Benoit Boissinot	b805aced54	unbundler: separate delta and header parsing Add header parsing for changelog and manifest (currently no headers might change for next-gen bundle).	2011-04-30 19:01:24 +02:00
Benoit Boissinot	e3152ec807	changegroup: new bundler API	2011-04-30 11:03:28 +02:00
Benoit Boissinot	c5f5260aea	bundler: make parsechunk return the base revision of the delta	2011-04-30 10:00:41 +02:00
Sune Foldager	9b847e3562	revlog: introduce _chunkbase to allow filelog to override Used by revlog.revision to retrieve the base-chunk in a delta chain.	2011-04-30 16:33:47 +02:00
Alexander Solovyov	0eb3836642	remove unused imports and variables	2011-04-30 13:59:14 +02:00
Matt Mackall	1fb0b59ceb	changegroup: introduce bundler objects This makes the bundler pluggable at lower levels.	2011-03-31 15:24:06 -05:00
Matt Mackall	c9e7d5507f	changegroup: add revlog to the group callback	2011-03-28 11:18:56 -05:00
Matt Mackall	d9e86660be	changegroup: move sorting down into group	2011-03-28 11:18:56 -05:00
Matt Mackall	f94b6206a0	changegroup: combine infocollect and lookup callbacks	2011-03-28 11:18:56 -05:00
Matt Mackall	af08071ace	changegroup: drop unused fullrev This is unfinished and unused and complicates expanding the wire protocol.	2011-03-24 17:16:30 -05:00
Matt Mackall	f689cccd2c	revlog: change variable name to avoid reuse	2011-03-26 17:12:02 -05:00
Peter Arrenbrecht	6646f48826	wireproto: add getbundle() function getbundle(common, heads) -> bundle Returns the changegroup for all ancestors of heads which are not ancestors of common. For both sets, the heads are included in the set. Intended to eventually supercede changegroupsubset and changegroup. Uses heads of common region to exclude unwanted changesets instead of bases of desired region, which is more useful and easier to implement. Designed to be extensible with new optional arguments (which will have to be guarded by corresponding capabilities).	2011-03-23 16:02:11 +01:00
Dan Villiom Podlaski Christiansen	ec590d5cd4	explicitly close files Add missing calls to close() to many places where files are opened. Relying on reference counting to catch them soon-ish is not portable and fails in environments with a proper GC, such as PyPy.	2010-12-24 15:23:01 +01:00
Matt Mackall	f549de8060	revlog: remove stray test in rev()	2011-01-21 16:26:01 -06:00
Matt Mackall	856c224de7	revlog: pass rev to _checkhash	2011-01-18 15:55:48 -06:00
Matt Mackall	275d2d9cb0	revlog: incrementally build node cache with linear searches This avoids needing to prime the cache for operations like verify which visit most or all of the index.	2011-01-18 15:55:46 -06:00
Benoit Boissinot	8acffa3308	revlog: explicit test and explicit variable names	2011-01-16 12:25:46 +01:00
Benoit Boissinot	3ada8fe22e	revlog: if the nodemap is set, use the fast version of revlog.rev()	2011-01-16 12:24:48 +01:00
Benoit Boissinot	383d62511b	revlog/parseindex: construct the nodemap if it is empty	2011-01-15 15:06:53 +01:00
Benoit Boissinot	4072e97b7c	revlog: always add the magic nullid/nullrev entry in parseindex	2011-01-15 13:02:19 +01:00
Benoit Boissinot	b75c111431	revlog/parseindex: no need to pass the file around	2011-01-15 15:04:58 +01:00
Matt Mackall	1e3dbac7f5	revlog: do revlog node->rev mapping by scanning Now that the nodemap is lazily created, we use linear scanning back from tip for typical node to rev mapping. Given that nodemap creation is O(n log n) and revisions searched for are usually very close to tip, this is often a significant performance win for a small number of searches. When we do end up building a nodemap for bulk lookups, the scanning function is replaced with a hash lookup.	2011-01-11 21:52:03 -06:00
Matt Mackall	a1c37f5749	revlog: introduce a cache for partial lookups Partial lookups are always O(n), and often we look up the same one multiple times.	2011-01-11 17:12:32 -06:00
Matt Mackall	846d35e24f	revlog: only build the nodemap on demand	2011-01-11 17:01:04 -06:00
Matt Mackall	efaaee2894	revlog: remove lazy index	2011-01-04 14:12:52 -06:00
Matt Mackall	a399da7502	revlog: break hash checking into subfunction	2011-01-06 17:04:41 -06:00
Martin Geisler	6a3d9310ab	code style: prefer 'is' and 'is not' tests with singletons	2010-11-22 18:15:58 +01:00
Nicolas Dumazet	f48c256c16	revlog: fix descendants() if nullrev is in revs We were not returning the correct result if nullrev was in revs, as we are checking parent(currentrev) != nullrev before yielding currentrev test-convert-hg-startrev was wrong: if we start converting from rev -1 and onwards, all the descendants of -1 (full repo) should be converted.	2010-11-07 18:23:48 +09:00
Nicolas Dumazet	31a27ed9b9	revlog: if start is nullrev, end is always a descendant	2010-11-07 18:16:07 +09:00
Matt Mackall	90141f6021	revlog: choose best delta for parentdelta (issue2466) When parentdelta is enabled, we choose the delta that has the minimum distance to its base. Otherwise, base may be sufficiently far away to require a full version, resulting in greatly reduced compression.	2010-10-30 02:47:34 -05:00
Matt Mackall	d0eecfc8e5	revlog: precalculate p1 and p2 revisions	2010-10-30 02:47:34 -05:00
Matt Mackall	9e1ed89cc3	revlog: extract delta building to a subfunction	2010-10-30 02:47:34 -05:00
Matt Mackall	4d0c140fd2	revlog: simplify cachedelta handling	2010-10-30 02:47:34 -05:00
Matt Mackall	a758f18a0a	revlog: fix buildtext local scope buildtext stores its result in _addrevision scope to avoid repeated builds cachedelta is already visible	2010-10-30 02:47:34 -05:00
Benoit Boissinot	e33ea43988	revlog.addgroup(): always use _addrevision() to add new revlog entries This makes parentdelta clone support pulling.	2010-10-08 18:00:19 -05:00
Benoit Boissinot	617749c5ab	revlog._addrevision(): allow text argument to be None, build it lazily	2010-10-08 18:00:16 -05:00
Matt Mackall	aa7fff48c0	bundle: move chunk parsing into unbundle class	2010-09-19 13:12:45 -05:00
Matt Mackall	4b4d939b00	bundle: get rid of chunkiter	2010-09-19 12:51:54 -05:00
Benoit Boissinot	0a73b8e369	mdiff.patch(): add a special case for when the base text is empty remove the special casing from revlog.addgroup()	2010-08-23 13:28:04 +02:00
Benoit Boissinot	cdd70dbbdc	revlog: add rawsize(), identical to size() but not subclassed by filelog	2010-08-23 13:24:19 +02:00
Benoit Boissinot	6b8f2ae045	revlog.addrevision(): move computation of nodeid in addrevision() The check "if node in nodemap" is already done earlier in addgroup().	2010-08-22 23:17:17 +02:00
Benoit Boissinot	25c2d76480	revlog: fix docstring	2010-08-21 19:31:59 +02:00
Benoit Boissinot	70194e7582	deltaparent(): don't return nullrev for a revision containing a full snapshot this allows us to simplify manifest.readdelta and revlog.revdiff	2010-08-21 19:30:42 +02:00
Vishakh H	f9137b572d	revlog: addgroup re-adds punched revisions for missing parents While reading changegroup if a node with missing parents is encountered, we add a punched entry in the index with null parents for the missing parent node.	2010-08-13 19:42:28 +05:30
Vishakh H	6a818df747	revlog: generate full revisions when parent node is missing The full revision is sent if the first parent, against which diff is calculated, is missing at remote. This happens in the case of shallow clones.	2010-08-13 19:41:51 +05:30
Benoit Boissinot	219b0cf9cf	revlog.revision(): inline deltachain computation	2010-08-20 00:17:50 +02:00
Benoit Boissinot	8b484e5088	revlog.revision(): remove debug code	2010-08-20 00:17:50 +02:00
Benoit Boissinot	4b8002ed12	revlog.revision(): don't use nullrev as the default value for the cache I is probably a bug if the deltachain computation think there was a cache hit at nullrev. Use None instead, this will never trigger a cache hit.	2010-08-20 00:17:50 +02:00
Benoit Boissinot	841f801067	revlog.revision(): minor cleanup Rename some variables, making the name more obvious (in particular "cache" was actually two different variable. Move code around, moving the index preloading before the deltachain computation, without that index preloading was useless (everything was read in deltachain).	2010-08-20 00:17:50 +02:00
Benoit Boissinot	35b0c5591f	parendelta: fix computation of base rev (fixes issue2337) Refactor revlog._addrevision() and put the correct base rev in the parent-delta case: base(rev) should always be equal to the first full snapshot that is needed by the delta chain, in both parent-delta and tip-delta case. Before this fix, the base rev was in most case wrong (and in the case where p1 == nullid, this triggered the bug from issue2337). This means that repositories converted to parent-delta earlier are corrupted and needs to be reconverted.	2010-08-18 19:37:23 +02:00
Benoit Boissinot	cbd0c9fbc1	revlog._addrevision(): make the parent of the cached delta explicit	2010-08-18 19:45:52 +02:00
Matt Mackall	d6488d66cb	revlog: optimize deltachain	2010-08-15 23:13:56 -05:00
Pradeepkumar Gayam	2257cdb06d	revlog: append delta against p1	2010-08-10 22:27:41 +05:30
Pradeepkumar Gayam	dada7ee47c	revlog: teach revlog to construct a revision from parentdeltas	2010-08-10 22:27:16 +05:30
Pradeepkumar Gayam	eeeffb2335	revlog: deltachain() returns chain of revs need to construct a revision	2010-08-10 22:26:08 +05:30
Pradeepkumar Gayam	65e80a1e8d	revlog: parentdelta flags for revlog index	2010-08-10 22:25:08 +05:30
Matt Mackall	c0eb9c1315	Merge with stable	2010-08-06 12:59:13 -05:00
Matt Mackall	6426194b87	revlog: drop cache after use to save memory footprint If we reconstruct back to back large versions, we need to drop the cache first to avoid doubling memory usage.	2010-08-05 16:17:17 -05:00
Vishakh H	04014eeaa3	revlog: add shallow header flag REVLOGSHALLOW header flag to mark revlog as shallow. The _shallow attribute of the revlog is used to check if the header flag is set.	2010-08-03 19:38:19 +05:30
Vishakh H	22c7674a18	revlog: add punched revision flag index flag to identify a revision as punched, i.e. it contains no data. REVIDX_PUNCHED_FLAG = 2, is used to mark a revision as punched. REVIDX_KNOWN_FLAGS is the accumulation of all index flags.	2010-08-03 19:38:19 +05:30
Pradeepkumar Gayam	fde50bd3d6	revlog: add a flags method that returns revision flags	2010-07-27 01:16:38 +05:30
Benoit Boissinot	ccef97c636	chunkbuffer: split big strings directly in chunkbuffer	2010-07-25 13:10:57 +02:00
Nicolas Dumazet	658e3fee6a	merge with stable	2010-07-13 22:56:01 +09:00
Nicolas Dumazet	6e75efdbcb	cmp: document the fact that we return True if content is different This is similar to the __builtin__.cmp behaviour, but still not straightforward, as the dailylife meaning of a comparison usually is "find out if they are different".	2010-07-09 11:02:39 +09:00
Renato Cunha	e0e017d51d	revlog: Marked classic int divisions as such.	2010-07-01 19:27:02 -03:00
Martin Geisler	74ffb9edd7	revlog: fix inconsistent comment formatting	2010-06-10 17:10:05 +02:00
Matt Mackall	94c9deabbe	static-http: disable lazy parsing This only hits if you're crazy enough to use static-http on a repository with revlogs larger than 1M. Don't do it.	2010-05-11 16:28:09 -05:00
Benoit Boissinot	99744e205a	merge with stable	2010-04-15 15:35:06 +02:00
Benoit Boissinot	042c3ae681	add documentation for revlog._prereadsize	2010-04-15 15:21:21 +02:00
Benoit Boissinot	9037401fec	merge with stable	2010-04-15 13:52:41 +02:00
Greg Ward	81511c58c5	revlog: fix lazyparser.__iter__() to return all revisions (issue2137) Previously, it only returned revisions that were in the revlog when it was originally opened; revisions added since then were invisible. This broke revlog._partialmatch() and therefore repo.lookup(). (Credit to Benoit Boissinot for simplifying my original test script and for the actual fix.)	2010-04-14 15:06:40 -04:00
Greg Ward	fac8084184	revlog: factor out _maxinline global. This lets us change the threshold at which a *.d file will be split out, which should make it much easier to construct test cases that probe revlogs with a separate data file. (issue2137)	2010-04-13 17:58:38 -04:00
Greg Ward	01987f5d3e	revlog: factor out _maxinline global. This lets us change the threshold at which a *.d file will be split out, which should make it much easier to construct test cases that probe revlogs with a separate data file. (issue2137)	2010-04-13 17:58:38 -04:00
Benoit Boissinot	b8ee2dc847	revlog: put graph related functions together	2010-04-13 22:06:17 +02:00
Benoit Boissinot	9e42f82266	revlog.size: remove alternate implementation (revlogv0 specific) it's only useful for revlogv0 anyway, revlogNG has the uncompressed size in the index.	2010-02-09 14:02:07 +01:00
Benoit Boissinot	a2e24d1e23	revlog: don't silently discard revlog flags on revlogv0	2010-02-08 17:28:19 +01:00
Dirkjan Ochtman	71e802a601	revlog: fix up previously stupid API change	2010-02-06 12:47:17 +01:00
Dirkjan Ochtman	10416e7207	revlog: add a fast path for checking ancestry	2010-02-06 11:27:22 +01:00
Vsevolod Solovyov	d3e723589a	add options dict to localrepo.store.opener and use it for defversion	2010-02-05 19:10:26 +01:00
Matt Mackall	8d99be19f0	many, many trivial check-code fixups	2010-01-25 00:05:27 -06:00
Matt Mackall	cd3ef170f7	Merge with stable	2010-01-19 22:45:09 -06:00
Matt Mackall	595d66f424	Update license to GPLv2+	2010-01-19 22:20:08 -06:00
Greg Ward	d9c710d107	revlog: rewrite several method docstrings - methods: findmissing(), nodesbetween(), descendants(), ancestors() - the goal is precise, concise, accurate, grammatical, understandable, consistently formatted docstrings	2009-12-10 09:35:43 -05:00
Benoit Boissinot	80a458a464	pychecker: remove unused local variables	2009-10-31 17:04:46 +01:00
Greg Ward	d0559b076a	Improve some docstrings relating to changegroups and prepush().	2009-09-08 17:58:59 -04:00
Benoit Boissinot	a98dab4110	manifest/revlog: do not let the revlog cache mutable objects If a buffer of an mutable object is passed to revlog.addrevision(), the revlog will happily store it in its cache. Later when the revlog reuses the cached entry, if the manifest modified the object in-between, all kind of bugs appears. We fix it by: - passing immutable objects to addrevision() if they are already available - only storing the text in the cache if it's of str type Then we can remove the conversion of the cache entry to str() during retrieval. That was probably just there hiding the bug for the common cases but not really fixing it.	2009-09-04 10:47:55 +02:00
Alejandro Santos	3183e52503	compat: use // for integer division	2009-07-05 11:00:44 +02:00
Martin Geisler	7fc350327e	revlog: make triple-quoted string a real comment	2009-05-31 00:58:20 +02:00
Matt Mackall	28a532efb7	revlog: refactor chunk cache interface again - chunk to _chunk - _prime to _chunkraw - _chunkclear for cache clearing - _chunk calls _chunkraw - clean up _prime a bit - simplify users in revision and checkinlinesize - drop file descriptor passing (we're better off opening fds lazily	2009-05-27 16:01:34 -05:00
Matt Mackall	4dc9e4309d	revlog: report indexfile rather than datafile for integrity check	2009-05-27 14:44:54 -05:00
Matt Mackall	df215f1dab	revlog: move stat inside lazyparser	2009-05-27 14:44:51 -05:00
Benoit Boissinot	1d46ec6dc3	changegroup: the node list might be an empty generator (fix issue1678)	2009-05-27 02:46:59 +02:00
Martin Geisler	4176f5b789	replace xrange(0, n) with xrange(n)	2009-05-25 23:06:11 +02:00
Benoit Boissinot	d861db9be5	revlog: fix undefined variable introduced in 042cecef3ad7	2009-05-25 13:52:09 +02:00
Matt Mackall	2c2304d400	revlog: fix reading of larger revlog indices on Windows	2009-05-23 11:53:23 -05:00
Martin Geisler	5d72c242d8	Merge with stable	2009-12-12 23:03:05 +01:00
Benoit Boissinot	795d30a996	revlog: add fast path to ancestor	2009-12-03 01:01:49 +01:00
Martin Geisler	0a365d5ca2	use 'x is None' instead of 'x == None' The built-in None object is a singleton and it is therefore safe to compare memory addresses with is. It is also faster, how much depends on the object being compared. For a simple type like str I get: \| s = "foo" \| s = None ----------+-----------+---------- s == None \| 0.25 usec \| 0.21 usec s is None \| 0.17 usec \| 0.17 usec	2009-05-20 00:52:46 +02:00
Benoit Boissinot	c97e6590bb	revlog: use set instead of dict	2009-05-17 03:49:59 +02:00
Benoit Boissinot	c8bc34d692	revlog.missing(): use sets instead of a dict	2009-05-17 02:44:12 +02:00
Peter Arrenbrecht	5851b0b382	revlog: slightly tune group() by not going rev->node->rev	2009-05-14 16:00:21 +02:00
Matt Mackall	c70b6084a7	revlog: add cache priming for reconstructing delta chains	2009-05-07 19:39:45 -05:00
Matt Mackall	7aa559f96a	revlog: use chunk cache to avoid rereading when splitting inline files	2009-05-07 19:39:45 -05:00
Matt Mackall	d48241b5f7	revlog: clean up the chunk caching code	2009-05-07 19:39:45 -05:00
Matt Mackall	9f78d7798f	revlog: use index to find index size	2009-05-07 19:39:45 -05:00
Matt Mackall	00548e0791	revlog: preread revlog .i file Smaller revlogs can be read with a single read, do it on open.	2009-05-07 19:39:45 -05:00
Simon Heimberg	09ac1e6c92	separate import lines from mercurial and general python modules	2009-04-28 17:40:46 +02:00
Martin Geisler	2c8901a1b9	turn some comments back into module docstrings	2009-04-26 01:24:49 +02:00
Martin Geisler	8e4bc1e9ad	put license and copyright info into comment blocks	2009-04-26 01:13:08 +02:00
Martin Geisler	750183bdad	updated license to be explicit about GPL version 2	2009-04-26 01:08:54 +02:00
Martin Geisler	7a5147b673	rebase, revlog: use set(x) instead of set(x.keys()) The latter is both unnecessary and slower.	2009-04-25 22:25:49 +02:00
Martin Geisler	747c05d2eb	revlog: let nodestotag be a set instead of a list	2009-04-22 20:51:20 +02:00
Martin Geisler	e2222d3c43	replace set-like dictionaries with real sets Many of the dictionaries created by dict.fromkeys were emulating sets. These can now be replaced with real sets.	2009-04-22 00:57:28 +02:00
Martin Geisler	1deb417a82	util: use built-in set and frozenset This drops Python 2.3 compatibility.	2009-04-22 00:55:32 +02:00
Henrik Stuart	c1e6537e5f	strip: make repair.strip transactional to avoid repository corruption Uses a transaction instance from the local repository to journal the truncation of revlog files, such that if a strip only partially completes, hg recover will be able to finish the truncate of all the files. The potential unbundling of changes that have been backed up to be restored later will, in case of an error, have to be unbundled manually. The difference is that it will be possible to recover the repository state so the unbundle can actually succeed.	2009-04-16 15:34:03 +02:00
Benoit Boissinot	309c0e0a31	merge with -stable	2009-04-06 20:11:00 +02:00
Benoit Boissinot	e5ea532970	raise RevlogError when parser can't parse the revlog index Initial patch and test thanks to Nicolas Dumazet.	2009-04-06 19:48:11 +02:00
Nicolas Dumazet	3ced073c3b	revlog: faster hash computation when one of the parent node is null Because we often compute sha1(nullid), it's interesting to copy a precomputed hash of nullid instead of computing everytime the same hash. Similarly, when one of the parents is null, we can avoid a < comparison (sort). Overall, this change adds a string equality comparison on each hash() call, but when p2 is null, we drop one string < comparison, and copy a hash instead of computing it. Since it is common to have revisions with only one parent, this change makes hash() 25% faster when cloning a big repository.	2009-03-23 15:32:29 +01:00
Peter Arrenbrecht	a2d3e23eef	cleanup: drop variables for unused return values They are unnecessary. I did leave them in localrepo.py where there is something like: _junk = foo() _junk = None to free memory early. I don't know if just `foo()` will free the return value as early.	2009-03-23 13:13:02 +01:00
Peter Arrenbrecht	bc21361ed2	cleanup: drop unused imports	2009-03-23 13:12:07 +01:00
Matt Mackall	d15d559b7c	errors: move revlog errors - create error.py for exception classes to reduce demandloading - move revlog exceptions to it - change users to import error and drop revlog import if possible	2009-01-11 22:48:28 -06:00
Matt Mackall	4bb1cd5f2b	lookup: speed up partial lookup	2008-11-12 19:11:34 -06:00
Matt Mackall	4205fad04a	revlog: speed up parents()	2008-11-12 15:58:46 -06:00
Matt Mackall	0770924171	revlog: remove delta function	2008-11-12 15:32:16 -06:00

... 2 3 4 5 6 ...

587 Commits