sapling

mirror of https://github.com/facebook/sapling.git synced 2024-10-10 16:57:49 +03:00

Author	SHA1	Message	Date
Pulkit Goyal	dfd06a1929	revlog: raise error.WdirUnsupported from revlog.node() if wdirrev is passed When we try to run, 'hg debugrevspec 'branch(wdir())'', it throws an index error and blows up. Lets raise the WdirUnsupported if wdir() is passed so that we can catch that later.	2017-05-23 01:30:36 +05:30
Pulkit Goyal	26a5b62b59	revlog: raise WdirUnsupported when wdirrev is passed revlog.parentrevs() is called while evaluating ^ operator in revsets. When wdir is passed, it raises IndexError. This patch raises WdirUnsupported if wdir is passed in the function. The error will be caugth in future patches.	2017-05-19 19:12:06 +05:30
Gregory Szorc	7d51a8278d	revlog: remove some revlogNG terminology RevlogNG is not such a good name when it is no longer the newest revlog version. Since we'll soon have revlog version 2, let's remove some references to it.	2017-05-19 20:14:31 -07:00
Gregory Szorc	5bcef1853c	revlog: tweak wording and logic for flags validation First, the logic around the if..elif..elif was subtly wrong and sub-optimal because all branches would be tested as long as the revlog was valid. This patch changes things so it behaves like a switch statement over the revlog version. While I was here, I also tweaked error strings to make them consistent and to read better.	2017-05-19 20:10:50 -07:00
Yuya Nishihara	4563e16232	parsers: switch to policy importer # no-check-commit	2016-08-13 12:23:56 +09:00
Gregory Szorc	8af088ee65	revlog: rename constants (API) Feature flag constants don't need "NG" in the name because they will presumably apply to non-"NG" version revlogs. All feature flag constants should also share a similar naming convention to identify them as such. And, "RevlogNG" isn't a great internal name since it isn't obvious it maps to version 1 revlogs. Plus, "NG" (next generation) is only a good name as long as it is the latest version. Since we're talking about version 2, now is as good a time as any to move on from that naming.	2017-05-17 19:52:18 -07:00
Jun Wu	718861a5f7	changelog: make sure datafile is 00changelog.d (API) 0ad0d26ff7 makes it possible for changelog datafile to be "00changelog.i.d", which is wrong. This patch adds an explicit datafile parameter to fix it.	2017-05-17 20:14:27 -07:00
Martin von Zweigbergk	c3406ac3db	cleanup: use set literals We no longer support Python 2.6, so we can now use set literals.	2017-02-10 16:56:29 -08:00
Jun Wu	4656f56bb3	flagprocessor: add a fast path when flags is 0 When flags is 0, _processflags could be a no-op instead of iterating through the flag bits.	2017-05-10 16:17:58 -07:00
Jun Wu	2c11c92a85	revlog: move part of "addrevision" to "addrawrevision" "addrawrevision" will be the public API to reuse revision rawdata elsewhere. It will be used by a future patch.	2017-05-09 21:27:06 -07:00
Gregory Szorc	5d6e940365	revlog: rename _chunkraw to _getsegmentforrevs() This completes our rename of internal revlog methods to distinguish between low-level raw revlog data "segments" and higher-level, per-revision "chunks." perf.py has been updated to consult both names so it will work against older Mercurial versions.	2017-05-06 12:12:53 -07:00
Gregory Szorc	46413ff643	revlog: rename internal functions containing "chunk" to use "segment" Currently, "chunk" is overloaded in revlog terminology to mean multiple things. One of them refers to a segment of raw data from the revlog. This commit renames various methods only used within revlog.py to have "segment" in their name instead of "chunk." While I was here, I also made the names more descriptive. e.g. "_loadchunk()" becomes "_readsegment()" because it actually does I/O.	2017-05-06 12:02:12 -07:00
Jun Wu	0606028aff	revlog: make "size" diverge from "rawsize" Previously, revlog.size equals to revlog.rawsize. However, the flag processor framework could make a difference - "size" could mean the length of len(revision(raw=False)), while "rawsize" means len(revision(raw=True)). This patch makes it so. This corrects "hg status" output when flag processor is involved. The call stack looks like: basectx.status -> workingctx._buildstatus -> workingctx._dirstatestatus -> workingctx._checklookup -> filectx.cmp -> filelog.cmp -> filelog.size -> revlog.size	2017-04-09 12:53:31 -07:00
Jun Wu	e557e14680	revlog: avoid applying delta chain on cache hit Previously, revlog.revision(raw=False) may try to apply the delta chain on _cache hit. That happens if flags are non-empty. This patch makes rawtext reused so delta chain application is avoided. "_cache" and "rev" are moved a bit to avoid unnecessary assignments.	2017-04-02 18:40:13 -07:00
Jun Wu	5f26616d71	revlog: indent block to make review easier	2017-04-02 18:29:24 -07:00
Jun Wu	2ab18ee566	revlog: avoid calculating "flags" twice in revision() This is more consistent with other code in "revision()" - prefer performance to code length.	2017-04-02 18:25:12 -07:00
Jun Wu	20165e0767	revlog: use raw revision for rawsize When writing the revlog-ng index, the third field is len(rawtext). See revlog._addrevision: textlen = len(rawtext) .... e = (offset_type(offset, flags), l, textlen, base, link, p1r, p2r, node) self.index.insert(-1, e) Therefore, revlog.index[rev][2] returned by revlog.rawsize should be len(rawtext), where "rawtext" is revlog.revision(raw=True). Unfortunately it's hard to add a test for this code path because "if l >= 0" catches most cases.	2017-04-02 18:57:03 -07:00
Jun Wu	7151069c4a	revlog: add a fast path for revision(raw=False) If cache hit and flags are empty, no flag processor runs and "text" equals to "rawtext". So we check flags, and return rawtext. This resolves performance issue introduced by a previous patch.	2017-03-30 21:21:15 -07:00
Jun Wu	ae8c9ce375	revlog: make _addrevision only accept rawtext All 3 users of _addrevision use raw: - addrevision: passing rawtext to _addrevision - addgroup: passing rawtext and raw=True to _addrevision - clone: passing rawtext to _addrevision There is no real user using _addrevision(raw=False). On the other hand, _addrevision is low-level code dealing with raw revlog deltas and rawtexts. It should not transform rawtext to non-raw text. This patch removes the "raw" parameter from "_addrevision", and does some rename and doc change to make it clear that "_addrevision" expects rawtext. Archeology shows 886a08012bbe added "raw" flag to "_addrevision", follow-ups fe1e206cb389 and 1cfa6239c923 seem to make the flag unnecessary. test-revlog-raw.py no longer complains.	2017-03-30 18:38:03 -07:00
Jun Wu	9a6035a980	revlog: use raw revisions in clone test-revlog-raw.py now shows "clone test passed", but there is more to fix.	2017-03-30 18:24:23 -07:00
Jun Wu	2468c838bd	revlog: use raw revisions in revdiff See the added comment. revdiff is meant to output the raw delta that will be written to revlog. It should use raw. test-revlog-raw.py now shows "addgroupcopy test passed", but there is more to fix.	2017-03-30 18:23:27 -07:00
Jun Wu	558f5cce61	revlog: use raw content when building delta Using external content provided by flagprocessor when building revlog delta is wrong, because deltas are applied to raw contents in revlog. This patch fixes the above issue by adding "raw=True". test-revlog-raw.py now shows "local test passed", but there is more to fix.	2017-03-30 17:58:03 -07:00
Jun Wu	50b232c61f	revlog: fix _cache usage in revision() As documented at revlog.__init__, revlog._cache stores raw text. The current read and write usage of "_cache" in revlog.revision lacks of raw=True check. This patch fixes that by adding check about raw, and storing rawtext explicitly in _cache. Note: it may slow down cache hit code path when raw=False and flags=0. That performance issue will be fixed in a later patch. test-revlog-raw now points us to a new problem.	2017-03-30 15:34:08 -07:00
Jun Wu	fec2bbc9e9	revlog: rename some "text"s to "rawtext" This makes code easier to understand. "_addrevision" is left untouched - it will be changed in a later patch.	2017-03-30 14:56:09 -07:00
Jun Wu	b483b1b607	revlog: clarify flagprocessor documentation The words "text", "newtext", "bool" could be confusing. Use explicit "text" or "rawtext" and document more about the "bool".	2017-03-30 07:59:48 -07:00
Jun Wu	7f99b86dbd	revlog: avoid unnecessary node -> rev conversion	2017-03-29 16:23:04 -07:00
Yuya Nishihara	5a92909ce0	py3: fix slicing of byte string in revlog.compress() I tried .startswith('\0'), but data wasn't always a bytes nor a bytearray.	2017-03-26 17:12:06 +09:00
Augie Fackler	8471f9f823	revlog: use pycompat.maplist to eagerly evaluate map on Python 3 According to Pulkit, this should fix `hg status --all` on Python 3.	2017-03-21 17:39:49 -04:00
Augie Fackler	e0f1b901d8	revlog: use int instead of long By my reading of PEP 237[0], this is completely safe and has been since Python 2.2. 0: https://www.python.org/dev/peps/pep-0237/	2017-03-19 01:05:28 -04:00
Augie Fackler	bc09440907	revlog: use bytes() instead of str() to get data from memoryview Fixes `files -v` on Python 3.	2017-03-12 15:27:02 -04:00
Augie Fackler	03a50eb15f	revlog: use bytes() to ensure text from _chunks is a reasonable type	2017-03-12 03:32:38 -04:00
Augie Fackler	58dedd9fd0	revlog: extract first byte of revlog with a slice so it's portable	2017-03-12 00:49:49 -05:00
Martin von Zweigbergk	ad5f4ef8a6	revlog: give EXTSTORED flag value to narrowhg Narrowhg has been using "1 << 14" as its revlog flag value for a long time. We (Google) have many repos with that value in production already. When the same value was reserved for EXTSTORED, it made those repos invalid. Upgrading them will be a little painful. We should clearly have reserved the value for narrowhg a long time ago. Since the EXTSTORED flag is not yet in any release and Facebook also says they have not started using it in production, so it should be okay to change it. This patch gives the current value (1 << 14) back to narrowhg and gives a new value (1 << 13) to EXTSTORED.	2017-01-17 11:25:02 -08:00
Gregory Szorc	765aada92f	localrepo: experimental support for non-zlib revlog compression The final part of integrating the compression manager APIs into revlog storage is the plumbing for repositories to advertise they are using non-zlib storage and for revlogs to instantiate a non-zlib compression engine. The main intent of the compression manager work was to zstd all of the things. Adding zstd to revlogs has proved to be more involved than other places because revlogs are... special. Very small inputs and the use of delta chains (which are themselves a form of compression) are a completely different use case from streaming compression, which bundles and the wire protocol employ. I've conducted numerous experiments with zstd in revlogs and have yet to formalize compression settings and a storage architecture that I'm confident I won't regret later. In other words, I'm not yet ready to commit to a new mechanism for using zstd - or any other compression format - in revlogs. That being said, having some support for zstd (and other compression formats) in revlogs in core is beneficial. It can allow others to conduct experiments. This patch introduces highly experimental support for non-zlib compression formats in revlogs. Introduced is a config option to control which compression engine to use. Also introduced is a namespace of "exp-compression-" requirements to denote support for non-zlib compression in revlogs. I've prefixed the namespace with "exp-" (short for "experimental") because I'm not confident of the requirements "schema" and in no way want to give the illusion of supporting these requirements in the future. I fully intend to drop support for these requirements once we figure out what we're doing with zstd in revlogs. A good portion of the patch is teaching the requirements system about registered compression engines and passing the requested compression engine as an opener option so revlogs can instantiate the proper compression engine for new operations. That's a verbose way of saying "we can now use zstd in revlogs!" On an `hg pull` conversion of the mozilla-unified repo with no extra redelta settings (like aggressivemergedeltas), we can see the impact of zstd vs zlib in revlogs: $ hg perfrevlogchunks -c ! chunk ! wall 2.032052 comb 2.040000 user 1.990000 sys 0.050000 (best of 5) ! wall 1.866360 comb 1.860000 user 1.820000 sys 0.040000 (best of 6) ! chunk batch ! wall 1.877261 comb 1.870000 user 1.860000 sys 0.010000 (best of 6) ! wall 1.705410 comb 1.710000 user 1.690000 sys 0.020000 (best of 6) $ hg perfrevlogchunks -m ! chunk ! wall 2.721427 comb 2.720000 user 2.640000 sys 0.080000 (best of 4) ! wall 2.035076 comb 2.030000 user 1.950000 sys 0.080000 (best of 5) ! chunk batch ! wall 2.614561 comb 2.620000 user 2.580000 sys 0.040000 (best of 4) ! wall 1.910252 comb 1.910000 user 1.880000 sys 0.030000 (best of 6) $ hg perfrevlog -c -d 1 ! wall 4.812885 comb 4.820000 user 4.800000 sys 0.020000 (best of 3) ! wall 4.699621 comb 4.710000 user 4.700000 sys 0.010000 (best of 3) $ hg perfrevlog -m -d 1000 ! wall 34.252800 comb 34.250000 user 33.730000 sys 0.520000 (best of 3) ! wall 24.094999 comb 24.090000 user 23.320000 sys 0.770000 (best of 3) Only modest wins for the changelog. But manifest reading is significantly faster. What's going on? One reason might be data volume. zstd decompresses faster. So given more bytes, it will put more distance between it and zlib. Another reason is size. In the current design, zstd revlogs are larger*: debugcreatestreamclonebundle (size in bytes) zlib: 1,638,852,492 zstd: 1,680,601,332 I haven't investigated this fully, but I reckon a significant cause of larger revlogs is that the zstd frame/header has more bytes than zlib's. For very small inputs or data that doesn't compress well, we'll tend to store more uncompressed chunks than with zlib (because the compressed size isn't smaller than original). This will make revlog reading faster because it is doing less decompression. Moving on to bundle performance: $ hg bundle -a -t none-v2 (total CPU time) zlib: 102.79s zstd: 97.75s So, marginal CPU decrease for reading all chunks in all revlogs (this is somewhat disappointing). $ hg bundle -a -t <engine>-v2 (total CPU time) zlib: 191.59s zstd: 115.36s This last test effectively measures the difference between zlib->zlib and zstd->zstd for revlogs to bundle. This is a rough approximation of what a server does during `hg clone`. There are some promising results for zstd. But not enough for me to feel comfortable advertising it to users. We'll get there...	2017-01-13 20:16:56 -08:00
Gregory Szorc	94d36bba2d	revlog: use compression engine APIs for decompression Now that compression engines declare their header in revlog chunks and can decompress revlog chunks, we refactor revlog.decompress() to use them. Making full use of the property that revlog compressor objects are reusable, revlog instances now maintain a dict mapping an engine's revlog header to a compressor object. This is not only a performance optimization for engines where compressor object reuse can result in better performance, but it also serves as a cache of header values so we don't need to perform redundant lookups against the compression engine manager. (Yes, I measured and the overhead of a function call versus a dict lookup was observed.) Replacing the previous inline lookup table with a dict lookup was measured to make chunk reading ~2.5% slower on changelogs and ~4.5% slower on manifests. So, the inline lookup table has been mostly preserved so we don't lose performance. This is unfortunate. But many decompression operations complete in microseconds, so Python attribute lookup, dict lookup, and function calls do matter. The impact of this change on mozilla-unified is as follows: $ hg perfrevlogchunks -c ! chunk ! wall 1.953663 comb 1.950000 user 1.920000 sys 0.030000 (best of 6) ! wall 1.946000 comb 1.940000 user 1.910000 sys 0.030000 (best of 6) ! chunk batch ! wall 1.791075 comb 1.800000 user 1.760000 sys 0.040000 (best of 6) ! wall 1.785690 comb 1.770000 user 1.750000 sys 0.020000 (best of 6) $ hg perfrevlogchunks -m ! chunk ! wall 2.587262 comb 2.580000 user 2.550000 sys 0.030000 (best of 4) ! wall 2.616330 comb 2.610000 user 2.560000 sys 0.050000 (best of 4) ! chunk batch ! wall 2.427092 comb 2.420000 user 2.400000 sys 0.020000 (best of 5) ! wall 2.462061 comb 2.460000 user 2.400000 sys 0.060000 (best of 4) Changelog chunk reading is slightly faster but manifest reading is slower. What gives? On this repo, 99.85% of changelog entries are zlib compressed (the 'x' header). On the manifest, 67.5% are zlib and 32.4% are '\0'. This patch swapped the test order of 'x' and '\0' so now 'x' is tested first. This makes changelogs faster since they almost always hit the first branch. This makes a significant percentage of manifest '\0' chunks slower because that code path now performs an extra test. Yes, I too can't believe we're able to measure the impact of an if..elif with simple string compares. I reckon this code would benefit from being written in C...	2017-01-13 19:58:00 -08:00
Gregory Szorc	24c1205d69	revlog: use compression engine API for compression This commit swaps in the just-added revlog compressor API into the revlog class. Instead of implementing zlib compression inline in compress(), we now store a cached-on-first-use revlog compressor on each revlog instance and invoke its "compress()" method. As part of this, revlog.compress() has been refactored a bit to use a cleaner code flow and modern formatting (e.g. avoiding parenthesis around returned tuples). On a mozilla-unified repo, here are the "compress" times for a few commands: $ hg perfrevlogchunks -c ! wall 5.772450 comb 5.780000 user 5.780000 sys 0.000000 (best of 3) ! wall 5.795158 comb 5.790000 user 5.790000 sys 0.000000 (best of 3) $ hg perfrevlogchunks -m ! wall 9.975789 comb 9.970000 user 9.970000 sys 0.000000 (best of 3) ! wall 10.019505 comb 10.010000 user 10.010000 sys 0.000000 (best of 3) Compression times did seem to slow down just a little. There are 360,210 changelog revisions and 359,342 manifest revisions. For the changelog, mean time to compress a revision increased from ~16.025us to ~16.088us. That's basically a function call or an attribute lookup. I suppose this is the price you pay for abstraction. It's so low that I'm not concerned.	2017-01-02 11:22:52 -08:00
Gregory Szorc	1a6670d670	revlog: move decompress() from module to revlog class (API) Upcoming patches will convert revlogs to use the compression engine APIs to perform all things compression. The yet-to-be-introduced APIs support a persistent "compressor" object so the same object can be reused for multiple compression operations, leading to better performance. In addition, compression engines like zstd may wish to tweak compression engine state based on the revlog (e.g. per-revlog compression dictionaries). A global and shared decompress() function will shortly no longer make much sense. So, we move decompress() to be a method of the revlog class. It joins compress() there. On the mozilla-unified repo, we can measure the impact of this change on reading performance: $ hg perfrevlogchunks -c ! chunk ! wall 1.932573 comb 1.930000 user 1.900000 sys 0.030000 (best of 6) ! wall 1.955183 comb 1.960000 user 1.930000 sys 0.030000 (best of 6) ! chunk batch ! wall 1.787879 comb 1.780000 user 1.770000 sys 0.010000 (best of 6 ! wall 1.774444 comb 1.770000 user 1.750000 sys 0.020000 (best of 6) "chunk" appeared to become slower but "chunk batch" got faster. Upon further examination by running both sets multiple times, the numbers appear to converge across all runs. This tells me that there is no perceived performance impact to this refactor.	2017-01-02 13:00:16 -08:00
Gregory Szorc	df8167ed29	revlog: make compressed size comparisons consistent revlog.compress() compares the compressed size to the input size and throws away the compressed data if it is larger than the input. This is the correct thing to do, as storing compressed data that is larger than the input takes up more storage space and makes reading slower. However, the comparison was implemented inconsistently. For the streaming compression mode, we threw away the result if it was greater than or equal to the input size. But for the one-shot compression, we threw away the compression only if it was greater than the input size! This patch changes the comparison for the simple case so it is consistent with the streaming case. As a few tests demonstrate, this adds 1 byte to some revlog entries. This is because of an added 'u' header on the chunk. It seems somewhat wrong to increase the revlog size here. However, IMO the cost of 1 byte in storage is insignificant compared to the performance gains of avoiding decompression. This patch should invite questions around the heuristic for throwing away compressed data. For example, I'd argue we should be more liberal about rejecting compressed data, additionally doing so where the number of bytes saved fails to reach a threshold. But we can have this discussion another time.	2017-01-02 11:50:17 -08:00
Gregory Szorc	4dbc7459c8	revlog: add clone method Upcoming patches will introduce functionality for in-place repository/store "upgrades." Copying the contents of a revlog feels sufficiently low-level to warrant being in the revlog class. So this commit implements that functionality. Because full delta recomputation can be very expensive (we're talking several hours on the Firefox repository), we support multiple modes of execution with regards to delta (re)use. This will allow repository upgrades to choose the "level" of processing/optimization they wish to perform when converting revlogs. It's not obvious from this commit, but "addrevisioncb" will be used for progress reporting.	2016-12-18 17:02:57 -08:00
Remi Chaintron	66071d6de5	revlog: REVIDX_EXTSTORED flag This flag will be used by the lfs extension to mark the revision data as stored externally.	2017-01-05 17:16:51 +00:00
Remi Chaintron	dfc79cbfc3	revlog: flag processor Add the ability for revlog objects to process revision flags and apply registered transforms on read/write operations. This patch introduces: - the 'revlog._processflags()' method that looks at revision flags and applies flag processors registered on them. Due to the need to handle non-commutative operations, flag transforms are applied in stable order but the order in which the transforms are applied is reversed between read and write operations. - the 'addflagprocessor()' method allowing to register processors on flags. Flag processors are defined as a 3-tuple of (read, write, raw) functions to be applied depending on the operation being performed. - an update on 'revlog.addrevision()' behavior. The current flagprocessor design relies on extensions to wrap around 'addrevision()' to set flags on revision data, and on the flagprocessor to perform the actual transformation of its contents. In the lfs case, this means we need to process flags before we meet the 2GB size check, leading to performing some operations before it happens: - if flags are set on the revision data, we assume some extensions might be modifying the contents using the flag processor next, and we compute the node for the original revision data (still allowing extension to override the node by wrapping around 'addrevision()'). - we then invoke the flag processor to apply registered transforms (in lfs's case, drastically reducing the size of large blobs). - finally, we proceed with the 2GB size check. Note: In the case a cachedelta is passed to 'addrevision()' and we detect the flag processor modified the revision data, we chose to trust the flag processor and drop the cachedelta.	2017-01-10 16:15:21 +00:00
Remi Chaintron	bd07cff7ec	revlog: pass revlog flags to addrevision Adding the ability to passing flags to addrevision instead of simply passing default flags to _addrevision will allow extensions relying on flag transforms to wrap around addrevision() in order to update revlog flags. The first use case of this patch will be the lfs extension marking nodes as stored externally when the contents are larger than the defined threshold. One of the reasons leading to setting flags in addrevision() wrappers in the flag processor design is that it allows to detect files larger than the 2GB limit before the check is performed, which allows lfs to transform the contents into metadata.	2017-01-05 17:16:07 +00:00
Remi Chaintron	6d11b9177b	revlog: add 'raw' argument to revision and _addrevision This patch introduces a new 'raw' argument (defaults to False) to revlog's revision() and _addrevision() methods. When the 'raw' argument is set to True, it indicates the revision data should be handled as raw data by the flagprocessor. Note: Given revlog.addgroup() calls are restricted to changegroup generation, we can always set raw to True when calling revlog._addrevision() from revlog.addgroup().	2017-01-05 17:16:07 +00:00
Remi Chaintron	cc88d4a3c4	revlog: merge hash checking subfunctions This patch factors the behavior of both methods into 'checkhash'.	2016-12-13 14:21:36 +00:00
Cotizo Sima	56dfac3f63	revlog: ensure that flags do not overflow 2 bytes This patch adds a line that ensures we are not setting by mistake a set of flags overlfowing the 2 bytes they are allocated. Given the way the data is packed in the revlog header, overflowing 2 bytes will result in setting a wrong offset.	2016-11-28 04:34:01 -08:00
Augie Fackler	2d9c1e8476	revlog: avoid shadowing several variables using list comprehensions	2016-11-10 16:34:43 -05:00
Gregory Szorc	83ab000007	revlog: optimize _chunkraw when startrev==endrev In many cases, _chunkraw() is called with startrev==endrev. When this is true, we can avoid an extra index lookup and some other minor operations. On the mozilla-unified repo, `hg perfrevlogchunks -c` says this has the following impact: ! read w/ reused fd ! wall 0.371846 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) ! wall 0.337930 comb 0.330000 user 0.300000 sys 0.030000 (best of 30) ! read batch w/ reused fd ! wall 0.014952 comb 0.020000 user 0.000000 sys 0.020000 (best of 197) ! wall 0.014866 comb 0.010000 user 0.000000 sys 0.010000 (best of 196) So, we've gone from ~25x slower than batch to ~22.5x slower. At this point, there's probably not much else we can do except implement an optimized function in the index itself, including in C.	2016-10-23 10:40:33 -07:00
Gregory Szorc	4d79c96e22	revlog: inline start() and end() for perf reasons When I implemented `hg perfrevlogchunks`, one of the things that stood out was N * _chunk() calls was ~38x slower than 1 _chunks() call. Specifically, on the mozilla-unified repo: N_chunk: 0.528997s 1_chunks: 0.013735s This repo has 352,097 changesets. So the average time per changeset comes out to: N_chunk: 1.502us 1_chunks: 0.039us If you extrapolate these numbers to a repository with 1M changesets, that comes out to 1.502s versus 0.039s, which is significant. At these latencies, Python attribute lookups and function calls matter. So, this patch inlines some code to cut down on that overhead. The impact of this patch on N*_chunk() calls is clear: ! wall 0.528997 comb 0.520000 user 0.500000 sys 0.020000 (best of 19) ! wall 0.367723 comb 0.370000 user 0.350000 sys 0.020000 (best of 27) So, we go from ~38x slower to ~27x. A nice improvement. But there's still a long way to go. It's worth noting that functionality like revsets perform changelog lookups one revision at a time. So this code path is worth optimizing.	2016-10-22 15:41:23 -07:00
Gregory Szorc	52757f4357	revlog: reorder index accessors to match data structure order Index entries are ordered tuples. We have accessors in the revlog class to map tuple offsets to names. To help reinforce the order, reorder the methods so they match the order of elements in the tuple. While I'm here, also sneak in some minimal documentation.	2016-10-23 09:34:55 -07:00
Pierre-Yves David	b03bd97b6a	revlog: make 'storedeltachains' a "public" attribute The next changeset will make that attribute read by the changegroup packer. We make it "public" beforehand.	2016-10-14 02:25:08 +02:00

1 2 3 4 5 ...

587 Commits