Commit Graph

1646 Commits

Author SHA1 Message Date
Pierre-Yves David
66973f9ece configitems: register the 'format.dotencode' config 2017-06-30 03:42:22 +02:00
Pierre-Yves David
6f55ce8db4 configitems: register the 'format.aggressivemergedeltas' config 2017-06-30 03:42:20 +02:00
Pierre-Yves David
7c5463c25b revlog: add an experimental option to mitigated delta issues (issue5480)
The general delta heuristic to select a delta do not scale with the number of
branch. The delta base is frequently too far away to be able to reuse a chain
according to the "distance" criteria. This leads to insertion of larger delta (or
even full text) that themselves push the bases for the next delta further away
leading to more large deltas and full texts. This full text and frequent
recomputation throw Mercurial performance in disarray.

For example of a slightly large repository

  280 000 files (2 150 000 versions)
  430 000 changesets (10 000 topological heads)

Number below compares repository with and without the distance criteria:

manifest size:
    with:    21.4 GB
    without:  0.3 GB

store size:
    with:    28.7 GB
    without   7.4 GB

bundle last 15 00 revisions:
    with:    800 seconds
             971 MB
    without:  50 seconds
              73 MB

unbundle time (of the last 15K revisions):
    with:    1150 seconds (~19 minutes)
    without:   35 seconds

Similar issues has been observed in other repositories.


Adding a new option or "feature" on stable is uncommon. However, given that this
issues is making Mercurial practically unusable, I'm exceptionally targeting
this patch for stable.

What is actually needed is a full rework of the delta building and reading
logic. However, that will be a longer process and churn not suitable for stable.

In the meantime, we introduces a quick and dirty mitigation of this in the
'experimental' config space. The new option introduces a way to set the maximum
amount of memory usable to store a diff in memory. This extend the ability for
Mercurial to create chains without removing all safe guard regarding memory
access. The option should be phased out when core has a more proper solution
available.

Setting the limit to '0' remove all limits, setting it to '-1' use the default
limit (textsize x 4).
2017-06-23 13:49:34 +02:00
FUJIWARA Katsunori
0f2623ec07 localrepo: factor out base of filecache annotation class
It isn't needed that storecache is derived from repofilecache.

Changes in this patch allow repofilecache and storecache to do in own
__init__() differently from each other.
2017-06-30 01:47:49 +09:00
Pulkit Goyal
d1e9e38065 py3: use '%d' instead of '%s' for integers
Python 3 does not let you use '%s' for integers.
2017-06-17 14:53:25 +05:30
Martin von Zweigbergk
d75cb87451 localrepo: remove unused addchangegroup() (API)
This completes the cleanup started in f1b3c9ce0ce7 (localrepo: move
the addchangegroup method in changegroup module, 2014-04-01).
2017-06-15 15:13:18 -07:00
Siddharth Agarwal
f23bf55820 workingctx: add a way for extensions to run code at status fixup time
Some extensions like fsmonitor need to run code after dirstate.status is
called, but while the wlock is held. The extensions could grab the wlock again,
but that has its own peculiar race issues. For example, fsmonitor would not
like its state to be written out if the dirstate has changed underneath (see
issue5581 for what can go wrong in that sort of case).

To protect against these sorts of issues, allow extensions to declare that they
would like to run some code to run at fixup time.

fsmonitor will switch to using this in the next patch in the series.
2017-06-12 13:56:50 -07:00
Gregory Szorc
2c7b5a6b43 localrepo: move filtername to __init__
This is obviously an instance attribute, not a type attribute. The
modern Python style is to use __init__ for defining these.

This exposes statichttprepo as inheriting from localrepository
without calling its __init__. As a result, its __init__ defines
a lot of variables that methods on localrepository's methods need.
But factoring the common bits into a separate class is for another
day.
2017-06-08 23:23:37 -07:00
Gregory Szorc
943d55015e obsolete: move obsstore creation logic from localrepo
This code has more to do with obsolete.py than localrepo.py. Let's
move it there.
2017-06-08 21:54:30 -07:00
Gregory Szorc
efbb740737 revlog: skeleton support for version 2 revlogs
There are a number of improvements we want to make to revlogs
that will require a new version - version 2. It is unclear what the
full set of improvements will be or when we'll be done with them.
What I do know is that the process will likely take longer than a
single release, will require input from various stakeholders to
evaluate changes, and will have many contentious debates and
bikeshedding.

It is unrealistic to develop revlog version 2 up front: there
are just too many uncertainties that we won't know until things
are implemented and experiments are run. Some changes will also
be invasive and prone to bit rot, so sitting on dozens of patches
is not practical.

This commit introduces skeleton support for version 2 revlogs in
a way that is flexible and not bound by backwards compatibility
concerns.

An experimental repo requirement for denoting revlog v2 has been
added. The requirement string has a sub-version component to it.
This will allow us to declare multiple requirements in the course
of developing revlog v2. Whenever we change the in-development
revlog v2 format, we can tweak the string, creating a new
requirement and locking out old clients. This will allow us to
make as many backwards incompatible changes and experiments to
revlog v2 as we want. In other words, we can land code and make
meaningful progress towards revlog v2 while still maintaining
extreme format flexibility up until the point we freeze the
format and remove the experimental labels.

To enable the new repo requirement, you must supply an experimental
and undocumented config option. But not just any boolean flag
will do: you need to explicitly use a value that no sane person
should ever type. This is an additional guard against enabling
revlog v2 on an installation it shouldn't be enabled on. The
specific scenario I'm trying to prevent is say a user with a
4.4 client with a frozen format enabling the option but then
downgrading to 4.3 and accidentally creating repos with an
outdated and unsupported repo format. Requiring a "challenge"
string should prevent this.

Because the format is not yet finalized and I don't want to take
any chances, revlog v2's version is currently 0xDEAD. I figure
squatting on a value we're likely never to use as an actual revlog
version to mean "internal testing only" is acceptable. And
"dead" is easily recognized as something meaningful.

There is a bunch of cleanup that is needed before work on revlog
v2 begins in earnest. I plan on doing that work once this patch
is accepted and we're comfortable with the idea of starting down
this path.
2017-05-19 20:29:11 -07:00
Yuya Nishihara
e6297851af localrepo: map integer and hex wdir identifiers to workingctx
changectx.__init__() is slightly modified to take str(wdirrev) as a valid
integer revision (and raise WdirUnsupported exception.)

Test will be added by the next patch.
2016-08-19 18:40:35 +09:00
Yuya Nishihara
75533e6603 localrepo: document that __contains__() may raise LookupError 2017-05-25 23:18:02 +09:00
Pierre-Yves David
c1ca9ad6ee transaction: run _writejournal unfiltered
The function use the length of the repository, something affected by filtering.
It seems better to use the unfiltered length here.

Credit for finding this goes to Durham Goode.
2017-05-25 01:45:52 +02:00
Augie Fackler
39eba5889f localrepo: extract bookmarkheads method to bookmarks.py
This method is only used internally by destutil, and it's obscure
enough I'm willing to just move it without a deprecation warning,
especially since the new method has more constrained functionality.

Design-wise I'd also like to get active bookmark handling folded into
the bookmark store, so that we don't squirrel away an extra attribute
for the active bookmark on the repository object.
2017-05-18 16:43:56 -04:00
Augie Fackler
33cedfa925 localrepo: mark walk convenience method as deprecated (API) 2017-05-18 18:01:48 -04:00
Augie Fackler
c46d888391 localrepo: migrate to context manager for changing dirstate parents 2017-05-18 17:11:14 -04:00
Pierre-Yves David
705173411e cache: make the cache updated callback easily accessible to extension
This will help extension to benefit from this new logic. As a side effect this
clarify the 'transaction' method a little bit.
2017-05-19 13:09:23 +02:00
Gregory Szorc
0d15165c74 localrepo: reformat set literals
Putting multiple elements on the same line makes diffs harder
to read. Switch to one line per element so future changes are
easier on the eyes.
2017-05-17 20:01:29 -07:00
Martin von Zweigbergk
3bc2187d25 match: remove ispartial()
The function was added in c2498bb6d298 (match: add match.ispartial(),
2015-05-15) for use by narrowhg, but narrowhg never ended up needing
it.
2017-05-17 09:43:50 -07:00
Gregory Szorc
ae8cb885e7 changelog: load pending file directly
When changelogs are written, a copy of the index (or inline revlog)
may be written to an 00changelog.i.a file to facilitate hooks and
other processes having access to the pending data before it is
finalized.

The way it works today, the localrepo class loads the changelog
like normal. Then, if it detects a pending transaction, it asks
the changelog class to load a pending changelog. The changelog
class looks for a 00changelog.i.a file. If it exists, it is
loaded and internal data structures on the new revlog class are
copied to the original instance.

The existing mechanism is inefficient because it loads 2 revlog
files. The index, node map, and chunk cache for 00changelog.i
are thrown away and replaced by those for 00changelog.i.a.

The existing mechanism is also brittle because it is a layering
violation to access the data structures being accessed. For example,
the code copies the "chunk cache" because for inline revlogs
this cache contains the raw revision chunks and allows the original
changelog/revlog instance to access revision data for these pending
revisions. This whole behavior of course relies on the revlog
constructor reading the entirety of an inline revlog into memory
and caching it. That's why it is brittle. (I discovered all this
as part of modifying behavior of the chunk cache.)

This patch streamlines the loading of a pending 00changelog.i.a
revlog by doing it directly in the changelog constructor if told
to do so. When this code path is active, we no longer load the
00changelog.i file at all.

The only negative outcome I see from this change is if loading
00changelog.i was somehow facilitating a role. But I can't imagine
what that would be because we throw away its data (the index data
structures are replaced and inline revision data is replaced via
the chunk cache) and since 00changelog.i.a is a copy of
00changelog.i, file content should be identical, so there should
be no meaninful file integrity checking at play. I think this was
all just sub-optimal code.
2017-05-13 16:26:43 -07:00
Martin von Zweigbergk
c3406ac3db cleanup: use set literals
We no longer support Python 2.6, so we can now use set literals.
2017-02-10 16:56:29 -08:00
Pierre-Yves David
9c635f53f5 caches: move the 'updating the branch cache' message in 'updatecaches'
We are about to remove the branchmap cache update in changegroup application.
There is a debug message alongside this update that we do not want to loose. We
move the message beforehand to simplify the test update in the next changeset.
The message move is quite noisy and isolating that noise is useful.

Most tests update are just line reordering since the message is issued at a
later point during the transaction.

After this changes, the message is displayed in more case since local commit
creation also issue it.
2017-05-02 22:27:44 +02:00
Pierre-Yves David
781ab337a0 caches: stop warming the cache after 'localrepo.commitctx'
Now that we garantee that branchmap cache are updated at the end of the
transaction we can drop that one. This removes a problematic case with nested
transaction where the new cache could be written on disk before the transaction
is finished.

The test change is harmless, since we update the cache at a later point, the
dirstate have been updated in between.
2017-05-02 18:56:07 +02:00
Pierre-Yves David
6b3c96d7ef caches: call 'repo.updatecache()' in 'repo.destroyed()'
Regenerating the cache after a 'strip' or a 'rollback' is useful. So we call the
generic cache warming function as other caches than just branchmap will be
updated there in the future.

To do so, we have to make 'repo.updatecache()' able to take no arguments. In
such cases, we reload all caches.
2017-05-02 19:05:58 +02:00
Pierre-Yves David
87c7f6f271 caches: introduce a function to warm cache
We have multiple caches that gain from being kept up to date. For example in a
server setup, we want to make sure the branchcache cache is hot for other
read-only clients.

Right now each cache tries to update themself in place where new data have been
added. However the approach is error prone (we might miss some spot) and
fragile. When nested transaction are involved, such cache updates might happen
before a top level transaction is committed. Writing caches for uncommitted
data on disk.

Having a single entry point, run at the end of each successful transaction,
helps to ensure the cache is up to date and refreshed at the right time.

We start with updating the branchmap cache but other will come.
2017-05-02 21:39:43 +02:00
Pierre-Yves David
a1a70e3fbc transaction: track newly introduced revisions
Tracking revisions is not the data that will unlock the most new capability.
However, they are the simplest thing to track and still unlock some nice
improvements in regard with caching.

We plug ourself at the changelog level to make sure we do not miss any revision
additions.

The 'revs' set is configured at the repository level because the transaction
itself does not needs to know that much about the business logic.
2017-05-02 18:45:51 +02:00
Martin von Zweigbergk
dfa0866489 localrepo: reuse exchange.bundle2requested()
It seems like localrepo.getbundle() is trying to do the same thing, so
let's just call the method. That way we get the same condition as
there (matching any "HG2" prefix, not only "HG20").
2017-05-03 10:33:26 -07:00
Pierre-Yves David
95ea84f11b cleanup: drop the deprecated 'localrepo._link' method
This was deprecated in favor of 'localrepo.wvfs.islink'. We can now drop it for the
future 4.3.
2017-05-02 02:05:39 +02:00
Pierre-Yves David
258c50d8f2 cleanup: drop the deprecated 'localrepo.wfile' method
This was deprecated in favor of 'localrepo.wvfs.join'. We can now drop it for the
future 4.3.
2017-05-02 02:04:55 +02:00
Pierre-Yves David
511036e5cd cleanup: drop the deprecated 'localrepo.join' method
This was deprecated in favor of 'localrepo.vfs.join'. We can now drop it for the
future 4.3.
2017-05-02 02:03:56 +02:00
Pierre-Yves David
18be67b624 cleanup: drop the deprecated 'localrepo.tag' method
This was deprecated in favor of 'mercurial.tags.tag'. We can now drop it for the
future 4.3.
2017-05-02 02:03:04 +02:00
Pierre-Yves David
a1ea2991e5 cleanup: drop the deprecated 'localrepo.opener' method
This was deprecated in favor of 'localrepo.vfs'. We can now drop it for the
future 4.3.
2017-05-02 02:01:47 +02:00
Pierre-Yves David
aa3d41a9fe cleanup: drop the deprecated 'localrepo.wopener' method
This was deprecated in favor of 'localrepo.wvfs'. We can now drop it for the
future 4.3.
2017-05-02 02:01:15 +02:00
Pierre-Yves David
53505593ab track-tags: write all tag changes to a file
The tag changes information we compute is now written to disk. This gives
hooks full access to that data.

The format picked for that file uses a 2 characters prefix for the action:

    -R: tag removed
    +A: tag added
    -M: tag moved (old value)
    +M: tag moved (new value)

This format allows hooks to easily select the line that matters to them without
having to post process the file too much. Here is a couple of examples:

 * to select all newly tagged changeset, match "^+",
 * to detect tag move, match "^.M",
 * to detect tag deletion, match "-R".

Once again we rely on the fact the tag tests run through all possible
situations to test this change.
2017-03-28 10:15:02 +02:00
Pierre-Yves David
cd08df0c89 track-tags: compute the actual differences between tags pre/post transaction
We now compute the proper actuall differences between tags before and after the
transaction. This catch a couple of false positives in the tests.

The compute the full difference since we are about to make this data available
to hooks in the next changeset.
2017-03-28 10:14:55 +02:00
Pierre-Yves David
ac782d2423 track-tags: introduce first bits of tags tracking during transaction
This changeset introduces detection of tags changes during transaction. When
this happens a 'tag_moved=1' argument is set for hooks, similar to what we do
for bookmarks and phases.

This code is disabled by default as there are still various performance
concerns.  Some require a smarter use of our existing tag caches and some other
require rework around the transaction logic to skip execution when unneeded.
These performance improvements have been delayed, I would like to be able to
experiment and stabilize the feature behavior first.

Later changesets will push the concept further and provide a way for hooks to
know what are the actual changes introduced by the transaction. Similar work
is needed for the other families of changes (bookmark, phase, obsolescence,
etc). Upgrade of the transaction logic will likely be performed at the same
time.

The current code can report some false positive when .hgtags file changes but
resulting tags are unchanged. This will be fixed in the next changeset.

For testing, we simply globally enable a hook in the tag test as all the
possible tag update cases should exist there. A couple of them show the false
positive mentioned above.

See in code documentation for more details.
2017-03-28 06:38:09 +02:00
Pierre-Yves David
8d44f66739 localrepo: fix deprecation version for 'repo._link'
The patch lingered for a while and nobody noticed when it was resubmitted.
2017-04-04 16:49:12 +02:00
Pierre-Yves David
630da1c31c localrepo: fix deprecation version for 'repo.join'
The patch lingered for a while and nobody noticed when it was resubmitted.
2017-04-04 16:48:58 +02:00
Pierre-Yves David
ef96a2bca6 tags: only return 'alltags' in 'findglobaltags'
This is minor update along the way. We simplify the 'findglobaltags' function to
only return the tags. Since no existing data is reused, we know that all tags
returned are global and we can let the caller get that information if it cares
about it.
2017-03-28 07:41:23 +02:00
Pierre-Yves David
8da79dae5a tags: do not feed dictionaries to 'findglobaltags'
The code asserts that these dictionary are empty. So we can be more explicit
and have the function return the dictionaries directly.
2017-03-28 06:13:49 +02:00
Pierre-Yves David
368236438f tags: deprecated 'repo.tag'
All user are gone. We can now celebrate the removal of some extra line from the
'localrepo' class.
2017-03-27 16:00:47 +02:00
Pierre-Yves David
8141af4eed tags: move 'repo.tag' in the 'tags' module
Similar logic, pretty much nobody use this method (that creates a tag) so we
move it into the 'tags' module were it belong.
2017-03-27 15:58:31 +02:00
Pierre-Yves David
f1cbb59e75 tags: move '_tags' from 'repo' to 'tags' module
As far as I understand, that function do not needs to be on the local repository
class, so we extract it in the 'tags' module were it will be nice and
comfortable. We keep the '_' in the name since its only user will follow in the
next changeset.
2017-03-27 15:55:07 +02:00
Ryan McElroy
372012a68d localrepo: use tryunlink 2017-03-21 06:50:28 -07:00
Ryan McElroy
2c2aec06d7 localrepo: improve vfs documentation
At the beginning of March, I promised Yuya that I would follow up a comment I
made on a patch with improved documention for these vfs objects. Also hat tip
to Pierre-Yves for adding the documentation here in the first place.
2017-03-21 06:50:42 -07:00
Augie Fackler
8e92fda6f8 localrepo: use node.hex instead of awkward .encode('latin1')
Spotted as an option by Yuya. Thanks!
2017-03-20 22:06:57 -04:00
Augie Fackler
18340f960e localrepo: forcibly copy list of filecache keys
On Python 3, keys() is more like iterkeys(), so we got in trouble for
mutating the dict while we're iterating here. Since the list of caches
should be relatively small, work around this difference by just
forcing a copy of the key list.
2017-03-19 01:11:00 -04:00
Augie Fackler
00cba0f12b localrepo: turn hook kwargs back into strs before calling hook
It might be better to ensure that the hook kwargs dict only has str
keys on Python 3. I'm torn.
2017-03-19 01:10:02 -04:00
Augie Fackler
092cc849d4 localrepo: ensure transaction id is fully bytes on py3 2017-03-19 01:08:59 -04:00
Gregory Szorc
5ca0f908bf py3: add __bool__ to every class defining __nonzero__
__nonzero__ was renamed to __bool__ in Python 3. This patch simply
aliases __bool__ to __nonzero__ for every class implementing
__nonzero__.
2017-03-13 12:40:14 -07:00