Without this patch, if the server sets preferuncompressed, there's no way for
clients to override that and force a non-streaming clone. With this patch, we
extend the meaning of --pull to also override preferuncompressed and force a
non-streaming clone.
The logic to read a requires file resides in scmutil, so it's only logical that
the logic to write the file should be there too.
And now I've typed logic too many times it no longer looks like a word.
Logically.
Localrepo's __init__ function creates a local requirements set to add to before
it replaces the instance's set with it. This is unnecessary, so let's just
use the instance's set directly.
To avoid confusion from overloading of the variable name "requirements", the
requirements is renamed to remotereqs for localrepo's stream_in() function.
Localrepo's _applyrequirements function isn't very straightforward about what
it does. Its purpose is to both act as a setter for the requirements attribute,
and to apply appropriate requirements to the opener's configuration.
This change makes the function just focus on the latter responsibility. We
rename it as such, and make setting the requirements attribute the
responsibility of the caller.
The init function used to create a local list, and then convert it to a set
before assigning it as a data attribute. This change simplifies the function by
having it always be a set, requiring no conversion.
Localrepo's requirements class variable was introduced in 87e5a9d0fbb1 to make
requirements modifiable by extensions. A main access point, _baserequirements,
still exists, but this change undoes making the class variable to begin with.
Without this simplification, there is a class variable with a default value
that is only copied, but never directly used. This behavior is moved directly
into the _baserequirements function.
Using 'addfinalize' to generate 'fncache' means that no pending version of the
file will be generated for the hooks. We would have to use the
'addfilegenerator' method to get such result. However the 'fncachevfs' (who
decide that a write is necessary) have no access to the transaction to register
such file generation at add time. Having the transaction accessible to the 'vfs'
is too much trouble for no benefit. This outdated 'fncache' file at hook time is
not expected to be an issue.
The previous move from 'onclose' to 'addfinalize' had no impact on this timing.
I'm documenting it now because I looked at it.
This allow to gracefully report the failure of the bookmark push and carry on.
Before this change set. Local push would plain quit and wireprotocol would
failed in various ungraceful way.
The goal is to allow access to file outside ofthe store directory from the
transaction. The obvious target are the `bookmarks` file. But we can envision
usage for cache too.
We keep passing a main opener explicitly because a lot of code rely on this
default opener. The main opener (operating on store) is using an empty key ''.
The current heuristic for deciding between storing delta and full texts
is based on ratio of (sizeofdeltas)/(sizeoffulltext).
In some cases (for example a manifest for ahuge repo) this approach
can result in extremely long delta chains (~30,000) which are very slow to
read. (In the case of a manifest ~500ms are added to every hg command because of that).
This commit introduces "revlog.maxchainlength" configuration variable that will
limit delta chain length.
Instead of calling 'cl.finalize()' by hand (possibly at a bogus time) we
register it in the transaction during 'delayupdate' and rely on 'tr.close()' to
call it at the right time.
The 'delayupdate' method now takes a transaction object and registers its
'_writepending' method for execution in 'transaction.writepending()'. The hook can then
use 'transaction.writepending()' directly.
At some point this will allow the addition of other file creation
during writepending.
History rewriting commands like histedit tend to use temporary
commits. They may schedule hook execution on these temporary commits
for after the lock has been released. But temporary commits are likely
to have been stripped before the lock is released (and the hook run).
Hook executed for missing revisions leads to various crashes.
We disable hooks execution for revision missing in the repo. This
provides a dirty but simple fix to user issues.
On streaming clone, we were priming the local branch cache with the
remote branchmap, without checking which heads were closed.
This fixes an issue introduced in:
changeset: 17740:f8d7aaf86507
user: Tomasz Kleczek <tomasz.kleczek@fb.com>
date: Wed Oct 03 13:19:53 2012 -0700
summary: branchcache: fetch source branchcache during clone (issue3378)
that was exposed in 2.9 by:
changeset: 20192:6c385e85aa05
user: Brodie Rao <brodie@sf.io>
date: Mon Sep 16 01:08:29 2013 -0700
summary: branches: simplify with repo.branchmap().iterbranches()
8a92e6790099 broke bookmarks getting copied during uncompressed clones. Since
most of the pull logic has been moved into exchange.py, lets just call
exchange.pull to fix up the repo with the latest bits after the streaming clone
has bootstrapped the repo. This keeps us from having to duplicate the bookmark
logic.
We are changing all integers that denote the size of a chunk to read to int32.
There are two main motivations for that.
First, we change everything to the same width (32 bits) to make it possible for
a reasonably agnostic actor to forward a bundle2 without any extra processing.
With this change, this could be achieved by just reading int32s and forwarding
chunks of the size read. A bit a smartness would be logic to detect the end of
stream but nothing too complicated.
Second, we need some capacity to transmit special information during the bundle
processing. For example we would like to be able to raise an exception while a
part is being read if this exception happend while this part was generated.
Having signed integer let us use negative numbers to trigger special events
during the parsing of the bundle.
The format is renamed for B2X to B2Y because this breaks binary
compatibility. The B2X format support is dropped. It was experimental to
allow this kind of things. All elements not directly related to the binary
format remain flagged "b2x" because they are still compatible.
The basic obsolete option is allowing the creation of obsolete markers. This
does not enable other features, such as allowing unstable commits or exchanging
obsolete markers.
Previously, obstore read the obsolete._enabled flag to determine whether to
allow writes to the obstore. Since obsolete._enabled will be moving into a repo
specific config, we can't read it globally, and therefore must pass the
information into the constructor.
Now that we have a separate variable for the original 'm1' manifest,
we can safely update the nodeid of the file in the new manifest in the
same place as we update the flags.
In localrepo.commitctx(), p1's manifest is copied and used as the
basis for the manifest that is about to be committed. The way the copy
is updated makes it safe to use it where the original p1's manifest is
wanted. For readability, though, a separate variable for each purpose
would be clearer. Make it so.
This option controls what version of the binary format to use when creating a new
obsstore file.
Default is still the old format. No safeguards are currently placed around the
option value, but no clueless users are in danger of harm since it is
undocumented.
I verified that changed was never false (it was always a 2-tuple) by
adding
@@ -220,6 +225,8 @@ class manifest(revlog.revlog):
def add(self, map, transaction, link, p1=None, p2=None,
changed=None):
+ if not changed:
+ assert False, 'changed was %r' % changed
# if we're using the cache, make sure it is valid and
# parented by the same node we're diffing against
if not (changed and p1 and (p1 in self._mancache)):
and observing that the test suite still passed. Making all the
arguments required should help future readers understand what's going
on here.
All the logic of this function is in the `exchange.pull` function for some time.
We just stop calling `localrepo.pull` in `command.pull` to have access to more
information. Leaving `localrepo.pull` in place will let third-party extensions
wrap it but it would never be called by `hg pull` making the wrapping useless.
Therefore, the method is removed so that third-party code fail noisily and
get properly upgraded.
We mimic what was done for `push` for similar reason. We are about to drop
`localrepo.pull` (for consistency with dropping `localrepo.push` and we better
have an API as extensible as `push` is.
Find explanations about localrepo.push removal in 88d9d4ec499e.
There is no reason for bookmarks to get a special treatment. As a first step we
move the code as is in the `exchange.pull` function. Integration with the rest
of the flow will come later.
Adding bookmarks to pull means that most clone paths are now pulling bookmarks
through pull. We ensure that bookmark-update messages are properly suppressed in
that case.
In test-pull-http.t the 'requesting all changes' message disappear because we
now get the authentication error on the `listkeys`command before such message
is printed.
We'll add bookmark-related arguments to `repo.pull` so we need to widen the
signature of `repo.pull`. We should probably kill `repo.pull` now that
`repo.push` is dead but this is outside the scope of this series.
All the logic of this function is in the exchange.push function for some time.
We just stop calling `localrepo.push` in `command.push` to have access to more
information. Leaving `localrepo.push` in place will let third-party extensions
wrap it but it would never be called by `hg push` making the wrapping useless.
Therefore, the method is removed so that third-party code fail noisily and
get properly upgraded.
Returning the pushop object gives access to more information (upcoming bookmark
push result for example). `localrepo.push` currently extracts the `cgresult` for
callers.
This wraps all the locations of dirstate.setparent with the appropriate
begin/endparentchange calls. This will prevent exceptions during those calls
from causing incoherent dirstates (issue4353).
It's possible for the dirstate to become incoherent (issue4353) if there is an
exception in the middle of the dirstate parent and entries being written (like
if the user ctrl+c's). This change adds begin/endparentchange which a future
patch will require to be set before changing the dirstate parent. This will
allow us to prevent writing the dirstate in the event of an exception while
changing the parent.
This makes join and wjoin behave in the same way as os.path.join. That is, it
makes it possible to pass more than one path element to them.
Note that the first path element is still required, as it was before this patch,
and as is the case for os.path.join.
The internal API used IOError to indicate that a file should be marked as
removed.
There is some correlation between IOError (especially with ENOENT) and files
that should be removed, but using IOErrors to represent file removal internally
required some hacks.
Instead, use the value None to indicate that the file not is present.
Before, spurious IO errors could cause commits that silently removed files.
They will now be reported like all other IO errors so the root cause can be
fixed.
In all the remaining cases the comprehension variable is used for the same
thing as a previous loop variable.
This will mute some pyflakes "list comprehension redefines" warnings.
The phase cache file is no longer written on lock release, it is now handled by
the transaction (as changesets and obsolescence markers are).
(Hooray)
As we stop relying on the lock to write phase, repos with no existing phase
information will need to wait for a phase move or a strip to happen in order to
get the first write in the `phaseroots` file. This explain the change in
test-inherit-mode.t.
This should not have any side effects but in very obscure cases where
people interact with pre-2.1 and post-2.1 versions of Mercurial on the
same repo while having MQ patches applied but the MQ extension
disabled from time to time. A case unlikely enough to not be worth
preserving the old behavior with awful hacks.
We now pass a transaction option to this phase movement function. The
object is currently not used by the function, but it will be in the
future.
All call sites have been updated. Most call sites were already enclosed in a
transaction for a long time. The handful of others have been recently
updated in previous commit.
The ``getbundle`` function was initially design to return a changegroup bundle.
However, bundle2 allows transmitting a wide range of data. Some bundle2
requests may not include a changegroup at all.
Before this changeset, the client would request a changegroup for
``heads=[nullid]`` and receive an empty changegroup.
We introduce an official boolean parameter, ``cg``, that can be set
to false to disable changegroup generation on getbundle. A new bundle2
capability is introduced to let the client know.
The onclose() closure added in 5264dbb2d4ee held a regular reference to
the transaction object. This was causing the transaction to not gc and
a leak to occur.
The closure now holds a reference to the weakref instance and the leak
goes away.
Same drill again. We catch the PushRaced error, check if it cames from
a bundle2 processing, if so we turn it into a bundle2 with a part
transporting error information to be reraised client side.
If the heads on the server differ from the ones reported seen by the client at
bundle time, we raise a PushRaced exception. However, the part raising the
exception was broken.
To fix it, we move the PushRaced class in the error module so it can be
accessible everywhere without an import cycle.
A test is also added to prevent regression.
We highlight the fact that this is experimental by moving it to an "experimental"
section, and we match the config name with the server capability name
`bundle2-exp`.
For the same reason, we advertise this bundle2 implementation and format as
experimental. This will leave room for field testing in 3.0 but won't conflict
with a stable implementation in 3.1.
The current implementation of bundle2 is still very experimental and the 3.0
freeze is yesterday. The current bundle2 format has never been field-tested, so
we rename the header to HG2X. This leaves the HG20 header available for real
usage as a stable format in Mercurial 3.1.
We won't guarantee that future mercurial versions will keep supporting this
`HG2X` format.
We can now retrieve them from the server during push. The capabilities are
encoded the same way as in `replycaps` part (with an extra layer of urlquoting
to escape separators).
If two revisions are linearly related, there will only be one ancestor, and
commonancestors and commonancestorsheads would give the same result.
commonancestorsheads is however slightly simpler, faster and more correct.
After ``listkeys`` we can now include ``pushkey`` request in a bundle2. The part
uses a very simple scheme closest as possible to the current wireproto command
for ``pushkey``. We may eventually decide for a more sophisticated part format
before the protocol becomes final.
A new ``listkeys`` is supported by getbundle. It is a list of namespaces whose
content should be included in the bundle.
An appropriate entry has been added to the wireproto map of getbundle arguments
and a new bundle2 capability is advertised.
There are still no codes that request such parts in core mercurial.
The path of the righteous man is beset on all sides by the inequities of the
selfish and the tyranny of evil men. Blessed is he, who in the name of charity
and good will, shepherds the weak through the valley of darkness, for he is
truly Mercurial's keeper and the finder of robust methods. And I will strike
down upon thee with great vengeance and furious anger those who would attempt
to poison and destroy Mercurial's codebase. And you will know my name is the
Lord when I lay my vengeance upon thee.
This patch removes the last of the 'working' variable that was sprinkled
throughout localrepo.status which paves the way for future patches to use the
object oriented design of contexts to handle calculating the status.
With the machinery in place, we use context.subrev instead of testing for a
workingctx directly. This allows more flexibility for later patches that will
add memctx to the mix.
When a bundle2 is pushed we return a bundle instead of an integer. We use to
return a binary stream. We now return a `bundle20` bundler to make the life of
wireprotocol implementation simpler.
This input will have to travel over the wire anyway, so we feed the peer method
with a simple binary stream and rely on the server side to use `readbundle`
to create the python object.
The test output changes because the bundle is created marginally sooner and the
debug output interleaves in a different way.
For friendliness with the wire protocol implementation, the `exchange.getbundle` now
returns a binary stream when asked for a bundle2. We detect a bundle2 request and
upgrade the binary stream to an unbundler object.
In the future the unbundler may gain feature to look like a binary stream, but
we are not quite there yet.
This patch introduces "prepushoutgoinghooks" to extend outgoing check
before pushing changesets to remote easily.
This chooses the function returning "util.hooks" instead of the one to
be overridden.
The latter may cause problems silently, if one of overriders forgets
(or fails) to execute a kind of "super(xxx, self).overridden(...)". In
the other hand, the former can ensure that all registered functions
are invoked, unless one of them raises an exception.
Before this patch, "localrepository.undofiles()" returns list of
absolute filename of undo files.
This patch makes it return list of tuples "(vfs, relative filename)"
to access undo files via vfs.
This patch also changes "repair.strip()", which is the only user of
"localrepository.undofiles()".
Localrepo now supports the unbundle method of pushing changegroups. We
plan to use the unbundle call for bundle2 so it is important that all
peers supports it. The `peer.unbundle` and `peer.addchangegroup` code
path have small difference so cause some test output changes. None of those
changes seems problematic.
caps.remove('bundle2') was throwing an exception if bundle2 wasn't present in
the capabilities. This was causing test-static-http.t to hang. Let's just use
discard, so we don't get an exception.
This changeset refactors the pull code to use a bundle2 when available. We keep
bundle2 disabled by default. The current code is not ready for prime time.
Ultimately we'll want to unify the API of `bunde10` and `bundle20` to have less
different code. But for now, testing the bundle2 exchange flow is an higher
priority.
This function can return a `HG10` or `HG20` bundle. It uses the `bundlecaps`
parameters to decides which one to return.
This is a distinct function from `changegroup.getbundle` for two reasons. First
the APIs of `bundle10` and `bundle20` are not compatible yet. The two functions
may be reunited in the future. Second `exchange.getbundle` will grow parameters
for all kinds of data (phases, obsmarkers, ...) so it's better to keep the
changegroup generation in its own function for now.
This function will be used it in the next changesets.
This is a gratuitous code move aimed at reducing the localrepo bloatness.
The method had few callers, not enough to be kept in local repo.
The peer API stay unchanged.
This is a gratuitous code move aimed at reducing the localrepo bloatness.
The method had few callers, not enough to be kept in local repo.
The peer API remains unchanged.
This is a gratuitous code move aimed at reducing the localrepo bloatness.
The method had few callers, not enough to be kept in local repo.
The peer API remains unchanged.
The `pushoperation` object contains strictly more data the arguments currently
passed to `localrepo.checkpush` we pass the new object instead. This function is
used by MQ to abort push that includes MQ changesets.
Note: extension that may use this function will have to align.
The fncache was not being properly invalidated each time the lock was taken, so
in theory it could contain old data from prior to the caller having the lock.
This changes it to be invalidated as soon as the lock is taken (same as all our
other caches).
Previously the fncache was written at lock.release time. This meant it was not
tracked by a transaction, and if an error occurred during the fncache write it
would fail to update the fncache, but would not rollback the transaction,
resulting in an fncache that was not in sync with the files on disk (which
causes verify to fail, and causes streaming clones to not copy all the revlogs).
This uses the new transaction backup mechanism to make the fncache transacted.
It also moves the fncache from being written at lock.release time, to being
written at transaction.close time.
Streaming clones were writing to files outside of a transaction. Soon the
fncache will be written at transaction close time, so we need streaming clones
to be in a transaction.
Before this patch, "localrepository.commit()" omits ".hgsubstate" from
"modified" (changes[0]) and "removed" (changes[2]) file list before
checking subrepositories, but leaves one in "added" (changes[1]) as it
is.
Then, "localrepository.commit()" adds ".hgsubstate" into "modified" or
"removed" list forcibly, according to subrepository statuses.
If "added" contains ".hgsubstate", the committed context will contain
two ".hgsubstate" in its "files": one from "added" (not omitted one),
and another from "modified" or "removed" (newly added one).
How many times ".hgsubstate" appears in "files" changes node hash,
even though revision content is same, because node hash calculation
uses the specified "files" directly (without duplication check or so).
This means that node hash of committed revision changes according to
existence of ".hgsubstate" in "added" at "localrepository.commit()".
".hgsubstate" is treated as "added", not only in accidental cases, but
also in the case of "qpush" for the patch adding ".hgsubstate".
This patch omits ".hgsubstate" also from "added" files before checking
subrepositories. This patch also omits ".hgsubstate" exclusion in
"qnew"/"qrefresh" introduced by changeset bbb8109a634f, because this
patch makes them meaningless.
"hg parents --template '{files}\n'" newly added to "test-mq-subrepo.t"
enhances checking unexpected multiple appearances of ".hgsubstate" in
"files" of created/refreshed MQ revisions.
Before this patch, "localrepository.commit()" invokes specified
"editor" to edit commit message manually, and saves it after checking
sub-repositories.
This may lose manually edited commit message, if unexpected exception
is raised while checking (or commiting recursively) sub-repositories.
This patch saves manually edited commit message as soon as possible.
Before this changeset local clone of a repo with hidden changeset would include
then in the clone (why not) and turn them public (plain wrong). This happened
because the copy clone publish by dropping the phaseroot file entirely making
everything in the repo public (and therefore immune to obsolescence marker).
This changeset takes the simplest fix, we deny the copy clone in the case of hidden
changeset falling back to pull clone that will exclude them from the clone and
therefore not turning them public.
A smarter version of copy clone could be done, but I prefer to go for the
simplest solution first.
Performance benchmarking:
$ time hg log -qf -l1
...
real 0m1.420s
user 0m1.249s
sys 0m0.167s
$ time ~/local/hg/hg log -qf -l1
...
real 0m0.719s
user 0m0.614s
sys 0m0.103s
MQ extension will wrap this function to invalidate its state.
repo.invalidate cannot be wrapped for this purpose because qpush obtains
repo.lock in the middle of the operation, triggering repo.invalidate. Also,
it seems wrong to obtain lock earlier because mq data is non-store parts.
The localrepo class if far too big. Push and pull logic are extracted and
reworked to better fit with the fact we exchange more than bundle now.
This changeset extract the pulh code. later changeset will slowly slice it into
smaller brick.
The localrepo.pull method is kept for now to limit impact on user code. But it
will be ultimately removed, now that the public API is hold by peer.
changegroupsubset will take the parents of the roots to find the bases.
Other parts of Mercurial do not expect that, and a result of that is that some
bundles contain more changesets than necessary.
No real changes here - just renaming a parameter to document what it is.
Now that we have the machinery in place, we call the context method for
changing the match object in the case of comparing the working directory with
its parent.
This removes the last of the hard-coded workingctx knowledge in localrepo
paving the way for us to remove localrepo.status completely.
This patch maintains the fast path for workingctx which is to not build a
manifest if the working directory is being compared to its parent since, in
this case, we can just copy the parent manifest.
This unpacking is unneeded now because previous patches have removed the need
for this method to modify each of these variables in favor of passing the whole
set around to pre/post hook methods of the corresponding context object.
This movement is a small step in getting rid of the 'if ... else' logic for
testing the current working directory with its parent. Previously, the deleted,
unknown, and ignored variables were set in a combination of before an 'if
... else' block and within the block. This moves the variables to be set
outside the loop in one common place.
This is a small patch to help streamline keeping tracking of the list of files
for status in a variable already called 'r' ('s' is for subrepos in this
method). We now move the setting of it out of an 'if' block so that we can
later refactor more code into the context objects.
This is a slight tweak to how localrepo.status calculates what files have
changed. By forcing a changectx to be first operator and anything not a
changectx to be the second operator, we can later exploit this to allow
refactoring the status operation as a method of a context object.
Furthermore, this change will speed up 'hg diff --reverse' when used with the
working directory because the code will now hit a fast path without needing to
calculate an unneeded second manifest.
A message like this was sometimes shown when pushing:
remote: waiting for lock on repository foo held by 'mercurial:20858'
That could scare users, making them wonder whether the push actually succeeded.
To mitigate that fear, issue an additional "warning" such as:
got lock after 2 seconds
The return value from lock.lock.lock() was unused - instead we return the
delay.
This also adds the first test coverage for waiting for locks.
The localrepo class if far too big. Push and pull logic will be extracted and
reworked to better fit with the fact they now exchange more than plain changeset
bundle.
This changeset extract the push code. later changeset will slowly slice this
over 200 hundred lines and 8 indentation level function into smaller saner
brick.
The localrepo.push method is kept for now to limit impact on user code. But it
will be ultimately removed, now that the public supposed API is hold by peer.
Now that discovery is working on unfiltered changeset, I had a good occasion to
look at that bug again. This let me realise that a trivial node vs rev
comparision was the cause of this two years old bugs…
Happy second birthday phases!
Was the behavior correct and the description wrong so it should be updated as
in this patch? Or should the code work as the documentation says?
Both ways could make some sense ... but none of them are obvious in all cases.
One place where it currently cause problems is when the current revision has
another branch head that is closer to tip but closed. 'hg rebase' refuses to
rebase to that as it only see the tip-most unclosed branch head which is the
current revision.
/me kind of likes named branches, but no so much how branch closing works ...
The discovery is not yet ready for filtered repo. Pull was using filtered for
its discovery which is wrong. It worked by dumb luck because discovery mainly
use funtion that does not respect the filtering.
Trying to makes discovery work on filtered repo revealed this bug.
When all changesets in the local repo are either being pushed or remotly known,
we can take a fast path when bundling changeset because we are certain all local
deltas are computed againts base known remotely.
So we have a check to detect this situation, when we did a bare push and nothing
was excluded.
In a coming refactoring, the discovery will run on filtered view and the content
of `outgoing.excluded` will just include unserved (secret) changeset not filtered by the
repoview used to call push (usually "visible"). So we need to check if there is
both no excluded changeset and nothing filtered by the current repoview.
Before that changes, pulled revision that happend to be already known locally
(so, not actually added) was not taken into account when computing the new
common set between local and remote.
It appears that we already know the heads of the pulled set. It is in the
`rheads` variable, so we are just using it and everything is works fine.
We are dropping the, now useless, computation of `added` set in the process.
The double-for form of list comprehensions gets particularly unreadable
when you throw in an 'if' condition. This expands the only remaining
instance of the double-for syntax in our codebase into a loop.
Push is currently allowed to create a new head if there is a remote
bookmark that will be updated to point to the new head. If the
bookmark is not known remotely then push aborts, even if a -B argument
is about to push the bookmark. This change allows push to continue in
this case. This does not require a wireproto force.
Running perfmoonwalk on the Mercurial repo (with almost 20,000 changesets) on
Mac OS X with an SSD, before this change:
$ hg --config format.chunkcachesize=1024 perfmoonwalk
! wall 2.022021 comb 2.030000 user 1.970000 sys 0.060000 (best of 5)
(16,154 cache hits, 3,840 misses.)
$ hg --config format.chunkcachesize=4096 perfmoonwalk
! wall 1.901006 comb 1.900000 user 1.880000 sys 0.020000 (best of 6)
(19,003 hits, 991 misses.)
$ hg --config format.chunkcachesize=16384 perfmoonwalk
! wall 1.802775 comb 1.800000 user 1.800000 sys 0.000000 (best of 6)
(19,746 hits, 248 misses.)
$ hg --config format.chunkcachesize=32768 perfmoonwalk
! wall 1.818545 comb 1.810000 user 1.810000 sys 0.000000 (best of 6)
(19,870 hits, 124 misses.)
$ hg --config format.chunkcachesize=65536 perfmoonwalk
! wall 1.801350 comb 1.810000 user 1.800000 sys 0.010000 (best of 6)
(19,932 hits, 62 misses.)
$ hg --config format.chunkcachesize=131072 perfmoonwalk
! wall 1.805879 comb 1.820000 user 1.810000 sys 0.010000 (best of 6)
(19,963 hits, 31 misses.)
We may want to change the default size in the future based on testing and
user feedback.
Before this patch, phase of newly created commit is determined by
"phases.new-commit" configuration regardless of phase of state in each
subrepositories.
For example, this may cause the "public" revision in the parent
repository referring the "secret" one in subrepository.
This patch checks phase of state in each subrepositories before
committing in the parent, and aborts or changes phase of newly created
commit if subrepositories have more restricted phase than the parent.
This patch uses "follow" as default value of "phases.checksubrepos"
configuration, because it can keep consistency between phases of the
parent and subrepositories without breaking existing tool chains.
This patch makes "lock.lock.__init__()" take both vfs and lock file
path relative to vfs, instead of absolute path to lock file.
This allows lock file to be accessed via vfs.
Before this patch, "localrepo.py" has many methods defining local
variable "lock", even though it imports "lock" module as "lock". This
ambiguity decreases readability.
Copying a repos local configuration to another repo is a bad idea because the
2nd repo gets the configuration of the first. Prevent this by really calling
repo.baseui.copy when repo.ui.copy is called.
This requires some changes in commandserver which needs to clone repo.ui for
rejecting temporary changes.
This patch has its roots back in the topic "repo isolation" around 68ae3063a47d
and was suggested by mpm.
This patch adds "updateremote()", which uses "compare()" to compare
bookmarks between the local and the remote repositories, to replace
pushing local bookmarks in "localrepository.push()".
Before this patch, each feature setup functions for localrepository
class should examine whether corresponding extension is enabled or not
by themselves.
This patch invokes only feature setup functions defined in module of
enabled extensions, and it makes implementation of feature setup
functions easier and simpler.
The changegroup hook runs when the repo lock is released, but it's possible that
multiple transactions have happened during that single lock and therefore the
commits the hook was for may be gone. This is the case in the shelve extension
where it adds a commit and strips it in the same lock but different
transactions (which results in warning messages during unshelve on hgsubversion
repos).
A real fix would be to attach the hook to the transaction instead, but that
might have unknown consequences. Since we're this close to code-freeze, this fix
just prevents the hook from running if the commit disappeared.
repo.transaction always writes to stderr when a transaction aborts. In order to
be able to abort a transaction quietly (e.g shelve needs a temporary view on
the repo) we need to make the report level configurable.
A `unfilteredpropertycache` is a kind of `propertycache` used on `localrepo` to
unsure it will always be run against unfiltered repo and stored only once.
As the cached value is never stored in the repoview instance, the descriptor
will always be called. Before this patch such calls always result in a call to
the `__get__` method of the `propertycache` on the unfiltered repo. That was
recomputing a new value on every access through a repoview.
We can't prevent the repoview's `unfilteredpropertycache` to get called on every
access. In that case the new code makes a standard attribute access to the
property. If a value is cached it will be used.
The `propertycache` test file have been augmented with test about this issue.
Make push -r/-B update only these bookmarks that point to pushed revisions
or their ancestors, so we can be sure that commit pointed by bookmark is
present in the remote reposiory. Previously push tried to update all shared
bookmarks.
Before this patch, all localrepositories support same features,
because supported features are managed by the class variable
"supported" of "localrepository".
For example, "largefiles" feature provided by largefiles extension is
recognized as supported, by adding the feature name to "supported" of
"localrepository".
So, commands handling multiple repositories at a time like below
misunderstand that such features are supported also in repositories
not enabling corresponded extensions:
- clone/pull from or push to localhost
- recursive execution in subrepo tree
"reposetup()" can't be used to fix this problem, because it is invoked
after checking whether supported features satisfy ones required in the
target repository.
So, this patch adds the set object named as "featuresetupfuncs" to
"localrepository" to manage hook functions to setup supported features
of each repositories.
If any functions are added to "featuresetupfuncs", they are invoked,
and information about supported features is managed in each
repositories individually.
This patch also adds checking below:
- pull from localhost: whether features supported in the local(= dst)
repository satisfies ones required in the remote(= src)
- push to localhost: whether features supported in the remote(= dst)
repository satisfies ones required in the local(= src)
Managing supported features by the class variable means that there is
no difference of supported features between each instances of
"localrepository" in the same Python process, so such checking is not
needed before this patch.
Even with this patch, if intermediate bundlefile is used as pulling
source, pulling indirectly from the remote repository, which requires
features more than ones supported in the local, can't be prevented,
because bundlefile has no information about "required features" in it.
A symlink's target should never be empty in normal use. Solaris and some BSDs
do allow empty symlinks to be created (with varying semantics on dereference),
but a symlink placeholder that started off as empty is either
- going to be empty, in which case ignoring it is fine, since it's unchanged, or
- going to not be empty, in which case this check is irrelevant.
The refactoring of all the context objects allows us to simply pass a basectx
to the __new__ constructor and have it return the same object without
allocating new memory.
Moving the logic that adds files to a changegroup to a different function
allows extensions to override it and customize the way filelogs are added
to changegroups.
No logic is changed.
Create a simple start() method to pass the lookup function until bundler
becomes smarter and gets a repo object.
Since we now create the bundler for the whole lifetime, we need to pass it
down to revlog methods.
Having the permission to lock the source repo on push is now optional. When the
repo cannot be locked, phase are not changed locally. A status message is issue
when some actual phase movement are skipped:
cannot lock source repo, skipping local public phase update
A debug message with the exact reason of the locking failure is issued in all
case.
Having a dedicated function will allow us to experiment with other exchange
strategies in an extension. As we have no solid clues about how to do it right,
being able to experiment is vital.
Some transaction tricks are necessary for pull. But nothing too scary.
Having a dedicated function will allows us to experiment with other exchange
strategies in an extension. As we have no solid clues about how to do it right,
being able to experiment is vital.
I intended a more ambitious extraction of push logic, but we are far too
advanced in the release cycle for it.
This patch makes "_journalfiles()" return a list of pairs of journal
file and corresponded vfs instance instead of a list of journal files
in full path, to use "vfs.rename()" instead of "util.rename()" in
"aftertrans()".
"undofiles()" still returns a list of undo files in full path, because
"repair.strip()" expects such list. It'll be also made to return a
list of pairs of undo file and corresponded vfs at vfs migration for
"repair.strip()" in the near future.
In the point of view of efficiency, "vfs" instance created in this
patch should be passed to and reuse in "store.store()" invocation just
after patched code block, because "store" object is initialized by vfs
created with "self.sharedpath".
eBut to focus just on migration from direct file I/O API accessing to
vfs, this patch uses created vfs as temporary one. Refactoring around
"store.store()" invocation will be done in the future.
Before this patch, vfs constructor applies both "util.expandpath()"
and "os.path.realpath()" on "base" path, if "expand" is True.
This patch splits it into "realpath" and "expandpath", to apply each
functions separately: this splitting can allow to use vfs also where
one of each is not needed.
When the strip command is run, it calls repo.destroyed, which in turn checks if
we read _phasecache, and if we did calls filterunknown on it and flushes the
changes immediately. But in some cases, nothing causes _phasecache to be read,
so we miss out on this and the file remains the same on-disk.
Then a call to invalidate comes, which should refresh _phasecache if it
changed, but it didn't, so it keeps using the old one with the stripped
revision which causes an IndexError.
Test written by Yuya Nishihara.
Publishing server may contains draft changeset when they are created locally. As
publishing is the default, it is actually fairly common. Because of this
"inconsistency" phases synchronization may be done even to publishing server.
This may cause severe issues for subrepo. It is possible to reference read-only
repository as subrepo. Push in a super repo recursively push subrepo. Those
pushes to potential read only repo are not optional, they are "suffered" not
"choosed". This does not break because as the repo is untouched the push is
supposed to be empty. If the reference repo locally contains draft changesets, a
courtesy push is triggered to turn them public. As the repo is read only, the
push fails (after possible prompt asking for credential). Failure of the
sub-push aborts the whole subrepo push. This force the user to define a custom
default-push for such subrepo.
This changeset introduce a prevention of this error client side by skipping the
courtesy phase synchronisation in problematic situation. The phases
synchronisation is skipped when four conditions are gathered:
- this is a subrepo push, (normal push to read-only repo)
- and remote support phase
- and remote is publishing
- and no changesets was pushed (if we pushed changesets, repo is not read only)
The internal config option used in this version is not definitive. It is here to
demonstrate a working fix to the issue.
In the future we probably wants to track subrepo changes and avoid pushing to
untouched one. That will prevent any attempt to push to read-only or unreachable
subrepo.
Another fix to prevent courtesy push from older clients to push to newer server
is also still needed.
When you have obsolescence marker that apply to a pulled changesets, the added
changeset is immediately filtered. Then the list of added changeset needs to be
build against and unfiltered repo.
This saves us a couple of dict lookups in the common case, and improves
the performance of the status method by 5% (measured with util.timed)
in a repo with a large manifest.
User 'timeless' in irc mentioned that having the blackbox be
translated would result in logs that:
- may be mixed language, if multiple users use the same repo
- are not google searchable (since searching for english gives more
results)
- might not be readable by an admin if the employee is using hg in
his native language
And therefore we should log everything in english.
The blackbox was logging every head after every incoming group.
Now we only log the heads that have changed.
Added a test. Moved the hooks test to the bottom of the file since
the hooks interfer with the tests after it.
Logs incoming changes to a repo to ui.log(). Includes the number of changes
and the hashes of the heads after the new changes.
Example log line:
2013/02/09 08:35:19 durham> 1 incoming changes - new heads: cb9a9f314b8b
Currently the blackbox logs the unix user that is performing the push/pull.
It would be nice to log the http authorized user as well so it works with
hgweb, but that's outside the scope of this commit.
This pulls some of the logic for the cleanup that needs to happen
after a commit has been made otu of localrepo.commit and into
workingctx. This is part of a larger refactoring effort that will
eventually allow us to perform some types of merges in-memory.
This changes localrepo.commit to use the workingctx it creates form
the munged output of localrepo.status while running some precommit
validation. Specifically, it uses functions that were already present
on the workingctx. I believe this is a net readabilty improvement,
and that this makes these lines consistent with the refactoring in a
subsequent patch that pulls some of the validation logic into
workingctx so that it can be reused elsewhere.
localrepo.commit creates a workingctx, calls self.status, does some
munging on the changes status returns, does some validation on those
changes, and then creates a new workingctx from the changes. This
moves the creation of the new workginctx ahead of some validation,
with the intention of refactoring some of that validation logic into
the workingctx, so that it can be reused elsewhere.
Accepting those will lead to "mild corruption", correctly reported as
an error by hg verify, but often not a problem in practice.
Enabled when server.validate is switched on.
Writing cache for unfiltered repo only is barely useful, Most repo user are now
at least use the `hidden` filter. This changeset now assigns the remote cache
for a filter as low as possible for a wider reuse as possible.
Before this changesets the `destroyed` function updated the branchcache for
unfiltered repository. As seen in a previous changeset, Read only repo does
not cares about the unfiltered repo. We now update it for `unserved`.
The strip code used a trick to lower the cost of branchcache update after a
strip. However is less necessary since we have branchcache collaboration.
Invalid branchcache are likely to be cheaply rebuilt again a near subset of the
repo.
Moreover, this trick would need update to be relevant in the now filtered
repository world. It currently update the unfiltered branchcache that few people
cares about. Make it smarter on that aspect would need complexes update of the
calling logic
So this mechanism is:
- Arguably needed,
- Currently irrelevant,
- Hard to update
and I'm dropping it.
We now update the branchcache in all case by courtesy of the read only reader.
This changeset have a few expected impact on the testsuite are different cache
are updated.
The `commitctx` and `addchangegroup` methods of repo upgrade branchcache after
completion. This behavior aims to keep the branchcache in sync for read only
process as hgweb. See b4909adfc093 for details.
Since changelog filtering is used, those calls only update the cache for unfiltered repo.
One of no interest for typical read only process like hgweb.
Note: By chance in basic case, `repo.unfiltered() == repo.filtered('unserved')`
This changesets have the "unserved" cache updated instead. I think this is the
only cache that matter for hgweb.
We could imagine updating all possible branchcaches instead but:
- I'm not sure it would have any benefit impact. It may even increase the odd of
all cache being invalidated.
- This is more complicated change.
So I'm going for updating a single cache only which is already better that
updating a cache nobody cares about.
This changeset have a few expected impact on the testsuite are different cache
are updated.
Now that changelog filtering is in place, it's become evident that
naming the filters according to the set of revs _not_ included in the
filtered changelog is confusing. This is especially evident in the
collaborative branch cache scheme.
This changes the names of the filters to reflect the revs that _are_
included:
hidden -> visible
unserved -> served
mutable -> immutable
impactable -> base
repoview.filteredrevs is renamed to filterrevs, so that callers read a
bit more sensibly, e.g.:
filterrevs('visible') # filter revs according to what's visible
Add sorted() in places found by testing with PYTHONHASHSEED=random and code
inspection.
An alternative to sprinkling sorted() all over would be to change substate to a
custom dict with sorted iterators...