Initializing a subrepo when one doesn't exist is the right thing to do when the
parent is being updated, but in few other cases. Unfortunately, there isn't
enough context in the subrepo module to distinguish this case. This same issue
can be caused with other subrepo aware commands, so there is a general issue
here beyond the scope of this fix.
A simpler attempt I tried was to add an '_updating' boolean to localrepo, and
set/clear it around the call to mergemod.update() in hg.updaterepo(). That
mostly worked, but doesn't handle the case where archive will clone the subrepo
if it is missing. (I vaguely recall that there may be other commands that will
clone if needed like this, but certainly not all do. It seems both handy, and a
bit surprising for what should be a read only operation. It might be nice if
all commands did this consistently, but we probably need Angel's subrepo caching
first, to not make a mess of the working directory.)
I originally handled 'Exception' in order to pick up the Aborts raised in
subrepo.state(), but this turns out to be unnecessary because that is called
once and cached by ctx.sub() when iterating the subrepos.
It was suggested in the bug discussion to skip looking at the subrepo links
unless -S is specified. I don't really like that idea because missing a subrepo
or (less likely, but worse) a corrupt .hgsubstate is a problem of the parent
repo when checking out a revision. The -S option seems like a better fit for
functionality that would recurse into each subrepo and do a full verification.
Ultimately, the default value for 'allowcreate' should probably be flipped, but
since the default behavior was to allow creation, this is less risky for now.
updatetotally() might be invoked outside wlock scope (e.g. invocation
via postincoming() at "hg unbundle" or "hg pull").
In such case, acquisition of wlock is needed for consistent view,
because parallel "hg update" and/or "hg bookmarks" might change
working directory status while executing updatetotally().
Strictly speaking, truly consistent updating should acquire also store
lock, because active bookmark might be moved to another one outside
wlock scope (e.g. pulling from other repository causes updating
current active one).
Acquisition of wlock in this patch ensures consistency in as same
level as past "hg update".
This patch centralizes similar code paths to update the working
directory with extra care for non-file components (e.g. bookmark) into
newly added function updatetotally().
'if True' at the beginning of updatetotally() is redundant at this
patch, but useful to reduce amount of changes in subsequent patch.
There are race conditions between clients performing a shared clone
to pooled storage:
1) Clients race to create the new shared repo in the pool directory
2) 1 client is seeding the repo in the pool directory and another goes
to share it before it is fully cloned
We prevent these race conditions by obtaining a lock in the pool
directory that is derived from the name of the repo we will be
accessing.
To test this, a simple generic "lockdelay" extension has been added.
The extension inserts an optional, configurable delay before or after
lock acquisition. In the test, we delay 2 seconds after lock acquisition
in the first process and 1 second before lock acquisition in the 2nd
process. This means the first process has 1s to obtain the lock. There
is a race condition here. If we encounter it in the wild, we could
change the dummy extension to wait on the lock file to appear instead
of relying on timing. But that's more complicated. Let's see what
happens first.
As part of writing an extension that wished to share an arbitrary piece
of data among shared repos, I had to reimplement a significant part of
hg.share in order to obtain localrepository instances for the source
and destination.
This patch establishes a function in hg.py that will be called after a
share is performed. It is passed localrepository instances so extensions
can easily perform additional actions at share time. We move hgrc and
shared file writing there because this function is a logical place for
it.
A side effect of the refactor is writing of the shared file now occurs
before updating. This seems more appropriate and shouldn't have any
impact on real world behavior.
When pooled storage is enabled, `hg clone` will initialize a repo
from a local repo using the store sharing mechanism then pull from
the originally requested repo.
Before this patch, the working directory update occurred between
these steps. This meant that we would only update to revisions that
were already present in the local pooled storage.
This patch moves the update to after we pull from the originally
requested repository so we may check out a revision that didn't yet
exist locally. In other words, it makes the behavior like normal
`hg clone`.
Before this patch, 'cachedlocalrepo' always caches "visible" repoview
object, because 'cachedlocalrepo' uses "visible" repoview returned by
'hg.repository()' without any additional processing.
If the client of 'cachedlocalrepo' wants "served" repoview, some
objects to be cached are discarded unintentionally.
1. 'cachedlocalrepo' newly caches "visible" repoview object
(call it VIEW1)
2. 'cachedlocalrepo' returns VIEW1 to the client of it at 'fetch()'
3. the client gets "served" repoview object by 'filtered("served")'
on VIEW1 (call this "served" repoview VIEW2)
4. accessing to 'repo.changelog' implies:
- instantiation of changelog via 'localrepository.changelog'
- instantiation of "filtered changelog" via 'repoview.changelog'
5. "filtered changelog" above is cached in VIEW2
6. VIEW2 is discarded after processing, because there is no
reference to it
7. 'cachedlocalrepo' returns VIEW1 cached at (1) above to the
client at next 'fetch()'
8. 'filtered("served")' on VIEW1 at the client side creates new
"served" repoview again, because VIEW1 is "visible"
(call this new "served" repoview VIEW3)
9. accessing to 'repo.changelog' implies instantiation of filtered
changelog again, because "filtered changelog" is cached in
VIEW2 at (5), but not in VIEW3 currently used
10. (go to (7) above)
As described above, "served" repoview object and "filtered changelog"
cached in it are discarded always, even if the repository itself
hasn't been changed since last access.
For example, in the case of 'hgweb_mod.hgweb', "newly caching" occurs,
when:
- all cached objects are already assigned to another threads
(in this case, repoview is created in 'cachedlocalrepo.copy()')
- or, stat of '00changelog.i' is changed from last access
(in this case, repoview is created in 'cachedlocalrepo.fetch()')
once changes are pushed via HTTP, this always occurs.
The root cause of this inefficiency is that 'cachedlocalrepo' always
caches "visible" repoview object, even if the client of it wants
another view.
To make 'cachedlocalrepo' cache appropriate repoview object, this
patch adds additional filtering on the repo object returned by
'hg.repository()'. It is assumed that initial repoview object should
be already filtered by expected view.
After this patch:
- 'filtered("served")' on VIEW1 at (3)/(7) above returns VIEW1
itself, because VIEW1 is now "served", and
- VIEW2 and VIEW3 equal VIEW1
- therefore, "filtered changelog" is cached in VIEW1, and reused
intentionally
In an upcoming patch we'll have different behavior here for when 'merge
--force' is used as opposed to when other kinds of force operations are
performed, like rebases.
The home of 'Abort' is 'error' not 'util' however, a lot of code seems to be
confused about that and gives all the credit to 'util' instead of the
hardworking 'error'. In a spirit of equity, we break the cycle of injustice and
give back to 'error' the respect it deserves. And screw that 'util' poser.
For great justice.
In 03e658289ff6, there was an attempt to fallback to looking up the update
revision assuming it was always a rev but the documentation states:
True means update to default rev, anything else is treated as a revision
Therefore, we should only fallback to looking up the update rev if it is not
True. This bug was found in hg-git and I couldn't think of a test that does
this in pure Mercurial since the source repository is checked for the revision
as well (and therefore gracefully falls back).
cachedlocalrepo.copy() didn't actually create new localrepository
instances. This meant that the new thread isolation code in hgweb wasn't
actually using separate localrepository instances, even though it was
properly using separate cachedlocalrepo instances.
Because the behavior of the API changed, the single caller in hgweb had
to be refactored to always call _webifyrepo() or it may not have used
the proper filter.
I confirmed via print() debugging that id(repo) is in fact different on
each thread. This was not the case before.
For reasons I can't yet explain, this does not fix issue4756. I suspect
there is shared cache somewhere that isn't thread safe.
hgweb contained code for determining whether a cached localrepository
instance was up to date. This code was way too low-level to be in
hgweb.
This functionality has been moved to a new "cachedlocalrepo" class
in hg.py. The code has been changed slightly to facilitate use
inside a class. hgweb has been refactored to use the new API.
As part of this refactor, hgweb.repo no longer exists! We're very close
to using a distinct repo instance per thread.
The new cache records state when it is created. This intelligence
prevents an extra localrepository from being created on the first
hgweb request. This is why some redundant output from test-extension.t
has gone away.
Before this patch if clone --updaterev points to branch which head
on src repo wasnt in dest repo, clone updated dest repo to
default branch. After applying this patch, if changeset from
src repo pointing at given branch is not in dest repo, it searches
for changeset pointing for given branch locally in dest repo.
Lookup in destination repo:
559: uprev = destrepo.lookup(update)
is wrapped by try/except block to preserve current behaviour when
given revset to -u is not found - it will not fail,but silently update
dest repo to head of default branch.
Before this patch, when auto sharing is enabled, 'hg.clone()' tries to
create local clone regardless of locality of the clone destination on
the host, and causes failure.
To avoid auto sharing when the clone destination is remote, this patch
adds examination of 'islocal(dest)' before auto sharing in
'hg.clone()'.
'islocal(dest)' is examined after 'sharepool', because:
- the former is more expensive than the latter
- without enabling share extension, the later is always negative
Many 3rd party consumers of Mercurial have created wrappers to
essentially perform clone+share as a single operation. This is
especially popular in automated processes like continuous integration
systems. The Jenkins CI software and Mozilla's Firefox release
automation infrastructure have both implemented custom code that
effectively perform clone+share. The common use case here is that
clients want to obtain N>1 checkouts while minimizing disk space and
network requirements. Furthermore, they often don't care that a clone
is an exact mirror of a remote: they are simply looking to obtain
checkouts of specific revisions.
When multiple third parties implement a similar feature, it's a good
sign that the feature is worth adding to the core product. This patch
adds support for an easy-to-use clone+share feature.
The internal "clone" function now accepts options to control auto
sharing during clone. When the auto share mode is active, a store will
be created/updated under the base directory specified and a new
repository pointing to the shared store will be created at the path
specified by the user.
The share extension has grown the ability to pass these options into
the clone command/function.
No command line options for this feature are added because we don't
feel the feature will be popular enough to warrant their existence.
There are two modes for auto share mode. In the default mode, the shared
repo is derived from the first changeset (rev 0) in the remote
repository. This enables related repositories existing at different URLs
to automatically use the same storage. In environments that operate
several repositories (separate repo for branch/head/bookmark or separate
repo per user), this has the potential to drastically reduce storage
and network requirements. In the other mode, the name is derived from the
remote's path/URL.
Python 2.6 introduced the "except type as instance" syntax, replacing
the "except type, instance" syntax that came before. Python 3 dropped
support for the latter syntax. Since we no longer support Python 2.4 or
2.5, we have no need to continue supporting the "except type, instance".
This patch mass rewrites the exception syntax to be Python 2.6+ and
Python 3 compatible.
This patch was produced by running `2to3 -f except -w -n .`.
While hopefully atypical, there are reasons that a subrepository revision can be
lost that aren't covered by corruption of the .hgsubstate revlog. Such things
can happen when a subrepo is amended, stripped or simply isn't pulled from
upstream because the parent repo revision wasn't updated yet. There's no way to
know if it is an error, but this will find potential problems sooner than when
some random revision is updated.
Until recently, convert made no attempt at rewriting the .hgsubstate file. The
impetuous for this is to verify the conversion of some repositories, and this is
orders of magnitude faster than a bash script from 0..tip that does an
'hg update -C $rev'. But it is equally useful to determine if everything has
been pulled down before taking a thumb drive on the go.
It feels somewhat wrong to leave this out of verifymod (mostly because the file
is already read in there, and the final summary is printed before the subrepos
are checked). But verifymod looks very low level, so importing subrepo stuff
there seems more wrong.
If a "thing" is callable but raises TypeError for some reason, a callable
object would be returned. Thereafter, unfriendly traceback would be displayed:
Traceback (most recent call last):
...
File "mercurial/hg.pyc", line 119, in _peerorrepo
obj = _peerlookup(path).instance(ui, path, create)
AttributeError: 'function' object has no attribute 'instance'
Instead, we should show the reason why "thing(path)" didn't work:
Traceback (most recent call last):
...
File "hggit/__init__.py", line 89, in _local
p = urlcls(path).localpath()
TypeError: 'NoneType' object is not callable
If a "thing" is not callable, it must be a module or an object that implements
instance(). If that module didn't have instance(), the error message would be
"<unloaded module 'foo'> object is not callable". It doesn't make perfect sense,
but it isn't so bad as it can blame which module went wrong.
Today, the terms 'active' and 'current' are interchangeably used throughout the
codebase in reference to the active bookmark (the bookmark that will be updated
with the next commit). This leads to confusion among developers and users.
This patch is part of a series to standardize the usage to 'active' throughout
the mercurial codebase and user interface.
Although Python supports `X = Y if COND else Z`, this was only
introduced in Python 2.5. Since we have to support Python 2.4, it was
a very common thing to write instead `X = COND and Y or Z`, which is a
bit obscure at a glance. It requires some intricate knowledge of
Python to understand how to parse these one-liners.
We change instead all of these one-liners to 4-liners. This was
executed with the following perlism:
find -name "*.py" -exec perl -pi -e 's,(\s*)([\.\w]+) = \(?(\S+)\s+and\s+(\S*)\)?\s+or\s+(\S*)$,$1if $3:\n$1 $2 = $4\n$1else:\n$1 $2 = $5,' {} \;
I tweaked the following cases from the automatic Perl output:
prev = (parents and parents[0]) or nullid
port = (use_ssl and 443 or 80)
cwd = (pats and repo.getcwd()) or ''
rename = fctx and webutil.renamelink(fctx) or []
ctx = fctx and fctx or ctx
self.base = (mapfile and os.path.dirname(mapfile)) or ''
I also added some newlines wherever they seemd appropriate for readability
There are probably a few ersatz ternary operators still in the code
somewhere, lurking away from the power of a simple regex.
The next patch will enable verification by using the system's CA store if
possible, which means we would have to distinguish None (=use default) from
'' (=--insecure). This smells bug-prone and provides no way to override
web.cacerts to forcibly use the system's store by --config argument.
This patch changes the meaning of web.cacerts as follows:
value behavior
------- ---------------------------------------
None/'' use default
'!' never use CA certs (set by --insecure)
<path> verify by the specified CA certificates
Values other than <path> are for internal use and therefore undocumented.
0e5f75a52c9d introduced a way to share bookmarks. When a repository share that
shares bookmarks was created, a .hg/bookmarks.shared file was created to mark
the repository share as one that shares its bookmarks.
We have plans to introduce other levels of sharing, including a "full share"
mode. Rather than creating a new ".shared" file for each new thing that we may
want to share It seems better to create a single "shared" file that will list
what is shared for a given shared repository. This should make it much easier
to get a list of everything that is shared by a given shared repository.
The shared file contains a list of shared "items" (such as bookmarks). Each
shared "item" is added as a new line in the file. For now the only possible
entry in the file is "bookmarks".
This change touches every module in which repository.opener was being used, and
changes it for the equivalent repository.vfs. This is meant to make it easier
to split the repository.vfs into several separate vfs.
It should now be possible to remove localrepo.opener.
This patch adds the -B/--bookmarks option to the share command added by the
share extension. All it does for now is create a marker, 'bookmarks.shared',
that will be used by future code to implement the sharing functionality.
This just copies the same local sample hgrc, except it sets the
default path to the repo it was cloned from.
This is cut-and-paste from the local sample hgrc, but I think it's
acceptable, since the two pieces of code are right next to each other
and they're small. There is danger of them going out of synch, but it
would complicate the code too much to get rid of this C&P.
I also add ui as an import to hg.py, but with demandimport, this
should not be a noticeable performance hit.
We need to explicitly push all local bookmarks when doing the clone from a local
repo to a remote one through ssh. This will let us remove the manual export of
bookmarks in clone and rely on the official exchange in push and pull instead.
Now that the standard pull function includes bookmarks, we need to ensure that
a copy clone also copies the bookmarks. This will let us drop the manual bookmark
pulling done independently during a clone.
Some users clone from a server before ever running 'hg config --edit',
so they don't see our helpful template for things like enabling the
username. Attempt to give them some helpful guidance.
Local clones copy/hardlink over files directly, so the branchcaches should all
be valid. There's a slight chance that a read operation would update one of the
branchcaches, but we hold a lock over the source repo so they shouldn't cause
an invalid branchcache to be copied over, just a different, valid one.