You can get into trouble if you commit, update back to an older
changeset, and then rollback. The update removes your valuable changes
from the working dir, then rollback removes them history. Oops: you've
just irretrievably lost data running nothing but core Mercurial
commands. (More subtly: rollback from a shared clone that was already
at an older changeset -- no update required, just rollback from the
wrong directory.)
The fix assumes that only "commit" transactions have irreplaceable
data, and allows rolling back non-commit transactions as always. But
when rolling back a commit, check that the working dir is checked out
to tip, i.e. the changeset we're about to destroy. If not, abort. You
can get back the old (dangerous) behaviour with --force.
If the working dir parent was destroyed by rollback, then the old
behaviour is perfectly reasonable: restore dirstate, branch, and
bookmarks. That way the working dir moves back to an existing
changeset rather than becoming an orphan.
But if the working dir parent was unaffected -- say, you updated to an
older changeset and then did rollback -- then it's silly to restore
dirstate and branch. So don't do that. Leave the status of the working
dir alone. (But always restore bookmarks, because that file refers to
changeset IDs that may have been destroyed.)
- clarify how we parse undo.desc
- fix bad grammar in an error message
- factor out ui local
- rename some local variables
- standardize string quoting
This is extremely handy for those occasional circumstances where you
need to edit .hg/sharedpath manually, since modern Unix text editors
make it surprisingly difficult to create a text file with no trailing
newline.
The usual contract is that close() makes your writes permanent, so
atomictempfile's use of close() to *discard* writes (and rename() to
keep them) is rather unexpected. Thus, change it so close() makes
things permanent and add a new discard() method to throw them away.
discard() is only used internally, in __del__(), to ensure that writes
are discarded when an atomictempfile object goes out of scope.
I audited mercurial.*, hgext.*, and ~80 third-party extensions, and
found no one using the existing semantics of close() to discard
writes, so this should be safe.
We refresh the stat info when releasing repo.wlock(), right after writing it.
Also, invalidate the dirstate by deleting its attribute. This will force a
stat by the decorator that actually checks if anything changed, rather than
reading it again every time.
Note that prior to this, there was a single dirstate instance created for a
localrepo. It was invalidated by calling dirstate.invalidated(), clearing
its internal attributes.
As a consequence, the following construct is no longer safe:
ds = repo.dirstate # keep a reference to the repo's dirstate
wlock = repo.wlock()
try:
ds.setparents(...)
finally:
wlock.release() # dirstate should be written here
Since it's possible that the dirstate was modified between lines #1 and #2,
therefore changes to the old dirstate won't get written when the lock releases,
because a new instance was created by the decorator.
util is never imported by any other name than util, so this is mostly just a
simple search and replace from util.localpath to util.urllocalpath (assuming
other uses of util.localpath already has been renamed).
It has substantially different semantics from forget at the command
layer, so change it to avoid confusion.
We can't simply combine it with remove because we need to explicitly
drop non-added files in some cases like commit.
This greatly improves the speed of the bundling process, and often reduces the
bundle size considerably. (Although if the repository is already ordered, this
has little effect on both time and bundle size.)
For non-generaldelta clients, the reduced bundle size translates to a reduced
repository size, similar to shrinking the revlogs (which uses the exact same
algorithm). For generaldelta clients the difference is minor.
When the new bundle format comes, reordering will not be necessary since we
can then store the deltaparent relationsships directly. The eventual default
behavior for clients and servers is presented in the table below, where "new"
implies support for GD as well as the new bundle format:
old client new client
old server old bundle, no reorder old bundle, no reorder
new server, non-GD old bundle, no reorder[1] old bundle, no reorder[2]
new server, GD old bundle, reorder[3] new bundle, no reorder[4]
[1] reordering is expensive on the server in this case, skip it
[2] client can choose to do its own redelta here
[3] reordering is needed because otherwise the pull does a lot of extra
work on the server
[4] reordering isn't needed because client can get deltabase in bundle
format
Currently, the default is to reorder on GD-servers, and not otherwise. A new
setting, bundle.reorder, has been added to override the default reordering
behavior. It can be set to either 'auto' (the default), or any true or false
value as a standard boolean setting, to either force the reordering on or off
regardless of generaldelta.
Some timing data from a relatively branch test repository follows. All
bundling is done with --all --type none options.
Non-generaldelta, non-shrunk repo:
-----------------------------------
Size: 276M
Without reorder (default):
Bundle time: 14.4 seconds
Bundle size: 939M
With reorder:
Bundle time: 1 minute, 29.3 seconds
Bundle size: 381M
Generaldelta, non-shrunk repo:
-----------------------------------
Size: 87M
Without reorder:
Bundle time: 2 minutes, 1.4 seconds
Bundle size: 939M
With reorder (default):
Bundle time: 25.5 seconds
Bundle size: 381M
defversion was a property (later option) on the store opener, used to propagate
the changelog revlog format to the other revlogs, so they would be created with
the same format.
This required that the changelog instance was created before any other revlog;
an invariant that wasn't directly enforced (or documented) anywhere.
We now use the revlogv1 requirement instead, which is transfered to the store
opener options. If this option is missing, v0 revlogs are created.
With generaldelta switched on, deltas are always computed against the first
parent when adding revisions. This is done regardless of what revision the
incoming bundle, if any, is deltaed against.
The exact delta building strategy is subject to change, but this will not
affect compatibility.
Generaldelta is switched off by default.
The Mercurial 1.9 release is moving a lot of stuff around anyway and we are
already moving path_auditor from util.py to scmutil.py for that release.
So this seems like a good opportunity to do such a rename. It also strengthens
the current project policy to avoid underbars in names.
Before this patch undo.bookmarks was created on bookmarks write and
not with other transaction-related files. There were two issues: first
is that if you have changed bookmarks few times after a transaction
happened, rollback will give you a state which can point to
non-existing revision. Second is that if you have not changed
bookmarks after a transaction, rollback will touch your state anyway.
This change also adds `localrepo._writejournal` method, which can be
used by other extensions to save their transaction-related backup in
right time.
This improves the misleading error message
$ hg identify
abort: there is no Mercurial repository here (.hg not found)!
to the more explicit
$ hg identify
abort: requirement 'fake' not supported!
for all commands in commands.optionalrepo, which includes the identify
and serve commands in particular.
This is for the case when a new entry in .hg/requires will be defined
in a future Mercurial release.
Previously, when rolling back a transaction, some users could be confused
between the level to which the store is rolled back, and the new parents
of the working directory.
$ hg rollback
rolling back to revision 4 (undo commit)
With this change:
$ hg rollback
repository tip rolled back to tip revision 4 (undo commit)
working directory now based on revision 2 and 1
So now the user can realize that the store has been rolled back to an older
tip, but also that the working directory may not on the tip (here we are
rolling back the merge of the heads 2 and 1)
The default behaviour is to commit subrepositories with uncommitted changes. In
my experience this is usually undesirable:
- Changes to dependencies are often debugging leftovers
- Real changes should generally be applied on the source project directly,
tested then committed. This is not always possible, subversion subrepos may
include only a small part of the source project, without the tests.
Setting ui.commitsubrepos=no will now abort commits containing such modified
subrepositories like:
$ hg --config ui.commitsubrepos=no ci -m msg
abort: uncommitted changes in subrepo sub
I ruled out the hook solution because it does not easily take --include/exclude
options in account. Also, my main concern is whether this flag could cause
problems with extensions. If there are legitimate reasons for callers to
override this behaviour (I could not find any), they might either override at ui
level, or we could add an argument to localrepo.commit() later.
v2:
- Renamed ui.commitsubs to ui.commitsubrepos
- Mention the configuration entry in hg help subrepos
Add missing calls to close() to many places where files are
opened. Relying on reference counting to catch them soon-ish is not
portable and fails in environments with a proper GC, such as PyPy.
Defers updating the fncache file with newly added entries to the end of
the transaction (on e.g. pull), doing a single open call on the fncache
file, instead of opening and closing it each time a new entry is added
to the store.
Implemented by adding a new abstract write() function on store.basicstore
and registering it as a release function on the store lock in
localrepo.lock (compare with dirstate.write).
store.fncachestore overrides write() from basicstore and calls a new
write function on the fncache object, which writes all entries to the
fncache file if it's dirty.
store.fncache.add() now just marks itself as dirty if a new name is added.
Some extensions (e.g. hgsubversion) completely override push command. Because
extensions load order is unspecified, if hgsubversion loads before mq, mq
checks about not pushing applied patches will be bypassed. Short of finding a
way to fix load order, extracting the checking logic will allow
hgsubversion-like extensions to run the check themselves.
The generation of cache files like tags.cache and branchheads.cache is not an
actual reflection of things changing in the whole of the .hg directory (like eg
a commit or a rebase or something) but instead these cache files are just part
of bookkeeping. As such its convienant to allow various clients to ignore file
events to do with these cache files which would otherwise cause a double
refresh. Eg one refresh might occur after a commit, but the act of refreshing
after the commit would cause Mercurial to generate a new branchheads.cache which
would then cause a second refresh, for clients.
However if these cache files are moved into a directory like eg .hg/cache/ then
GUI clients on OSX (and possibly other platforms) can happily ignore file events
in this cache directory.
These leaks may occur in environments that don't employ a reference
counting GC, i.e. PyPy.
This implies:
- changing opener(...).read() calls to opener.read(...)
- changing opener(...).write() calls to opener.write(...)
- changing open(...).read(...) to util.readfile(...)
- changing open(...).write(...) to util.writefile(...)
This speeds up the in-memory version of debugbuilddag that I'm
working on considerably for the case where we want to build just
a 00changelog.i (for discovery tests, for instance).
There are a couple of test changes because node ids in tests
have changed.
The changes to the patch names in test-mq-qdelete.t were required
because they could collide with nodeid abbreviations and newly
actually do (patch "c" collides with id "cafe..." for patch "b").
All tests repeatedly passes with a3900c75ca8c on some machines, but on other
machines it regularly causes failure in test-mv-cp-st-diff.t, such as:
@@ -203,6 +203,7 @@
- working to root: --rev 0
M a
+ M x/x
A b
a
The introduction of the new URL parsing code has created a startup
time regression. This is mainly due to the use of url.hasscheme() in
the ui class. It ends up importing many libraries that the url module
requires.
This fix helps marginally, but if we can get rid of the urllib import
in the URL parser all together, startup time will go back to normal.
perfstartup time before the URL refactoring (707e4b1e8064):
! wall 0.050692 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
current startup time (9ad1dce9e7f4):
! wall 0.070685 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
after this change:
! wall 0.064667 comb 0.000000 user 0.000000 sys 0.000000 (best of 100)
This is a long desired cleanup and paves the way for new discovery.
To specify subsets for bundling changes, all code should use the heads
of the desired subset ("heads") and the heads of the common subset
("common") to be excluded from the bundled set. These can be used
revlog.findmissing instead of revlog.nodesbetween.
This fixes an actual bug exposed by the change in test-bundle-r.t
where we try to bundle a changeset while specifying that said changeset
is to be assumed already present in the target. This used to still
bundle the changeset. It no longer does. This is similar to the bugs
fixed by the recent switch to heads/common for incoming/pull.
Updating the branch cache is quadratic to the amount of heads in the
repository. One consequence of this was that cloning a pathological
repository with 10,000 heads (and nothing else) took hours of CPU
time.
This patch makes one of the inner loop much faster, by removing a
changectx instantiation, and removes another entirely in cases where
there are no candidate branch heads which descend from other branch
heads.
New argument is silently ignored by both HTTP and SSH servers.
This means we can, for instance, add new flags to getbundle()
to request advanced features (like lightweight-copy-aware bundles),
and older servers will silently ignore this request and send back
a plain bundle.
If a closed head gets pulled, we currently see (example):
$ hg pull
pulling from $TESTTMP/repo2
searching for changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 1 changes to 1 files (+1 heads)
(run 'hg heads' to see heads, 'hg merge' to merge)
A subsequent 'hg heads' doesn't show that head because it is closed.
This patch improves the UI response texts for that same use case to:
$ hg pull
pulling from $TESTTMP/repo2
searching for changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 1 changes to 1 files
(run 'hg update' to get a working copy)
That is, the part "(+1 heads)" is not shown in that case any longer.
If a file is deleted (rm, not 'hg rm') from the working dir
an attempt to run 'hg diff -r X', with the file being present in X will
cause an abort.
We didn't check if the file has been deleted from the working dir
and later on tried to open it to compare with the one from X, causing the abort.
This fix adds that check. Consequently, no output will be returned.
The manifest value of a file will never be false when "not parentworking", and
the expensive content comparision would thus fortunately never be reached. (If
it was reached it would be wrong for example in case of renames.)
This code once handled status against working directory, but that has been done
elsewhere for a long time.
This replaces util.drop_scheme() with url.localpath(), using url.url for
parsing instead of doing it on its own. The function is moved from
util to url to avoid an import cycle.
hg.localpath() is removed in favor of using url.localpath(). This
provides more consistent behavior between "hg clone" and other
commands.
To preserve backwards compatibility, URLs like bundle://../foo still
refer to ../foo, not /foo.
If a URL contains a scheme, percent-encoded entities are decoded. When
there's no scheme, all characters are left untouched.
Comparison of old and new behaviors:
URL drop_scheme() hg.localpath() url.localpath()
=== ============= ============== ===============
file://foo/foo /foo foo/foo /foo
file://localhost:80/foo /foo localhost:80/foo /foo
file://localhost:/foo /foo localhost:/foo /foo
file://localhost/foo /foo /foo /foo
file:///foo /foo /foo /foo
file://foo (empty string) foo /
file:/foo /foo /foo /foo
file:foo foo foo foo
file:foo%23bar foo%23bar foo%23bar foo#bar
foo%23bar foo%23bar foo%23bar foo%23bar
/foo /foo /foo /foo
Windows-related paths on Windows:
URL drop_scheme() hg.localpath() url.localpath()
=== ============= ============== ===============
file:///C:/foo C:/C:/foo /C:/foo C:/foo
file:///D:/foo C:/D:/foo /D:/foo D:/foo
file://C:/foo C:/foo C:/foo C:/foo
file://D:/foo C:/foo D:/foo D:/foo
file:////foo/bar //foo/bar //foo/bar //foo/bar
//foo/bar //foo/bar //foo/bar //foo/bar
\\foo\bar //foo/bar //foo/bar \\foo\bar
Windows-related paths on other platforms:
file:///C:/foo C:/C:/foo /C:/foo C:/foo
file:///D:/foo C:/D:/foo /D:/foo D:/foo
file://C:/foo C:/foo C:/foo C:/foo
file://D:/foo C:/foo D:/foo D:/foo
file:////foo/bar //foo/bar //foo/bar //foo/bar
//foo/bar //foo/bar //foo/bar //foo/bar
\\foo\bar //foo/bar //foo/bar \\foo\bar
For more information about file:// URL handling, see:
http://www-archive.mozilla.org/quality/networking/testing/filetests.html
Related issues:
- issue1153: File URIs aren't handled correctly in windows
This patch should preserve the fix implemented in
5c92d05b064e. However, it goes a step further and "promotes"
Windows-style drive letters from being interpreted as host names to
being part of the path.
- issue2154: Cannot escape '#' in Mercurial URLs (#1172 in THG)
The fragment is still interpreted as a revision or a branch, even in
paths to bundles. However, when file: is used, percent-encoded
entities are decoded, so file:test%23bundle.hg can refer to
test#bundle.hg ond isk.
split can be more readable for longer lists like the list in
dirstate.invalidate. As dirstate.invalidate is used in wlock() and therefoe
used heavily, I think it's worth avoiding a split there too.
This patch avoids empty commit when .hgsubstate is dirty. Empty commit
was caused by .hgsubstate being updated back to the state of the
working copy parent when committing, if a user had changed it manually
and not made any changes in subrepositories.
The subrepository state from the working copies parent is compared
with the state calculated as a result of trying to commit the
subrepositories. If the two states are the same, then return None
otherwise the commit is just done.
The line: "committing subrepository x" will be written if there is
nothing committed, but .hgsubstate is dirty for x subrepository.
This uses the same strategy as progress for pulls, estimating manifests
based on changeset count and estimating file count by files list in
each changeset.
The opener already unlinks the filename before 'w'riting, for both
the symlink and the normal file case. It also now resets the flags
for normal files on 'w'rite, which makes this os.unlink call completely
redundant.
For Windows, removing this extra unlink call helps to avoid tripping
issue2524 (os.unlink followed by a write).
Previously, branch names were ideally manipulated as UTF-8 strings,
because they were stored as UTF-8 in the dirstate and the changelog
and could not be safely converted to the local encoding and back.
However, only about 80% of branch name code was actually using the
right encoding conventions. This patch uses the localstr addition to
allow working on branch names as local strings, which simplifies
handling so that the previously incorrect code becomes correct.
When issuing `hg pull -r REV` in a repo with no common ancestor with the
remote repo, the message 'requesting all changes' is printed, even though only
the changese that are ancestors of REV are actually requested. This can be
confusing for users (see
http://www.selenic.com/pipermail/mercurial/2010-October/035508.html).
This silences the message if (and only if) the '-r' option was passed.
- Mac OS X has problems with filenames starting with '._'
(e.g. '.FOO' -> '._f_o_o' is now encoded as '~2e_f_o_o')
- Explorer of Windows Vista and Windows 7 strip leading spaces of
path elements of filenames when copying trees
Above problems are avoided by encoding the first space (as '~20') or
period (as '~2e') of all path elements.
This introduces a new entry 'dotencode' in .hg/requires, that is,
a new repository filename layout (inside .hg/store).
Newly created repositories require 'dotencode' by default. Specifying
[format]
dotencode = False
in a config file will use the old format instead.
Prior Mercurial versions will abort with the message
abort: requirement 'dotencode' not supported!
when trying to access a local repository that requires 'dotencode'.
New 'dotencode' repositories can be converted to the previous
repository format with
hg --config format.dotencode=0 clone --pull repoA repoB