Without the blank line, the minirst parser renders
Term-1
Line-1
Line-2
Term-2
Line-1
as
Term-1
Line-1
Line-2 Term-2 Line-1
because the second term is seen as a paragraph.
git patches may require copies to be handled out-of-order. For instance, take
the following sequence:
* modify a
* copy a into b
Here, we have to generate b from a before its modification. To do so,
applydiff() was scanning for copy metadata and performing the copies before
processing the other changes in-order. While smart and efficient, this approach
complicates things by handling file copies and file creations at different
places and times. While a new file must not exist before being patched a copied
file already exists before applying the first hunk.
Instead of copying the files at their final destination before patching, we
store them in a temporary file location and retrieve them when patching. The
filestore always stores file content in real files but nothing prevents adding
a cache layer. The filestore class was kept separate from fsbackend for at
least two reasons:
- This class is likely to be reused as a temporary result store for a future
repository patching call (entries just have to be extended to contain copy
sources).
- Delegating this role to backends might be more efficient in a repository
backend case: the source files are already available in the repository itself
and do not need to be copied again. It also means that third-parties backend
would have to implement two other methods. If we ever decide to merge the
filestore feature into backend, a minimalistic approach would be to compose
with filestore directly. Keep in mind this copy overhead only applies for
copy/rename sources, and may even be reduced to copy sources which have to
handled ahead of time.
The patcher has to know if a file is being created or removed to check if the
target already exists, or to actually unlink the file when a hunk emptying it
is applied. This was done by embedding the creation/removal information in the
first (and only) hunk attached to the file.
There are two problems with this approach:
- creation/removal is really a property of the file being patched and not its
hunk.
- for regular patches, file creation cannot be deduced at parsing time: there
are case where the *stripped* file paths must be compared. Modifying hunks
after their creation is clumsy and prevent further refactorings related to
copies handling.
Instead, we delegate this job to selectfile() which has all the relevant
information, and remove the hunk createfile() and rmfile() methods.
workingctx.remove(list, unlink=True) is unsuited here, because it does too
much: it also unlinks added files. But the command 'hg remove' is specified
to *never* unlink added files.
Instead, we now unlink the files at the commands.remove level (if --after was
not specified) and use workingctx.forget for all files.
As an added bonus, this happens to eliminate a wlock acquire/release pair,
since the previous implementation caused
acquire wlock
release wlock
acquire wlock
release wlock
where the first pair of acquire/release was caused by the workingctx.forget
call, and the second by the workingctx.remove call.
Currently, if there is a bare git subrepo, but it is at the "right"
revision, calling dirty() will error because diff-index does not work
on bare repos. This patch makes it so bare subrepos are always
considered dirty.
and check if we got one before creating.
note that the contents of the ui object might change after
dispatch() returns (by options passed through --config for example),
to ensure it doesn't, pass a copy() of it.
Restore the previous diffstat behaviour of scaling by the maximum number of
changes to a single file. Changeset 7bb0e22a7988 modified the diffstat to be
scaled by the total number of changes. This seems to have been unintentional.
Firstly, I think we should do this for all new wire commands, just
to be on the safe side. So I want to get this into the 1.9 release.
Secondly, there actually is potential here that sometimes the server
can know that the number of its nodes which can possibly still be
undecided on the client is small. It might then just send them along
directly (cutting short the end game). This, however, requires
walking the graph on the server, which can be expensive, so for the
moment we're not actually doing it.
It has substantially different semantics from forget at the command
layer, so change it to avoid confusion.
We can't simply combine it with remove because we need to explicitly
drop non-added files in some cases like commit.
This new Python code should be equivalent in behavior to the if
statement at line 312 of parsers.c. Without this, the pure-python
parsers improperly ignore truncated revlogs as created in
test-verify.t.
Changeset 20b319765bcf introduced the unbundlehash capability and
unconditionally hashed the heads on the client side. By mistake, the
heads were also cased in the heads == ['force'] case.
extensions that depend on other extensions (such as record) use this pattern
to check if the dependant extension is available:
try:
mq = extensions.find('mq')
except KeyError:
return
but since if an error occurs while loading an extension it leaves its entry
in the _extensions map as None, we want to raise in that situation too.
(rather than adding another check if the return value is None)
Failing to do so makes it impossible to use the memctx API to create a
changeset with a commit message or username outside of the current
encoding.encoding setting.
requires ctypes
Why is posixfile a class?
Because the implementation needs to use the Python library call os.fdopen [1],
which sets the 'name' attribute on the Python file object it creates to the
mostly meaningless string '<fdopen>', since file descriptors don't have a name.
But users of posixfile depend on the name attribute [2] being set to a proper
value, like Python's built-in 'open' function sets it on file objects.
Python file's name attribute is read-only, so we can't just assign to it after
the file object has alrady been created.
To solve this problem, we save the name of the file on a wrapper object,
and delegate the file function calls to the wrapped (private) file object
using __getattr__.
[1] http://docs.python.org/library/os.html#os.fdopen
[2] http://docs.python.org/library/stdtypes.html#file.name
Extensions can hook discovery.findcommonincoming to filter out unwanted remote
changesets. This patch makes getremotechanges respect the changed remote heads
returned by such extensions.
gitpatch objects emitted by iterhunks() were referencing file paths unmodified
from the input patch. _applydif() made them usable by modifying the gitpatch
objects in-place with specified path strip level. The same modified objects
were then reused by iterhunks() generator. _applydiff() now copies and update
the paths which completely decouples both routines.
As a side effect, the "git" event now receives only metadata about
copies/renames to perform the necessary copies ahead of time. Other actions are
handled in the "file" event.
Patch changes are emitted by iterhunks() in two separate events: 'file' when
hunks have to be applied and 'git' to describe other modifications like copies
or mode changes. Note that a file which mode is changed and which content is
modified by the same patch will be emitted in both events. It is more
convenient to handle all file modifications in a single event. This patch
"zips" git actions with regular changes so both kinds can be emitted at the
same place.
git afile/bfile are extracted twice, once when reading a 'diff --git', and
again when reading a unified hunk. The problem is not all git blocks have
unified hunks (renames just have metadata) and they were not extracted the same
way. This is what this patch unifies.
gitpatch objects emitted by iterhunks() are modified in place by applydiff().
Processing them earlier improves iterhunks() isolation. applydiff() modifying
them should still be fixed though.
_updatedir() is no longer used by internalpatch()
The change in test-mq-missingfiles.t comes from workingbackend not considering
the missing 'b' file as changed, thus not calling addremove() on it.
This introduces a performance regression for large files, as they will be
copied just to be clobbered afterwards since binary patching does not use
deltas. But it simplifies the code and the previous optimization will be
reintroduced later in a better way.
_applydiff() patcher argument was added to help hgsubversion like extension
monkeypatching the patching process. While it could be removed at this point, I
prefer to leave it until patch.py is completely refactored and there is a valid
and tested alternative.
This greatly improves the speed of the bundling process, and often reduces the
bundle size considerably. (Although if the repository is already ordered, this
has little effect on both time and bundle size.)
For non-generaldelta clients, the reduced bundle size translates to a reduced
repository size, similar to shrinking the revlogs (which uses the exact same
algorithm). For generaldelta clients the difference is minor.
When the new bundle format comes, reordering will not be necessary since we
can then store the deltaparent relationsships directly. The eventual default
behavior for clients and servers is presented in the table below, where "new"
implies support for GD as well as the new bundle format:
old client new client
old server old bundle, no reorder old bundle, no reorder
new server, non-GD old bundle, no reorder[1] old bundle, no reorder[2]
new server, GD old bundle, reorder[3] new bundle, no reorder[4]
[1] reordering is expensive on the server in this case, skip it
[2] client can choose to do its own redelta here
[3] reordering is needed because otherwise the pull does a lot of extra
work on the server
[4] reordering isn't needed because client can get deltabase in bundle
format
Currently, the default is to reorder on GD-servers, and not otherwise. A new
setting, bundle.reorder, has been added to override the default reordering
behavior. It can be set to either 'auto' (the default), or any true or false
value as a standard boolean setting, to either force the reordering on or off
regardless of generaldelta.
Some timing data from a relatively branch test repository follows. All
bundling is done with --all --type none options.
Non-generaldelta, non-shrunk repo:
-----------------------------------
Size: 276M
Without reorder (default):
Bundle time: 14.4 seconds
Bundle size: 939M
With reorder:
Bundle time: 1 minute, 29.3 seconds
Bundle size: 381M
Generaldelta, non-shrunk repo:
-----------------------------------
Size: 87M
Without reorder:
Bundle time: 2 minutes, 1.4 seconds
Bundle size: 939M
With reorder (default):
Bundle time: 25.5 seconds
Bundle size: 381M
Most filesystem calls are already isolated in patchfile but this is not enough:
renames are performed before patchfile is available and some chmod calls are
even done outside of the applydiff call. Once all these calls are extracted
into a backend class, we can provide cleaner APIs to write to a working
directory context directly into the repository.
Previously the connection cache for keepalives didn't keep track of
ssl. This meant that when we connected to an https server after that
same server via http, both on the default port, we'd incorrectly reuse
the non-https connection as the default port meant the connection
cache key was the same.
When in plain mode with "alias" present in the exception list,
keep the aliases. This will be used later to enable auto-completion.
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
Let ui.plain() accept an optional parameter in the form of a feature
name (as a string) to exclude from plain mode.
The result of ui.plain is now:
- False if HGPLAIN is not set or the requested feature is in HGPLAINEXCEPT
- True otherwise
Signed-off-by: "Yann E. MORIN" <yann.morin.1998@anciens.enib.fr>
This is a feature of ctypes. Without these, pypy complains with
RuntimeWarning: C function without declared arguments called
RuntimeWarning: C function without declared return type called
As a side effect of specifying restypes, the return value of e.g. CreateFileA
is now implicitly converted to an instance of _HANDLE, so we also need to
change the definition
_INVALID_HANDLE_VALUE = -1
to
_INVALID_HANDLE_VALUE = _HANDLE(-1).value
Otherwise, tests for equality to _INVALID_HANDLE_VALUE in code like
def _getfileinfo(name):
fh = _kernel32.CreateFileA(name, 0,
_FILE_SHARE_READ | _FILE_SHARE_WRITE | _FILE_SHARE_DELETE,
None, _OPEN_EXISTING, 0, None)
if fh == _INVALID_HANDLE_VALUE:
_raiseoserror(name)
would now fail to detect an invalid handle, which in turn would lead to
exceptions raised with wrong errno values, like e.g.
>>> nlinks('missing.txt')
Traceback (most recent call last):
...
OSError: [Errno 9] missing.txt: The handle is invalid.
instead of the correct (as per this patch and before it)
>>> nlinks('missing.txt')
Traceback (most recent call last):
...
OSError: [Errno 2] missing.txt: The system cannot find the file specified.
When used as the default merge tool, or used as a --tool override,
the simplemerge script should not be allowed to raise a util.Abort
just because one of the files being merged is binary. Instead, return
1 and mark the file unresolved.
defversion was a property (later option) on the store opener, used to propagate
the changelog revlog format to the other revlogs, so they would be created with
the same format.
This required that the changelog instance was created before any other revlog;
an invariant that wasn't directly enforced (or documented) anywhere.
We now use the revlogv1 requirement instead, which is transfered to the store
opener options. If this option is missing, v0 revlogs are created.
Suppresses output (resolved paths or "not found!") when searching a path,
similar to "grep -q".
Sample usage: hg paths -q foo || echo "there is no foo"
Just prints path names (instead of "name = result") when listing all path
definitions, like "hg bookmarks -q".
Sample usage: hg paths -q | while read i; do hg incoming "$i"; done
These open the changelog and manifest, respectively, directly so you don't
need to specify the path.
The options have been added to debugindex, debugdata and debugrevlog.
The patch also fixes some minor usage-related bugs.
str(url) was recently changed to return only file:/. However, the
canonical way to represent absolute local paths is file:/// [1], which
is also expected by at least hgsubversion.
Relative paths are returned as file:the/relative/path.
[1] http://en.wikipedia.org/wiki/File_URI_scheme
Without this change, pulls (and clones) into a generaldelta repository could
generate very inefficient revlogs, the size of which could be at least twice
the original size.
This was caused by the generated delta chains covering too large distances,
causing new chains to be built far too often. This change addresses the
problem by forcing a delta against second parent or against the previous
revision, when the first parent delta is in danger of creating a long chain.
The bug didn't cause corruption, and thus wasn't caught in hg verify or in
tests. It could lead to delta chains longer than normally allowed, by
affecting the code that decides when to add a full revision. This could,
in turn, lead to performance regression.
as stated at http://mercurial.selenic.com/wiki/CodingStyle
(see also http://selenic.com/pipermail/mercurial-devel/2011-May/031139.html )
name changes done at module scope:
_build_lower_encodefun -> _buildlowerencodefun
_windows_reserved_filenames -> _winreservednames (see 42a6bcfc9d44)
MAX_PATH_LEN_IN_HGSTORE -> _maxstorepathlen
DIR_PREFIX_LEN -> _dirprefixlen
_MAX_SHORTENED_DIRS_LEN -> _maxshortdirslen
(no users of these outside the store module)
changed locals:
win_reserved -> winreserved
space_left -> spaceleft
Before this patch undo.bookmarks was created on bookmarks write and
not with other transaction-related files. There were two issues: first
is that if you have changed bookmarks few times after a transaction
happened, rollback will give you a state which can point to
non-existing revision. Second is that if you have not changed
bookmarks after a transaction, rollback will touch your state anyway.
This change also adds `localrepo._writejournal` method, which can be
used by other extensions to save their transaction-related backup in
right time.
With Python >= 2.6 the original code already works correct, therefore the
fix for issue2556 on Python <= 2.5 broke relative subrepositories with
newer versions of Python.
This happens more often than expected. Say you have an svn subrepository with
python code. Python would have generated unknown .pyc files. Now, you rebase
this setup on a revision where a directory containing python code does not
exist. Subversion is first asked to remove this directory when updating, but
will not because it contains untracked items. Then it will have to bring back
the directory after the merge but will fail because it now collides with an
untracked directory.
Using --force is not very elegant and only works with svn >= 1.5 but the only
alternative I can think of is to write our own purge command for subversion.
Inspired by critique given on StackOverflow where a user writes:
I can have a good guess at what "%USERPROFILE%" might signify but
none of the files listed in the "hg help config" output exist after
running the installer. Previous experience would suggest that
missing files mean something somewhere has gone seriously wrong.
http://stackoverflow.com/questions/2329023/2351139#2351139
localstr's hash method exists to prevent bogus matching on lossy local
encodings. For instance, we don't want 'caf?' to match 'café' in an
ASCII locale.
But when café can be losslessly encoded in the local charset, we can
simply use a normal string and avoid the hashing trick.
This avoids using localstr's hash method, which would prevent a match between